EVA-Qwen2.5-72B-v0.2
Property | Value |
---|---|
Parameter Count | 72.7B |
Model Type | Large Language Model |
License | Qwen License |
Base Model | Qwen2.5-72B |
Training Hardware | 8x H100 SXM |
What is EVA-Qwen2.5-72B-v0.2?
EVA-Qwen2.5-72B-v0.2 is a specialized language model designed for roleplay and creative writing applications. It represents a full-parameter fine-tune of the Qwen2.5-72B base model, incorporating a diverse mixture of synthetic and natural training data. The model builds upon the Celeste 70B 0.1 data mixture, significantly expanded to enhance versatility, creativity, and distinctive output characteristics.
Implementation Details
The model utilizes the ChatML format for prompting and incorporates advanced training optimizations, including a sequence length of 10240 tokens and carefully tuned hyperparameters. Training was conducted over 17 hours using 8x H100 SXM hardware, with significant improvements in instruction following and reduced repetition compared to previous versions.
- Optimized sampling parameters (Temperature: 0.8, Min-P: 0.05, Top-A: 0.3)
- Enhanced instruction following deeper into context
- Improved repetition handling with 1.03 penalty
- Comprehensive dataset mixture including Opus_Instruct, Sonnet3.5, and synthetic datasets
Core Capabilities
- Specialized in creative writing and roleplay scenarios
- Extended context handling with 10K+ token support
- Improved instruction following and coherence
- Enhanced creative content generation
- Reduced repetition in longer outputs
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized focus on creative writing and roleplay, achieved through careful dataset curation and full-parameter fine-tuning of the Qwen2.5-72B base model. The v0.2 version brings significant improvements in instruction following and reduced repetition.
Q: What are the recommended use cases?
The model excels in creative writing, storytelling, and roleplay scenarios. It's particularly well-suited for applications requiring extended creative content generation with consistent character and plot maintenance.