EVA-Qwen2.5-14B-v0.2
Property | Value |
---|---|
Parameter Count | 14.8B |
Model Type | Text Generation/Roleplay |
Base Model | Qwen2.5-14B |
License | Apache 2.0 |
Training Hardware | 8x H100 SXM |
What is EVA-Qwen2.5-14B-v0.2?
EVA-Qwen2.5-14B-v0.2 is a specialized language model designed for roleplay and creative writing applications. It represents a full-parameter fine-tune of the Qwen2.5-14B model, incorporating a diverse mixture of synthetic and natural datasets. This version (0.2) brings significant improvements in coherence, instruction following, and long-context comprehension compared to its predecessor.
Implementation Details
The model utilizes the ChatML format and implements advanced training techniques including spectrum-based training and optimization through the Axolotl framework. It was trained for 3 hours on 8xH100 SXM hardware provided by FeatherlessAI, using a carefully curated mixture of datasets including Celeste 70B, Opus_Instruct, and various specialized roleplay and writing datasets.
- Optimized sampling parameters (Temperature: 0.8, Min-P: 0.05, Top-A: 0.3)
- 10,240 sequence length with sample packing
- Specialized training configuration using Liger plugins
- Advanced dropout and attention mechanisms
Core Capabilities
- Enhanced creative writing and storytelling
- Improved coherence in long-form content
- Better instruction following compared to v0.1
- Specialized roleplay interactions
- Extended context understanding
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized focus on creative writing and roleplay, utilizing a refined dataset from the 32B v0.2 version and implementing advanced training techniques. The combination of multiple high-quality datasets and careful optimization makes it particularly effective for narrative generation and character interactions.
Q: What are the recommended use cases?
The model excels in creative writing, storytelling, roleplay scenarios, and extended narrative generation. It's particularly well-suited for applications requiring consistent character portrayal and coherent long-form content generation.