EVA-Qwen2.5-14B-v0.2

Maintained By
EVA-UNIT-01

EVA-Qwen2.5-14B-v0.2

PropertyValue
Parameter Count14.8B
Model TypeText Generation/Roleplay
Base ModelQwen2.5-14B
LicenseApache 2.0
Training Hardware8x H100 SXM

What is EVA-Qwen2.5-14B-v0.2?

EVA-Qwen2.5-14B-v0.2 is a specialized language model designed for roleplay and creative writing applications. It represents a full-parameter fine-tune of the Qwen2.5-14B model, incorporating a diverse mixture of synthetic and natural datasets. This version (0.2) brings significant improvements in coherence, instruction following, and long-context comprehension compared to its predecessor.

Implementation Details

The model utilizes the ChatML format and implements advanced training techniques including spectrum-based training and optimization through the Axolotl framework. It was trained for 3 hours on 8xH100 SXM hardware provided by FeatherlessAI, using a carefully curated mixture of datasets including Celeste 70B, Opus_Instruct, and various specialized roleplay and writing datasets.

  • Optimized sampling parameters (Temperature: 0.8, Min-P: 0.05, Top-A: 0.3)
  • 10,240 sequence length with sample packing
  • Specialized training configuration using Liger plugins
  • Advanced dropout and attention mechanisms

Core Capabilities

  • Enhanced creative writing and storytelling
  • Improved coherence in long-form content
  • Better instruction following compared to v0.1
  • Specialized roleplay interactions
  • Extended context understanding

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on creative writing and roleplay, utilizing a refined dataset from the 32B v0.2 version and implementing advanced training techniques. The combination of multiple high-quality datasets and careful optimization makes it particularly effective for narrative generation and character interactions.

Q: What are the recommended use cases?

The model excels in creative writing, storytelling, roleplay scenarios, and extended narrative generation. It's particularly well-suited for applications requiring consistent character portrayal and coherent long-form content generation.

The first platform built for prompt engineering