NeuralHermes-2.5-Mistral-7B
Property | Value |
---|---|
Parameter Count | 7.24B |
License | Apache 2.0 |
Base Model | teknium/OpenHermes-2.5-Mistral-7B |
Training Method | Direct Preference Optimization (DPO) |
What is NeuralHermes-2.5-Mistral-7B?
NeuralHermes-2.5-Mistral-7B is an advanced language model that builds upon the OpenHermes-2.5-Mistral-7B architecture, enhanced through Direct Preference Optimization using a carefully curated dataset. The model has demonstrated exceptional performance across multiple benchmarks, including ARC-Challenge (66.55% accuracy), HellaSwag (84.9% accuracy), and MMLU (63.32% accuracy).
Implementation Details
The model was trained using LoRA fine-tuning with specific hyperparameters including r=16, lora_alpha=16, and targeted optimization of key projection layers. Training was conducted on an A100 GPU for approximately one hour using the mlabonne/chatml_dpo_pairs dataset.
- Implements ChatML template for consistent dialogue formatting
- Available in multiple quantized versions (GGUF, AWQ, GPTQ, EXL2)
- Optimized using paged_adamw_32bit optimizer with cosine learning rate scheduling
Core Capabilities
- Strong performance in reasoning tasks (ARC Challenge)
- Excellent common sense understanding (HellaSwag)
- High accuracy in mathematical reasoning (GSM8k: 61.33%)
- Improved truthfulness in responses (TruthfulQA: 54.93%)
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its improved performance over the original OpenHermes model across multiple benchmarks, achieved through careful DPO fine-tuning and optimization of the training process.
Q: What are the recommended use cases?
The model is well-suited for general text generation, instruction following, mathematical reasoning, and truthful Q&A applications. It can be easily integrated into applications using popular frameworks like transformers or LM Studio.