OpenHermes-2.5-Mistral-7B
Property | Value |
---|---|
Parameter Count | 7.24B |
Base Model | Mistral-7B-v0.1 |
License | Apache 2.0 |
Training Data | 1M GPT-4 generated entries |
Format | ChatML |
What is OpenHermes-2.5-Mistral-7B?
OpenHermes 2.5 is a state-of-the-art language model built on Mistral-7B, representing an evolution of the OpenHermes series. Developed by teknium, it's trained on a carefully curated dataset of 1 million entries, predominantly generated by GPT-4, with additional high-quality data from various open datasets. The model particularly excels in both general language understanding and code generation tasks.
Implementation Details
The model implements the ChatML format for structured dialogue, supporting system prompts and multi-turn conversations. It's been trained with an emphasis on code instruction (approximately 7-14% of the dataset), which has notably improved its performance across various benchmarks.
- Trained on filtered and transformed datasets using the ChatML format
- Supports system prompts for consistent behavior across conversations
- Compatible with OpenAI endpoint format
- Available in multiple quantized versions (GGUF, GPTQ, AWQ, EXL2)
Core Capabilities
- Improved HumanEval score of 50.7% @ Pass 1
- Enhanced performance in TruthfulQA, AGIEval, and GPT4All benchmarks
- Strong code generation and understanding capabilities
- Advanced reasoning and problem-solving abilities
- Sophisticated multi-turn dialogue handling
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its balanced approach to both code and general language tasks, achieving improved benchmark scores through strategic inclusion of code instruction data. It offers superior performance compared to previous OpenHermes versions while maintaining a relatively compact 7B parameter size.
Q: What are the recommended use cases?
The model excels in code generation, technical discussions, general conversation, and complex reasoning tasks. It's particularly well-suited for applications requiring both programming assistance and general language understanding.