NeuralHermes-2.5-Mistral-7B

Maintained By
mlabonne

NeuralHermes-2.5-Mistral-7B

PropertyValue
Parameter Count7.24B
LicenseApache 2.0
Base Modelteknium/OpenHermes-2.5-Mistral-7B
Training MethodDirect Preference Optimization (DPO)

What is NeuralHermes-2.5-Mistral-7B?

NeuralHermes-2.5-Mistral-7B is an advanced language model that builds upon the OpenHermes-2.5-Mistral-7B architecture, enhanced through Direct Preference Optimization using a carefully curated dataset. The model has demonstrated exceptional performance across multiple benchmarks, including ARC-Challenge (66.55% accuracy), HellaSwag (84.9% accuracy), and MMLU (63.32% accuracy).

Implementation Details

The model was trained using LoRA fine-tuning with specific hyperparameters including r=16, lora_alpha=16, and targeted optimization of key projection layers. Training was conducted on an A100 GPU for approximately one hour using the mlabonne/chatml_dpo_pairs dataset.

  • Implements ChatML template for consistent dialogue formatting
  • Available in multiple quantized versions (GGUF, AWQ, GPTQ, EXL2)
  • Optimized using paged_adamw_32bit optimizer with cosine learning rate scheduling

Core Capabilities

  • Strong performance in reasoning tasks (ARC Challenge)
  • Excellent common sense understanding (HellaSwag)
  • High accuracy in mathematical reasoning (GSM8k: 61.33%)
  • Improved truthfulness in responses (TruthfulQA: 54.93%)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its improved performance over the original OpenHermes model across multiple benchmarks, achieved through careful DPO fine-tuning and optimization of the training process.

Q: What are the recommended use cases?

The model is well-suited for general text generation, instruction following, mathematical reasoning, and truthful Q&A applications. It can be easily integrated into applications using popular frameworks like transformers or LM Studio.

The first platform built for prompt engineering