Rhea-72b-v0.5
Property | Value |
---|---|
Parameter Count | 72.3B |
License | Apache 2.0 |
Base Model | Smaug-72B-v0.1 |
Tensor Type | BF16 |
What is Rhea-72b-v0.5?
Rhea-72b-v0.5 is a state-of-the-art language model that has achieved top rankings on HuggingFace's Open LLM leaderboard. Built on the Smaug-72B-v0.1 architecture, it implements innovative approaches to model training, including a novel Self-Generated Dataset (SGD) creation method for DPO learning.
Implementation Details
The model utilizes the nox framework for fine-tuning and incorporates a comprehensive dataset combining over 60 different sources for SFT learning. The training process involves both supervised fine-tuning with a 4M dataset and DPO learning with a specialized 151k dataset.
- Advanced SGD methodology for dataset creation
- Comprehensive evaluation across multiple benchmarks
- Implementation of Direct Preference Optimization (DPO) learning
- Integration with diverse training datasets
Core Capabilities
- Outstanding performance on AI2 Reasoning Challenge (79.78%)
- Exceptional results on HellaSwag (91.15%)
- Strong MMLU performance (77.95%)
- High accuracy on GSM8k (76.12%)
- Impressive TruthfulQA results (74.50%)
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its novel SGD approach for dataset creation, where it autonomously generates training data by comparing model-generated outputs with correct answers, specifically targeting areas where improvement is needed.
Q: What are the recommended use cases?
Given its strong performance across multiple benchmarks, the model is well-suited for complex reasoning tasks, question answering, and general text generation applications requiring high accuracy and reliability.