Rhea-72b-v0.5

Property	Value
Parameter Count	72.3B
License	Apache 2.0
Base Model	Smaug-72B-v0.1
Tensor Type	BF16

What is Rhea-72b-v0.5?

Rhea-72b-v0.5 is a state-of-the-art language model that has achieved top rankings on HuggingFace's Open LLM leaderboard. Built on the Smaug-72B-v0.1 architecture, it implements innovative approaches to model training, including a novel Self-Generated Dataset (SGD) creation method for DPO learning.

Implementation Details

The model utilizes the nox framework for fine-tuning and incorporates a comprehensive dataset combining over 60 different sources for SFT learning. The training process involves both supervised fine-tuning with a 4M dataset and DPO learning with a specialized 151k dataset.

Advanced SGD methodology for dataset creation
Comprehensive evaluation across multiple benchmarks
Implementation of Direct Preference Optimization (DPO) learning
Integration with diverse training datasets

Core Capabilities

Outstanding performance on AI2 Reasoning Challenge (79.78%)
Exceptional results on HellaSwag (91.15%)
Strong MMLU performance (77.95%)
High accuracy on GSM8k (76.12%)
Impressive TruthfulQA results (74.50%)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its novel SGD approach for dataset creation, where it autonomously generates training data by comparing model-generated outputs with correct answers, specifically targeting areas where improvement is needed.

Q: What are the recommended use cases?

Given its strong performance across multiple benchmarks, the model is well-suited for complex reasoning tasks, question answering, and general text generation applications requiring high accuracy and reliability.

Rhea-72b-v0.5

Rhea-72b-v0.5

What is Rhea-72b-v0.5?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models

The first platform built for prompt engineering