Rhea-72b-v0.5

Maintained By
davidkim205

Rhea-72b-v0.5

PropertyValue
Parameter Count72.3B
LicenseApache 2.0
Base ModelSmaug-72B-v0.1
Tensor TypeBF16

What is Rhea-72b-v0.5?

Rhea-72b-v0.5 is a state-of-the-art language model that has achieved top rankings on HuggingFace's Open LLM leaderboard. Built on the Smaug-72B-v0.1 architecture, it implements innovative approaches to model training, including a novel Self-Generated Dataset (SGD) creation method for DPO learning.

Implementation Details

The model utilizes the nox framework for fine-tuning and incorporates a comprehensive dataset combining over 60 different sources for SFT learning. The training process involves both supervised fine-tuning with a 4M dataset and DPO learning with a specialized 151k dataset.

  • Advanced SGD methodology for dataset creation
  • Comprehensive evaluation across multiple benchmarks
  • Implementation of Direct Preference Optimization (DPO) learning
  • Integration with diverse training datasets

Core Capabilities

  • Outstanding performance on AI2 Reasoning Challenge (79.78%)
  • Exceptional results on HellaSwag (91.15%)
  • Strong MMLU performance (77.95%)
  • High accuracy on GSM8k (76.12%)
  • Impressive TruthfulQA results (74.50%)

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its novel SGD approach for dataset creation, where it autonomously generates training data by comparing model-generated outputs with correct answers, specifically targeting areas where improvement is needed.

Q: What are the recommended use cases?

Given its strong performance across multiple benchmarks, the model is well-suited for complex reasoning tasks, question answering, and general text generation applications requiring high accuracy and reliability.

The first platform built for prompt engineering