Smaug-34B-v0.1

Property	Value
Parameter Count	34.4B
Base Model	bagel-34b-v0.2
License	Apache 2.0
Paper	arXiv:2402.13228
Tensor Type	BF16

What is Smaug-34B-v0.1?

Smaug-34B-v0.1 is an advanced language model that introduces a revolutionary fine-tuning technique called DPO-Positive (DPOP). Built upon the foundation of the Bagel-34B model, it represents a significant advancement in preference optimization and performance across various benchmarks, achieving an impressive 77.29% average score across key evaluations.

Implementation Details

The model employs a novel training approach that addresses traditional DPO limitations, particularly in scenarios where edit distances between completion pairs are minimal. Through the innovative DPOP technique, Smaug-34B-v0.1 maintains high performance while avoiding the typical pitfalls of preference optimization.

Utilizes new pairwise preference versions of ARC, HellaSwag, and MetaMath datasets
Implements BF16 tensor format for efficient computation
Achieves state-of-the-art performance: 74.23% on ARC, 86.76% on HellaSwag, 76.66% on MMLU

Core Capabilities

Enhanced mathematical reasoning with 72.18% accuracy on GSM8K
Strong performance in truthfulness evaluation (70.22% on TruthfulQA)
Exceptional common-sense reasoning with 83.66% on Winogrande
Minimal contamination across benchmark datasets

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its DPO-Positive training approach, which specifically addresses the limitations of standard DPO in scenarios with low edit distances between completion pairs. This innovation enables better performance across various tasks while maintaining the quality of preferred examples.

Q: What are the recommended use cases?

Smaug-34B-v0.1 excels in mathematical reasoning, truthfulness assessment, and common-sense understanding tasks. It's particularly suitable for applications requiring precise reasoning and accurate content generation, especially in scenarios where maintaining consistent quality across varied inputs is crucial.

Smaug-34B-v0.1

Smaug-34B-v0.1

What is Smaug-34B-v0.1?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models