Llama-3-Smaug-8B

Property	Value
Parameter Count	8.03B
License	LLaMA 2
Base Model	Meta-LLaMA-3-8B-Instruct
Tensor Type	BF16
Research Paper	Smaug Paper

What is Llama-3-Smaug-8B?

Llama-3-Smaug-8B is an advanced language model developed by Abacus.AI, built upon Meta's LLaMA 3 architecture. It implements the innovative Smaug recipe for enhancing performance in real-world multi-turn conversations, showing notable improvements over the base model in benchmark tests.

Implementation Details

The model is trained on multiple high-quality datasets including AQUA-RAT, Microsoft's Orca math word problems, CodeFeedback, and ShareGPT Vicuna. It leverages new techniques compared to its predecessor Smaug-72B, optimizing for both single-turn and multi-turn conversational scenarios.

Achieves 8.33 average score on MT-Bench (compared to base model's 8.10)
Significantly improved first-turn performance (8.78 vs 8.31)
Maintains consistent second-turn performance (7.89)

Core Capabilities

Enhanced multi-turn conversation handling
Mathematical problem-solving abilities
Code-related feedback and analysis
General instruction following and task completion

Frequently Asked Questions

Q: What makes this model unique?

The model's implementation of the Smaug recipe, combined with its optimization for multi-turn conversations, sets it apart. It shows particular strength in first-turn interactions while maintaining competitive performance in follow-up exchanges.

Q: What are the recommended use cases?

The model is well-suited for conversational applications, mathematical problem-solving, code-related tasks, and general instruction-following scenarios. Its balanced performance makes it particularly valuable for applications requiring sustained dialogue.

Llama-3-Smaug-8B

Llama-3-Smaug-8B

What is Llama-3-Smaug-8B?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models