Llama-3-Smaug-8B

Maintained By
abacusai

Llama-3-Smaug-8B

PropertyValue
Parameter Count8.03B
LicenseLLaMA 2
Base ModelMeta-LLaMA-3-8B-Instruct
Tensor TypeBF16
Research PaperSmaug Paper

What is Llama-3-Smaug-8B?

Llama-3-Smaug-8B is an advanced language model developed by Abacus.AI, built upon Meta's LLaMA 3 architecture. It implements the innovative Smaug recipe for enhancing performance in real-world multi-turn conversations, showing notable improvements over the base model in benchmark tests.

Implementation Details

The model is trained on multiple high-quality datasets including AQUA-RAT, Microsoft's Orca math word problems, CodeFeedback, and ShareGPT Vicuna. It leverages new techniques compared to its predecessor Smaug-72B, optimizing for both single-turn and multi-turn conversational scenarios.

  • Achieves 8.33 average score on MT-Bench (compared to base model's 8.10)
  • Significantly improved first-turn performance (8.78 vs 8.31)
  • Maintains consistent second-turn performance (7.89)

Core Capabilities

  • Enhanced multi-turn conversation handling
  • Mathematical problem-solving abilities
  • Code-related feedback and analysis
  • General instruction following and task completion

Frequently Asked Questions

Q: What makes this model unique?

The model's implementation of the Smaug recipe, combined with its optimization for multi-turn conversations, sets it apart. It shows particular strength in first-turn interactions while maintaining competitive performance in follow-up exchanges.

Q: What are the recommended use cases?

The model is well-suited for conversational applications, mathematical problem-solving, code-related tasks, and general instruction-following scenarios. Its balanced performance makes it particularly valuable for applications requiring sustained dialogue.

The first platform built for prompt engineering