Llama-3.1-Tulu-3-8B-SFT
Property | Value |
---|---|
Parameter Count | 8.03B |
License | Llama 3.1 Community License |
Base Model | Llama-3.1-8B |
Paper | arXiv:2411.15124 |
Training Type | Supervised Fine-Tuning (SFT) |
What is Llama-3.1-Tulu-3-8B-SFT?
Llama-3.1-Tulu-3-8B-SFT is a state-of-the-art instruction-following model built on Meta's Llama 3.1 architecture. It represents the first stage in the Tulu 3 model series, developed by Allen Institute for AI, focusing on both general conversation and specialized tasks like mathematics and reasoning.
Implementation Details
The model is trained using supervised fine-tuning with carefully selected hyperparameters including a 5E-6 learning rate, 128 batch size, and 4096 maximum sequence length. It utilizes a linear learning rate schedule with a 0.03 warmup ratio over 2 epochs.
- Optimized for both general conversation and specialized tasks
- Implements a standardized chat template for consistent interactions
- Supports efficient deployment through VLLM
- Uses BF16 tensor type for optimal performance
Core Capabilities
- Strong performance in mathematical reasoning (MATH and GSM8K benchmarks)
- Excellent safety scores (93.1% average across 6 safety tasks)
- Robust performance on technical evaluations like HumanEval (86.2% pass@10)
- Effective instruction following as measured by IFEval (72.8%)
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its balanced performance across diverse tasks, particularly excelling in safety and mathematical reasoning while maintaining strong general conversation capabilities.
Q: What are the recommended use cases?
This model is particularly well-suited for research and educational applications, especially those requiring mathematical reasoning, code generation, and safe, controlled interactions.