Llama-3.1-Tulu-3-8B-SFT

Property	Value
Parameter Count	8.03B
License	Llama 3.1 Community License
Base Model	Llama-3.1-8B
Paper	arXiv:2411.15124
Training Type	Supervised Fine-Tuning (SFT)

What is Llama-3.1-Tulu-3-8B-SFT?

Llama-3.1-Tulu-3-8B-SFT is a state-of-the-art instruction-following model built on Meta's Llama 3.1 architecture. It represents the first stage in the Tulu 3 model series, developed by Allen Institute for AI, focusing on both general conversation and specialized tasks like mathematics and reasoning.

Implementation Details

The model is trained using supervised fine-tuning with carefully selected hyperparameters including a 5E-6 learning rate, 128 batch size, and 4096 maximum sequence length. It utilizes a linear learning rate schedule with a 0.03 warmup ratio over 2 epochs.

Optimized for both general conversation and specialized tasks
Implements a standardized chat template for consistent interactions
Supports efficient deployment through VLLM
Uses BF16 tensor type for optimal performance

Core Capabilities

Strong performance in mathematical reasoning (MATH and GSM8K benchmarks)
Excellent safety scores (93.1% average across 6 safety tasks)
Robust performance on technical evaluations like HumanEval (86.2% pass@10)
Effective instruction following as measured by IFEval (72.8%)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its balanced performance across diverse tasks, particularly excelling in safety and mathematical reasoning while maintaining strong general conversation capabilities.

Q: What are the recommended use cases?

This model is particularly well-suited for research and educational applications, especially those requiring mathematical reasoning, code generation, and safe, controlled interactions.