Llama-3.1-Tulu-3-8B-SFT

Maintained By
allenai

Llama-3.1-Tulu-3-8B-SFT

PropertyValue
Parameter Count8.03B
LicenseLlama 3.1 Community License
Base ModelLlama-3.1-8B
PaperarXiv:2411.15124
Training TypeSupervised Fine-Tuning (SFT)

What is Llama-3.1-Tulu-3-8B-SFT?

Llama-3.1-Tulu-3-8B-SFT is a state-of-the-art instruction-following model built on Meta's Llama 3.1 architecture. It represents the first stage in the Tulu 3 model series, developed by Allen Institute for AI, focusing on both general conversation and specialized tasks like mathematics and reasoning.

Implementation Details

The model is trained using supervised fine-tuning with carefully selected hyperparameters including a 5E-6 learning rate, 128 batch size, and 4096 maximum sequence length. It utilizes a linear learning rate schedule with a 0.03 warmup ratio over 2 epochs.

  • Optimized for both general conversation and specialized tasks
  • Implements a standardized chat template for consistent interactions
  • Supports efficient deployment through VLLM
  • Uses BF16 tensor type for optimal performance

Core Capabilities

  • Strong performance in mathematical reasoning (MATH and GSM8K benchmarks)
  • Excellent safety scores (93.1% average across 6 safety tasks)
  • Robust performance on technical evaluations like HumanEval (86.2% pass@10)
  • Effective instruction following as measured by IFEval (72.8%)

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its balanced performance across diverse tasks, particularly excelling in safety and mathematical reasoning while maintaining strong general conversation capabilities.

Q: What are the recommended use cases?

This model is particularly well-suited for research and educational applications, especially those requiring mathematical reasoning, code generation, and safe, controlled interactions.

The first platform built for prompt engineering