OLMo-2-1124-7B-Instruct

Property	Value
Parameter Count	7.3B
License	Apache 2.0
Base Model	OLMo-2-1124-7B-DPO
Paper	Tülu 3 Paper

What is OLMo-2-1124-7B-Instruct?

OLMo-2-1124-7B-Instruct is an advanced language model developed by Allen AI as part of their OLMo (Open Language Model) series. This model represents a significant evolution in open-source AI, incorporating supervised fine-tuning on the Tülu 3 dataset, followed by DPO training and RLVR optimization. The model is specifically designed for enhanced performance across diverse tasks, including mathematical reasoning and general instruction following.

Implementation Details

The model architecture builds upon the base OLMo-2 7B model, implementing a sophisticated training pipeline that includes multiple stages of optimization. It utilizes BF16 tensor precision and incorporates specific hyperparameters for RLVR training, including a learning rate of 3×10⁻⁷ and a batch size of 512.

Comprehensive training pipeline including SFT, DPO, and RLVR stages
Custom chat template support with specific formatting
2048 token context window
Advanced PPO settings for optimal performance

Core Capabilities

Strong performance on mathematical tasks (85.2% on GSM8k)
Enhanced instruction following abilities
Robust safety measures (81.2% on safety benchmarks)
Competitive performance across multiple benchmarks including MMLU and DROP

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its fully open-source nature and strong performance across diverse tasks, particularly in mathematical reasoning. It's part of the OLMo series, which is specifically designed to enable scientific research in language models.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, instruction following, and general language tasks. It's particularly well-suited for research and educational applications, though users should be aware of its limitations and adhere to the responsible use guidelines.