OLMo-2-1124-7B-Instruct
Property | Value |
---|---|
Parameter Count | 7.3B |
License | Apache 2.0 |
Base Model | OLMo-2-1124-7B-DPO |
Paper | Tülu 3 Paper |
What is OLMo-2-1124-7B-Instruct?
OLMo-2-1124-7B-Instruct is an advanced language model developed by Allen AI as part of their OLMo (Open Language Model) series. This model represents a significant evolution in open-source AI, incorporating supervised fine-tuning on the Tülu 3 dataset, followed by DPO training and RLVR optimization. The model is specifically designed for enhanced performance across diverse tasks, including mathematical reasoning and general instruction following.
Implementation Details
The model architecture builds upon the base OLMo-2 7B model, implementing a sophisticated training pipeline that includes multiple stages of optimization. It utilizes BF16 tensor precision and incorporates specific hyperparameters for RLVR training, including a learning rate of 3×10⁻⁷ and a batch size of 512.
- Comprehensive training pipeline including SFT, DPO, and RLVR stages
- Custom chat template support with specific formatting
- 2048 token context window
- Advanced PPO settings for optimal performance
Core Capabilities
- Strong performance on mathematical tasks (85.2% on GSM8k)
- Enhanced instruction following abilities
- Robust safety measures (81.2% on safety benchmarks)
- Competitive performance across multiple benchmarks including MMLU and DROP
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its fully open-source nature and strong performance across diverse tasks, particularly in mathematical reasoning. It's part of the OLMo series, which is specifically designed to enable scientific research in language models.
Q: What are the recommended use cases?
The model excels in mathematical problem-solving, instruction following, and general language tasks. It's particularly well-suited for research and educational applications, though users should be aware of its limitations and adhere to the responsible use guidelines.