OpenThinker2-7B

Maintained By
open-thoughts

OpenThinker2-7B

PropertyValue
Base ModelQwen2.5-7B-Instruct
Training DataOpenThoughts2-1M
LicenseApache 2.0
Training Infrastructure32 8xA100 nodes
Training Duration36 hours

What is OpenThinker2-7B?

OpenThinker2-7B represents a significant advancement in open-source reasoning models, built upon the Qwen2.5-7B-Instruct architecture. This model demonstrates exceptional performance across various mathematical and reasoning tasks, achieving scores comparable to state-of-the-art models like DeepSeek-R1-Distill-7B.

Implementation Details

The model was trained using a sophisticated setup of 32 8xA100 nodes over 36 hours, utilizing key hyperparameters including a learning rate of 8e-05 and a cosine scheduler with 0.1 warmup ratio. The training process employed the ADAMW_TORCH optimizer with carefully tuned parameters and a total batch size of 512.

  • Trained on OpenThoughts2-1M dataset, an enhanced version of OpenThoughts-114k
  • Implements 26 different question generation methodologies
  • Utilizes advanced distributed training across 256 devices

Core Capabilities

  • AIME24 Performance: 50.0%
  • AIME25 Performance: 33.3%
  • AMC23 Performance: 89.5%
  • MATH500 Performance: 88.4%
  • GPQA-D Performance: 49.3%
  • LCBv2 Performance: 55.6%

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its exceptional performance in mathematical reasoning tasks, achieved through extensive training on a carefully curated dataset and innovative question generation methodologies.

Q: What are the recommended use cases?

The model excels in mathematical problem-solving, particularly in competitive mathematics (AIME, AMC) and general problem-solving scenarios. It's particularly well-suited for educational applications and complex reasoning tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.