AReaL-boba-SFT-32B
Property | Value |
---|---|
Parameter Count | 32 Billion |
Model Type | Language Model (Mathematical Reasoning) |
Architecture | Transformer-based SFT Model |
Model Link | Hugging Face |
What is AReaL-boba-SFT-32B?
AReaL-boba-SFT-32B represents a significant advancement in efficient large language model training, particularly for mathematical reasoning tasks. This 32B parameter model achieves remarkable performance comparable to QwQ-32B using only 200 carefully curated training samples, demonstrating the importance of high-quality data over quantity.
Implementation Details
The model leverages SGLang v0.4.0 as its generation backend, offering significant speed improvements through radix attention mechanisms. The training process incorporates sophisticated optimizations for handling variable-length sequences and large batches, with high-performance data transfer capabilities scaling up to 1000 GPUs.
- Advanced sequence packing into 1D tensors for optimal GPU memory utilization
- NCCL with GPU-Direct RDMA for efficient communication
- Specialized evaluation settings for long-context generation
- Temperature setting of 0.6 and top_p of 0.95 for inference
Core Capabilities
- Achieves 78.8% accuracy on AIME 2024 problems
- 62.1% accuracy on AIME 2025 problems
- 60.1% performance on GPQA-Diamond
- Specialized in step-by-step mathematical reasoning
- Efficient handling of long-context problems
Frequently Asked Questions
Q: What makes this model unique?
The model's ability to achieve SOTA performance using only 200 training samples sets it apart, demonstrating exceptional data efficiency. It matches the performance of QwQ-32B while using significantly fewer resources for training.
Q: What are the recommended use cases?
The model excels in mathematical reasoning tasks, particularly in solving complex problems requiring step-by-step solutions. It's specifically optimized for AIME-level mathematics problems and similar advanced mathematical reasoning tasks.