AReaL-boba-SFT-32B

Maintained By
inclusionAI

AReaL-boba-SFT-32B

PropertyValue
Parameter Count32 Billion
Model TypeLanguage Model (Mathematical Reasoning)
ArchitectureTransformer-based SFT Model
Model LinkHugging Face

What is AReaL-boba-SFT-32B?

AReaL-boba-SFT-32B represents a significant advancement in efficient large language model training, particularly for mathematical reasoning tasks. This 32B parameter model achieves remarkable performance comparable to QwQ-32B using only 200 carefully curated training samples, demonstrating the importance of high-quality data over quantity.

Implementation Details

The model leverages SGLang v0.4.0 as its generation backend, offering significant speed improvements through radix attention mechanisms. The training process incorporates sophisticated optimizations for handling variable-length sequences and large batches, with high-performance data transfer capabilities scaling up to 1000 GPUs.

  • Advanced sequence packing into 1D tensors for optimal GPU memory utilization
  • NCCL with GPU-Direct RDMA for efficient communication
  • Specialized evaluation settings for long-context generation
  • Temperature setting of 0.6 and top_p of 0.95 for inference

Core Capabilities

  • Achieves 78.8% accuracy on AIME 2024 problems
  • 62.1% accuracy on AIME 2025 problems
  • 60.1% performance on GPQA-Diamond
  • Specialized in step-by-step mathematical reasoning
  • Efficient handling of long-context problems

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to achieve SOTA performance using only 200 training samples sets it apart, demonstrating exceptional data efficiency. It matches the performance of QwQ-32B while using significantly fewer resources for training.

Q: What are the recommended use cases?

The model excels in mathematical reasoning tasks, particularly in solving complex problems requiring step-by-step solutions. It's specifically optimized for AIME-level mathematics problems and similar advanced mathematical reasoning tasks.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.