Magnum-v1-72b
Property | Value |
---|---|
Parameter Count | 72.7B |
License | tongyi-qianwen |
Languages | English, Chinese |
Training Data | 55M tokens |
Base Model | Qwen2-72B-Instruct |
What is magnum-v1-72b?
Magnum-v1-72b is a sophisticated large language model designed to emulate the prose quality of Claude 3 models (Sonnet and Opus). Built on the foundation of Qwen2-72B-Instruct, this model represents a significant advancement in natural language processing, trained using 8x AMD Instinctâ„¢ MI300X Accelerators.
Implementation Details
The model underwent comprehensive training with 55 million tokens of high-quality RP data over 1.5 epochs. It utilizes the ChatML formatting for interactions and supports both English and Chinese languages. The model architecture employs BF16 tensor type for optimal performance.
- Achieves 76.06% accuracy on IFEval (0-Shot)
- Demonstrates 57.65% normalized accuracy on BBH (3-Shot)
- Shows strong performance with 35.27% exact match on MATH Lvl 5 (4-Shot)
- Overall average performance of 42.21% across benchmark tests
Core Capabilities
- Bilingual support for English and Chinese
- Advanced text generation and conversation abilities
- Strong performance in mathematical reasoning tasks
- Optimized for instruction-following scenarios
- Compatible with text-generation-inference systems
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its optimization for Claude 3-like prose quality, combined with its impressive performance on various benchmarks, particularly its 76.06% accuracy on IFEval.
Q: What are the recommended use cases?
The model excels in conversational AI, text generation, and instruction-following tasks. It's particularly well-suited for applications requiring high-quality prose output and mathematical reasoning.