Magnum-v1-72b

Property	Value
Parameter Count	72.7B
License	tongyi-qianwen
Languages	English, Chinese
Training Data	55M tokens
Base Model	Qwen2-72B-Instruct

What is magnum-v1-72b?

Magnum-v1-72b is a sophisticated large language model designed to emulate the prose quality of Claude 3 models (Sonnet and Opus). Built on the foundation of Qwen2-72B-Instruct, this model represents a significant advancement in natural language processing, trained using 8x AMD Instinct™ MI300X Accelerators.

Implementation Details

The model underwent comprehensive training with 55 million tokens of high-quality RP data over 1.5 epochs. It utilizes the ChatML formatting for interactions and supports both English and Chinese languages. The model architecture employs BF16 tensor type for optimal performance.

Achieves 76.06% accuracy on IFEval (0-Shot)
Demonstrates 57.65% normalized accuracy on BBH (3-Shot)
Shows strong performance with 35.27% exact match on MATH Lvl 5 (4-Shot)
Overall average performance of 42.21% across benchmark tests

Core Capabilities

Bilingual support for English and Chinese
Advanced text generation and conversation abilities
Strong performance in mathematical reasoning tasks
Optimized for instruction-following scenarios
Compatible with text-generation-inference systems

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its optimization for Claude 3-like prose quality, combined with its impressive performance on various benchmarks, particularly its 76.06% accuracy on IFEval.

Q: What are the recommended use cases?

The model excels in conversational AI, text generation, and instruction-following tasks. It's particularly well-suited for applications requiring high-quality prose output and mathematical reasoning.

magnum-v1-72b