magnum-v1-72b

Maintained By
anthracite-org

Magnum-v1-72b

PropertyValue
Parameter Count72.7B
Licensetongyi-qianwen
LanguagesEnglish, Chinese
Training Data55M tokens
Base ModelQwen2-72B-Instruct

What is magnum-v1-72b?

Magnum-v1-72b is a sophisticated large language model designed to emulate the prose quality of Claude 3 models (Sonnet and Opus). Built on the foundation of Qwen2-72B-Instruct, this model represents a significant advancement in natural language processing, trained using 8x AMD Instinctâ„¢ MI300X Accelerators.

Implementation Details

The model underwent comprehensive training with 55 million tokens of high-quality RP data over 1.5 epochs. It utilizes the ChatML formatting for interactions and supports both English and Chinese languages. The model architecture employs BF16 tensor type for optimal performance.

  • Achieves 76.06% accuracy on IFEval (0-Shot)
  • Demonstrates 57.65% normalized accuracy on BBH (3-Shot)
  • Shows strong performance with 35.27% exact match on MATH Lvl 5 (4-Shot)
  • Overall average performance of 42.21% across benchmark tests

Core Capabilities

  • Bilingual support for English and Chinese
  • Advanced text generation and conversation abilities
  • Strong performance in mathematical reasoning tasks
  • Optimized for instruction-following scenarios
  • Compatible with text-generation-inference systems

Frequently Asked Questions

Q: What makes this model unique?

The model's distinctive feature is its optimization for Claude 3-like prose quality, combined with its impressive performance on various benchmarks, particularly its 76.06% accuracy on IFEval.

Q: What are the recommended use cases?

The model excels in conversational AI, text generation, and instruction-following tasks. It's particularly well-suited for applications requiring high-quality prose output and mathematical reasoning.

The first platform built for prompt engineering