Magnum-v2-12b
Property | Value |
---|---|
Parameter Count | 12.2B |
Base Model | Mistral-Nemo-Base-2407 |
License | Apache 2.0 |
Supported Languages | 9 (EN, FR, DE, ES, IT, PT, RU, ZH, JA) |
Training Hardware | 8x NVIDIA H100 GPUs |
What is magnum-v2-12b?
Magnum-v2-12b is a sophisticated multilingual language model designed to emulate the prose quality of Claude 3 models (Sonnet and Opus). Built on the Mistral-Nemo-Base architecture, it represents the fourth iteration in the Magnum series, incorporating extensive fine-tuning with high-quality instruction datasets.
Implementation Details
The model underwent comprehensive training for 2 epochs using state-of-the-art H100 GPUs. It implements ChatML formatting for instruction tuning and leverages multiple high-quality datasets including Stheno, Opus_Instruct, and Sonnet3.5-SlimOrcaDedupCleaned.
- BF16 tensor format for optimal performance
- Supports ChatML prompt formatting
- Trained on diverse instruction datasets
- Implements full-parameter fine-tuning
Core Capabilities
- Multilingual support across 9 major languages
- Strong performance on IFEval with 37.62% accuracy (0-shot)
- Competitive BBH performance at 28.79% (3-shot)
- MMLU-PRO score of 24.08% (5-shot)
- Advanced reasoning capabilities across multiple domains
Frequently Asked Questions
Q: What makes this model unique?
The model stands out for its Claude 3-like prose quality combined with extensive multilingual support and robust instruction-following capabilities. It's particularly notable for its balanced performance across various evaluation metrics and practical applications.
Q: What are the recommended use cases?
The model excels in multilingual text generation, conversational AI, and complex reasoning tasks. It's particularly well-suited for applications requiring high-quality prose generation and cross-lingual capabilities.