Magnum-v2-12b

Property	Value
Parameter Count	12.2B
Base Model	Mistral-Nemo-Base-2407
License	Apache 2.0
Supported Languages	9 (EN, FR, DE, ES, IT, PT, RU, ZH, JA)
Training Hardware	8x NVIDIA H100 GPUs

What is magnum-v2-12b?

Magnum-v2-12b is a sophisticated multilingual language model designed to emulate the prose quality of Claude 3 models (Sonnet and Opus). Built on the Mistral-Nemo-Base architecture, it represents the fourth iteration in the Magnum series, incorporating extensive fine-tuning with high-quality instruction datasets.

Implementation Details

The model underwent comprehensive training for 2 epochs using state-of-the-art H100 GPUs. It implements ChatML formatting for instruction tuning and leverages multiple high-quality datasets including Stheno, Opus_Instruct, and Sonnet3.5-SlimOrcaDedupCleaned.

BF16 tensor format for optimal performance
Supports ChatML prompt formatting
Trained on diverse instruction datasets
Implements full-parameter fine-tuning

Core Capabilities

Multilingual support across 9 major languages
Strong performance on IFEval with 37.62% accuracy (0-shot)
Competitive BBH performance at 28.79% (3-shot)
MMLU-PRO score of 24.08% (5-shot)
Advanced reasoning capabilities across multiple domains

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its Claude 3-like prose quality combined with extensive multilingual support and robust instruction-following capabilities. It's particularly notable for its balanced performance across various evaluation metrics and practical applications.

Q: What are the recommended use cases?

The model excels in multilingual text generation, conversational AI, and complex reasoning tasks. It's particularly well-suited for applications requiring high-quality prose generation and cross-lingual capabilities.

magnum-v2-12b