magnum-v2-12b

Maintained By
anthracite-org

Magnum-v2-12b

PropertyValue
Parameter Count12.2B
Base ModelMistral-Nemo-Base-2407
LicenseApache 2.0
Supported Languages9 (EN, FR, DE, ES, IT, PT, RU, ZH, JA)
Training Hardware8x NVIDIA H100 GPUs

What is magnum-v2-12b?

Magnum-v2-12b is a sophisticated multilingual language model designed to emulate the prose quality of Claude 3 models (Sonnet and Opus). Built on the Mistral-Nemo-Base architecture, it represents the fourth iteration in the Magnum series, incorporating extensive fine-tuning with high-quality instruction datasets.

Implementation Details

The model underwent comprehensive training for 2 epochs using state-of-the-art H100 GPUs. It implements ChatML formatting for instruction tuning and leverages multiple high-quality datasets including Stheno, Opus_Instruct, and Sonnet3.5-SlimOrcaDedupCleaned.

  • BF16 tensor format for optimal performance
  • Supports ChatML prompt formatting
  • Trained on diverse instruction datasets
  • Implements full-parameter fine-tuning

Core Capabilities

  • Multilingual support across 9 major languages
  • Strong performance on IFEval with 37.62% accuracy (0-shot)
  • Competitive BBH performance at 28.79% (3-shot)
  • MMLU-PRO score of 24.08% (5-shot)
  • Advanced reasoning capabilities across multiple domains

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its Claude 3-like prose quality combined with extensive multilingual support and robust instruction-following capabilities. It's particularly notable for its balanced performance across various evaluation metrics and practical applications.

Q: What are the recommended use cases?

The model excels in multilingual text generation, conversational AI, and complex reasoning tasks. It's particularly well-suited for applications requiring high-quality prose generation and cross-lingual capabilities.

The first platform built for prompt engineering