WizardLM-2-8x22B

Maintained By
alpindale

WizardLM-2-8x22B

PropertyValue
Parameter Count141B
Model TypeMixture of Experts (MoE)
Base ModelMixtral-8x22B-v0.1
LicenseApache 2.0
DeveloperWizardLM@Microsoft AI

What is WizardLM-2-8x22B?

WizardLM-2-8x22B is a state-of-the-art large language model that represents Microsoft's most advanced offering in the WizardLM series. Built on the Mixtral-8x22B architecture, this model employs a Mixture of Experts (MoE) approach with 141B parameters, delivering exceptional performance across complex chat, multilingual tasks, reasoning, and agent-based interactions.

Implementation Details

The model implements a fully AI-powered synthetic training system and follows the Vicuna prompt format for multi-turn conversations. It's trained using BF16 precision and demonstrates remarkable capabilities in various benchmark tests, including achieving 52.72% accuracy on IFEval and 48.58% on BBH (3-Shot) evaluations.

  • Supports multi-turn conversations with Vicuna-style prompting
  • Implements advanced MoE architecture for efficient processing
  • Utilizes BF16 tensor type for optimal performance
  • Built with Apache 2.0 license for open usage

Core Capabilities

  • Achieves competitive performance against leading proprietary models on MT-Bench
  • Excels in complex instruction following with strong win rates against GPT-4 models
  • Demonstrates superior performance in multilingual tasks
  • Shows strong reasoning capabilities across various domains
  • Achieves 39.96% accuracy on MMLU-PRO (5-shot)

Frequently Asked Questions

Q: What makes this model unique?

WizardLM-2-8x22B stands out for its impressive performance metrics that rival proprietary models while maintaining open-source accessibility. Its MoE architecture and comprehensive training system enable superior performance across diverse tasks.

Q: What are the recommended use cases?

The model excels in complex chat applications, multilingual processing, advanced reasoning tasks, and agent-based interactions. It's particularly well-suited for applications requiring sophisticated language understanding and generation capabilities.

The first platform built for prompt engineering