DiscoLM-mixtral-8x7b-v2

Maintained By
DiscoResearch

DiscoLM-mixtral-8x7b-v2

PropertyValue
Parameter Count46.7B
Model TypeMixtral MoE
LicenseApache 2.0
Tensor TypeFP16

What is DiscoLM-mixtral-8x7b-v2?

DiscoLM-mixtral-8x7b-v2 is an experimental Mixture of Experts (MoE) model developed by DiscoResearch, based on Mistral AI's Mixtral architecture. It represents a significant advancement in language model development, combining the power of multiple expert networks with state-of-the-art architecture design.

Implementation Details

The model utilizes a sophisticated implementation requiring trust_remote_code=True for operation until the architecture is merged into the transformers library. It follows the ChatML format for interactions and can be easily integrated using the Hugging Face Transformers library.

  • Built on Mistral AI's Mixtral 8x7b architecture
  • Fine-tuned on Synthia, MetaMathQA, and Capybara datasets
  • Implements FP16 precision for efficient computation
  • Supports chat template formatting for streamlined interaction

Core Capabilities

  • Strong performance on ARC (67.32%) and HellaSwag (86.25%)
  • Impressive MMLU score of 70.72%
  • Excellent performance in humanities (9.75) and STEM (9.45) categories
  • Specialized in writing, roleplay, and extraction tasks

Frequently Asked Questions

Q: What makes this model unique?

The model's unique strength lies in its Mixture of Experts architecture combined with careful fine-tuning on specialized datasets, making it particularly effective for diverse tasks while maintaining high performance across various benchmarks.

Q: What are the recommended use cases?

The model excels in conversational AI, academic content generation, and complex reasoning tasks. It's particularly well-suited for applications requiring strong performance in humanities and STEM fields, as evidenced by its MTBench scores.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.