Multilingual-MiniLM-L12-H384

Maintained By
microsoft

Multilingual-MiniLM-L12-H384

PropertyValue
Architecture12-layer Transformer
Hidden Size384
Parameters21M (Transformer) + 96M (Embedding)
LicenseMIT
Languages16 (including en, ar, bg, de, el, es, fr, hi, ru, sw, th, tr, ur, vi, zh)

What is Multilingual-MiniLM-L12-H384?

Multilingual-MiniLM-L12-H384 is a compact yet powerful multilingual transformer model developed by Microsoft. It's a distilled version of larger language models, designed to provide efficient cross-lingual understanding while maintaining strong performance. The model combines BERT's architecture with XLM-R's tokenization approach, offering a unique balance between computational efficiency and multilingual capabilities.

Implementation Details

The model features a 12-layer transformer architecture with 384 hidden dimensions and 12 attention heads. It uses knowledge distillation techniques to compress the capabilities of larger models into a more efficient form, resulting in just 21M transformer parameters and 96M embedding parameters.

  • Utilizes XLM-RoBERTa tokenizer for multilingual support
  • Implements BERT-style transformer architecture
  • Optimized for cross-lingual transfer learning
  • Achieves 71.1% average accuracy on XNLI benchmark

Core Capabilities

  • Cross-lingual Natural Language Inference (XNLI benchmark performance)
  • Multilingual Question Answering (MLQA benchmark support)
  • Text Classification across 16 languages
  • Efficient deployment with reduced parameter count

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its efficient architecture that maintains strong performance while using only 21M parameters for its transformer component. It successfully combines BERT's architecture with XLM-R's tokenization, making it particularly effective for multilingual applications while requiring fewer computational resources.

Q: What are the recommended use cases?

The model is particularly well-suited for cross-lingual tasks such as natural language inference and question answering. It's ideal for applications requiring multilingual understanding with limited computational resources, showing strong performance on benchmarks like XNLI and MLQA across multiple languages.

The first platform built for prompt engineering