USER-bge-m3

Maintained By
deepvk

USER-bge-m3

PropertyValue
Parameter Count359M
LicenseApache 2.0
Embedding Dimension1024
Primary LanguageRussian
ArchitectureXLM-RoBERTa-based
Research PaperLM-Cocktail Paper

What is USER-bge-m3?

USER-bge-m3 is a specialized sentence transformer model designed specifically for the Russian language. It's built upon the BGE-M3 architecture and optimized to generate high-quality 1024-dimensional embeddings for Russian text. The model excels at tasks like semantic search, clustering, and sentence similarity analysis.

Implementation Details

The model was developed through a sophisticated training process involving multiple stages. It was initialized from TatonkaHF/bge-m3_en_ru and enhanced using the LM-Cocktail technique, combining symmetric and asymmetric training approaches. The training utilized over 2.2 million positive pairs and 792,644 negative pairs from various Russian datasets.

  • Advanced training methodology using AnglE loss for symmetric tasks
  • Integrated with popular frameworks like sentence-transformers and transformers
  • Comprehensive evaluation on encodechka benchmark showing superior performance compared to base BGE-M3
  • Trained on 14 diverse Russian datasets

Core Capabilities

  • Generation of 1024-dimensional text embeddings
  • Optimized for Russian language understanding
  • Strong performance in semantic similarity tasks
  • Efficient text classification and retrieval
  • Supports both symmetric and asymmetric similarity tasks

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialized focus on Russian language processing, achieving superior performance on Russian NLP tasks compared to multilingual models. It shows significant improvements in various benchmarks, particularly in classification and pair classification tasks.

Q: What are the recommended use cases?

The model is ideal for Russian language applications including semantic search, document clustering, text similarity analysis, and information retrieval. It's particularly effective for tasks requiring deep understanding of Russian text semantics.

The first platform built for prompt engineering