Solon-embeddings-large-0.1

Maintained By
OrdalieTech

Solon-embeddings-large-0.1

PropertyValue
Parameter Count560M
LicenseMIT
LanguageFrench
Tensor TypeF32

What is Solon-embeddings-large-0.1?

Solon-embeddings-large-0.1 is a state-of-the-art French language embedding model that achieves superior performance across various NLP tasks. Developed by OrdalieTech, it outperforms other multilingual models like cohere/embed-multilingual-v3 and OpenAI's ada-002, achieving a mean score of 0.749 on MTEB benchmarks.

Implementation Details

The model is optimized for French language processing and requires a specific format for queries - adding "query:" before the input text improves retrieval performance. It uses the XLM-RoBERTa architecture and is available in both base and large variants.

  • Achieves 92.7% Recall@500 on mMARCO-fr passage retrieval
  • Superior performance on classification tasks (89.26% accuracy on MTOP Domain Classification)
  • Strong STS (Semantic Textual Similarity) capabilities with 83.31% Spearman correlation on STS22

Core Capabilities

  • Text Classification
  • Semantic Search and Retrieval
  • Clustering
  • Bitext Mining
  • Semantic Textual Similarity
  • Reranking

Frequently Asked Questions

Q: What makes this model unique?

The model's specialized focus on French language processing and its comprehensive evaluation across 9 French benchmarks sets it apart. It consistently outperforms other multilingual models in French-specific tasks.

Q: What are the recommended use cases?

The model excels in semantic search, document classification, and similarity assessment tasks. It's particularly effective for French language applications requiring precise semantic understanding and retrieval capabilities.

The first platform built for prompt engineering