multi-qa-MiniLM-L6-dot-v1

Maintained By
sentence-transformers

multi-qa-MiniLM-L6-dot-v1

PropertyValue
Parameter Count22.7M
Embedding Dimensions384
Training Data215M Q&A pairs
Maximum Sequence Length512 tokens

What is multi-qa-MiniLM-L6-dot-v1?

multi-qa-MiniLM-L6-dot-v1 is a specialized sentence transformer model designed for semantic search applications. It transforms text into 384-dimensional dense vector representations, enabling efficient similarity matching between queries and documents. The model was trained on an extensive dataset of 215 million question-answer pairs from diverse sources including WikiAnswers, Stack Exchange, and MS MARCO.

Implementation Details

The model utilizes CLS pooling and is optimized for dot-product similarity scoring. It's built on the MiniLM architecture, offering an efficient balance between performance and computational requirements. The model processes text up to 512 tokens, though it's optimized for inputs under 250 word pieces.

  • Produces non-normalized 384-dimensional embeddings
  • Uses CLS pooling for sentence representation
  • Optimized for dot-product similarity scoring
  • Implements efficient transformer architecture

Core Capabilities

  • Semantic search and document retrieval
  • Question-answer matching
  • Text similarity computation
  • Dense passage retrieval

Frequently Asked Questions

Q: What makes this model unique?

The model's uniqueness lies in its extensive training on 215M diverse Q&A pairs and its optimization for dot-product similarity, making it particularly effective for semantic search applications while maintaining computational efficiency with only 22.7M parameters.

Q: What are the recommended use cases?

The model excels in semantic search applications, question-answer matching, and document retrieval tasks. It's particularly suitable for applications requiring quick similarity matching between shorter texts (under 250 tokens) and performs best with dot-product scoring.

The first platform built for prompt engineering