all-mpnet-base-v1

Property	Value
Parameter Count	109M
License	Apache 2.0
Architecture	MPNet-based Transformer
Output Dimension	768
Training Data	1B+ sentence pairs

What is all-mpnet-base-v1?

all-mpnet-base-v1 is a powerful sentence embedding model developed by the sentence-transformers team. Built on Microsoft's MPNet architecture, it transforms sentences and paragraphs into 768-dimensional dense vector representations, making it ideal for semantic search, clustering, and similarity tasks. The model was fine-tuned on an impressive dataset of over 1 billion sentence pairs from diverse sources including Reddit comments, academic citations, and question-answer pairs.

Implementation Details

The model leverages the pretrained microsoft/mpnet-base architecture and employs a contrastive learning objective during fine-tuning. It was trained for 920k steps using a batch size of 512 on TPU v3-8 hardware, with AdamW optimizer and a learning rate of 2e-5.

Supports input sequences up to 128 tokens
Implements mean pooling with attention mask
Uses cosine similarity for sentence pair comparison
Provides both PyTorch and ONNX runtime support

Core Capabilities

Sentence and paragraph embedding generation
Semantic similarity computation
Text clustering and classification
Information retrieval tasks
Cross-sentence relationship modeling

Frequently Asked Questions

Q: What makes this model unique?

This model stands out due to its extensive training on over 1 billion diverse sentence pairs and its use of the advanced MPNet architecture, resulting in robust and versatile sentence embeddings that perform well across various tasks.

Q: What are the recommended use cases?

The model excels in semantic search applications, document clustering, similarity comparison, and any task requiring semantic understanding of text. It's particularly effective for applications needing to compare or match sentences and short paragraphs.

all-mpnet-base-v1

all-mpnet-base-v1

What is all-mpnet-base-v1?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models