stella_en_1.5B_v5

Property	Value
Parameter Count	1.54B parameters
Model Type	Sentence Transformer
License	MIT
Paper	MRL Paper

What is stella_en_1.5B_v5?

stella_en_1.5B_v5 is an advanced sentence embedding model built on Alibaba's GTE-large and Qwen2 architectures, specifically designed for semantic search and text similarity tasks. The model implements Multi-dimensional Retrieval Learning (MRL) to generate embeddings in multiple dimensions (512 to 8192), with 1024 dimensions being the optimal choice for most applications.

Implementation Details

The model utilizes two main prompts for different tasks: s2p (sentence-to-passage) and s2s (sentence-to-sentence). It supports both SentenceTransformers and transformers libraries for text encoding, with a maximum sequence length of 512 tokens. The model has achieved impressive scores on the MTEB benchmark suite across various tasks including classification, clustering, and retrieval.

Multiple dimension support: 512, 768, 1024, 2048, 4096, 6144, and 8192
Simplified prompt system for general tasks
Built on established architectures (GTE-large and Qwen2)
Optimized for 512 token sequence length

Core Capabilities

Semantic search and retrieval
Text similarity comparison
Document clustering
Classification tasks
Pair classification

Frequently Asked Questions

Q: What makes this model unique?

The model's implementation of MRL with multiple dimension options and simplified prompt system makes it highly versatile while maintaining strong performance. The 1024-dimension version achieves nearly identical performance to the 8192-dimension version.

Q: What are the recommended use cases?

The model excels in semantic search, document retrieval, and text similarity tasks. It's particularly effective for applications requiring robust sentence embeddings with flexible dimensionality options.

stella_en_1.5B_v5

stella_en_1.5B_v5

What is stella_en_1.5B_v5?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models

The first platform built for prompt engineering