stsb-roberta-large

Property	Value
Parameter Count	355M
License	Apache 2.0
Paper	Sentence-BERT Paper
Embedding Dimension	1024

What is stsb-roberta-large?

stsb-roberta-large is a deprecated sentence embedding model based on RoBERTa architecture that maps sentences and paragraphs to 1024-dimensional dense vector space. While historically used for semantic tasks like clustering and similarity search, it's now considered outdated due to producing lower quality embeddings compared to modern alternatives.

Implementation Details

The model utilizes a RoBERTa-large backbone with a mean pooling layer. It processes text with a maximum sequence length of 128 tokens and includes both transformer and pooling components in its architecture. The model can be easily implemented using either the sentence-transformers library or HuggingFace Transformers.

Built on RoBERTa architecture with 355M parameters
Produces 1024-dimensional embeddings
Implements mean pooling strategy
Supports multiple framework implementations (PyTorch, TensorFlow, JAX)

Core Capabilities

Sentence and paragraph embedding generation
Semantic similarity computation
Text clustering applications
Semantic search functionality

Frequently Asked Questions

Q: What makes this model unique?

This model was one of the early implementations of sentence-transformers using RoBERTa architecture. However, it's now deprecated in favor of more modern alternatives that provide better quality embeddings.

Q: What are the recommended use cases?

Given its deprecated status, it's recommended to use newer models from the SBERT.net collection instead. However, if still using this model, it's suitable for basic sentence similarity tasks and semantic search applications where absolute state-of-the-art performance isn't critical.