multi-sentence-BERTino
Property | Value |
---|---|
Parameter Count | 67.6M |
Model Type | Sentence Transformer |
Architecture | DistilBERT-based |
License | MIT |
Language | Italian |
What is multi-sentence-BERTino?
multi-sentence-BERTino is a specialized sentence transformer model designed for Italian language processing. Built upon the indigo-ai/BERTino architecture, it has been specifically trained on Italian datasets including mmarco (200K samples) and stsb to generate high-quality 768-dimensional sentence embeddings.
Implementation Details
The model implements a sophisticated architecture combining a DistilBERT transformer with mean pooling. It's trained using multiple loss functions including TripletLoss, CosineSimilarityLoss, and CachedMultipleNegativesRankingLoss, optimized with AdamW optimizer and WarmupLinear scheduling.
- Maximum sequence length: 512 tokens
- Embedding dimension: 768
- Trained with batch size: 16
- Uses mean pooling strategy
Core Capabilities
- Sentence and paragraph embedding generation
- Semantic similarity computation
- Text clustering support
- Semantic search functionality
- Cross-sentence comparison in Italian
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized Italian language capabilities and efficient architecture, combining the power of DistilBERT with optimized training on Italian-specific datasets. The multi-task training approach with various loss functions makes it particularly robust for sentence similarity tasks.
Q: What are the recommended use cases?
The model excels in applications requiring semantic understanding of Italian text, including: semantic search systems, document clustering, text similarity analysis, and automated text comparison tasks. It's particularly well-suited for production environments needing efficient sentence embedding generation.