electra-small-nli-sts
Property | Value |
---|---|
License | Apache-2.0 |
Language | Korean |
Vector Dimension | 256 |
Framework | PyTorch, Transformers |
What is electra-small-nli-sts?
electra-small-nli-sts is a specialized sentence transformer model designed for Korean language processing. It converts sentences and paragraphs into 256-dimensional dense vector representations, making it particularly effective for tasks like semantic search and clustering. Built on the ELECTRA architecture, this model has been fine-tuned using NLI (Natural Language Inference) and STS (Semantic Textual Similarity) tasks.
Implementation Details
The model utilizes a two-component architecture consisting of an ELECTRA transformer followed by a pooling layer. It was trained using the MultipleNegativesRankingLoss with a batch size of 64 and employs mean pooling for sentence embedding generation. The training process included warmup steps and used the AdamW optimizer with a learning rate of 2e-05.
- Maximum sequence length: 512 tokens
- Pooling configuration: Mean tokens pooling enabled
- Training epochs: 1 with evaluation steps every 903 iterations
- Optimization: AdamW with weight decay of 0.01
Core Capabilities
- Sentence and paragraph embedding generation
- Semantic similarity computation
- Clustering support
- Cross-lingual content matching
- Support for both sentence-transformers and HuggingFace implementations
Frequently Asked Questions
Q: What makes this model unique?
The model's distinctive feature is its optimization for Korean language processing using a compact ELECTRA architecture while maintaining strong performance through NLI and STS training. Its 256-dimensional output makes it efficient for production deployments.
Q: What are the recommended use cases?
The model is ideal for semantic search applications, document clustering, similarity analysis, and content recommendation systems in Korean language contexts. It's particularly suitable for applications requiring efficient sentence-level semantic understanding.