ko-sbert-nli

Property	Value
Author	jhgan
Downloads	28,533
Paper	Research Paper
Embedding Dimension	768
Performance	82.24% Cosine Pearson on KorSTS

What is ko-sbert-nli?

ko-sbert-nli is a specialized Korean language model based on SBERT (Sentence-BERT) architecture, designed specifically for generating meaningful sentence embeddings. It transforms Korean text into 768-dimensional dense vector representations, making it particularly effective for semantic similarity tasks and natural language understanding applications.

Implementation Details

The model utilizes a BERT-based architecture with mean pooling and is trained using MultipleNegativesRankingLoss with a scale factor of 20.0. It employs the AdamW optimizer with a learning rate of 2e-05 and implements a WarmupLinear scheduler with 889 warmup steps.

Maximum sequence length: 128 tokens
Pooling strategy: Mean tokens pooling
Training dataset: KorNLI
Evaluation dataset: KorSTS

Core Capabilities

Sentence embedding generation for Korean text
Semantic similarity computation
Cross-sentence relationship understanding
Support for clustering and semantic search applications

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically optimized for Korean language understanding, trained on KorNLI dataset and evaluated on KorSTS, achieving impressive performance metrics including 82.24% Cosine Pearson correlation.

Q: What are the recommended use cases?

The model excels in tasks such as semantic similarity comparison, document clustering, information retrieval, and semantic search applications specifically for Korean language content.

ko-sbert-nli

ko-sbert-nli

What is ko-sbert-nli?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models