klue-sroberta-base-continue-learning-by-mnr

Maintained By
bespin-global

klue-sroberta-base-continue-learning-by-mnr

PropertyValue
Parameter Count111M parameters
LicenseCC-BY-4.0
LanguageKorean
FrameworkPyTorch, Sentence-Transformers
Model TypeSentence Similarity

What is klue-sroberta-base-continue-learning-by-mnr?

This is a specialized Korean language model developed by Bespin Global that maps sentences and paragraphs to 768-dimensional dense vector spaces. The model employs a continue-learning approach, utilizing both KLUE/NLI and KLUE/STS datasets for training, achieving impressive performance metrics with a Pearson correlation of 0.89 on similarity tasks.

Implementation Details

The model implements a two-phase training approach: First, it uses NLI dataset with MultipleNegativeRankingLoss for initial training, followed by STS dataset training using CosineSimilarityLoss. The architecture is based on RoBERTa with mean pooling strategy for sentence embeddings.

  • Utilizes sentence-transformers framework for efficient sentence embedding generation
  • Implements mean pooling on token embeddings with attention mask consideration
  • Supports maximum sequence length of 512 tokens
  • Achieves 0.8901 Pearson correlation on cosine similarity metrics

Core Capabilities

  • Sentence similarity computation
  • Dense vector embeddings for Korean text
  • Semantic search functionality
  • Clustering applications

Frequently Asked Questions

Q: What makes this model unique?

The model's unique continue-learning approach combining NLI and STS training, specifically optimized for Korean language understanding, sets it apart. Its impressive evaluation metrics and specialized training on KLUE datasets make it particularly effective for Korean text similarity tasks.

Q: What are the recommended use cases?

The model excels in semantic search applications, document similarity comparison, clustering of Korean text, and any task requiring semantic understanding of Korean language sentences. It's particularly suitable for production environments requiring robust sentence embeddings.

The first platform built for prompt engineering