nb-sbert-base

Maintained By
NbAiLab

nb-sbert-base

PropertyValue
Parameter Count178M
LicenseApache 2.0
ArchitectureBERT-based Sentence Transformer
LanguageNorwegian/English

What is nb-sbert-base?

nb-sbert-base is a specialized sentence transformer model designed for Norwegian language processing, with cross-lingual capabilities between Norwegian and English. Built on the foundation of nb-bert-base, this model maps sentences and paragraphs to 768-dimensional dense vector space, enabling sophisticated semantic analysis and similarity computations.

Implementation Details

The model leverages the SentenceTransformers framework and was trained on a machine-translated version of the MNLI dataset. It employs mean pooling and achieves impressive performance scores, with a Pearson correlation of 0.8275 on similarity tasks.

  • Trained using MultipleNegativesRankingLoss with a scale of 20.0
  • Implements cosine similarity as the primary similarity function
  • Supports batch processing with size 32
  • Features automatic mean pooling of token embeddings

Core Capabilities

  • Semantic similarity computation between sentences
  • Cross-lingual sentence matching (Norwegian-English)
  • Keyword extraction using KeyBERT integration
  • Topic modeling with BERTopic
  • Vector-based similarity search

Frequently Asked Questions

Q: What makes this model unique?

The model's ability to handle both Norwegian and English content while maintaining high performance in semantic similarity tasks makes it particularly valuable for Nordic NLP applications. Its versatility in supporting multiple downstream tasks like keyword extraction and topic modeling sets it apart.

Q: What are the recommended use cases?

The model excels in various applications including semantic search, document clustering, cross-lingual information retrieval, and automated keyword extraction. It's particularly useful for organizations working with Norwegian-English bilingual content or requiring sophisticated text analysis in Norwegian.

The first platform built for prompt engineering