indonesian-sbert-large

Maintained By
naufalihsan

indonesian-sbert-large

PropertyValue
Authornaufalihsan
Downloads83,836
Embedding Dimension1024
FrameworkPyTorch + Transformers

What is indonesian-sbert-large?

indonesian-sbert-large is a specialized sentence transformer model designed for processing Indonesian text. It's built on the BERT architecture and maps sentences and paragraphs to a high-dimensional vector space (1024 dimensions), making it particularly effective for semantic search, clustering, and similarity analysis tasks in Indonesian language applications.

Implementation Details

The model utilizes a sophisticated architecture combining a BERT transformer with mean pooling. It was trained using CosineSimilarityLoss with AdamW optimizer, featuring a learning rate of 2e-05 and warmup steps of 144 across 4 epochs. The model supports a maximum sequence length of 128 tokens and implements both sentence-transformers and HuggingFace Transformers interfaces.

  • Trained with batch size of 16 using RandomSampler
  • Implements mean pooling strategy for sentence embeddings
  • Features weight decay of 0.01 and max gradient norm of 1

Core Capabilities

  • Sentence and paragraph embedding generation
  • Semantic similarity computation
  • Support for clustering operations
  • Efficient text feature extraction
  • Cross-lingual transfer potential for Indonesian language processing

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its specialization in Indonesian language processing, offering high-dimensional (1024D) sentence embeddings that are particularly well-suited for semantic search and similarity tasks in Indonesian text applications.

Q: What are the recommended use cases?

The model is ideal for applications requiring semantic search, document clustering, similarity analysis, and text classification in Indonesian language contexts. It's particularly effective for tasks requiring nuanced understanding of sentence meanings and relationships.

The first platform built for prompt engineering