Gemma2 9B CPT Sahabat-AI v1 Instruct
Property | Value |
---|---|
Parameter Count | 9.24B |
Model Type | Decoder |
Languages | English, Indonesian, Javanese, Sundanese |
License | Gemma Community License |
Context Length | 8192 tokens |
What is gemma2-9b-cpt-sahabatai-v1-instruct?
Sahabat-AI v1 Instruct is an advanced multilingual language model specifically designed for Indonesian languages and dialects. Co-initiated by GoTo Group and Indosat Ooredoo Hutchison, this model has been fine-tuned on an extensive dataset of 448,000 Indonesian instruction pairs, along with 96,000 Javanese and 98,000 Sundanese instruction pairs, plus 129,000 English instruction pairs.
Implementation Details
Built on the Gemma2 architecture, this model leverages advanced decoder technology and has been fine-tuned through a combination of full parameter tuning and on-policy alignment. The training process involved 4 hours of fine-tuning and 2 hours of alignment on 8x H100-80GB GPUs.
- Context length of 8192 tokens
- BF16 tensor type optimization
- Comprehensive multilingual capabilities
- State-of-the-art performance on regional language benchmarks
Core Capabilities
- Achieves 61.169% overall score on SEA HELM benchmark
- 62.6% performance on IndoMMLU evaluation
- Superior performance in Indonesian (64.154%), Javanese (64.439%), and Sundanese (54.913%) tasks
- Maintains strong English language capabilities with 33.67% average score on standard benchmarks
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized optimization for Indonesian languages and dialects, while maintaining strong performance across multiple languages. It's particularly notable for achieving state-of-the-art results on regional language benchmarks like SEA HELM and IndoMMLU.
Q: What are the recommended use cases?
The model is ideal for applications requiring multilingual capabilities in Southeast Asian languages, particularly Indonesian, Javanese, and Sundanese. It's suitable for tasks like question answering, sentiment analysis, translation, and abstractive summarization in these languages.