BgGPT-Gemma-2-9B-IT-v1.0
Property | Value |
---|---|
Parameter Count | 9.24B |
Model Type | Causal decoder-only transformer |
License | Gemma Terms of Use |
Languages | Bulgarian, English |
Base Model | google/gemma-2-9b-it |
What is BgGPT-Gemma-2-9B-IT-v1.0?
BgGPT-Gemma-2-9B-IT-v1.0 is a state-of-the-art language model developed by INSAIT Institute, specifically designed to excel in both Bulgarian and English language tasks. Built upon Google's Gemma 2 9B architecture, it underwent extensive pre-training on approximately 100 billion tokens, with 85 billion of those being Bulgarian content.
Implementation Details
The model implements a Branch-and-Merge training strategy (presented at EMNLP'24) and utilizes various data sources including Bulgarian web crawls, Wikipedia, and specialized datasets. It operates using BF16 precision and supports both instruction-tuning and general language generation tasks.
- Continuously pre-trained on 100B tokens
- Implements Branch-and-Merge strategy
- Instruction-fine-tuned on real-world Bulgarian conversations
- Supports Hugging Face Transformers and GGML/llama.cpp implementations
Core Capabilities
- Outperforms larger models in Bulgarian language tasks
- Maintains strong English language capabilities
- Excels in logical reasoning, mathematics, and knowledge testing
- Handles both standard benchmarks and specialized Bulgarian educational assessments
Frequently Asked Questions
Q: What makes this model unique?
The model's unique Branch-and-Merge training strategy allows it to excel in Bulgarian while maintaining English capabilities, often outperforming much larger models like Qwen 2.5 72B and Llama3.1 70B in Bulgarian language tasks.
Q: What are the recommended use cases?
The model is ideal for Bulgarian-English bilingual applications, educational assessments, and general language understanding tasks. It performs particularly well in logical reasoning, mathematics, and knowledge-based applications.