ko-gemma-2-9b-it
Property | Value |
---|---|
Parameter Count | 9.24B |
Model Type | Instruction-tuned Language Model |
Architecture | Gemma 2 |
License | Llama 3 |
Languages | Korean, English |
What is ko-gemma-2-9b-it?
ko-gemma-2-9b-it is an advanced Korean-English language model developed by davidkim205, built upon Google's Gemma-2-9b-it architecture. This model has been specifically fine-tuned for Korean language understanding and generation, using a carefully curated QA dataset containing 1,851 samples. The model demonstrates impressive performance in Korean language tasks, achieving an evaluation score of 7.52 according to KEval benchmarks, positioning it competitively among larger models.
Implementation Details
The model utilizes BF16 tensor type and implements efficient 4-bit quantization for optimal performance. It's designed to work with the Transformers library and supports text-generation-inference endpoints.
- Base Model: google/gemma-2-9b-it
- Training Dataset: qa_ability_1851.jsonl
- Quantization Support: 4-bit loading capability
- Benchmark Performance: Strong results on Korean language evaluation metrics (KoBEST, Ko-TruthfulQA, Ko-MMLU)
Core Capabilities
- Bilingual processing in Korean and English
- Detailed explanation generation for complex queries
- Strong performance in various Korean language tasks (0.5150 accuracy on KoBEST)
- Efficient memory usage through quantization
- Comprehensive chat template support
Frequently Asked Questions
Q: What makes this model unique?
This model stands out for its specialized optimization for Korean language tasks while maintaining English capabilities, achieving competitive performance scores against larger models like Qwen2-72B-Instruct and WizardLM-2-8x22B despite its smaller size.
Q: What are the recommended use cases?
The model is particularly well-suited for: Bilingual Korean-English applications, Detailed explanation generation, Question-answering systems, Conversational AI applications requiring nuanced understanding of Korean language contexts.