Gemma-2-27B-Chinese-Chat
Property | Value |
---|---|
Parameter Count | 27.2B |
Context Length | 8K tokens |
License | Gemma License |
Base Model | google/gemma-2-27b-it |
Paper | ORPO Paper |
What is Gemma-2-27B-Chinese-Chat?
Gemma-2-27B-Chinese-Chat represents a significant milestone as the first instruction-tuned language model built upon Google's Gemma-2-27b-it specifically optimized for Chinese and English users. Developed through fine-tuning with over 100K preference pairs using the ORPO (Reference-free Monolithic Preference Optimization) algorithm, this model significantly improves upon the base model's capabilities in handling bilingual interactions.
Implementation Details
The model was trained using the LLaMA-Factory framework with specific parameters including 3 epochs, a learning rate of 3e-6 with cosine scheduler, and a warmup ratio of 0.1. The training utilized flash-attn-2 instead of eager attention, and implements full parameter fine-tuning with a global batch size of 128.
- BF16 precision for optimal performance
- 8K context length support
- Comprehensive GGUF format support for various quantization options
- Official Ollama model integration
Core Capabilities
- Enhanced bilingual performance in Chinese and English
- Improved handling of roleplay scenarios
- Advanced tool-using capabilities
- Superior mathematical reasoning
- Reduced issues with language mixing and response consistency
Frequently Asked Questions
Q: What makes this model unique?
This model is the first to specifically fine-tune the Gemma-2-27B architecture for Chinese-English bilingual capabilities, significantly reducing common issues like mixed language responses and improving performance across various tasks.
Q: What are the recommended use cases?
The model excels in bilingual conversations, role-playing scenarios, tool-using applications, and mathematical reasoning tasks. It's particularly suitable for applications requiring natural language understanding and generation in both Chinese and English contexts.