Gemma-2-27B-Chinese-Chat

Maintained By
shenzhi-wang

Gemma-2-27B-Chinese-Chat

PropertyValue
Parameter Count27.2B
Context Length8K tokens
LicenseGemma License
Base Modelgoogle/gemma-2-27b-it
PaperORPO Paper

What is Gemma-2-27B-Chinese-Chat?

Gemma-2-27B-Chinese-Chat represents a significant milestone as the first instruction-tuned language model built upon Google's Gemma-2-27b-it specifically optimized for Chinese and English users. Developed through fine-tuning with over 100K preference pairs using the ORPO (Reference-free Monolithic Preference Optimization) algorithm, this model significantly improves upon the base model's capabilities in handling bilingual interactions.

Implementation Details

The model was trained using the LLaMA-Factory framework with specific parameters including 3 epochs, a learning rate of 3e-6 with cosine scheduler, and a warmup ratio of 0.1. The training utilized flash-attn-2 instead of eager attention, and implements full parameter fine-tuning with a global batch size of 128.

  • BF16 precision for optimal performance
  • 8K context length support
  • Comprehensive GGUF format support for various quantization options
  • Official Ollama model integration

Core Capabilities

  • Enhanced bilingual performance in Chinese and English
  • Improved handling of roleplay scenarios
  • Advanced tool-using capabilities
  • Superior mathematical reasoning
  • Reduced issues with language mixing and response consistency

Frequently Asked Questions

Q: What makes this model unique?

This model is the first to specifically fine-tune the Gemma-2-27B architecture for Chinese-English bilingual capabilities, significantly reducing common issues like mixed language responses and improving performance across various tasks.

Q: What are the recommended use cases?

The model excels in bilingual conversations, role-playing scenarios, tool-using applications, and mathematical reasoning tasks. It's particularly suitable for applications requiring natural language understanding and generation in both Chinese and English contexts.

The first platform built for prompt engineering