gemma-2-Ifable-9B
Property | Value |
---|---|
Parameter Count | 9.24B |
Model Type | Text Generation |
License | Gemma |
Framework | Transformers 4.43.4 |
Tensor Type | BF16 |
What is gemma-2-Ifable-9B?
gemma-2-Ifable-9B is a specialized language model built on Google's Gemma-2 architecture, specifically optimized for creative writing tasks. Notable for ranking first on the Creative Writing Benchmark, this model represents a significant advancement in AI-powered creative text generation.
Implementation Details
The model was trained using the SimPO (Simple Preference Optimization) methodology, incorporating both the Gutenberg dataset and a proprietary creative writing dataset. Training utilized a multi-GPU setup with 8 devices, implementing a cosine learning rate scheduler with 0.1 warmup ratio and Adam optimizer.
- Training batch size: 128 (effective)
- Learning rate: 8e-07
- Training epochs: 1.0
- Gradient accumulation steps: 16
Core Capabilities
- Advanced creative writing generation
- High-quality text completion
- Benchmark-leading performance in creative tasks
- Efficient processing with BF16 precision
Frequently Asked Questions
Q: What makes this model unique?
This model distinguishes itself through its top-ranking performance on the Creative Writing Benchmark and its specialized training using SimPO methodology, combining both classic literature and curated creative writing data.
Q: What are the recommended use cases?
The model is particularly well-suited for creative writing applications, including story generation, narrative development, and other creative text-based tasks where high-quality, imaginative output is required.