gemma-2-Ifable-9B

Maintained By
ifable

gemma-2-Ifable-9B

PropertyValue
Parameter Count9.24B
Model TypeText Generation
LicenseGemma
FrameworkTransformers 4.43.4
Tensor TypeBF16

What is gemma-2-Ifable-9B?

gemma-2-Ifable-9B is a specialized language model built on Google's Gemma-2 architecture, specifically optimized for creative writing tasks. Notable for ranking first on the Creative Writing Benchmark, this model represents a significant advancement in AI-powered creative text generation.

Implementation Details

The model was trained using the SimPO (Simple Preference Optimization) methodology, incorporating both the Gutenberg dataset and a proprietary creative writing dataset. Training utilized a multi-GPU setup with 8 devices, implementing a cosine learning rate scheduler with 0.1 warmup ratio and Adam optimizer.

  • Training batch size: 128 (effective)
  • Learning rate: 8e-07
  • Training epochs: 1.0
  • Gradient accumulation steps: 16

Core Capabilities

  • Advanced creative writing generation
  • High-quality text completion
  • Benchmark-leading performance in creative tasks
  • Efficient processing with BF16 precision

Frequently Asked Questions

Q: What makes this model unique?

This model distinguishes itself through its top-ranking performance on the Creative Writing Benchmark and its specialized training using SimPO methodology, combining both classic literature and curated creative writing data.

Q: What are the recommended use cases?

The model is particularly well-suited for creative writing applications, including story generation, narrative development, and other creative text-based tasks where high-quality, imaginative output is required.

The first platform built for prompt engineering