TQ2.5-14B-Sugarquill-v1

Maintained By
allura-org

TQ2.5-14B-Sugarquill-v1

PropertyValue
Parameter Count14.8B
Model TypeText Generation
LicenseApache-2.0
Base Modelarcee-ai/SuperNova-Medius
Training DatasetsMielikki/Erebus-87k, allura-org/r_shortstories_24k

What is TQ2.5-14B-Sugarquill-v1?

TQ2.5-14B-Sugarquill-v1 is an advanced language model specifically designed for creative writing and storytelling. Built upon the SuperNova-Medius architecture, this model has been fine-tuned on a carefully curated collection of short stories to enhance its narrative capabilities while maintaining strong instruction-following abilities.

Implementation Details

The model was trained for 2 epochs on approximately 18.7M tokens, utilizing rsLoRA and the paged_ademamix_8bit optimizer. Training was conducted on a 5x3090Ti workstation with BF16 precision, implementing various optimizations including flash attention and gradient checkpointing.

  • Supports ChatML instruct formatting
  • 8192 token context length
  • Optimized with rsLoRA (r=64, alpha=32)
  • Normalized punctuation and whitespace handling

Core Capabilities

  • Advanced story writing and narrative generation
  • Role-playing (RP) interactions
  • Chat-based co-writing
  • Raw text completion
  • Strong instruction following

Frequently Asked Questions

Q: What makes this model unique?

This model combines the prose capabilities of SuperNova-Medius with enhanced creative writing abilities, featuring an extended context length suitable for longer narratives while maintaining instruction-following capabilities.

Q: What are the recommended use cases?

The model excels at creative writing, storytelling, role-playing scenarios, and interactive narrative generation. It can be used both in chat mode for collaborative writing and direct text completion.

Q: What are the recommended sampling parameters?

The model performs best with Temperature: 0.8, Min-P: 0.05, Top-A: 0.3, and Repetition Penalty: 1.03. It shows particular affinity for Top-A and Smooth Sampling techniques.

The first platform built for prompt engineering