TQ2.5-14B-Sugarquill-v1

Property	Value
Parameter Count	14.8B
Model Type	Text Generation
License	Apache-2.0
Base Model	arcee-ai/SuperNova-Medius
Training Datasets	Mielikki/Erebus-87k, allura-org/r_shortstories_24k

What is TQ2.5-14B-Sugarquill-v1?

TQ2.5-14B-Sugarquill-v1 is an advanced language model specifically designed for creative writing and storytelling. Built upon the SuperNova-Medius architecture, this model has been fine-tuned on a carefully curated collection of short stories to enhance its narrative capabilities while maintaining strong instruction-following abilities.

Implementation Details

The model was trained for 2 epochs on approximately 18.7M tokens, utilizing rsLoRA and the paged_ademamix_8bit optimizer. Training was conducted on a 5x3090Ti workstation with BF16 precision, implementing various optimizations including flash attention and gradient checkpointing.

Supports ChatML instruct formatting
8192 token context length
Optimized with rsLoRA (r=64, alpha=32)
Normalized punctuation and whitespace handling

Core Capabilities

Advanced story writing and narrative generation
Role-playing (RP) interactions
Chat-based co-writing
Raw text completion
Strong instruction following

Frequently Asked Questions

Q: What makes this model unique?

This model combines the prose capabilities of SuperNova-Medius with enhanced creative writing abilities, featuring an extended context length suitable for longer narratives while maintaining instruction-following capabilities.

Q: What are the recommended use cases?

The model excels at creative writing, storytelling, role-playing scenarios, and interactive narrative generation. It can be used both in chat mode for collaborative writing and direct text completion.

Q: What are the recommended sampling parameters?

The model performs best with Temperature: 0.8, Min-P: 0.05, Top-A: 0.3, and Repetition Penalty: 1.03. It shows particular affinity for Top-A and Smooth Sampling techniques.