TQ2.5-14B-Sugarquill-v1
Property | Value |
---|---|
Parameter Count | 14.8B |
Model Type | Text Generation |
License | Apache-2.0 |
Base Model | arcee-ai/SuperNova-Medius |
Training Datasets | Mielikki/Erebus-87k, allura-org/r_shortstories_24k |
What is TQ2.5-14B-Sugarquill-v1?
TQ2.5-14B-Sugarquill-v1 is an advanced language model specifically designed for creative writing and storytelling. Built upon the SuperNova-Medius architecture, this model has been fine-tuned on a carefully curated collection of short stories to enhance its narrative capabilities while maintaining strong instruction-following abilities.
Implementation Details
The model was trained for 2 epochs on approximately 18.7M tokens, utilizing rsLoRA and the paged_ademamix_8bit optimizer. Training was conducted on a 5x3090Ti workstation with BF16 precision, implementing various optimizations including flash attention and gradient checkpointing.
- Supports ChatML instruct formatting
- 8192 token context length
- Optimized with rsLoRA (r=64, alpha=32)
- Normalized punctuation and whitespace handling
Core Capabilities
- Advanced story writing and narrative generation
- Role-playing (RP) interactions
- Chat-based co-writing
- Raw text completion
- Strong instruction following
Frequently Asked Questions
Q: What makes this model unique?
This model combines the prose capabilities of SuperNova-Medius with enhanced creative writing abilities, featuring an extended context length suitable for longer narratives while maintaining instruction-following capabilities.
Q: What are the recommended use cases?
The model excels at creative writing, storytelling, role-playing scenarios, and interactive narrative generation. It can be used both in chat mode for collaborative writing and direct text completion.
Q: What are the recommended sampling parameters?
The model performs best with Temperature: 0.8, Min-P: 0.05, Top-A: 0.3, and Repetition Penalty: 1.03. It shows particular affinity for Top-A and Smooth Sampling techniques.