L3-8B-Stheno-v3.2

Maintained By
Sao10K

L3-8B-Stheno-v3.2

PropertyValue
Parameter Count8.03B
Model TypeText Generation
ArchitectureLLaMA-based Transformer
LicenseCC-BY-NC-4.0
Tensor TypeBF16

What is L3-8B-Stheno-v3.2?

L3-8B-Stheno-v3.2 is a sophisticated language model developed by Sao10K, representing the sixth iteration of the Stheno series. Trained on an H100 SXM GPU over approximately 24 hours, this model combines creative writing capabilities with assistant-style functionality. It's built upon the LLaMA architecture and has been fine-tuned using four carefully curated datasets, including writing prompts, instruct data, and filtered conversational logs.

Implementation Details

The model employs specific sampling parameters for optimal performance, including a recommended temperature range of 1.12-1.22, Min-P of 0.075, Top-K of 50, and a repetition penalty of 1.1. It utilizes the LLaMA-3-Instruct prompting template and includes specialized system prompts for roleplay scenarios.

  • Trained on multiple high-quality datasets including Opus-WritingPrompts and Claude-3-Opus-Instruct-15K
  • Implements BF16 tensor format for efficient computation
  • Features improved hyperparameters resulting in lower loss levels
  • Includes sophisticated stopping mechanisms for coherent text generation

Core Capabilities

  • Enhanced narrative and storywriting abilities
  • Balanced handling of SFW and NSFW content
  • Improved multi-turn coherency in conversations
  • Better prompt and instruction adherence
  • Assistant-style task handling
  • Role-playing and character immersion

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its balanced approach to content generation, improved narrative capabilities, and enhanced instruction following compared to previous versions. It represents a careful trade-off between creativity and reliability.

Q: What are the recommended use cases?

The model excels in creative writing, storytelling, roleplay scenarios, and assistant-style tasks. It's particularly well-suited for applications requiring both creative expression and structured response generation.

The first platform built for prompt engineering