L3-8B-Stheno-v3.2

Property	Value
Parameter Count	8.03B
Model Type	Text Generation
Architecture	LLaMA-based Transformer
License	CC-BY-NC-4.0
Tensor Type	BF16

What is L3-8B-Stheno-v3.2?

L3-8B-Stheno-v3.2 is a sophisticated language model developed by Sao10K, representing the sixth iteration of the Stheno series. Trained on an H100 SXM GPU over approximately 24 hours, this model combines creative writing capabilities with assistant-style functionality. It's built upon the LLaMA architecture and has been fine-tuned using four carefully curated datasets, including writing prompts, instruct data, and filtered conversational logs.

Implementation Details

The model employs specific sampling parameters for optimal performance, including a recommended temperature range of 1.12-1.22, Min-P of 0.075, Top-K of 50, and a repetition penalty of 1.1. It utilizes the LLaMA-3-Instruct prompting template and includes specialized system prompts for roleplay scenarios.

Trained on multiple high-quality datasets including Opus-WritingPrompts and Claude-3-Opus-Instruct-15K
Implements BF16 tensor format for efficient computation
Features improved hyperparameters resulting in lower loss levels
Includes sophisticated stopping mechanisms for coherent text generation

Core Capabilities

Enhanced narrative and storywriting abilities
Balanced handling of SFW and NSFW content
Improved multi-turn coherency in conversations
Better prompt and instruction adherence
Assistant-style task handling
Role-playing and character immersion

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its balanced approach to content generation, improved narrative capabilities, and enhanced instruction following compared to previous versions. It represents a careful trade-off between creativity and reliability.

Q: What are the recommended use cases?

The model excels in creative writing, storytelling, roleplay scenarios, and assistant-style tasks. It's particularly well-suited for applications requiring both creative expression and structured response generation.

L3-8B-Stheno-v3.2

L3-8B-Stheno-v3.2

What is L3-8B-Stheno-v3.2?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models

The first platform built for prompt engineering