Dolphin 2.9.2 Qwen2 72B

Property	Value
Parameter Count	72.7B
Context Length	128,000 tokens
Training Length	8,192 tokens
License	tongyi-qianwen
Model Type	Decoder-only Transformer

What is dolphin-2.9.2-qwen2-72b?

Dolphin 2.9.2 Qwen2 72B is a large language model developed by Cognitive Computations, based on the Qwen2-72B architecture. It represents a significant advancement in conversational AI, featuring full-weight fine-tuning and utilizing the ChatML prompt format. The model was trained using carefully selected parameters identified by the Laser Scanner tool, resulting in enhanced performance across various tasks.

Implementation Details

The model implements a sophisticated architecture with 72.7B parameters, trained using BF16 precision. It supports an impressive 128k context window, with training conducted using 8k sequence lengths. The implementation uses the ChatML template format for consistent interaction patterns.

Utilizes advanced parameter selection via Laser Scanner
Implements full-weight fine-tuning methodology
Features gradient checkpointing and flash attention
Trained on 8 diverse datasets including OpenHermes-2.5 and Dolphin Coder

Core Capabilities

Strong performance in instruction following and conversation
Advanced coding capabilities
Function calling support
Initial agentic abilities
Benchmark scores: 40.38% on IFEval, 47.7% on BBH, 49.52% on MMLU-PRO

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its uncensored nature and high compliance, combined with strong performance across various benchmarks. It features a significantly large context window of 128k tokens, making it suitable for processing lengthy documents and conversations.

Q: What are the recommended use cases?

The model excels in conversational AI applications, coding tasks, and function calling scenarios. It's particularly well-suited for applications requiring long context understanding and complex instruction following. However, users should implement their own alignment layer before deploying it as a service.