dolphin-2.6-mistral-7b

Maintained By
cognitivecomputations

Dolphin-2.6-Mistral-7B

PropertyValue
Parameter Count7.24B
Model TypeText Generation
ArchitectureMistral-7B
LicenseApache-2.0
Context Window16k tokens
Training Hardware4x A100 GPUs

What is dolphin-2.6-mistral-7b?

Dolphin-2.6-Mistral-7B is an advanced language model built on the Mistral-7B architecture, specifically optimized for coding tasks and general text generation. This uncensored model features enhanced compliance and a significant 16k context window, making it suitable for complex, long-form interactions. The model was trained on seven carefully curated datasets, including specialized coding datasets and general instruction data.

Implementation Details

The model utilizes the ChatML prompt format and was trained for 3 epochs using full weights fine-tuning on Axolotl. It implements BF16 tensor types and incorporates advanced training configurations to improve overall quality. The training process took 2 days on 4 A100 GPUs.

  • Enhanced training configuration for improved quality
  • Integration of empathy-focused data
  • Replacement of previous datasets with Capybara
  • Comprehensive coding capabilities
  • Uncensored architecture with filtered dataset for bias removal

Core Capabilities

  • Advanced code generation and understanding
  • High compliance and instruction following
  • Extended context handling (16k tokens)
  • Empathetic responses
  • Structured output generation

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its exceptional coding capabilities, uncensored nature, and the integration of multiple high-quality datasets. It features a unique combination of compliance and capability while maintaining an open, unbiased approach to task completion.

Q: What are the recommended use cases?

The model excels in coding tasks, general text generation, and complex problem-solving scenarios. It's particularly suitable for developers, technical documentation, and situations requiring detailed, unbiased responses. However, users should implement their own alignment layer before deploying it as a service.

The first platform built for prompt engineering