Dolphin 2.9 LLaMA3 8B

Property	Value
Parameter Count	8.03B
Base Model	Meta-LLaMA-3-8B
Context Length	4096 tokens
License	META LLAMA 3 COMMUNITY LICENSE
Training Duration	2.5 days on 8x L40S

What is dolphin-2.9-llama3-8b?

Dolphin 2.9 is an advanced language model built on Meta's LLaMA3 architecture, fine-tuned by Eric Hartford, Lucas Atkins, and Fernando Fernandes at Cognitive Computations. This model represents a significant advancement in conversational AI, combining instruction-following capabilities with coding expertise and uncensored responses.

Implementation Details

The model underwent full-weight fine-tuning with a 4k sequence length, utilizing the ChatML prompt template format. Training was conducted using 8x L40S GPUs provided by Crusoe Cloud, incorporating multiple high-quality datasets including OpenHermes-2.5, CodeFeedback, and UltraChat.

Trained using Axolotl framework version 0.4.0
Implements BF16 precision and flash attention
Uses cosine learning rate scheduler with 2e-5 learning rate
Trained for 3 epochs with gradient checkpointing

Core Capabilities

Advanced conversational abilities with ChatML format support
Strong coding and technical task performance
Function calling support
Initial agentic capabilities
Uncensored responses (requires careful deployment consideration)
Mathematical problem-solving abilities

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its combination of uncensored capabilities, diverse training datasets, and optimal balance between size and performance. It's particularly notable for its implementation of the latest LLaMA3 architecture with carefully curated training data.

Q: What are the recommended use cases?

This model is well-suited for conversational AI applications, coding assistance, technical documentation, and general instruction-following tasks. However, due to its uncensored nature, implementing appropriate safety layers is recommended for production deployments.