dolphin-2.9-llama3-8b

Maintained By
cognitivecomputations

Dolphin 2.9 LLaMA3 8B

PropertyValue
Parameter Count8.03B
Base ModelMeta-LLaMA-3-8B
Context Length4096 tokens
LicenseMETA LLAMA 3 COMMUNITY LICENSE
Training Duration2.5 days on 8x L40S

What is dolphin-2.9-llama3-8b?

Dolphin 2.9 is an advanced language model built on Meta's LLaMA3 architecture, fine-tuned by Eric Hartford, Lucas Atkins, and Fernando Fernandes at Cognitive Computations. This model represents a significant advancement in conversational AI, combining instruction-following capabilities with coding expertise and uncensored responses.

Implementation Details

The model underwent full-weight fine-tuning with a 4k sequence length, utilizing the ChatML prompt template format. Training was conducted using 8x L40S GPUs provided by Crusoe Cloud, incorporating multiple high-quality datasets including OpenHermes-2.5, CodeFeedback, and UltraChat.

  • Trained using Axolotl framework version 0.4.0
  • Implements BF16 precision and flash attention
  • Uses cosine learning rate scheduler with 2e-5 learning rate
  • Trained for 3 epochs with gradient checkpointing

Core Capabilities

  • Advanced conversational abilities with ChatML format support
  • Strong coding and technical task performance
  • Function calling support
  • Initial agentic capabilities
  • Uncensored responses (requires careful deployment consideration)
  • Mathematical problem-solving abilities

Frequently Asked Questions

Q: What makes this model unique?

The model stands out for its combination of uncensored capabilities, diverse training datasets, and optimal balance between size and performance. It's particularly notable for its implementation of the latest LLaMA3 architecture with carefully curated training data.

Q: What are the recommended use cases?

This model is well-suited for conversational AI applications, coding assistance, technical documentation, and general instruction-following tasks. However, due to its uncensored nature, implementing appropriate safety layers is recommended for production deployments.

The first platform built for prompt engineering