StableBeluga1-Delta

Maintained By
stabilityai

StableBeluga1-Delta

PropertyValue
Parameter Count65.3B
LicenseCC BY-NC-4.0
FrameworkHuggingFace Transformers
Base ModelLLaMA 65B
Training Datasets4 specialized datasets including COT, FLAN2021, T0, and NIV2 submixes
Research PaperOrca Paper

What is StableBeluga1-Delta?

StableBeluga1-Delta is an advanced language model developed by Stability AI, built upon the LLaMA 65B architecture and fine-tuned using an Orca-style dataset approach. This model represents a significant advancement in instruction-following AI systems, implementing sophisticated training procedures with carefully curated datasets.

Implementation Details

The model utilizes a specialized training procedure with mixed-precision (BF16) and AdamW optimization. Key training parameters include a batch size of 512, learning rate of 3e-5 with cosine decay to 3e-6, and 100-step warmup period.

  • Sophisticated delta weights implementation requiring base model combination
  • Optimized for both FP16 and F32 tensor operations
  • Implements text-generation-inference pipeline
  • Supports efficient inference endpoints

Core Capabilities

  • Advanced instruction-following abilities
  • Safe and controlled response generation
  • Support for complex explanation traces
  • Optimized for English language tasks
  • Handles various prompt formats similar to Alpaca

Frequently Asked Questions

Q: What makes this model unique?

StableBeluga1-Delta stands out for its combination of the powerful LLaMA architecture with Orca-style training, focusing on safe and controlled responses while maintaining high-quality instruction-following capabilities. The delta weights approach allows for flexible deployment while protecting the base model integrity.

Q: What are the recommended use cases?

The model is best suited for applications requiring sophisticated instruction-following, complex explanation generation, and safe interaction patterns. It's particularly valuable for scenarios requiring controlled, ethical AI responses while maintaining high-quality output.

The first platform built for prompt engineering