StableBeluga-7B

Maintained By
stabilityai

StableBeluga-7B

PropertyValue
Parameter Count6.74B
Model TypeLanguage Model (LLaMA2-based)
LicenseSTABLE BELUGA NON-COMMERCIAL COMMUNITY LICENSE
Research PaperLLaMA2 Paper
DeveloperStability AI

What is StableBeluga-7B?

StableBeluga-7B is an advanced language model developed by Stability AI, built upon the LLaMA2 7B architecture and fine-tuned using an Orca-style dataset. The model is specifically designed to excel at instruction-following tasks while maintaining safety and ethical considerations in its responses.

Implementation Details

The model implements a sophisticated training procedure using mixed-precision (BF16) and AdamW optimization. It utilizes a two-phase training approach with carefully tuned hyperparameters, including batch sizes of 256/512 and learning rates of 3e-5 with cosine decay.

  • Supports both F32 and FP16 tensor types for flexibility
  • Employs a specific prompt format with System, User, and Assistant components
  • Trained on four specialized datasets for comprehensive knowledge
  • Implements automatic device mapping for efficient resource utilization

Core Capabilities

  • Advanced instruction following with safety considerations
  • Natural language generation and understanding
  • Context-aware responses with system-guided behavior
  • Support for both inference endpoints and local deployment

Frequently Asked Questions

Q: What makes this model unique?

StableBeluga-7B stands out for its careful balance of performance and safety, using a specialized Orca-style dataset and implementing strict ethical guidelines. The model's architecture is optimized for instruction-following while maintaining reasonable resource requirements.

Q: What are the recommended use cases?

The model is well-suited for research and non-commercial applications requiring sophisticated language understanding and generation. It excels in scenarios requiring safe, controlled responses while maintaining high-quality output. The model is particularly effective when used with its specific prompt format for consistent results.

The first platform built for prompt engineering