stable-diffusion-2-base

Maintained By
stabilityai

Stable Diffusion v2-base

PropertyValue
LicenseCreativeML OpenRAIL++
ArchitectureLatent Diffusion Model
Research PaperHigh-Resolution Image Synthesis With Latent Diffusion Models
Training DataLAION-5B filtered subset

What is stable-diffusion-2-base?

Stable Diffusion v2-base is a state-of-the-art text-to-image generation model developed by StabilityAI. It represents a significant evolution in image synthesis, trained initially for 550k steps at 256x256 resolution and further refined for 850k steps at 512x512 resolution. The model utilizes a sophisticated LAION-NSFW classifier to filter explicit content and maintains high aesthetic quality standards.

Implementation Details

The model employs a Latent Diffusion architecture with an integrated OpenCLIP-ViT/H text encoder. It features an autoencoder that transforms images into latent representations with a downsampling factor of 8, making it highly efficient for processing and generation.

  • Trained on 32 x 8 A100 GPUs with AdamW optimizer
  • Batch size of 2048
  • Learning rate of 0.0001 with 10,000-step warmup
  • Supports multiple schedulers including EulerDiscrete

Core Capabilities

  • High-quality image generation at 512x512 resolution
  • Enhanced safety filters and content moderation
  • Efficient latent space processing
  • Improved aesthetic quality with filtered training data (aesthetic score ≥ 4.5)

Frequently Asked Questions

Q: What makes this model unique?

This model stands out for its improved training methodology, utilizing filtered LAION-5B data and incorporating enhanced safety measures while maintaining high aesthetic quality. It represents a base version that balances performance with practical usability.

Q: What are the recommended use cases?

The model is designed for research purposes, including artistic creation, educational tools, and creative applications. It specifically excludes harmful content generation and maintains strict ethical usage guidelines.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.