AltDiffusion

Maintained By
BAAI

AltDiffusion

PropertyValue
Model TypeText-to-Image Diffusion
Total Parameters~1.8B (AutoEncoder: 83.7M, Unet: 865M, TextEncoder: 859M)
LicenseCreativeML OpenRAIL-M
PaperAltDiffusion Paper
LanguagesChinese and English

What is AltDiffusion?

AltDiffusion is a groundbreaking bilingual text-to-image generation model that extends Stable Diffusion's capabilities to support both Chinese and English inputs. Developed by BAAI, it leverages the AltCLIP technology to achieve superior cross-lingual alignment while maintaining high-quality image generation capabilities.

Implementation Details

The model architecture combines three main components: an AutoEncoder (83.7M parameters), a UNet (865M parameters), and the AltCLIP TextEncoder (859M parameters). It's trained on a combination of WuDao dataset and LAION, enabling robust bilingual understanding and generation capabilities.

  • Built on Stable Diffusion architecture with bilingual extensions
  • Implements fast DPM scheduler for efficient generation (~2 seconds on V100)
  • Requires minimum 10GB GPU memory for inference

Core Capabilities

  • Bilingual text-to-image generation in Chinese and English
  • Superior cross-lingual alignment compared to other open-source alternatives
  • Maintains original Stable Diffusion capabilities while adding enhanced features
  • Supports high-resolution image generation with detailed control

Frequently Asked Questions

Q: What makes this model unique?

AltDiffusion's main distinction is its superior bilingual capabilities, particularly in Chinese-English alignment, while maintaining high-quality image generation comparable to or better than the original Stable Diffusion model.

Q: What are the recommended use cases?

The model excels in generating detailed artwork, illustrations, and concept art from both Chinese and English prompts. It's particularly useful for applications requiring multilingual support or specific cultural context in image generation.

The first platform built for prompt engineering