Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1

Maintained By
IDEA-CCNL

Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1

PropertyValue
LicenseCreativeML OpenRAIL-M
Base ModelStable Diffusion v1.4
Training Data20M filtered Chinese image-text pairs
PaperFengshenbang 1.0

What is Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1?

This is a groundbreaking bilingual Stable Diffusion model that enables both Chinese and English text-to-image generation. Developed by IDEA-CCNL, it's trained on carefully curated datasets from Noah-Wukong and Zero, filtered using CLIP scoring for high-quality image-text pairs.

Implementation Details

The model underwent a two-stage training process on 8 A80 GPUs: First stage (80 hours) focused on text encoder training while freezing other components, followed by a second stage (100 hours) with full model fine-tuning for better Chinese language compatibility.

  • Built on Stable Diffusion v1.4 architecture
  • Uses CLIP Score filtering (>0.2) for training data selection
  • Supports both full precision and half-precision (FP16) inference

Core Capabilities

  • Bilingual text-to-image generation
  • Support for artistic style transfer (e.g., Van Gogh style)
  • Complex concept combination in both languages
  • DreamBooth fine-tuning compatibility

Frequently Asked Questions

Q: What makes this model unique?

It's the first open-source Stable Diffusion model specifically trained for both Chinese and English text-to-image generation, with carefully curated training data and a two-stage training approach.

Q: What are the recommended use cases?

The model excels at generating images from Chinese or English prompts, artistic style transfer, and can be further fine-tuned using DreamBooth for specific use cases. It's particularly effective for cultural-specific Chinese concepts and artistic interpretations.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.