stable-diffusion-safety-checker

Maintained By
CompVis

stable-diffusion-safety-checker

PropertyValue
AuthorCompVis
Downloads1,236,752
Base ModelCLIP
PaperCLIP Paper

What is stable-diffusion-safety-checker?

The stable-diffusion-safety-checker is a specialized model developed by CompVis that leverages CLIP architecture to identify potentially NSFW content in images. It serves as a crucial component in the Stable Diffusion pipeline, helping maintain appropriate content generation standards.

Implementation Details

Built on the CLIP architecture, this model uses a ViT-L/14 Transformer as an image encoder and a masked self-attention Transformer as a text encoder. The model is trained to analyze image content through contrastive learning approaches, maximizing the similarity between image-text pairs.

  • Utilizes Vision Transformer (ViT) architecture for image processing
  • Implements CLIP-based content analysis
  • Designed specifically for integration with Stable Diffusion pipelines

Core Capabilities

  • NSFW content detection in generated images
  • High accuracy in gender classification (>96% across demographics)
  • Racial classification capability (~93% accuracy)
  • Age classification functionality (~63% accuracy)

Frequently Asked Questions

Q: What makes this model unique?

This model is specifically designed for content safety checking in image generation pipelines, with a focus on identifying inappropriate content using advanced CLIP-based architecture.

Q: What are the recommended use cases?

The model is primarily intended for researchers and developers implementing safety features in image generation systems, particularly with Stable Diffusion. It should be used with the diffusers library rather than transformers.

The first platform built for prompt engineering