Arc2Face

Maintained By
FoivosPar

Arc2Face

PropertyValue
LicenseMIT
PaperArXiv Link
LanguageEnglish
FrameworkDiffusers

What is Arc2Face?

Arc2Face is a groundbreaking face generation model that can create diverse, identity-consistent photos of individuals using only their ArcFace ID-embedding. Built upon Stable Diffusion v1-5, it combines a finetuned CLIP ViT-L/14 encoder with a specialized UNet model to generate high-quality facial images while maintaining identity consistency.

Implementation Details

The model architecture consists of two main components: an encoder (finetuned CLIP ViT-L/14) that projects ID-embeddings to the CLIP latent space, and the arc2face UNet model that handles the actual image generation. The system is trained on a restored version of the WebFace42M database and further refined using FFHQ and CelebA-HQ datasets.

  • Finetuned CLIP ViT-L/14 encoder for ID embedding projection
  • Specialized UNet model for face generation
  • Additional ControlNet model for pose control
  • Built on Stable Diffusion v1-5 architecture

Core Capabilities

  • Generate diverse facial images from ID embeddings
  • Maintain identity consistency across generations
  • Control pose through ControlNet integration
  • Support for frontal hemisphere poses
  • Single-person image generation

Frequently Asked Questions

Q: What makes this model unique?

Arc2Face stands out for its ability to generate face images using only ArcFace ID-embeddings, maintaining identity consistency while allowing for pose control and diverse representations. Its architecture combines state-of-the-art components from CLIP and Stable Diffusion.

Q: What are the recommended use cases?

The model is ideal for applications requiring identity-preserved face generation, such as avatar creation, face-based authentication testing, and facial analysis research. However, it's limited to single-person images and frontal hemisphere poses.

🍰 Interesting in building your own agents?
PromptLayer provides Huggingface integration tools to manage and monitor prompts with your whole team. Get started here.