SkyReels-A2

Property	Value
Author	Skywork
Model URL	https://huggingface.co/Skywork/SkyReels-A2
Video Resolution	Up to 720x1080 (Infinity version)
Publication Year	2025

What is SkyReels-A2?

SkyReels-A2 is an advanced video diffusion transformer framework designed to compose and generate high-quality video content. It introduces a novel dual-branch encoding system that processes both spatial and semantic features from reference images to create coherent and contextually relevant video outputs.

Implementation Details

The model employs two distinct processing branches: a spatial feature branch utilizing a fine-grained VAE encoder, and a semantic feature branch leveraging CLIP vision encoding with MLP projection. These features are integrated through diffusion transformer blocks, with semantic features incorporated via cross-attention layers.

Supports multiple model variants including A2-Wan2.1-14B-Preview and upcoming Infinity version
Processes videos at resolutions of 81x480x832 and higher
Implements advanced diffusion transformer architecture
Features dual-branch encoding system for comprehensive feature extraction

Core Capabilities

High-resolution video generation
Semantic-aware content composition
Flexible video length handling (including infinite length in Infinity version)
Reference image-based composition control

Frequently Asked Questions

Q: What makes this model unique?

SkyReels-A2's distinctive feature is its dual-branch encoding system that processes both spatial and semantic features independently, allowing for more nuanced and controlled video generation.

Q: What are the recommended use cases?

The model is particularly suited for video composition tasks requiring high-quality output, especially when specific reference images need to guide the generation process. It's ideal for creative content generation and video synthesis applications.

SkyReels-A2

SkyReels-A2

What is SkyReels-A2?

Implementation Details

Core Capabilities

Frequently Asked Questions

Q: What makes this model unique?

Q: What are the recommended use cases?

Related Models