GeometryCrafter

Property	Value
Developer	TencentARC
Model Type	Geometry Estimation
Hardware Requirements	40GB GPU (full resolution) / 22GB GPU (low resolution)
Repository	https://github.com/TencentARC/GeometryCrafter

What is GeometryCrafter?

GeometryCrafter is an innovative AI model developed by TencentARC that estimates temporally consistent, high-quality point maps from open-world videos. This groundbreaking technology enables advanced 3D/4D reconstruction and depth-based video editing capabilities, making it a valuable tool for computer vision applications.

Implementation Details

The model operates at various performance levels depending on configuration: 1.27FPS for full resolution (1024x576) processing and up to 2.49FPS for low-resolution (384x640) processing. It employs diffusion priors to ensure consistent geometry estimation across video frames.

Supports multiple resolution modes for different hardware capabilities
Includes both standard and deterministic variant implementations
Features comprehensive visualization tools via Viser
Provides extensive dataset evaluation capabilities

Core Capabilities

High-quality point map generation from videos
Temporal consistency maintenance across frames
Scale-invariant point map estimation
Affine-invariant depth estimation
Support for various video resolutions

Frequently Asked Questions

Q: What makes this model unique?

GeometryCrafter stands out for its ability to maintain temporal consistency while processing open-world videos, offering both standard and deterministic variants for different use cases. Its diffusion priors enable robust geometry estimation across diverse scenarios.

Q: What are the recommended use cases?

The model is ideal for 3D/4D reconstruction tasks, depth-based video editing, and geometry estimation in open-world scenarios. It's particularly useful for applications requiring consistent point map generation from video content.