Depth Any Video with Scalable Synthetic Data

Depth Any Video introduces a scalable synthetic data pipeline, capturing 40,000 video clips from diverse games, and leverages powerful priors of generative video diffusion models to advance video depth estimation. By incorporating rotary position encoding, flow matching, and a mixed-duration training strategy, it robustly handles varying video lengths and frame rates. Additionally, a novel depth interpolation method enables high-resolution depth inference, achieving superior spatial accuracy and temporal consistency over previous models.

This repository is the official implementation of the paper:

Depth Any Video with Scalable Synthetic Data

Honghui Yang*, Di Huang*, Wei Yin, Chunhua Shen, Haifeng Liu, Xiaofei He, Binbin Lin+, Wanli Ouyang, Tong He+

News

[2024-10-20] The Replicate Demo and API is added here.

[2024-10-20] The Hugging Face online demo is live here.

[2024-10-15] The arXiv submission is available here.

Installation

Setting up the environment with conda. With support for the app.

git clone https://github.com/Nightmare-n/DepthAnyVideo
cd DepthAnyVideo

# create env using conda
conda create -n dav python==3.10
conda activate dav
pip install -r requirements.txt
pip install gradio

Inference

To run inference on an image, use the following command:

python run_infer.py --data_path ./demos/arch_2.jpg --output_dir ./outputs/ --max_resolution 2048

To run inference on a video, use the following command:

python run_infer.py --data_path ./demos/wooly_mammoth.mp4 --output_dir ./outputs/ --max_resolution 960

Citation

If you find our work useful, please cite:

@article{yang2024depthanyvideo,
  author    = {Honghui Yang and Di Huang and Wei Yin and Chunhua Shen and Haifeng Liu and Xiaofei He and Binbin Lin and Wanli Ouyang and Tong He},
  title     = {Depth Any Video with Scalable Synthetic Data},
  journal   = {arXiv preprint arXiv:2410.10815},
  year      = {2024}
}

Name	Name	Last commit message	Last commit date
Latest commit Nightmare-n update readme Oct 20, 2024 10ca9aa · Oct 20, 2024 History 10 Commits
assets	assets	clean cache history	Oct 19, 2024
dav	dav	add core eval	Oct 20, 2024
demos	demos	clean cache history	Oct 19, 2024
.gitignore	.gitignore	clean cache history	Oct 19, 2024
LICENSE	LICENSE	clean cache history	Oct 19, 2024
README.md	README.md	update readme	Oct 20, 2024
app.py	app.py	Major Updates to app following latest	Oct 19, 2024
cog.yaml	cog.yaml	replicate demo	Oct 20, 2024
predict.py	predict.py	replicate demo	Oct 20, 2024
requirements.txt	requirements.txt	clean cache history	Oct 19, 2024
run_infer.py	run_infer.py	clean cache history	Oct 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Depth Any Video with Scalable Synthetic Data

News

Installation

Inference

Citation

About

Releases

Packages

Languages

License

GPTAlgoPro/DepthAnyVideo

Folders and files

Latest commit

History

Repository files navigation

Depth Any Video with Scalable Synthetic Data

News

Installation

Inference

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages