FastVideo is a lightweight framework for accelerating large video diffusion models.
FastMochi-Demo.mp4
🤗 FastMochi | 🤗 FastHunyuan | 🔍 Discord
FastVideo currently offers: (with more to come)
- FastHunyuan and FastMochi: consistency distilled video diffusion models for 8x inference speedup.
- First open distillation recipes for video DiT, based on PCM.
- Support distilling/finetuning/inferencing state-of-the-art open video DiTs: 1. Mochi 2. Hunyuan.
- Scalable training with FSDP, sequence parallelism, and selective activation checkpointing, with near linear scaling to 64 GPUs.
- Memory efficient finetuning with LoRA, precomputed latent, and precomputed text embeddings.
Dev in progress and highly experimental.
2024/12/17
:FastVideo
v0.1 is released.
The code is tested on Python 3.10.0, CUDA 12.1 and H100.
./env_setup.sh fastvideo
We recommend using a GPU with 80GB of memory. To run the inference, use the following command:
# Download the model weight
python scripts/huggingface/download_hf.py --repo_id=FastVideo/FastMochi-diffusers --local_dir=data/FastMochi-diffusers --repo_type=model
# CLI inference
bash scripts/inference/inference_mochi_sp.sh
# Download the model weight
python scripts/huggingface/download_hf.py --repo_id=FastVideo/FastHunyuan --local_dir=data/FastHunyuan --repo_type=model
# CLI inference
sh scripts/inference/inference_hunyuan.sh
You can also inference FastHunyuan in the official Hunyuan github.
FastHunyuan-Demo.mp4
compare-hori.mp4
Please refer to the distillation guide.
Please refer to the finetuning guide.
We learned and reused code from the following projects: PCM, diffusers, OpenSoraPlan, and xDiT.
We thank MBZUAI and Anyscale for their support throughout this project.