From 6debd46482a9f51e108673c752e873970b6b334b Mon Sep 17 00:00:00 2001
From: Your Name <a1286225768@gmail.com>
Date: Tue, 17 Dec 2024 12:20:42 -0800
Subject: [PATCH] merge docs

---
 README.md            | 71 +++++++++++++++++++++++++++++++++++++++++---
 docs/distillation.md | 23 --------------
 docs/finetuning.md   | 41 -------------------------
 3 files changed, 67 insertions(+), 68 deletions(-)
 delete mode 100644 docs/distillation.md
 delete mode 100644 docs/finetuning.md

diff --git a/README.md b/README.md
index d7a6beb..98d3576 100644
--- a/README.md
+++ b/README.md
@@ -63,11 +63,74 @@ https://github.com/user-attachments/assets/064ac1d2-11ed-4a0c-955b-4d412a96ef30
 https://github.com/user-attachments/assets/122cfa1a-e2a3-47a5-80c8-b8852d347d9a
 
 
-## Distillation
-Please refer to the [distillation guide](docs/distillation.md).
+## 🎯 Distill
+
+Our distillation recipe is based on [Phased Consistency Model](https://github.com/G-U-N/Phased-Consistency-Model). We did not find significant improvement using multi-phase distillation, so we keep the one phase setup similar to the original latent consistency model's recipe.
+
+We use the [MixKit](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main/all_mixkit) dataset for distillation. To avoid running the text encoder and VAE during training, we preprocess all data to generate text embeddings and VAE latents.
+
+Preprocessing instructions can be found [data_preprocess.md](docs/data_preprocess.md). For convenience, we also provide preprocessed data that can be downloaded directly using the following command:
+
+```bash
+python scripts/huggingface/download_hf.py --repo_id=FastVideo/HD-Mixkit-Finetune-Hunyuan --local_dir=data/HD-Mixkit-Finetune-Hunyuan --repo_type=dataset
+```
+Next, download the original model weights with:
+
+```bash
+python scripts/huggingface/download_hf.py --repo_id=FastVideo/hunyuan --local_dir=data/hunyuan --repo_type=model
+```
+To launch the distillation process, use the following commands:
+
+```
+bash scripts/distill/distill_mochi.sh # for mochi
+bash scripts/distill/distill_hunyuan.sh # for hunyuan
+```
+We also provide an optional script for distillation with adversarial loss, located at `fastvideo/distill_adv.py`. Although we tried adversarial loss, we did not observe significant improvements.
+
+
+## Finetune
+
+### ⚡ Full Finetune
+
+Ensure your data is prepared and preprocessed in the format specified in [data_preprocess.md](docs/data_preprocess.md). For convenience, we also provide a mochi preprocessed Black Myth Wukong data that can be downloaded directly:
+```bash
+python scripts/huggingface/download_hf.py --repo_id=FastVideo/Mochi-Black-Myth --local_dir=data/Mochi-Black-Myth --repo_type=dataset
+```
+Download the original model weights with:
+```bash
+python scripts/huggingface/download_hf.py --repo_id=genmo/mochi-1-preview --local_dir=data/mochi --repo_type=model
+python scripts/huggingface/download_hf.py --repo_id=FastVideo/hunyuan --local_dir=data/hunyuan --repo_type=model
+```
+
+Then you can run the finetune with:
+```
+bash scripts/finetune/finetune_mochi.sh # for mochi
+```
+**Note that for finetuning, we did not tune the hyperparameters in the provided script**
+
+### ⚡ Lora Finetune
+
+Currently, we only provide Lora Finetune for Mochi model, the command for Lora Finetune is
+```
+bash scripts/finetune/finetune_mochi_lora.sh
+```
+### Minimum Hardware Requirement
+- 40 GB GPU memory each for 2 GPUs with lora
+- 30 GB GPU memory each for 2 GPUs with CPU offload and lora.
+
+### Finetune with Both Image and Video
+Our codebase support finetuning with both image and video. 
+
+```bash
+bash scripts/finetune/finetune_hunyuan.sh
+bash scripts/finetune/finetune_mochi_lora_mix.sh
+```
+For Image-Video Mixture Fine-tuning, make sure to enable the --group_frame option in your script.
+
+
+
+
 
-## Finetuning
-Please refer to the [finetuning guide](docs/finetuning.md).
 
 ## Acknowledgement
 We learned and reused code from the following projects: [PCM](https://github.com/G-U-N/Phased-Consistency-Model), [diffusers](https://github.com/huggingface/diffusers), [OpenSoraPlan](https://github.com/PKU-YuanGroup/Open-Sora-Plan), and [xDiT](https://github.com/xdit-project/xDiT).
diff --git a/docs/distillation.md b/docs/distillation.md
deleted file mode 100644
index 89ef2bd..0000000
--- a/docs/distillation.md
+++ /dev/null
@@ -1,23 +0,0 @@
-## 🎯 Distill
-
-Our distillation recipe is based on [Phased Consistency Model](https://github.com/G-U-N/Phased-Consistency-Model). We did not find significant improvement using multi-phase distillation, so we keep the one phase setup similar to the original latent consistency model's recipe.
-
-We use the [MixKit](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main/all_mixkit) dataset for distillation. To avoid running the text encoder and VAE during training, we preprocess all data to generate text embeddings and VAE latents.
-
-Preprocessing instructions can be found [data_preprocess.md](docs/data_preprocess.md). For convenience, we also provide preprocessed data that can be downloaded directly using the following command:
-
-```bash
-python scripts/huggingface/download_hf.py --repo_id=FastVideo/HD-Mixkit-Finetune-Hunyuan --local_dir=data/HD-Mixkit-Finetune-Hunyuan --repo_type=dataset
-```
-Next, download the original model weights with:
-
-```bash
-python scripts/huggingface/download_hf.py --repo_id=FastVideo/hunyuan --local_dir=data/hunyuan --repo_type=model
-```
-To launch the distillation process, use the following commands:
-
-```
-bash scripts/distill/distill_mochi.sh # for mochi
-bash scripts/distill/distill_hunyuan.sh # for hunyuan
-```
-We also provide an optional script for distillation with adversarial loss, located at `fastvideo/distill_adv.py`. Although we tried adversarial loss, we did not observe significant improvements.
diff --git a/docs/finetuning.md b/docs/finetuning.md
deleted file mode 100644
index 74428a7..0000000
--- a/docs/finetuning.md
+++ /dev/null
@@ -1,41 +0,0 @@
-
-## ⚡ Full Finetune
-
-Ensure your data is prepared and preprocessed in the format specified in [data_preprocess.md](docs/data_preprocess.md). For convenience, we also provide a mochi preprocessed Black Myth Wukong data that can be downloaded directly:
-```bash
-python scripts/huggingface/download_hf.py --repo_id=FastVideo/Mochi-Black-Myth --local_dir=data/Mochi-Black-Myth --repo_type=dataset
-```
-Download the original model weights with:
-```bash
-python scripts/huggingface/download_hf.py --repo_id=genmo/mochi-1-preview --local_dir=data/mochi --repo_type=model
-python scripts/huggingface/download_hf.py --repo_id=FastVideo/hunyuan --local_dir=data/hunyuan --repo_type=model
-```
-
-Then you can run the finetune with:
-```
-bash scripts/finetune/finetune_mochi.sh # for mochi
-```
-**Note that we did not tune the hyperparameters in the provided script**
-
-## ⚡ Lora Finetune
-
-Currently, we only provide Lora Finetune for Mochi model, the command for Lora Finetune is
-```
-bash scripts/finetune/finetune_mochi_lora.sh
-```
-### Minimum Hardware Requirement
-- 40 GB GPU memory each for 2 GPUs with lora
-- 30 GB GPU memory each for 2 GPUs with CPU offload and lora.
-
-## Finetune with Both Image and Video
-Our codebase support finetuning with both image and video. 
-
-```bash
-bash scripts/finetune/finetune_hunyuan.sh
-bash scripts/finetune/finetune_mochi_lora_mix.sh
-```
-For Image-Video Mixture Fine-tuning, make sure to enable the --group_frame option in your script.
-
-
-
-