Zhicong Tang
We propose Model-guidance (MG) for training diffusion models, remove the commmonly used Classifier-free guidance (CFG), and achieve SOTA on ImageNet-256 conditional generation with FID=1.34.
conda create -n mg python=3.12 -y
conda activate mg
pip install -r requirements.txt
We provide FID stat files (from ADM) and the checkpoint of our final SiT-XL/2 model. To download and evaluate them, simply run
bash download.sh
torchrun --nnodes=2 --nproc_per_node=8 test.py
You can adjust nnodes
and nproc_per_node
according to your environment. However, they may affect evaluation results as the random seed is relative to GPU ranks (see train.py
).
We list the hyper-paremeters for the final SiT-XL/2 model in our paper as the defaults of train.py
. Through the following command your can train your own models
torchrun --nnodes=2 --nproc_per_node=8 train.py
# Or using REPA checkpoints as initialization. This does not affect final performances.
# torchrun --nnodes=2 --nproc_per_node=8 train.py --ckpt-path output/SiT-XL-2-REPA.pt
If you find our work useful, please kindly consider to cite us
@article{tang2025diffusion,
title={Diffusion Models without Classifier-free Guidance},
author={Zhicong Tang and Jianmin Bao and Dong Chen and Baining Guo},
journal={arXiv preprint arXiv:2502.12154},
year={2025}
}
This code is mainly built upon DiT, SiT, and REPA repositories.