Personalized text-to-image generation aims to create images tailored to user-defined concepts and textual descriptions. Balancing the fidelity of the learned concept with its ability for generation in various contexts presents a significant challenge. Existing methods often address this through diverse fine-tuning parameterizations and improved sampling strategies that integrate superclass trajectories during the diffusion process. While improved sampling offers a cost-effective, training-free solution for enhancing fine-tuned models, systematic analyses of these methods remain limited. Current approaches typically tie sampling strategies with fixed fine-tuning configurations, making it difficult to isolate their impact on generation outcomes. To address this issue, we systematically analyze sampling strategies beyond fine-tuning, exploring the impact of concept and superclass trajectories on the results. Building on this analysis, we propose a decision framework evaluating text alignment, computational constraints, and fidelity objectives to guide strategy selection. It integrates with diverse architectures and training approaches, systematically optimizing concept preservation, prompt adherence, and resource efficiency.
We analyze sampling strategies in personalized image generation, showing how superclass trajectories optimize fidelity-adaptability tradeoffs and offer a framework for selecting strategies by text alignment, efficiency, and identity preservation.
- [30/01/2025] 🔥🔥🔥 Beyond Fine-Tuning release. Paper has been published on Arxiv.
You need following hardware and python version to run our method.
- Linux
- NVIDIA GPU + CUDA CuDNN
- Conda 24.1.0+ or Python 3.11+
- Clone this repo:
git clone https://github.com/V-Soboleva/persongen
cd persongen
- Create Conda environment:
conda create -n persongen python=3.11
conda activate persongen
- Install the dependencies in your environment:
pip install -r requirements.txt
- Compile binaries for face detection metrics:
cd nb_utils/face_align/PIPNet/FaceBoxesV2/utils
bash ./make.sh
Here, you can find examples of how to run training/inference on top of the SD-2.0 model. Detailed instructions for SD-XL and PixArt-alpha are contained in the branches sdxl
and pixart
.
Inference cli commands and example notebooks
.
├── 📂 baselines
│ ├── 📂 custom_diffusion
│ │ ├── ...
│ │ ├── 📄 inference.py # Implementation of CD inference
│ │ └── 📄 train_custom_diffusion.py # Implementation of CD training
│ ├── 📂 elite
│ │ ├── 📂 dreambooth
│ │ ├── ...
│ │ └── 📄 inference.py # Implementation of ELITE inference
│ ├── 📂 profusion
│ │ ├── 📄 inference.py # Implementation of Profusion inference
│ │ └── 📄 pipeline.py # Implementation of Profusion pipelines
│ └── 📂 textual_inversion
│ ├── 📄 inference.py # Implementation of TI inference
│ └── 📄 textual_inversion.py # Implementation of TI training
├── 📂 docs
│ ├── 📂 assets # Folder with diagrams and teaser
│ ├── 📜 INFERENCE.md # File with inference CLI
│ └── 📜 TRAINING.md # File with training CLI
├── 📂 dreambooth
│ ├── 📂 aug_dataset # Preprocessed DB datasets for training
│ ├── 📂 dataset # Original DB datasets for evaluation
│ └── ...
├── 📂 nb_utils # Folder with utility functions
├── 📂 persongen
│ ├── 📂 data # Implementation of image datasets
│ ├── 📂 model # Implementation of sampling pipelines and SVDDiff
│ ├── 📂 utils # Folder with utility functions
│ ├── 📄 ...
│ ├── 📄 inferencer.py # Wrapper for sampling pipelines
│ └── 📄 trainer.py # Wrapper for training methods
│
├── 📜 README.md # This file
├── 📜 requirements.txt # Lists required Python packages.
│
├── 📄 inference.py # Wrapper for inference
└── 📄 train.py # Wrapper for training
The repository has used several codebases:
- Implementation of Custom Diffusion, and Textual Inversion methods from diffusers
- Implementation of SVDDiff
- Pretrained ELITE models and inference code
- Dreambooth dataset and prompts
- Implementation of ProFusion sampling
If you use this code or our findings for your research, please cite our paper:
@misc{soboleva2025finetuningsystematicstudysampling,
title={Beyond Fine-Tuning: A Systematic Study of Sampling Techniques in Personalized Image Generation},
author={Vera Soboleva and Maksim Nakhodnov and Aibek Alanov},
year={2025},
eprint={2502.05895},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.05895},
}