Stars
A Training-free Iterative Framework for Long Story Visualization
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
[NeurIPS 2024] Generalizable Implicit Motion Modeling for Video Frame Interpolation
Redux StyleModelApply adds more controls
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A python package to streamline evaluation of unconditional image generation models
Paper: "From Text to Pose to Image: Improving Diffusion Model Control and Quality"
DSPy: The framework for programming—not prompting—language models
Official repository of In-Context LoRA for Diffusion Transformers
An official pytorch implementation of "MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts"
Nodes for image juxtaposition for Flux in ComfyUI
Official Pytorch implementation of "Visual Style Prompting with Swapping Self-Attention"
InstructG2I: Synthesizing Images from Multimodal Attributed Graphs (NeurIPs 2024)
Official Implementation of "Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function" [NeurIPS 2024]
[ECCV 2024] OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models
A Collection of Papers and Codes for CVPR2024/ECCV2024 AIGC
CSGO: Content-Style Composition in Text-to-Image Generation 🔥
A powerful tool that translates ComfyUI workflows into executable Python code.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Official code for "RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control"
Code release: https://github.com/google/RB-Modulation