ControlNet for Affordance based Style Transfer

The goal of image style transfer so far has been to render an image with artistic features guided by a style reference while maintaining the original content. The drawback of neural style transfer algorithms is that they impose a single style uniformly on all parts of the image which makes it difficult to transfer objects from a scene to a thematically different scene. With affordance based style transfer we can preserve the underlying object interactions of the input image and perform a more open style transfer. It allows for more freedom on what an object can be in the target theme because only its affordance needs to be preserved. Imagine a robot that is trained to navigate predominantly in an indoor environment is to be deployed in a new outdoor environment, its perception stack is familiar with indoor objects and cannot generalize to objects in the wild. With an affordance based style transfer, we can morph the new environment to a domain familiar to the robot while ensuring that the functionality of objects in the scene is not changed. This type of style transfer can also find applications in a mixed reality (MR) setting where we can directly change the surrounding of the user instead of creating an environment from the ground up.

Approach 1 : ControlNet with Affordance aware semantic segmentation

Results

Approach 2 : Blended Diffusion with ControlNet Style Transfer

Results

Approach 3 : End-to-End Affordance Based Style Transfer

Results

Approach 4 : End-to-End Depth Aware Affordance Based Style Transfer

Results

Usage

First create a new conda environment

conda env create -f environment.yaml
conda activate aff_control

Download ControlNet weights for ADE20k and IIT-AFF masks from here: https://drive.google.com/drive/folders/1QeWJuPeEcyJLL3qNXkY7TwLUaO0s5NjG?usp=sharing Download ControlNet weights for affordance + midas depth masks from here: https://drive.google.com/file/d/1IIMiMtrD--E2mY5g0zspqphOoOeCl64I/view?usp=share_link

Place the weights in the models folder

Training

Minimum system requirements: A10 GPU with 24GB VRAM

First run

bash aff_setup/setup.bash

To train controlnet on affordance conditioning run:

python train_affordance.py

To train controlnet on depth + affordance conditioning run:

python train_affordance_with_depth.py

Inference

Minimum system requirements: K80 GPU with 12GB VRAM should suffice

Launch a gradio app where you can upload your images and run inference:

python gradio_aff2image.py

Make sure to use the correct weights in gradio_aff2image.py file

Credits

@misc{zhang2023adding,
  title={Adding Conditional Control to Text-to-Image Diffusion Models}, 
  author={Lvmin Zhang and Maneesh Agrawala},
  year={2023},
  eprint={2302.05543},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Arxiv Link

Name		Name	Last commit message	Last commit date
Latest commit History 199 Commits
.vscode		.vscode
aff_setup		aff_setup
annotator		annotator
cldm		cldm
docs		docs
font		font
github_page		github_page
ldm		ldm
models		models
results		results
test_imgs		test_imgs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
affordance_dataset.py		affordance_dataset.py
config.py		config.py
environment.yaml		environment.yaml
generate_depth_maps.py		generate_depth_maps.py
gradio_aff2image.py		gradio_aff2image.py
gradio_seg2image.py		gradio_seg2image.py
iitf_dataset.py		iitf_dataset.py
join_results.py		join_results.py
load_affordance_dataset.py		load_affordance_dataset.py
prepare_affordance_dataset.py		prepare_affordance_dataset.py
share.py		share.py
tool_add_control.py		tool_add_control.py
tool_add_control_sd21.py		tool_add_control_sd21.py
tool_transfer_control.py		tool_transfer_control.py
train_affordance.py		train_affordance.py
train_affordance_with_depth.py		train_affordance_with_depth.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ControlNet for Affordance based Style Transfer

Approach 1 : ControlNet with Affordance aware semantic segmentation

Results

Approach 2 : Blended Diffusion with ControlNet Style Transfer

Results

Approach 3 : End-to-End Affordance Based Style Transfer

Results

Approach 4 : End-to-End Depth Aware Affordance Based Style Transfer

Results

Usage

Training

Inference

Credits

About

Uh oh!

Releases

Packages

Contributors 7

Uh oh!

Languages

License

rcarbonn/control-net-affordance

Folders and files

Latest commit

History

Repository files navigation

ControlNet for Affordance based Style Transfer

Approach 1 : ControlNet with Affordance aware semantic segmentation

Results

Approach 2 : Blended Diffusion with ControlNet Style Transfer

Results

Approach 3 : End-to-End Affordance Based Style Transfer

Results

Approach 4 : End-to-End Depth Aware Affordance Based Style Transfer

Results

Usage

Training

Inference

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Uh oh!

Languages

Packages