Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

This is a cog wrapper for the following paper:

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

[Arxiv] [Website] [Colab]

This repo contains code to generate visual anagrams and other multi-view optical illusions. These are images that change appearance or identity when transformed, such as by a rotation, a color inversion, or a jigsaw rearrangement. Please read our paper or visit our website for more details.

Usage

cog predict \
  -i 'seed=0' \
  -i 'style="an oil painting of "' \
  -i 'video=true' \
  -i 'views="identity, jigsaw"' \
  -i 'prompts="a rabbit, a coffee cup"' \
  -i 'num_samples=1' \
  -i 'guidance_scale_1=10' \
  -i 'guidance_scale_2=10' \
  -i 'num_inference_steps_1=30' \
  -i 'num_inference_steps_2=30'

The Art of Choosing Prompts

Choosing prompts for illusions can be fairly tricky and unintuitive. Here are some tips:

Intuition and reasoning works less often than you would expect. Prompts that you think would work great often work poorly, and vice versa. So exploration is key.
Styles such as "a photo of" tend to be harder as the constraint of realism is fairly difficult (but this doesn't mean they can't work!).
Conversely, styles such as "an oil painting of" seem to do better because there's more freedom to how it can be depicted and interpreted.
In a similar vein, subjects that allow for high degrees of flexibility in depiction tend to be good. For example, prompts such as "houseplants" or "wine and cheese" or "a kitchen"
But be careful the subject is still easily recognizable. Illusions are much better when they are instantly understandable.
Faces often make for very good "hidden" subjects. This is probably because the human visual system is particularly adept at picking out faces. For example, "an old man" or "marilyn monroe" tend to be good subjects.
Perhaps a bit evident, but 3 view and 4 view illusions are considerably more difficult to get to work.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
visual_anagrams		visual_anagrams
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
animate.py		animate.py
cog.yaml		cog.yaml
environment.yml		environment.yml
generate.py		generate.py
predict.py		predict.py
readme.md		readme.md
setup.py		setup.py
weights_downloader.py		weights_downloader.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

[Arxiv] [Website] [Colab]

Usage

The Art of Choosing Prompts

About

Languages

License

chigozienri/cog-visual-anagrams

Folders and files

Latest commit

History

Repository files navigation

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

[Arxiv] [Website] [Colab]

Usage

The Art of Choosing Prompts

About

Resources

License

Stars

Watchers

Forks

Languages