Name		Name	Last commit message	Last commit date
parent directory ..
.gitignore		.gitignore
README.md		README.md
demo_img2img.py		demo_img2img.py
demo_inpaint.py		demo_inpaint.py
demo_txt2img.py		demo_txt2img.py
img2img_pipeline.py		img2img_pipeline.py
inpaint_pipeline.py		inpaint_pipeline.py
models.py		models.py
requirements.txt		requirements.txt
stable_diffusion_pipeline.py		stable_diffusion_pipeline.py
txt2img_pipeline.py		txt2img_pipeline.py
utilities.py		utilities.py

README.md

Introduction

This demo application ("demoDiffusion") showcases the acceleration of Stable Diffusion pipeline using TensorRT.

Setup

Clone the TensorRT OSS repository

git clone [email protected]:NVIDIA/TensorRT.git -b release/8.6 --single-branch
cd TensorRT

Launch TensorRT NGC container

Install nvidia-docker using these intructions.

docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:23.02-py3 /bin/bash

Install latest TensorRT release

python3 -m pip install --upgrade pip
python3 -m pip install --upgrade tensorrt

Minimum required version is TensorRT 8.6.0. Check your installed version using: python3 -c 'import tensorrt;print(tensorrt.__version__)'

NOTE: Alternatively, you can download and install TensorRT packages from NVIDIA TensorRT Developer Zone.

Install required packages

export TRT_OSSPATH=/workspace
cd $TRT_OSSPATH/demo/Diffusion
pip3 install -r requirements.txt

# Create output directories
mkdir -p onnx engine output

NOTE: demoDiffusion has been tested on systems with NVIDIA A100, RTX3090, and RTX4090 GPUs, and the following software configuration.

diffusers           0.14.0
onnx                1.13.1
onnx-graphsurgeon   0.3.26
onnxruntime         1.14.1
polygraphy          0.47.1
tensorrt            8.6.1.6
tokenizers          0.13.2
torch               1.13.0
transformers        4.26.1

NOTE: optionally install HuggingFace accelerate package for faster and less memory-intense model loading.

Running demoDiffusion

Review usage instructions for the supported pipelines

python3 demo_txt2img.py --help
python3 demo_img2img.py --help
python3 demo_inpaint.py --help

HuggingFace user access token

To download the model checkpoints for the Stable Diffusion pipeline, you will need a read access token. See instructions.

export HF_TOKEN=<your access token>

Generate an image guided by a single text prompt

python3 demo_txt2img.py "a beautiful photograph of Mt. Fuji during cherry blossom" --hf-token=$HF_TOKEN -v

Generate an image guided by an image and single text prompt

python3 demo_img2img.py "photorealistic new zealand hills" --hf-token=$HF_TOKEN -v

Use --input-image=<path to image> to specify your image. Otherwise the example image will be downloaded from the Internet.

Generate an inpainted image guided by an image, mask and single text prompt

# Create separate onnx/engine directories when switching versions
mkdir -p onnx-1.5 engine-1.5

python3 demo_inpaint.py "a mecha robot sitting on a bench" --hf-token=$HF_TOKEN --version=1.5 --onnx-dir=onnx-1.5 --engine-dir=engine-1.5 -v

Use --input-image=<path to image> and --mask-image=<path to mask> to specify your inputs. They must have the same dimensions. Otherwise the example image and mask will be downloaded from the Internet.

Input arguments

One can set schdeuler using --scheduler=EulerA. Note that some schedulers are not available for some pipelines or version.
To accelerate engine building time one can use --timing-cache=<path to cache file>. This cache file will be created if does not exist. Note, that it may influence the performance if the cache file created on the other hardware is used. It is suggested to use this flag only during development. To achieve the best perfromance during deployment, please, build engines without timing cache.
To switch between versions or pipelines one needs either to clear onnx and engine dirs, or to specify --force-onnx-export --force-onnx-optimize --force-engine-build or to create new dirs and to specify --onnx-dir=<new onnx dir> --engine-dir=<new engine dir>.
Inference performance can be improved by enabling CUDA graphs using --use-cuda-graph. Enabling CUDA graphs requires fixed input shapes, so this flag must be combined with --build-static-batch and cannot be combined with --build-dynamic-shape.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diffusion

Diffusion

README.md

Introduction

Setup

Clone the TensorRT OSS repository

Launch TensorRT NGC container

Install latest TensorRT release

Install required packages

Running demoDiffusion

Review usage instructions for the supported pipelines

HuggingFace user access token

Generate an image guided by a single text prompt

Generate an image guided by an image and single text prompt

Generate an inpainted image guided by an image, mask and single text prompt

Input arguments

Files

Diffusion

Directory actions

More options

Directory actions

More options

Latest commit

History

Diffusion

Folders and files

parent directory

README.md

Introduction

Setup

Clone the TensorRT OSS repository

Launch TensorRT NGC container

Install latest TensorRT release

Install required packages

Running demoDiffusion

Review usage instructions for the supported pipelines

HuggingFace user access token

Generate an image guided by a single text prompt

Generate an image guided by an image and single text prompt

Generate an inpainted image guided by an image, mask and single text prompt

Input arguments