Merge branch 'master' into IMP/ThumbThreads

midcoastal · Apr 24, 2024 · 4163065 · 4163065
2 parents f149050 + de26739
commit 4163065
Show file tree

Hide file tree

Showing 37 changed files with 241 additions and 182 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,38 +2,45 @@
 
 ## TODO
 
-- resize type: fixed, fill, etc.
 - reference styles
+- quick apply style
 
-## Update for 2024-03-14
+## Update for 2024-03-19
 
-### Highlights 2024-03-14
+### Highlights 2024-03-19
 
 New models:
 - [Stable Cascade](https://github.com/Stability-AI/StableCascade) *Full* and *Lite*
 - [Playground v2.5](https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic)
 - [KOALA 700M](https://github.com/youngwanLEE/sdxl-koala)
 - [Stable Video Diffusion XT 1.1](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1)
 - [VGen](https://huggingface.co/ali-vilab/i2vgen-xl)  
+
 New pipelines and features:
-- Trajectory Consistency Distillation [TCD](https://mhh0318.github.io/tcd) for generate in even less steps
-- Image2image using [LEdit++](https://leditsplusplus-project.static.hf.space/index.html), context aware method with image analysis and positive/negative prompt handling
+- Img2img using [LEdit++](https://leditsplusplus-project.static.hf.space/index.html), context aware method with image analysis and positive/negative prompt handling
+- Trajectory Consistency Distillation [TCD](https://mhh0318.github.io/tcd) for processing in even less steps
 - Visual Query & Answer using [moondream2](https://github.com/vikhyat/moondream) as an addition to standard interrogate methods
-- Face-HiRes: simple detailer for face refinements
+- **Face-HiRes**: simple built-in detailer for face refinements
+- Even simpler outpaint: when resizing image, simply pick outpaint method and if image has different aspect ratio, blank areas will be outpainted!
 - UI aspect-ratio controls and other UI improvements
 - User controllable invisibile and visible watermarking
 - Native composable LoRA
-**Styles**: Not just for prompts! Can apply generate parameters as templates and can be used to apply wildcards to prompts
-**Reference models**: *Networks -> Models -> Reference*: All reference models now come with recommended settings that can be auto-applied if desired
-Additional Improvements such as: Smooth tiling, Refine/HiRes workflow improvements, Control workflow improvements, Additional API endpoints
+
+What else?
+
+- **Reference models**: *Networks -> Models -> Reference*: All reference models now come with recommended settings that can be auto-applied if desired  
+- **Styles**: Not just for prompts! Styles can apply *generate parameters* as templates and can be used to *apply wildcards* to prompts  
+improvements, Additional API endpoints  
+- Given the high interest in [ZLUDA](https://github.com/vosen/ZLUDA) engine introduced in last release we've updated much more flexible/automatic install procedure (see [wiki](https://github.com/vladmandic/automatic/wiki/ZLUDA) for details)  
+- Plus Additional Improvements such as: Smooth tiling, Refine/HiRes workflow improvements, Control workflow 
 
 Further details:  
 - For basic instructions, see [README](https://github.com/vladmandic/automatic/blob/master/README.md)  
 - For more details on all new features see full [CHANGELOG](https://github.com/vladmandic/automatic/blob/master/CHANGELOG.md)  
 - For documentation, see [WiKi](https://github.com/vladmandic/automatic/wiki)
 - [Discord](https://discord.com/invite/sd-next-federal-batch-inspectors-1101998836328697867) server  
 
-### Full Changelog 2024-03-14
+### Full Changelog 2024-03-19
 
 - [Stable Cascade](https://github.com/Stability-AI/StableCascade) *Full* and *Lite*
   - large multi-stage high-quality model from warp-ai/wuerstchen team and released by stabilityai  
@@ -55,17 +62,17 @@ Further details:
     - positive prompt: what to enhance, strength and threshold for auto-masking
     - negative prompt: what to remove, strength and threshold for auto-masking  
   - *note*: not compatible with model offloading
+- **Second Pass / Refine**
+  - independent upscale and hires options: run hires without upscale or upscale without hires or both
+  - upscale can now run 0.1-8.0 scale and will also run if enabled at 1.0 to allow for upscalers that simply improve image quality
+  - update ui section to reflect changes
+  - *note*: behavior using backend:original is unchanged for backwards compatibilty
 - **Visual Query** visual query & answer in process tab  
   - go to process -> visual query  
   - ask your questions, e.g. "describe the image", "what is behind the subject", "what are predominant colors of the image?"
   - primary model is [moondream2](https://github.com/vikhyat/moondream), a *tiny* 1.86B vision language model  
     *note*: its still 3.7GB in size, so not really tiny  
   - additional support for multiple variations of several base models: *GIT, BLIP, ViLT, PIX*, sizes range from 0.3 to 1.7GB  
-- **Second Pass / Refine**
-  - independent upscale and hires options: run hires without upscale or upscale without hires or both
-  - upscale can now run 0.1-8.0 scale and will also run if enabled at 1.0 to allow for upscalers that simply improve image quality
-  - update ui section to reflect changes
-  - *note*: behavior using backend:original is unchanged for backwards compatibilty
 - **Video**
   - **Image2Video**
     - new module for creating videos from images  
@@ -83,13 +90,7 @@ Further details:
   - *note*: this is a very experimental feature and may not work as expected
 - **Control**
   - added *refiner/hires* workflows
-- **Samplers**
-  - [TCD](https://mhh0318.github.io/tcd/): Trajectory Consistency Distillation  
-    new sampler that produces consistent results in a very low number of steps (comparable to LCM but without reliance on LoRA)  
-    for best results, use with TCD LoRA: <https://huggingface.co/h1t/TCD-SDXL-LoRA>
-  - *DPM++ 2M EDM* and *Euler EDM*  
-    EDM is a new solver algorithm currently available for DPM++2M and Euler samplers  
-    Note that using EDM samplers with non-EDM optimized models will provide just noise and vice-versa  
+  - added resize methods to before/after/mask: fixed, crop, fill
 - **Styles**: styles are not just for prompts!
   - new styles editor: *networks -> styles -> edit*
   - styles can apply generate parameters, for example to have a style that enables and configures hires:  
@@ -124,6 +125,13 @@ Further details:
   - reference models will print recommended settings to log if present
   - new setting in extra network: *use reference values when available*  
     disabled by default, if enabled will force use of reference settings for models that have them
+- **Samplers**
+  - [TCD](https://mhh0318.github.io/tcd/): Trajectory Consistency Distillation  
+    new sampler that produces consistent results in a very low number of steps (comparable to LCM but without reliance on LoRA)  
+    for best results, use with TCD LoRA: <https://huggingface.co/h1t/TCD-SDXL-LoRA>
+  - *DPM++ 2M EDM* and *Euler EDM*  
+    EDM is a new solver algorithm currently available for DPM++2M and Euler samplers  
+    Note that using EDM samplers with non-EDM optimized models will provide just noise and vice-versa  
 - **Improvements**
   - **FaceID** extend support for LoRA, HyperTile and FreeU, thanks @Trojaner
   - **Tiling** now extends to both Unet and VAE producing smoother outputs, thanks @AI-Casanova
@@ -132,6 +140,8 @@ Further details:
   - default theme updates and additional built-in theme *black-gray*
   - support models with their own YAML model config files
   - support models with their own JSON per-component config files, for example: `playground-v2.5_vae.config`
+  - prompt can have comments enclosed with `/*` and `*/`  
+    comments are extracted from prompt and added to image metadata  
 - **ROCm**  
   - add **ROCm** 6.0 nightly option to installer, thanks @jicka
   - add *flash attention* support for rdna3, thanks @Disty0  
@@ -167,6 +177,13 @@ Further details:
   - fix *requires_aesthetics_score* errors
   - fix t2i-canny
   - fix *differenital diffusion* for manual mask, thanks @23pennies
+  - fix ipadapter apply/unapply on batch runs
+  - fix control with multiple units and override images
+  - fix control with hires
+  - fix control-lllite
+  - fix font fallback, thanks @NetroScript
+  - update civitai downloader to handler new metadata
+  - improve control error handling
   - use default model variant if specified variant doesnt exist
   - use diffusers lora load override for *lcm/tcd/turbo loras*
   - exception handler around vram memory stats gather

diff --git a/README.md b/README.md
@@ -20,13 +20,14 @@ All individual features are not listed here, instead check [ChangeLog](CHANGELOG
 - Multiple backends!  
   ▹ **Diffusers | Original**  
 - Multiple diffusion models!  
-  ▹ **Stable Diffusion 1.5/2.1 | SD-XL | LCM | Segmind | Kandinsky | Pixart-α | Würstchen | aMUSEd | DeepFloyd IF | UniDiffusion | SD-Distilled | BLiP Diffusion | etc.**
+  ▹ **Stable Diffusion 1.5/2.1 | SD-XL | LCM | Segmind | Kandinsky | Pixart-α | Stable Cascade | Würstchen | aMUSEd | DeepFloyd IF | UniDiffusion | SD-Distilled | BLiP Diffusion | KOALA | etc.**
 - Built-in Control for Text, Image, Batch and video processing!  
   ▹ **ControlNet | ControlNet XS | Control LLLite | T2I Adapters | IP Adapters**  
 - Multiplatform!  
- ▹ **Windows | Linux | MacOS with CPU | nVidia | AMD | IntelArc | DirectML | OpenVINO | ONNX+Olive**
+ ▹ **Windows | Linux | MacOS with CPU | nVidia | AMD | IntelArc | DirectML | OpenVINO | ONNX+Olive | ZLUDA**
 - Platform specific autodetection and tuning performed on install
-- Optimized processing with latest `torch` developments with built-in support for `torch.compile` and multiple compile backends
+- Optimized processing with latest `torch` developments with built-in support for `torch.compile`  
+  and multiple compile backends: *Triton, ZLUDA, StableFast, DeepCache, OpenVINO, NNCF, IPEX*  
 - Improved prompt parser  
 - Enhanced *Lora*/*LoCon*/*Lyco* code supporting latest trends in training  
 - Built-in queue management  
@@ -62,21 +63,24 @@ Additional models will be added as they become available and there is public int
 
 - [RunwayML Stable Diffusion](https://github.com/Stability-AI/stablediffusion/) 1.x and 2.x *(all variants)*  
 - [StabilityAI Stable Diffusion XL](https://github.com/Stability-AI/generative-models)  
-- [StabilityAI Stable Video Diffusion](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid) Base and XT  
+- [StabilityAI Stable Video Diffusion](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid) Base, XT 1.0, XT 1.1
 - [LCM: Latent Consistency Models](https://github.com/openai/consistency_models)  
+- [Playground](https://huggingface.co/playgroundai/playground-v2-256px-base) *v1, v2 256, v2 512, v2 1024 and latest v2.5*  
+- [Stable Cascade](https://github.com/Stability-AI/StableCascade) *Full* and *Lite*
 - [aMUSEd 256](https://huggingface.co/amused/amused-256) 256 and 512
 - [Segmind Vega](https://huggingface.co/segmind/Segmind-Vega)  
 - [Segmind SSD-1B](https://huggingface.co/segmind/SSD-1B)  
 - [Segmind SegMoE](https://github.com/segmind/segmoe) *SD and SD-XL*  
 - [Kandinsky](https://github.com/ai-forever/Kandinsky-2) *2.1 and 2.2 and latest 3.0*  
 - [PixArt-α XL 2](https://github.com/PixArt-alpha/PixArt-alpha) *Medium and Large*  
 - [Warp Wuerstchen](https://huggingface.co/blog/wuertschen)  
-- [Playground](https://huggingface.co/playgroundai/playground-v2-256px-base) *v1, v2 256, v2 512, v2 1024*  
 - [Tsinghua UniDiffusion](https://github.com/thu-ml/unidiffuser)
 - [DeepFloyd IF](https://github.com/deep-floyd/IF) *Medium and Large*
 - [ModelScope T2V](https://huggingface.co/damo-vilab/text-to-video-ms-1.7b)
 - [Segmind SD Distilled](https://huggingface.co/blog/sd_distillation) *(all variants)*
 - [BLIP-Diffusion](https://dxli94.github.io/BLIP-Diffusion-website/)  
+- [KOALA 700M](https://github.com/youngwanLEE/sdxl-koala)
+- [VGen](https://huggingface.co/ali-vilab/i2vgen-xl)  
 
 
 Also supported are modifiers such as:
@@ -265,7 +269,7 @@ check [ChangeLog](CHANGELOG.md) for when feature was first introduced as it will
 ### **Sponsors**
 
 <div align="center">
-<!-- sponsors --><a href="https://github.com/allangrant"><img src="https://github.com/allangrant.png" width="60px" alt="Allan Grant" /></a><a href="https://github.com/BrentOzar"><img src="https://github.com/BrentOzar.png" width="60px" alt="Brent Ozar" /></a><a href="https://github.com/inktomi"><img src="https://github.com/inktomi.png" width="60px" alt="Matthew Runo" /></a><a href="https://github.com/HELLO-WORLD-SAS"><img src="https://github.com/HELLO-WORLD-SAS.png" width="60px" alt="HELLO WORLD SAS" /></a><a href="https://github.com/4joeknight4"><img src="https://github.com/4joeknight4.png" width="60px" alt="" /></a><a href="https://github.com/SaladTechnologies"><img src="https://github.com/SaladTechnologies.png" width="60px" alt="Salad Technologies" /></a><a href="https://github.com/mantzaris"><img src="https://github.com/mantzaris.png" width="60px" alt="a.v.mantzaris" /></a><a href="https://github.com/JohnnyStreet"><img src="https://github.com/JohnnyStreet.png" width="60px" alt="Johnny Street" /></a><!-- sponsors -->
+<!-- sponsors --><a href="https://github.com/allangrant"><img src="https://github.com/allangrant.png" width="60px" alt="Allan Grant" /></a><a href="https://github.com/BrentOzar"><img src="https://github.com/BrentOzar.png" width="60px" alt="Brent Ozar" /></a><a href="https://github.com/inktomi"><img src="https://github.com/inktomi.png" width="60px" alt="Matthew Runo" /></a><a href="https://github.com/4joeknight4"><img src="https://github.com/4joeknight4.png" width="60px" alt="" /></a><a href="https://github.com/SaladTechnologies"><img src="https://github.com/SaladTechnologies.png" width="60px" alt="Salad Technologies" /></a><a href="https://github.com/mantzaris"><img src="https://github.com/mantzaris.png" width="60px" alt="a.v.mantzaris" /></a><a href="https://github.com/CurseWave"><img src="https://github.com/CurseWave.png" width="60px" alt="" /></a><!-- sponsors -->
 </div>
 
 <br>
diff --git a/cli/clone.py b/cli/clone.py
@@ -1,3 +1,4 @@
+#!/usr/bin/env python
 import os
 import logging
 import git

diff --git a/extensions-builtin/sd-webui-controlnet b/extensions-builtin/sd-webui-controlnet
diff --git a/installer.py b/installer.py
@@ -927,6 +927,7 @@ def get_version():
                 'app': 'sd.next',
                 'updated': updated,
                 'hash': githash,
+                'branch': branch_name.replace('\n', ''),
                 'url': origin.replace('\n', '') + '/tree/' + branch_name.replace('\n', '')
             }
         except Exception:
@@ -950,7 +951,10 @@ def patch_zluda():
     if zluda_path is None:
         log.warning('Failed to automatically patch torch with ZLUDA. Could not find ZLUDA from PATH.')
         return
-    venv_dir = os.environ.get('VENV_DIR', os.path.dirname(shutil.which('python')))
+    python_dir = os.path.dirname(shutil.which('python'))
+    if shutil.which('conda') is None:
+        python_dir = os.path.dirname(python_dir)
+    venv_dir = os.environ.get('VENV_DIR', python_dir)
     dlls_to_patch = {
         'cublas.dll': 'cublas64_11.dll',
         #'cudnn.dll': 'cudnn64_8.dll',

diff --git a/javascript/amethyst-nightfall.css b/javascript/amethyst-nightfall.css
@@ -1,6 +1,6 @@
 /* generic html tags */
 :root {
-  --font: "Source Sans Pro", 'ui-sans-serif', 'system-ui', sans-serif;
+  --font: "Source Sans Pro", 'ui-sans-serif', 'system-ui', sans-serif, 'NotoSans';
   --font-size: 16px;
   --highlight-color: #8a3df6; /* Purple color */
   --inactive-color: #404040; /* Darker shade of gray */

diff --git a/javascript/base.css b/javascript/base.css
@@ -1,3 +1,5 @@
+@font-face { font-family: 'NotoSans'; font-display: swap; font-style: normal; font-weight: 100; src: local('NotoSans'), url('notosans-nerdfont-regular.ttf') }
+
 /* toolbutton */
 .gradio-button.tool { max-width: min-content; min-width: min-content !important; align-self: end; font-size: 1.4em; color: var(--body-text-color) !important; }
 
@@ -26,7 +28,7 @@
 
 /* fullpage image viewer */
 #lightboxModal{ display: none; position: fixed; z-index: 1001; left: 0; top: 0; width: 100%; height: 100%; overflow: auto; background-color: rgba(20, 20, 20, 0.75); backdrop-filter: blur(6px);
-  user-select: none; -webkit-user-select: none; flex-direction: row; }
+  user-select: none; -webkit-user-select: none; flex-direction: row; font-family: 'NotoSans'; }
 .modalControls { display: flex; justify-content: space-evenly; background-color: transparent; position: absolute; width: 99%; z-index: 1; }
 .modalControls:hover { background-color: #50505050; }
 .modalControls span { color: white; font-size: 2em; font-weight: bold; cursor: pointer; filter: grayscale(100%); }

diff --git a/javascript/black-teal.css b/javascript/black-teal.css
@@ -144,6 +144,7 @@ textarea[rows="1"] { height: 33px !important; width: 99% !important; padding: 8p
 #txt2img_settings { min-width: var(--left-column); max-width: var(--left-column); background-color: var(--neutral-950); padding-top: 16px; }
 #pnginfo_html2_info { margin-top: -18px; background-color: var(--input-background-fill); padding: var(--input-padding) }
 #txt2img_styles_row, #img2img_styles_row, #control_styles_row { margin-top: -6px; }
+.block > span { margin-bottom: 0 !important; margin-top: var(--spacing-lg); }
 
 /* based on gradio built-in dark theme */
 :root, .light, .dark {
@@ -187,7 +188,7 @@ textarea[rows="1"] { height: 33px !important; width: 99% !important; padding: 8p
   --checkbox-label-border-color: var(--border-color-primary);
   --checkbox-label-border-color-hover: var(--checkbox-label-border-color);
   --checkbox-label-border-width: var(--input-border-width);
-  --checkbox-label-text-color: var(--body-text-color);
+  --checkbox-label-text-color: var(--block-title-text-color);
   --checkbox-label-text-color-selected: var(--checkbox-label-text-color);
   --error-background-fill: var(--background-fill-primary);
   --error-border-color: var(--border-color-primary);

diff --git a/javascript/emerald-paradise.css b/javascript/emerald-paradise.css
@@ -1,6 +1,6 @@
 /* generic html tags */
 :root, .light, .dark {
-  --font: 'system-ui', 'ui-sans-serif', 'system-ui', "Roboto", sans-serif;
+  --font: 'system-ui', 'ui-sans-serif', 'system-ui', "Roboto", sans-serif, 'NotoSans';
   --font-mono: 'ui-monospace', 'Consolas', monospace;
   --font-size: 16px;
   --primary-100: #1e2223; /* bg color*/

diff --git a/javascript/imageViewer.js b/javascript/imageViewer.js
@@ -59,7 +59,7 @@ async function displayExif(el) {
   modalExif.innerHTML = '';
   const exif = await window.exifr.parse(el);
   if (!exif) return;
-  log('exif', exif);
+  // log('exif', exif);
   try {
     let html = `
       <b>Image</b> <a href="${el.src}" target="_blank">${el.src}</a> <b>Size</b> ${el.naturalWidth}x${el.naturalHeight}<br>
@@ -69,7 +69,7 @@ async function displayExif(el) {
     html = html.replace('Negative prompt:', '<br><b>Negative</b>');
     html = html.replace('Steps:', '<br><b>Params</b> Steps:');
     modalExif.innerHTML = html;
-  } catch(e) { }
+  } catch (e) { }
 }
 
 function showModal(event) {
@@ -80,7 +80,7 @@ function showModal(event) {
   modalImage.onload = () => {
     previewInstance.moveTo(0, 0);
     modalPreviewZone.focus();
-    displayExif(modalImage);''
+    displayExif(modalImage);
   };
   modalImage.src = source.src;
   if (modalImage.style.display === 'none') lb.style.setProperty('background-image', `url(${source.src})`);
@@ -225,7 +225,7 @@ async function initImageViewer() {
   // exif
   const modalExif = document.createElement('div');
   modalExif.id = 'modalExif';
-  modalExif.style = 'position: absolute; bottom: 0px; width: 98%; background-color: rgba(0, 0, 0, 0.5); color: var(--neutral-300); padding: 1em; font-size: small;'
+  modalExif.style = 'position: absolute; bottom: 0px; width: 98%; background-color: rgba(0, 0, 0, 0.5); color: var(--neutral-300); padding: 1em; font-size: small;';
 
   // handlers
   modalPreviewZone.addEventListener('mousedown', () => { previewDrag = false; });

diff --git a/javascript/invoked.css b/javascript/invoked.css
@@ -1,6 +1,6 @@
 /* generic html tags */
 :root, .light, .dark {
-  --font: 'system-ui', 'ui-sans-serif', 'system-ui', "Roboto", sans-serif;
+  --font: 'system-ui', 'ui-sans-serif', 'system-ui', "Roboto", sans-serif, 'NotoSans';
   --font-mono: 'ui-monospace', 'Consolas', monospace;
   --font-size: 16px;
   --primary-100: #2b303b;
+2 −2		annotator/clipvision/__init__.py
+3 −1		annotator/lama/saicinpainting/training/losses/segmentation.py
+3 −1		annotator/zoe/zoedepth/models/base_models/midas_repo/run.py