(CapPic)
An image viewer and AI-assisted editing tool that helps with curating datasets for generative AI models, finetunes and LoRA.
-
Image Viewer: Display and navigate images
- Quick-starting desktop application built with Qt
- Modular interface that lets you place windows on different monitors
- Open multiple tabs
- Zoom/pan and fullscreen mode
- Gallery with thumbnails
- Compare two images
- Measure size and pixel distances
- Slideshow
-
Image/Mask Editor: Prepare images for training
- Crop and save parts of images
- Scale images, optionally using AI upscale models
- Manually edit masks with multiple layers
- Support for pressure-sensitive drawing pens
- Record masking operations into macros
- Automated masking
-
Captioning: Describe images with text
- Edit captions manually with drag-and-drop support
- Tag sorting and filtering rules
- Colored text highlighting
- Automated captioning
- Prompt presets
- Iterative prompting with each answer saved to different entries in a
.json
file - Further refinement with LLMs
-
Batch Processing: Process whole folders at once
- Flexible batch captioning, tagging and transformation
- Batch scaling of images
- Batch masking with user-defined macros
- Batch cropping of images using your macros
-
AI Assistance:
- Support for state-of-the-art captioning and masking models
- Model and sampling settings, GPU acceleration with CPU offload support
- On-the-fly NF4 and INT8 quantization
- Separate inference subprocess isolates potential crashes and allows complete VRAM cleanup
-
Tagging
-
Captioning
- Florence-2
- InternVL2
- MiniCPM-V-2.6 (GGUF) (alternative link)
- Molmo (recommended)
- Ovis-1.6
- Qwen2-VL
-
LLM
- Models in GGUF format with embedded chat template (llama-cpp backend).
-
Upscaling
- Model architectures supported by the spandrel backend.
- Find more models at openmodeldb.info.
-
Masking
- Box Detection
- YOLO/Adetailer detection models
- Florence-2
- Segmentation / Background Removal
- InSPyReNet (Plus_Ultra)
- RMBG-2.0
- Florence-2
- Box Detection
Requires Python.
By default, prebuilt packages for CUDA 12.4 are installed. If you need a different CUDA version, change the index URL in requirements-pytorch.txt
and requirements-llamacpp.txt
before running the setup script.
- Git clone or download this repository.
- Run
setup.sh
on Linux,setup.bat
on Windows.- This will create a virtual environment that needs 7-9 GB.
If the setup scripts didn't work for you, but you manually got it running, please share your solution and raise an issue.
- Linux:
run.sh
- Windows:
run.bat
orrun-console.bat
You can open files or folders directly in qapyq by associating the file types with the respective run script in your OS.
For shortcuts, icons are available in the qapyq/res
folder.
If git was used to clone the repository, simply use git pull
to update.
If the repository was downloaded as a zip archive, download it again and replace the installed files.
New dependencies may be added. If the program fails to start or crashes, run the setup script again to install the missing packages.
More information is available in the Wiki.
How to setup AI models for automatic captioning and masking: Model Setup
How to use: User Guide
- Natural sorting of files
- Gallery list view with captions
- Summary and stats of captions and tags
- Shortcuts and improved ease-of-use
- AI-assisted mask editing
- Auto-caption after crop
- Overlays (difference image) for comparison tool
- Image resizing
- Run inference on remote machines
- Adapt new captioning and masking models
- Possibly a plugin system for new tools
- Integration with ComfyUI
- Docs, Screenshots, Video Guides
- Selection of second image for comparison in Gallery Window might be wrong (unfinished GUI).
- Icons in Gallery can be inconsistent.