SkillAgentSearch skills...

Qapyq

An image viewer and AI-assisted editing/captioning/masking tool that helps with curating datasets for generative AI models, finetunes and LoRA.

Install / Use

/learn @FennelFetish/Qapyq

README

<img src="res/qapyq.png" align="left" />

qapyq

<sup>(CapPic)</sup><br /> An image viewer and AI-assisted editing tool that helps with curating datasets for generative AI models, finetunes and LoRA.

<br clear="left"/> <br /><br />

Screenshot of qapyq with its 5 windows open.

<a href="https://camo.githubusercontent.com/059f5cef1671955473d5d3e096263cf85910a2d52094c389b2924cca1b1a33c5/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f647261672d6e2d64726f702e676966"><img alt="Edit captions quickly with drag-and-drop support" src="https://camo.githubusercontent.com/059f5cef1671955473d5d3e096263cf85910a2d52094c389b2924cca1b1a33c5/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f647261672d6e2d64726f702e676966" width="30%"></img></a> <a href="https://camo.githubusercontent.com/71df5556ba81a944f3a28ed3760644b6f7c0c455b4ed639a80418b51d0cae704/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f7461675f6d75742d6578636c75736976652e676966"><img alt="Select one-of-many" src="https://camo.githubusercontent.com/71df5556ba81a944f3a28ed3760644b6f7c0c455b4ed639a80418b51d0cae704/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f7461675f6d75742d6578636c75736976652e676966" width="30%"></img></a> <a href="https://camo.githubusercontent.com/9403e354708969d4c5f1262583294913bd238e8b38df640e2a7fc36a313bf686/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f72756c65732e676966"><img alt="Apply sorting and filtering rules" src="https://camo.githubusercontent.com/9403e354708969d4c5f1262583294913bd238e8b38df640e2a7fc36a313bf686/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f72756c65732e676966" width="30%"></img></a>

<a href="https://camo.githubusercontent.com/74122b177a2f5a1cd4add5d749b90a49ac2f0cec631363ef861199a7c90566d7/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f63726f702e676966"><img alt="Quick cropping" src="https://camo.githubusercontent.com/74122b177a2f5a1cd4add5d749b90a49ac2f0cec631363ef861199a7c90566d7/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f63726f702e676966" width="30%"></img></a> <a href="https://camo.githubusercontent.com/d15df56575d4d69fe2cc04c5ed822e6cc95c0208185df3464a21fc351c4b04fb/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f636f6d706172652e676966"><img alt="Image comparison" src="https://camo.githubusercontent.com/d15df56575d4d69fe2cc04c5ed822e6cc95c0208185df3464a21fc351c4b04fb/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f636f6d706172652e676966" width="30%"></img></a> <a href="https://camo.githubusercontent.com/1583a08a56e63f4d6dae0c59f6572310558bb7c9f8e7b79e7e4e53af6e2663ee/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f6d61736b2d322e676966"><img alt="Draw masks manually or apply automatic detection and segmentation" src="https://camo.githubusercontent.com/1583a08a56e63f4d6dae0c59f6572310558bb7c9f8e7b79e7e4e53af6e2663ee/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f6d61736b2d322e676966" width="30%"></img></a>

<a href="https://camo.githubusercontent.com/b6cf81d56d9d4e9e2bbc8fb031e03e9380bc0a5c5e47b4885edd8ff0cc043b6b/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f636f6e642d666f6f74776561722d686169722d666c6f6f722e676966"><img alt="Transform tags using conditional rules" src="https://camo.githubusercontent.com/b6cf81d56d9d4e9e2bbc8fb031e03e9380bc0a5c5e47b4885edd8ff0cc043b6b/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f636f6e642d666f6f74776561722d686169722d666c6f6f722e676966" width="30%"></img></a> <a href="https://camo.githubusercontent.com/b094b255ba1d18d83253dba4f7f813ac6d64ea6cedac7330437eb0791479f062/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f6d756c7469656469742d666f6375732d636f6d707265737365642e676966"><img alt="Multi-Edit and Focus Mode" src="https://camo.githubusercontent.com/b094b255ba1d18d83253dba4f7f813ac6d64ea6cedac7330437eb0791479f062/68747470733a2f2f7777772e616c6368656d697374732e63682f71617079712f6d756c7469656469742d666f6375732d636f6d707265737365642e676966" width="60%"></img></a>

Features

  • Image Viewer: Display and navigate images

    • Quick-starting desktop application built with Qt
    • Runs smoothly with a million images
    • Modular interface that lets you place windows on different monitors
    • Open multiple tabs
    • Zoom/pan and fullscreen mode
    • Gallery with thumbnails and optionally captions <sup>?</sup>
    • Semantic image sorting with text prompts <sup>?</sup>
    • Compare two images <sup>?</sup>
    • Measure size, area and pixel distances <sup>?</sup>
    • Slideshow <sup>?</sup>
  • Image/Mask Editor: Prepare images for training

    • Crop and save parts of images <sup>?</sup>
    • Scale images, optionally using AI upscale models <sup>?</sup>
    • Dynamic save paths with template variables <sup>?</sup>
    • Manually edit masks with multiple layers <sup>?</sup>
    • Generate masks with AI models <sup>?</sup>
    • Record masking operations into macros <sup>?</sup>
    • VAE-encode images and check their latent representation <sup>?</sup>
  • Captioning: Describe images with text

    • Edit captions manually with drag-and-drop support <sup>?</sup>
    • Save multiple captions in a JSON file per image <sup>?</sup>
    • Multi-Edit Mode: Edit captions of multiple images simultaneously <sup>?</sup>
    • Focus Mode: Add the same tags to many files quickly <sup>?</sup>
    • Tag grouping, merging, sorting, filtering and replacement rules <sup>?</sup>
    • Colored text highlighting
    • Autocomplete with tags from your groups and CSV files <sup>?</sup>
    • CLIP Token Counter <sup>?</sup>
    • Automated captioning with support for grounding <sup>?</sup>
    • Dynamic prompts with templates and text transformations <sup>?</sup>
    • Multi-turn conversations with VLMs <sup>?</sup>
    • Further refinement with LLMs
  • Stats/Filters: Summarize your data and get an overview

    • List all tags, image resolutions, masked regions, or size of concept folders <sup>?</sup>
    • Filter images and create subsets <sup>?</sup>
    • Combine and chain filters
    • Export the summaries as CSV
  • Batch Processing: Process whole folders at once

    • Flexible batch captioning, tagging and transformation <sup>?</sup>
    • Batch scaling of images
    • Batch masking with user-defined macros
    • Batch cropping of images using your macros
    • Copy, move and rename files, create symlinks, ZIP captions for backups
  • AI Assistance:

    • Support for state-of-the-art captioning and masking models
    • Model and sampling settings, GPU acceleration with CPU offload support
    • On-the-fly NF4 and INT8 quantization
    • Run inference locally and/or on multiple remote machines over SSH <sup>?</sup>
    • Separate inference subprocess isolates potential crashes and allows complete VRAM cleanup

Supported Models

These are the supported architectures with links to the original models.<br> Find more specialized finetuned models on huggingface.co.

View on GitHub
GitHub Stars152
CategoryDevelopment
Updated6d ago
Forks10

Languages

Python

Security Score

100/100

Audited on Mar 26, 2026

No findings