Clearshot
📸 Structured screenshot intelligence for AI coding tools. Give your AI X-ray vision for UI screenshots.
Install / Use
/learn @udayanwalvekar/ClearshotREADME
clearshot
structured screenshot intelligence for AI coding tools.
install — 30 seconds
open claude code and paste this. claude does the rest.
git clone https://github.com/udayanwalvekar/clearshot.git ~/.claude/skills/clearshot && cd ~/.claude/skills/clearshot && ./setup
add to your project so teammates get it (optional)
cp -Rf ~/.claude/skills/clearshot .claude/skills/clearshot && rm -rf .claude/skills/clearshot/.git && cd .claude/skills/clearshot && ./setup
real files get committed to your repo (not a submodule), so git clone just works for your teammates. everything lives inside .claude/.
alternative
also available via npx clearshot.
installed before v1.4.0? reinstall
there was a small bug that was causing clearshot to not get triggered consistently. paste this in your terminal to fix it:
rm -rf ~/.claude/skills/clearshot ~/.clearshot && git clone https://github.com/udayanwalvekar/clearshot.git ~/.claude/skills/clearshot && cd ~/.claude/skills/clearshot && ./setup
troubleshooting
if clearshot stopped working or the skill won't load, run the same command above.
activates automatically when you share a UI screenshot. first time it runs, it asks two questions (update preference + telemetry) and you're done.
what it does
AI sucks at seeing screenshots. paste one into claude code and ask it to rebuild something — it says "i see a dashboard with some cards and a sidebar." meanwhile you're staring at a broken button, inconsistent border radius, and 6px padding where 8px should be.
clearshot fixes this by running structured analysis before the AI responds. not "blue color" but #3B82F6. not "some spacing" but 8px gap. not "a button" but sm/rounded-md/border-gray-200/shadow-sm.
three analysis levels — facts and taste together, every time:
| level | what it does | when it runs | |-------|-------------|--------------| | map | 5×5 spatial grid, full element inventory with colors/borders/states | always | | system | color palette, type scale, spacing patterns, design cohesion | always | | blueprint | layout architecture, CSS patterns, interaction map, responsive context | when building |
levels 1+2 always run. level 3 escalates when you're implementing from the screenshot.
it knows when to shut up
screenshot of a meme? stays quiet. architecture diagram? stays quiet. photo of your lunch? definitely stays quiet. only activates on UI screenshots when you're building or critiquing frontend.
privacy
everything runs locally. no screenshots or code ever leave your machine.
telemetry is opt-in — you choose during first run:
| mode | what's sent | |------|------------| | anonymous | usage events + hashed device ID (no PII) | | off | nothing |
no network calls happen until you explicitly opt in.
research
builds on production systems and academic research:
- microsoft omniparser — improved GPT-4V from 70.5% to 93.8% with structured analysis. paper
- dcgen — recursive divide-and-conquer screenshot decomposition. paper
- replit — multi-agent verification loop with structured screenshot comparison
- google screenai — UI-specific vision-language model. blog
built by
license
MIT
Related Skills
node-connect
351.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
351.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
351.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
