Inspect
Entity-level code review for Git. Graph-based risk scoring, change classification, commit untangling. 95% recall on the Greptile benchmark.
Install / Use
/learn @Ataraxy-Labs/InspectQuality Score
Category
Development & EngineeringSupported Platforms
README
Entity-level code review for Git. Every code review tool today works at the file or line level. inspect works at the entity level: functions, structs, classes, traits. It scores each change by risk and groups them by logical dependency.
The Problem
git diff tells you 12 files changed. But which changes actually matter? A renamed variable, a reformatted function, and a deleted public API method all look the same in a line-level diff. You have to read every line to figure out what needs careful review and what can be skipped.
This gets worse with AI-generated code. DORA 2025 found that AI adoption led to +154% PR size, +91% review time, and +9% more bugs shipped. Reviewers are drowning in noise.
inspect gives you two ways to handle this: local triage that ranks changes by structural risk, and optional LLM-powered review that finds the actual bugs.
What inspect Does
For every changed entity, inspect computes:
- Classification: What kind of change is this? Text-only (comments/whitespace), syntax (signature/type change), functional (logic change), or a combination. Based on ConGra.
- Risk score: 0.0 to 1.0, combining classification, blast radius, dependent count, public API exposure, and change type. Cosmetic-only changes get a 70% discount.
- Blast radius: How many entities are transitively affected if this change breaks something. Computed from the full repo entity graph, not just changed files.
- Grouping: Union-Find untangling separates independent logical changes within a single commit, so tangled commits can be reviewed as separate units.
$ inspect diff HEAD~1
inspect 12 entities changed
1 critical, 4 high, 6 medium, 1 low
groups 3 logical groups:
[0] src/merge/ (5 entities)
[1] src/driver/ (4 entities)
[2] validate (3 entities)
entities (by risk):
~ CRITICAL function merge_entities (src/merge/core.rs)
classification: functional score: 0.82 blast: 171 deps: 3/12
public API
>>> 12 dependents may be affected
- HIGH function old_validate (src/validate.rs)
classification: functional score: 0.65 blast: 8 deps: 0/3
public API
+ MEDIUM function parse_config (src/config.rs)
classification: functional score: 0.45 blast: 0 deps: 2/0
~ LOW function format_output (src/display.rs)
classification: text score: 0.05 blast: 0 deps: 0/0
cosmetic only (no structural change)
Two ways to use inspect
Local (free, open source): CLI + MCP server. Entity triage, risk scoring, blast radius, commit untangling. inspect review sends the riskiest entities to any LLM you choose (Anthropic, OpenAI, Ollama, or any OpenAI-compatible server). No vendor lock-in.
Hosted API (optional): Full review via inspect.ataraxy-labs.com. Goes further than local review with 9 specialized lenses, cross-model ensemble, and validation passes. Submit a PR, get back findings.
Install
cargo install --git https://github.com/Ataraxy-Labs/inspect inspect-cli
Or build from source:
git clone https://github.com/Ataraxy-Labs/inspect
cd inspect && cargo build --release
Commands
inspect diff <ref>
Review entity-level changes for a commit or range.
inspect diff HEAD~1 # last commit
inspect diff main..feature # branch comparison
inspect diff abc123 # specific commit
inspect diff HEAD~1 --context # show dependency details
inspect diff HEAD~1 --min-risk high # only high/critical
inspect diff HEAD~1 --format json # JSON output
inspect diff HEAD~1 --format markdown # markdown output (for agents)
inspect pr <number>
Review all changes in a GitHub pull request. Uses gh CLI to resolve base/head refs.
inspect pr 42
inspect pr 42 --min-risk medium
inspect pr 42 --format json
inspect file <path>
Review uncommitted changes in a file.
inspect file src/main.rs
inspect file src/main.rs --context
inspect review <ref>
Triage + LLM review. Triages entities by risk, sends the highest-risk ones to an LLM for review.
inspect review HEAD~1 # Anthropic (default)
inspect review HEAD~1 --provider ollama --model llama3 # local Ollama
inspect review HEAD~1 --api-base http://localhost:8000/v1 --model my-model # any OpenAI-compatible server
inspect review HEAD~1 --min-risk medium # review more entities
inspect review HEAD~1 --max-entities 20 # send more to LLM
inspect bench --repo <path>
Benchmark entity-level review across a repo's commit history. Outputs JSON with per-commit details and aggregate metrics.
inspect bench --repo ~/my-project --limit 50
LLM Providers
inspect review works with Anthropic, OpenAI, and any OpenAI-compatible server (Ollama, vLLM, LM Studio, llama.cpp). Pass --api-base and it auto-detects the right client.
# Anthropic (default)
export ANTHROPIC_API_KEY=sk-ant-...
inspect review HEAD~1
# OpenAI
export OPENAI_API_KEY=sk-...
inspect review HEAD~1 --provider openai --model gpt-4o
# Ollama (local, no API key)
inspect review HEAD~1 --provider ollama --model llama3
# Any OpenAI-compatible endpoint (vLLM, LM Studio, etc.)
inspect review HEAD~1 --api-base http://localhost:8000/v1 --model my-model
| Provider | API key env var | Default base URL |
|----------|----------------|-----------------|
| anthropic | ANTHROPIC_API_KEY | api.anthropic.com |
| openai | OPENAI_API_KEY | api.openai.com/v1 |
| ollama | none | localhost:11434/v1 |
--api-base implies the OpenAI-compatible client, so you don't need --provider with it. --provider ollama implies localhost:11434, so you don't need --api-base with it.
MCP Server
inspect ships an MCP server so any coding agent (Claude Code, Cursor, etc.) can use entity-level review as a tool.
# Build the MCP server
cargo build -p inspect-mcp
# Binary at target/debug/inspect-mcp
6 tools:
| Tool | Purpose |
|------|---------|
| inspect_triage | Primary entry point. Full analysis sorted by risk with verdict. |
| inspect_entity | Drill into one entity: before/after content, dependents, dependencies. |
| inspect_group | Get all entities in a logical change group. |
| inspect_file | Scope review to a single file. |
| inspect_stats | Lightweight summary: stats, verdict, timing. No entity details. |
| inspect_risk_map | File-level risk heatmap with per-file aggregate scores. |
Review verdict (returned by triage and stats):
likely_approvable: All changes are cosmeticstandard_review: Normal changes, no high-risk entitiesrequires_review: High-risk entities presentrequires_careful_review: Critical-risk entities present
Add to your Claude Code config:
{
"mcpServers": {
"inspect": {
"command": "/path/to/inspect-mcp"
}
}
}
Code Review Benchmark
inspect + LLM vs Greptile vs CodeRabbit on the same dataset, same judge, same methodology. 141 planted bugs across 52 PRs in 5 production repos (Sentry, Cal.com, Grafana, Keycloak, Discourse).
| Metric | inspect + LLM | Greptile API | CodeRabbit CLI | |--------|--------------|-------------|----------------| | Recall | 95.0% | 91.5% | 56.0% | | Precision | 33.3% | 21.9% | 48.2% | | F1 Score | 49.4% | 35.3% | 51.8% | | HC Recall | 100% | 94.1% | 60.8% | | Findings | 402 | 590 | 164 |
inspect catches 95% of all bugs and 100% of high-severity bugs. CodeRabbit misses 44% of bugs overall and 39% of high-severity ones. Greptile has decent recall but produces 3x more noise.
The approach: entity-level triage cuts 100+ changed entities to the 60 riskiest, then sends each to an LLM for review. This costs a fraction of reviewing the full diff, with higher recall than tools that scan everything.
Dataset: HuggingFace. Judge: heuristic keyword matching applied identically to all tools.
Hosted API
For teams that don't want to manage LLM infrastructure, we run a hosted review service at inspect.ataraxy-labs.com. It goes beyond what inspect review does locally.
What the hosted API does differently:
- Entity triage ranks changes by graph signals (same as local)
- 9 parallel review lenses: 6 specialized (data correctness, concurrency, contracts, security, typos, runtime) + 3 general
- Cross-model ensemble for higher recall
- Structural filter drops findings that reference files not in the diff
- Validation pass confirms each finding against the actual code
# Submit a PR for review
curl -X POST https://inspect.ataraxy-labs.com/api/review \
-H "Authorization: Bearer insp_..." \
-H "Content-Type: application/json" \
-d '{"repo":"owner/repo","pr_number":123}'
# Entity triage only (no LLM, returns in 1-3s)
curl -X POST https://inspect.ataraxy-labs.com/api/triage \
-H "Authorization: Bearer insp_..." \
-H "Content-Type: application/json" \
-d '{"repo":"owner/repo","pr_number":123}'
Get an API key at inspect.ataraxy-labs.com/dashboard/keys.
Triage Benchmark
Results from running inspect bench against three Rust codebases (89 commits, 8,870 entities total):
| Metric | sem | weave | agenthub | |--------|-----|-------|----------| | Commits analyzed | 31 | 39 | 19 | | Entities reviewed | 4,955 | 2,803 | 1,112 | | Avg entities/commit | 159.8 | 71.9 | 58.5 | | Avg blast radius | 0.0 | 3.4 | 42.5 | | Max blast radius | 0 | 171 | 595 | | High/Critical ratio | 15.1% | 40.6% | 77.1% | | Cross-file impact | 0% | 10.6% | 70.7% | | Tangled commits | 96.8% | 69.2% | 94.7% |
Key takeaways:
- Blast radius 595 means one entity change in agenthub could affect 595 other entities transitively. A line-level diff won't t
