SkillAgentSearch skills...

Inspect

Entity-level code review for Git. Graph-based risk scoring, change classification, commit untangling. 95% recall on the Greptile benchmark.

Install / Use

/learn @Ataraxy-Labs/Inspect
About this skill

Quality Score

0/100

Supported Platforms

Claude Code
Cursor

README

<p align="center"> <img src="assets/logo.svg" alt="inspect" width="80" /> </p> <p align="center"> <strong>inspect</strong> </p>

Entity-level code review for Git. Every code review tool today works at the file or line level. inspect works at the entity level: functions, structs, classes, traits. It scores each change by risk and groups them by logical dependency.

The Problem

git diff tells you 12 files changed. But which changes actually matter? A renamed variable, a reformatted function, and a deleted public API method all look the same in a line-level diff. You have to read every line to figure out what needs careful review and what can be skipped.

This gets worse with AI-generated code. DORA 2025 found that AI adoption led to +154% PR size, +91% review time, and +9% more bugs shipped. Reviewers are drowning in noise.

inspect gives you two ways to handle this: local triage that ranks changes by structural risk, and optional LLM-powered review that finds the actual bugs.

What inspect Does

For every changed entity, inspect computes:

  • Classification: What kind of change is this? Text-only (comments/whitespace), syntax (signature/type change), functional (logic change), or a combination. Based on ConGra.
  • Risk score: 0.0 to 1.0, combining classification, blast radius, dependent count, public API exposure, and change type. Cosmetic-only changes get a 70% discount.
  • Blast radius: How many entities are transitively affected if this change breaks something. Computed from the full repo entity graph, not just changed files.
  • Grouping: Union-Find untangling separates independent logical changes within a single commit, so tangled commits can be reviewed as separate units.
$ inspect diff HEAD~1

inspect 12 entities changed
  1 critical, 4 high, 6 medium, 1 low

groups 3 logical groups:
  [0] src/merge/ (5 entities)
  [1] src/driver/ (4 entities)
  [2] validate (3 entities)

entities (by risk):

  ~ CRITICAL function merge_entities (src/merge/core.rs)
    classification: functional  score: 0.82  blast: 171  deps: 3/12
    public API
    >>> 12 dependents may be affected

  - HIGH function old_validate (src/validate.rs)
    classification: functional  score: 0.65  blast: 8  deps: 0/3
    public API

  + MEDIUM function parse_config (src/config.rs)
    classification: functional  score: 0.45  blast: 0  deps: 2/0

  ~ LOW function format_output (src/display.rs)
    classification: text  score: 0.05  blast: 0  deps: 0/0
    cosmetic only (no structural change)

Two ways to use inspect

Local (free, open source): CLI + MCP server. Entity triage, risk scoring, blast radius, commit untangling. inspect review sends the riskiest entities to any LLM you choose (Anthropic, OpenAI, Ollama, or any OpenAI-compatible server). No vendor lock-in.

Hosted API (optional): Full review via inspect.ataraxy-labs.com. Goes further than local review with 9 specialized lenses, cross-model ensemble, and validation passes. Submit a PR, get back findings.

Install

cargo install --git https://github.com/Ataraxy-Labs/inspect inspect-cli

Or build from source:

git clone https://github.com/Ataraxy-Labs/inspect
cd inspect && cargo build --release

Commands

inspect diff <ref>

Review entity-level changes for a commit or range.

inspect diff HEAD~1              # last commit
inspect diff main..feature       # branch comparison
inspect diff abc123              # specific commit
inspect diff HEAD~1 --context    # show dependency details
inspect diff HEAD~1 --min-risk high  # only high/critical
inspect diff HEAD~1 --format json    # JSON output
inspect diff HEAD~1 --format markdown  # markdown output (for agents)

inspect pr <number>

Review all changes in a GitHub pull request. Uses gh CLI to resolve base/head refs.

inspect pr 42
inspect pr 42 --min-risk medium
inspect pr 42 --format json

inspect file <path>

Review uncommitted changes in a file.

inspect file src/main.rs
inspect file src/main.rs --context

inspect review <ref>

Triage + LLM review. Triages entities by risk, sends the highest-risk ones to an LLM for review.

inspect review HEAD~1                          # Anthropic (default)
inspect review HEAD~1 --provider ollama --model llama3  # local Ollama
inspect review HEAD~1 --api-base http://localhost:8000/v1 --model my-model  # any OpenAI-compatible server
inspect review HEAD~1 --min-risk medium        # review more entities
inspect review HEAD~1 --max-entities 20        # send more to LLM

inspect bench --repo <path>

Benchmark entity-level review across a repo's commit history. Outputs JSON with per-commit details and aggregate metrics.

inspect bench --repo ~/my-project --limit 50

LLM Providers

inspect review works with Anthropic, OpenAI, and any OpenAI-compatible server (Ollama, vLLM, LM Studio, llama.cpp). Pass --api-base and it auto-detects the right client.

# Anthropic (default)
export ANTHROPIC_API_KEY=sk-ant-...
inspect review HEAD~1

# OpenAI
export OPENAI_API_KEY=sk-...
inspect review HEAD~1 --provider openai --model gpt-4o

# Ollama (local, no API key)
inspect review HEAD~1 --provider ollama --model llama3

# Any OpenAI-compatible endpoint (vLLM, LM Studio, etc.)
inspect review HEAD~1 --api-base http://localhost:8000/v1 --model my-model

| Provider | API key env var | Default base URL | |----------|----------------|-----------------| | anthropic | ANTHROPIC_API_KEY | api.anthropic.com | | openai | OPENAI_API_KEY | api.openai.com/v1 | | ollama | none | localhost:11434/v1 |

--api-base implies the OpenAI-compatible client, so you don't need --provider with it. --provider ollama implies localhost:11434, so you don't need --api-base with it.

MCP Server

inspect ships an MCP server so any coding agent (Claude Code, Cursor, etc.) can use entity-level review as a tool.

# Build the MCP server
cargo build -p inspect-mcp

# Binary at target/debug/inspect-mcp

6 tools:

| Tool | Purpose | |------|---------| | inspect_triage | Primary entry point. Full analysis sorted by risk with verdict. | | inspect_entity | Drill into one entity: before/after content, dependents, dependencies. | | inspect_group | Get all entities in a logical change group. | | inspect_file | Scope review to a single file. | | inspect_stats | Lightweight summary: stats, verdict, timing. No entity details. | | inspect_risk_map | File-level risk heatmap with per-file aggregate scores. |

Review verdict (returned by triage and stats):

  • likely_approvable: All changes are cosmetic
  • standard_review: Normal changes, no high-risk entities
  • requires_review: High-risk entities present
  • requires_careful_review: Critical-risk entities present

Add to your Claude Code config:

{
  "mcpServers": {
    "inspect": {
      "command": "/path/to/inspect-mcp"
    }
  }
}

Code Review Benchmark

inspect + LLM vs Greptile vs CodeRabbit on the same dataset, same judge, same methodology. 141 planted bugs across 52 PRs in 5 production repos (Sentry, Cal.com, Grafana, Keycloak, Discourse).

| Metric | inspect + LLM | Greptile API | CodeRabbit CLI | |--------|--------------|-------------|----------------| | Recall | 95.0% | 91.5% | 56.0% | | Precision | 33.3% | 21.9% | 48.2% | | F1 Score | 49.4% | 35.3% | 51.8% | | HC Recall | 100% | 94.1% | 60.8% | | Findings | 402 | 590 | 164 |

inspect catches 95% of all bugs and 100% of high-severity bugs. CodeRabbit misses 44% of bugs overall and 39% of high-severity ones. Greptile has decent recall but produces 3x more noise.

The approach: entity-level triage cuts 100+ changed entities to the 60 riskiest, then sends each to an LLM for review. This costs a fraction of reviewing the full diff, with higher recall than tools that scan everything.

Dataset: HuggingFace. Judge: heuristic keyword matching applied identically to all tools.

Hosted API

For teams that don't want to manage LLM infrastructure, we run a hosted review service at inspect.ataraxy-labs.com. It goes beyond what inspect review does locally.

What the hosted API does differently:

  1. Entity triage ranks changes by graph signals (same as local)
  2. 9 parallel review lenses: 6 specialized (data correctness, concurrency, contracts, security, typos, runtime) + 3 general
  3. Cross-model ensemble for higher recall
  4. Structural filter drops findings that reference files not in the diff
  5. Validation pass confirms each finding against the actual code
# Submit a PR for review
curl -X POST https://inspect.ataraxy-labs.com/api/review \
  -H "Authorization: Bearer insp_..." \
  -H "Content-Type: application/json" \
  -d '{"repo":"owner/repo","pr_number":123}'

# Entity triage only (no LLM, returns in 1-3s)
curl -X POST https://inspect.ataraxy-labs.com/api/triage \
  -H "Authorization: Bearer insp_..." \
  -H "Content-Type: application/json" \
  -d '{"repo":"owner/repo","pr_number":123}'

Get an API key at inspect.ataraxy-labs.com/dashboard/keys.

Triage Benchmark

Results from running inspect bench against three Rust codebases (89 commits, 8,870 entities total):

| Metric | sem | weave | agenthub | |--------|-----|-------|----------| | Commits analyzed | 31 | 39 | 19 | | Entities reviewed | 4,955 | 2,803 | 1,112 | | Avg entities/commit | 159.8 | 71.9 | 58.5 | | Avg blast radius | 0.0 | 3.4 | 42.5 | | Max blast radius | 0 | 171 | 595 | | High/Critical ratio | 15.1% | 40.6% | 77.1% | | Cross-file impact | 0% | 10.6% | 70.7% | | Tangled commits | 96.8% | 69.2% | 94.7% |

Key takeaways:

  • Blast radius 595 means one entity change in agenthub could affect 595 other entities transitively. A line-level diff won't t
View on GitHub
GitHub Stars70
CategoryDevelopment
Updated24m ago
Forks3

Languages

Python

Security Score

85/100

Audited on Mar 29, 2026

No findings