Agentshield
AI agent security scanner. Detect vulnerabilities in agent configurations, MCP servers, and tool permissions. Available as CLI, GitHub Action, ECC plugin, and GitHub App integration. ๐ก๏ธ
Install / Use
/learn @affaan-m/AgentshieldQuality Score
Category
Development & EngineeringSupported Platforms
README
AgentShield
Security auditor for AI agent configurations
Scans Claude Code setups for hardcoded secrets, permission misconfigs,<br/> hook injection, MCP server risks, and agent prompt injection vectors.<br/> Available as CLI, GitHub Action, and GitHub App integration.
Quick Start ยท What It Catches ยท API Reference ยท Opus Pipeline ยท GitHub Action ยท Distribution ยท MiniClaw ยท Changelog
</div>Why
The AI agent ecosystem is growing faster than its security tooling. In January 2026 alone:
- 12% of a major agent skill marketplace was malicious (341 of 2,857 community skills)
- A CVSS 8.8 CVE exposed 17,500+ internet-facing instances to one-click RCE
- The Moltbook breach compromised 1.5M API tokens across 770,000 agents
Developers install community skills, connect MCP servers, and configure hooks without any automated way to audit the security of their setup. AgentShield scans your .claude/ directory and flags vulnerabilities before they become exploits.
Built at the Claude Code Hackathon (Cerebral Valley x Anthropic, Feb 2026). Part of the Everything Claude Code ecosystem (42K+ stars).
Quick Start
# Scan your Claude Code config (no install required)
npx ecc-agentshield scan
# Or install globally
npm install -g ecc-agentshield
agentshield scan
That's it. AgentShield auto-discovers your ~/.claude/ directory, scans all config files, and prints a graded security report.
Discovery intentionally skips common generated directories such as node_modules, build output, and .dmux worktree mirrors so transient copies do not duplicate findings.
AgentShield Security Report
Grade: F (0/100)
Score Breakdown
Secrets โโโโโโโโโโโโโโโโโโโโ 0
Permissions โโโโโโโโโโโโโโโโโโโโ 0
Hooks โโโโโโโโโโโโโโโโโโโโ 0
MCP Servers โโโโโโโโโโโโโโโโโโโโ 0
Agents โโโโโโโโโโโโโโโโโโโโ 0
โ CRITICAL Hardcoded Anthropic API key
CLAUDE.md:13
Evidence: sk-ant-a...cdef
Fix: Replace with environment variable reference [auto-fixable]
โ CRITICAL Overly permissive allow rule: Bash(*)
settings.json
Evidence: Bash(*)
Fix: Restrict to specific commands: Bash(git *), Bash(npm *), Bash(node *)
Summary
Files scanned: 6
Findings: 73 total โ 19 critical, 29 high, 15 medium, 4 low, 6 info
Auto-fixable: 8 (use --fix)
More commands
# Scan a specific directory
agentshield scan --path /path/to/.claude
# Auto-fix safe issues (replaces hardcoded secrets with env var references)
agentshield scan --fix
# JSON output for CI pipelines
agentshield scan --format json
# Generate an HTML security report
agentshield scan --format html > report.html
# Three-agent Opus 4.6 adversarial analysis (requires ANTHROPIC_API_KEY)
agentshield scan --opus --stream
# Generate a secure baseline config
agentshield init
JSON reports now expose findings[].runtimeConfidence when AgentShield can distinguish active runtime config from project-local settings, template/example inventories, declarative plugin manifests, and manifest-resolved non-shell hook implementations.
What It Catches
102 rules across 5 categories, graded AโF with a 0โ100 numeric score.
Secrets Detection (10 rules, 14 patterns)
| What | Examples |
|------|----------|
| API keys | Anthropic (sk-ant-), OpenAI (sk-proj-), AWS (AKIA), Google (AIza), Stripe (sk_test_/sk_live_) |
| Tokens | GitHub PATs (ghp_/github_pat_), Slack (xox[bprs]-), JWTs (eyJ...), Bearer tokens |
| Credentials | Hardcoded passwords, database connection strings (postgres/mongo/mysql/redis), private key material |
| Env leaks | Secrets passed through environment variables in configs, echo $SECRET in hooks |
Permission Audit (10 rules)
| What | Examples |
|------|----------|
| Wildcard access | Bash(*), Write(*), Edit(*) โ unrestricted tool permissions |
| Missing deny lists | No deny rules for rm -rf, sudo, chmod 777 |
| Dangerous flags | --dangerously-skip-permissions usage |
| Mutable tool exposure | All mutable tools (Write, Edit, Bash) allowed without scoping |
| Destructive git | git push --force, git reset --hard in allowed commands |
| Unrestricted network | curl *, wget, ssh *, scp * in allow list without scope |
Hook Analysis (34 rules)
| What | Examples |
|------|----------|
| Command injection | ${file} interpolation in shell commands โ attacker-controlled filenames become code |
| Data exfiltration | curl -X POST with variable interpolation sending data to external URLs |
| Silent errors | 2>/dev/null, \|\| true โ failing security hooks that silently pass |
| Missing hooks | No PreToolUse hooks, no Stop hooks for session-end validation |
| Network exposure | Unthrottled network requests in hooks, sensitive file access without filtering |
| Session startup | SessionStart hooks that download and execute remote scripts |
| Package installs | Global npm install -g, pip install, gem install, cargo install in hooks |
| Container escape | Docker --privileged, --pid=host, --network=host, root volume mounts |
| Credential access | macOS Keychain, GNOME Keyring, /etc/shadow reads |
| Reverse shells | /dev/tcp, mkfifo + nc, Python/Perl socket shells |
| Clipboard access | pbcopy, xclip, xsel, wl-copy โ exfiltration via clipboard |
| Log tampering | journalctl --vacuum, rm /var/log, history -c โ anti-forensics |
MCP Server Security (23 rules)
| What | Examples |
|------|----------|
| High-risk servers | Shell/command MCPs, filesystem with root access, database MCPs, browser automation |
| Supply chain | npx -y auto-install without confirmation โ typosquatting vector |
| Hardcoded secrets | API tokens in MCP environment config instead of env var references |
| Remote transport | MCP servers connecting to remote URLs (SSE/streamable HTTP) |
| Shell metacharacters | &&, \|, ; in MCP server command arguments |
| Missing metadata | No version pin, no description, excessive server count |
| Sensitive file args | .env, .pem, credentials.json passed as server arguments |
| Network exposure | Binding to 0.0.0.0 instead of localhost |
| Auto-approve | autoApprove settings that skip user confirmation for tool calls |
| Missing timeouts | High-risk servers without timeout โ resource exhaustion risk |
MCP Confidence Notes
AgentShield scans both active MCP config and repository-shipped MCP templates.
- Findings from
mcp.json,.claude/mcp.json,.claude.json, and activesettings.jsonshould be treated as the highest-confidence runtime exposure. - Findings from
settings.local.jsonare emitted asruntimeConfidence: project-local-optional. - Findings from locations such as
mcp-configs/,config/mcp/, orconfigs/mcp/indicate risky MCP definitions present in repository templates, not guaranteed active runtime enablement. - JSON, markdown, terminal, and HTML outputs now expose source context via
runtimeConfidence: active-runtime | project-local-optional | template-example | docs-example | plugin-manifest | hook-code. - Non-secret
template-exampleMCP findings are score-weighted at0.25x, and one template file is capped at10deduction points per score category so a single MCP catalog cannot score like dozens of enabled servers. - In template files, findings such as risky server type, remote URL transport,
npx -y, unpinned packages, and environment inheritance are still valuable, but they should be interpreted as "this repo ships a risky MCP template" rather than "this MCP is definitely enabled right now." - Aggregate findings like large MCP server counts are especially likely to overstate runtime exposure when the source file is a template catalog.
Agent Config Review (25 rules)
| What | Examples |
|------|----------|
| Unrestricted tools | Agents with Bash access, no allowedTools restriction |
| Prompt injection surface | Agents processing external/user-provided content without defenses |
| Auto-run instructions | CLAUDE.md containing "Always run", "without asking", "automatically install" |
| Hidden instructions | Unicode zero-width characters, HTML comments, base64-encoded directives |
| URL execution | CLAUDE.md instructing agents to fetch and execute remote URLs |
| Time bombs | Delayed execution instructions triggered by time or absence conditions |
| Data harvesting | Bulk collection of passwords, credentials, or database dumps |
| Prompt reflection | ignore previous instructions, you are now, DAN jailbreak, fake system prompts |
| Output manipulation | always report ok, remove warnings from output, suppress security findings |
Structured JSON under .claude/subagents/ and .claude/slash-commands/ is analyzed like agent config when it declares allowedTools or similar tool metadata. Freeform skill-md prompt text still has narrower security coverage than agent-md and CLAUDE.md.
Scanner Accuracy Notes
- Live audit notes and follow-up items are tracked in
false-positive-audit.md. - The most useful operator guidance is in the audit's [`Triage Rules For Curr
