Phantom
Autonomous Offensive Security Intelligence AI-powered multi-agent penetration testing
Install / Use
/learn @Usta0x001/PhantomREADME
☠ PHANTOM
Autonomous Adversary Simulation Platform
AI-native penetration testing — autonomous reconnaissance, exploitation, and verified results.
<br/> <br/>Quick Start · Architecture · Usage · Configuration · Contributing
<br/> </div>Overview
Phantom is an autonomous AI penetration testing agent built on the ReAct (Reason–Act) loop. It connects a large language model to over 30 professional security tools, runs all offensive operations inside an isolated Docker sandbox, and produces verified vulnerability reports — entirely without human intervention.
Unlike CVE-signature scanners, Phantom reasons about your target: it reads HTTP responses, forms hypotheses, selects the right tool, chains multi-step exploits, then writes and executes a proof-of-concept script to confirm every finding before it appears in a report.
| | Traditional Scanners | Phantom | |--|--|--| | Approach | Signature matching against CVE databases | LLM reasoning + adaptive tool chaining | | False Positives | 40–70% — requires manual triage | Every finding verified with a working PoC | | Depth | Single-pass HTTP probe | Multi-phase: recon → exploit → verify | | Adaptability | Fixed rules, static payloads | Adapts to target responses in real time | | Novel Vulns | Known CVEs only | Logic flaws + novel attack paths | | Reporting | Generic vulnerability lists | MITRE ATT&CK mapped, compliance-ready |
Core Capabilities
<table> <tr> <td align="center">🧠</td> <td><strong>Autonomous ReAct Loop</strong> — Plans, executes tools, reads results, re-plans. Handles dead ends and unexpected responses without human guidance.</td> </tr> <tr> <td align="center">🔧</td> <td><strong>53 Security Tools</strong> — nmap · nuclei · sqlmap · ffuf · httpx · katana · subfinder · nikto · gobuster · arjun · semgrep · playwright — all orchestrated automatically.</td> </tr> <tr> <td align="center">🐳</td> <td><strong>Ephemeral Docker Sandbox</strong> — All offensive tooling runs in a network-restricted Kali Linux container. Zero host filesystem access. Container is destroyed after every scan.</td> </tr> <tr> <td align="center">⚡</td> <td><strong>Multi-Agent Parallelism</strong> — Spawns specialized sub-agents (SQLi, XSS, recon) that work concurrently and report findings to the coordinator.</td> </tr> <tr> <td align="center">🛡️</td> <td><strong>7-Layer Defense Model</strong> — Scope guard → Tool firewall → Docker sandbox → Cost limiter → Time budget → HMAC audit trail → Output sanitizer.</td> </tr> <tr> <td align="center">✅</td> <td><strong>Verified Findings Only</strong> — No hallucinations. Every reported vulnerability includes raw HTTP evidence, reproduction steps, and a working exploit script.</td> </tr> <tr> <td align="center">🗺️</td> <td><strong>MITRE ATT&CK Enrichment</strong> — Automatic CWE, CAPEC, technique-level tagging, and CVSS 3.1 scoring per finding.</td> </tr> <tr> <td align="center">📋</td> <td><strong>Compliance Coverage</strong> — OWASP Top 10 (2021) · PCI DSS v4.0 · NIST 800-53 — mapped automatically per finding.</td> </tr> <tr> <td align="center">💾</td> <td><strong>Knowledge Persistence</strong> — Cross-scan memory stores hosts, past findings, and false-positive signatures. Each scan learns from the last.</td> </tr> <tr> <td align="center">💰</td> <td><strong>Full Cost Control</strong> — Per-request and per-scan budget caps. Every token and every dollar tracked in real time.</td> </tr> </table>Architecture
<details open> <summary><strong>① System Architecture — Component Overview</strong></summary> <br/>%%{init: {"theme": "dark"}}%%
flowchart TD
USER(["👤 User / CI-CD"])
subgraph IFACE["Interface Layer"]
CLI["CLI · TUI"]
PARSER["Output Parser"]
end
subgraph ORCH["Orchestration"]
PROFILE["Scan Profile"]
SCOPE["Scope Guard"]
COST["Cost Controller"]
AUDIT["HMAC Audit Log"]
end
subgraph AGENT["Agent Core — ReAct"]
LLM["LLM via LiteLLM"]
STATE["State Machine"]
MEM["Memory Engine"]
SKILLS["Skills Engine"]
end
subgraph SEC["Security Layer"]
FW["Tool Firewall"]
VERIFY["Verifier"]
SANIT["Sanitizer"]
end
subgraph SANDBOX["Docker Sandbox — Kali Linux"]
TSRV["Tool Server :48081"]
TOOLS["30+ Security Tools"]
BROWSER["Playwright · Chromium"]
PROXY["Caido Proxy :48080"]
end
subgraph OUTPUT["Output Pipeline"]
REPORTS["JSON · MD · HTML"]
GRAPH["Attack Graph"]
MITRE["MITRE ATT&CK Map"]
end
USER --> IFACE
IFACE --> ORCH
ORCH --> AGENT
AGENT <--> SEC
SEC --> SANDBOX
AGENT --> OUTPUT
style IFACE fill:#6c5ce7,stroke:#a29bfe,color:#ffffff
style ORCH fill:#00b894,stroke:#55efc4,color:#ffffff
style AGENT fill:#e17055,stroke:#fab1a0,color:#ffffff
style SEC fill:#d63031,stroke:#ff7675,color:#ffffff
style SANDBOX fill:#0984e3,stroke:#74b9ff,color:#ffffff
style OUTPUT fill:#f9ca24,stroke:#f0932b,color:#2d3436
</details>
<details>
<summary><strong>② Scan Execution Flow — Phase by Phase</strong></summary>
<br/>
%%{init: {"theme": "dark"}}%%
sequenceDiagram
actor User
participant CLI as Phantom CLI
participant Orch as Orchestrator
participant Agent as Agent ReAct
participant FW as Tool Firewall
participant Box as Docker Sandbox
participant LLM as LLM Provider
participant T as Target App
User->>CLI: phantom scan -t https://app.com
CLI->>Orch: Validate scope · init cost controller
Orch->>Box: Spin up ephemeral Kali container
Orch->>Agent: Begin scan · profile + scope injected
rect rgb(48, 25, 80)
Note over Agent,LLM: Phase 1 — Reconnaissance
Agent->>LLM: Analyze target · plan recon
LLM-->>Agent: Run katana · httpx · nmap
Agent->>FW: Validate tool call
FW-->>Agent: Approved
Agent->>Box: Execute recon tools
Box->>T: HTTP probes · port scans · crawl
T-->>Box: Responses
Box-->>Agent: Endpoints · tech stack · open ports
end
rect rgb(80, 20, 20)
Note over Agent,LLM: Phase 2 — Exploitation
Agent->>LLM: Hypothesize attack vectors
LLM-->>Agent: SQLi on /api/login · XSS on /search
Agent->>Box: sqlmap · custom payload injection
Box->>T: Exploit attempts
T-->>Box: Vulnerability confirmed
Box-->>Agent: Raw HTTP evidence
end
rect rgb(15, 60, 30)
Note over Agent,LLM: Phase 3 — Verification
Agent->>Box: Re-exploit with clean PoC script
Box->>T: Reproduce exact attack
T-->>Box: Confirmed
Agent->>Agent: CVSS 3.1 · CWE tag · MITRE map
end
Agent->>CLI: Findings compiled
CLI->>User: Vulnerabilities + PoCs + Compliance
CLI->>Box: Destroy container
</details>
<details>
<summary><strong>③ Agent ReAct Loop — Decision Cycle</strong></summary>
<br/>
%%{init: {"theme": "dark"}}%%
flowchart LR
INIT(["Scan Start"])
OBS["Observe\nCollect results"]
THINK["Reason\nAnalyze context"]
PLAN["Plan\nChoose tool"]
ACT["Act\nBuild arguments"]
FW{"Firewall?"}
EXEC["Execute\nDocker sandbox"]
DONE{"Stop\nCondition?"}
VERIFY["Verify\nRe-test findings"]
ENRICH["Enrich\nMITRE · CVSS"]
REPORT["Report\nJSON · HTML · MD"]
FINISH(["Scan Complete ☠"])
INIT --> OBS
OBS --> THINK
THINK --> PLAN
PLAN --> ACT
ACT --> FW
FW -- "✓ Pass" --> EXEC
FW -- "✗ Block" --> THINK
EXEC --> OBS
OBS --> DONE
DONE -- "Continue" --> THINK
DONE -- "Done" --> VERIFY
VERIFY --> ENRICH
ENRICH --> REPORT
REPORT --> FINISH
style INIT fill:#6c5ce7,stroke:#a29bfe,color:#fff
style FINISH fill:#6c5ce7,stroke:#a29bfe,color:#fff
style FW fill:#d63031,stroke:#ff7675,color:#fff
style DONE fill:#e17055,stroke:#fab1a0,color:#fff
style EXEC fill:#0984e3,stroke:#74b9ff,color:#fff
style REPORT fill:#00b894,stroke:#55efc4,color:#fff
</details>
<details>
<summary><strong>④ Docker Sandbox — Isolation Architecture</strong></summary>
<br/>
%%{init: {"theme": "dark"}}%%
flowchart LR
HOST(["Phantom Agent\nHost Machine"])
subgraph CONTAINER["Kali Linux Container — Network Isolated"]
TSRV["Tool Server :48081"]
PROXY["Caido Proxy :48080"]
subgraph TOOLKIT["Security Toolkit"]
SCA["nmap · masscan"]
INJ["sqlmap · nuclei"]
FUZ["ffuf · gobuster · arjun"]
WEB["httpx · katana"]
ANA["nikto · semgrep"]
end
subgraph RUNTIME["Runtime Environment"]
PY["Python 3.12"]
BR["Playwright + Chromium"]
SH["Bash Shell"]
end
end
TARGET(["Target\nApplication"])
HOST -- "Authenticated API" --> TSRV
TSRV --> TOOLKIT
TSR
