LuaN1aoAgent
LuaN1aoAgent is a cognitive-driven AI hacker. It is a fully autonomous AI penetration testing agent powered by DeepSeek V3.2. Using dual-graph reasoning, LuaN1ao achieves a success rate of over 90% on the XBOW Benchmark, with a median exploit cost of just $0.09.
Install / Use
/learn @SanMuzZzZz/LuaN1aoAgentREADME
Cognitive-Driven AI Hackers
</h2> <div align="center"> </div> <div align="center"> <a href="https://zc.tencent.com/competition/competitionHackathon?code=cha004"><img src="imgs/tch.png" alt="[TCH]Top-RankedIntelligent Pentest Project" width="250" /></a>🧠 Think Like Human Experts • 📊 Dynamic Graph Planning • 🔄 Learn From Failures • 🎯 Evidence-Driven Decisions
🚀 Quick Start • ✨ Core Innovations • 🏗️ System Architecture • 🗓️ Roadmap
</div>📖 Introduction
LuaN1ao (鸾鸟) is a next-generation Autonomous Penetration Testing Agent powered by Large Language Models (LLMs).
Traditional automated scanning tools rely on predefined rules and struggle with complex real-world scenarios. LuaN1ao breaks through these limitations by innovatively integrating the P-E-R (Planner-Executor-Reflector) Agent Collaboration Framework with Causal Graph Reasoning technology.
LuaN1ao simulates the thinking patterns of human security experts:
- 🎯 Strategic Planning: Dynamically plan attack paths based on global situational awareness
- 🔍 Evidence-Driven: Build rigorous "Evidence-Hypothesis-Validation" logical chains
- 🔄 Continuous Evolution: Learn from failures and autonomously adjust tactical strategies
- 🧠 Cognitive Loop: Form a complete cognitive cycle of planning-execution-reflection
From information gathering to vulnerability exploitation, LuaN1ao elevates penetration testing from "automated tools" to an "autonomous agent".
<p align="center"> <a href="https://github.com/SanMuzZzZz/LuaN1aoAgent"> <img src="https://img.shields.io/badge/⭐-Give%20us%20a%20Star-yellow?style=for-the-badge&logo=github" alt="Give us a Star"> </a> </p>
🖼️ Showcase
https://github.com/user-attachments/assets/e2c19442-20db-40ab-a5c6-3bf5c9054ae8
💡 More demos coming soon!
🚀 Core Innovations
1️⃣ P-E-R Agent Collaboration Framework ⭐⭐⭐
LuaN1ao decouples penetration testing thinking into three independent yet collaborative cognitive roles, forming a complete decision-making loop:
-
🧠 Planner
- Strategic Brain: Dynamic planning based on global graph awareness
- Adaptive Capability: Identify dead ends and automatically generate alternative paths
- Graph Operation Driven: Output structured graph editing instructions rather than natural language
- Parallel Scheduling: Automatically identify parallelizable tasks based on topological dependencies
- Adaptive Step Count: Allocate extra execution steps (
max_steps) per subtask for complex tasks (blind injection extraction, multi-stage bypass, etc.)
-
⚙️ Executor
- Tactical Execution: Focus on single sub-task tool invocation and result analysis
- Tool Orchestration: Unified scheduling of security tools via MCP (Model Context Protocol)
- Context Compression: Intelligent message history management to avoid token overflow
- Fault Tolerance: Automatic handling of network transient errors and tool invocation failures
- Hypothesis Persistence: Hypotheses from
formulate_hypothesesare preserved across steps and survive context compression - Parallel Discovery Sharing: Parallel subtasks exchange high-value findings in real-time via a shared bulletin board (ConfirmedVulnerability and high-confidence KeyFact)
- First-Step Guidance: When no confirmed vulnerabilities exist, automatically prompts the agent to formulate a hypothesis framework before blind exploration
-
⚖️ Reflector
- Audit Analysis: Review task execution and validate artifact effectiveness
- Failure Attribution: L1-L4 level failure pattern analysis to prevent repeated errors
- Intelligence Generation: Extract attack intelligence and build knowledge accumulation
- Termination Control: Judge goal achievement or task entrapment
Key Advantages: Role separation avoids the "split personality" problem of single agents. Each component focuses on its core responsibilities and collaborates via event bus.
2️⃣ Causal Graph Reasoning ⭐⭐⭐
LuaN1ao rejects blind guessing and LLM hallucinations, constructing explicit causal graphs to drive testing decisions:
graph LR
E[🔍 Evidence<br/>Evidence Node] -->|Support| H[💭 Hypothesis<br/>Hypothesis Node]
H -->|Validation| V[⚠️ Vulnerability<br/>Vulnerability Node]
V -->|Exploitation| X[💥 Exploit<br/>Exploit Node]
Core Principles:
- Evidence First: Any hypothesis requires explicit prior evidence support
- Confidence Quantification: Each causal edge has a confidence score to avoid blind advancement
- Traceability: Complete recording of reasoning chains for failure tracing and experience reuse
- Hallucination Prevention: Mandatory evidence validation, rejecting unfounded attacks
Example Scenario:
Evidence: Port scan discovers 3306/tcp open
↓ (Confidence 0.8)
Hypothesis: Target runs MySQL service
↓ (Validation successful)
Vulnerability: MySQL weak password/unauthorized access
↓ (Attempt exploitation)
Exploit: mysql -h target -u root -p [brute-force/empty password]
3️⃣ Plan-on-Graph (PoG) Dynamic Task Planning ⭐⭐⭐
Say goodbye to static task lists. LuaN1ao models penetration testing plans as dynamically evolving Directed Acyclic Graphs (DAGs):
Core Features:
- Graph Operation Language: Planner outputs standardized graph editing operations (
ADD_NODE,UPDATE_NODE,DEPRECATE_NODE) - Real-time Adaptation: Task graphs deform in real-time with testing progress
- Discover new ports → Automatically mount service scanning subgraphs
- Encounter WAF → Insert bypass strategy nodes
- Path blocked → Automatically prune or branch planning
- Topological Dependency Management: Automatically identify and parallelize independent tasks based on DAG topology
- State Tracking: Each node contains a state machine (
pending,in_progress,completed,failed,deprecated)
Comparison with Traditional Planning:
| Feature | Traditional Task List | Plan-on-Graph | |---------|----------------------|---------------| | Structure | Linear list | Directed graph | | Dependency Management | Manual sorting | Topological auto-sorting | | Parallel Capability | None | Auto-identify parallel paths | | Dynamic Adjustment | Regenerate | Local graph editing | | Visualization | Difficult | Native support (Web UI) |
Visualization Example: Start the Web Server to view the task graph evolution in real-time in the browser.
Core Capabilities
Tool Integration (MCP Protocol)
LuaN1ao achieves unified integration and scheduling of tools through the Model Context Protocol (MCP):
- HTTP/HTTPS Requests: Support for custom headers, proxies, timeout control
- Shell Command Execution: Securely encapsulated system command invocation (containerized execution recommended)
- Python Code Execution: Dynamic execution of Python scripts for complex logic processing
- Metacognitive Tools:
think(deep thinking),formulate_hypotheses(hypothesis generation),reflect_on_failure(failure reflection) - Task Control:
halt_task(early task termination) - Local Graph Query:
query_causal_graph(direct in-process causal graph lookup, zero MCP latency)
💡 Extensibility: New tools can be easily integrated via
mcp.json(e.g., Metasploit, Nuclei, Burp Suite API)
Knowledge Enhancement (RAG)
- Vector Retrieval: Efficient knowledge base retrieval based on FAISS
- Domain Knowledge: Integration of PayloadsAllTheThings and other open-source security knowledge bases
- Dynamic Learning: Continuous addition of custom knowledge documents
Web Visualization (New Architecture)
The Web UI is now a standalone service powered by a database, enabling persistent task monitoring and management.
- Real-time Monitoring: Browser view of dynamic task graph evolution and live logs.
- Node Details: Click nodes to view execution logs, artifacts, state transitions.
- Task Management: Create, abort, and delete historical tasks.
- Data Persistence: All task data is stored in SQLite (
luan1ao.db), preserving history across restarts.
Human-in-the-Loop (HITL) Mode
LuaN1ao Agent supports a Human-in-the-Loop (HITL) mode, allowing experts to supervise and intervene in the decision-making process.
- Enable: Set
HUMAN_IN_THE_LOOP=truein.env. - Approval: The agent pauses after generating a plan (initial or dynamic), waiting for human approval via Web UI or CLI.
- Modification: Experts can reject or directly modify the plan (JSON editing) before execution.
- Injection: Supports real-time injection of new sub-tasks via the Web UI ("Active Intervention").
Interaction Methods:
- Web UI: Approval modal pops up automatically. Use "Modify" to edit plans or "Add Task" button to inject tasks.
- CLI: Prompts with
HITL >. Typeyto approve,nto reject, ormto modify (opens system editor).
