<p align="center"> <img src="imgs/logo.png" alt="LuaN1ao Logo" width="200" /> </p> <h1 align="center">LuaN1aoAgent</h1> <h2 align="center">

Cognitive-Driven AI Hackers

</h2> <div align="center">

</div> <div align="center"> <a href="https://zc.tencent.com/competition/competitionHackathon?code=cha004"><img src="imgs/tch.png" alt="[TCH]Top-RankedIntelligent Pentest Project" width="250" /></a>

🧠 Think Like Human Experts • 📊 Dynamic Graph Planning • 🔄 Learn From Failures • 🎯 Evidence-Driven Decisions

🚀 Quick Start • ✨ Core Innovations • 🏗️ System Architecture • 🗓️ Roadmap

🌐 中文版 • English Version

</div>

📖 Introduction

LuaN1ao (鸾鸟) is a next-generation Autonomous Penetration Testing Agent powered by Large Language Models (LLMs).

Traditional automated scanning tools rely on predefined rules and struggle with complex real-world scenarios. LuaN1ao breaks through these limitations by innovatively integrating the P-E-R (Planner-Executor-Reflector) Agent Collaboration Framework with Causal Graph Reasoning technology.

LuaN1ao simulates the thinking patterns of human security experts:

🎯 Strategic Planning: Dynamically plan attack paths based on global situational awareness
🔍 Evidence-Driven: Build rigorous "Evidence-Hypothesis-Validation" logical chains
🔄 Continuous Evolution: Learn from failures and autonomously adjust tactical strategies
🧠 Cognitive Loop: Form a complete cognitive cycle of planning-execution-reflection

From information gathering to vulnerability exploitation, LuaN1ao elevates penetration testing from "automated tools" to an "autonomous agent".

[!NOTE] LuaN1aoAgent achieves a 90.4% success rate on benchmark tasks fully autonomously, with a median exploit cost of only $0.09. →

🖼️ Showcase

https://github.com/user-attachments/assets/e2c19442-20db-40ab-a5c6-3bf5c9054ae8

💡 More demos coming soon!

🚀 Core Innovations

1️⃣ P-E-R Agent Collaboration Framework ⭐⭐⭐

LuaN1ao decouples penetration testing thinking into three independent yet collaborative cognitive roles, forming a complete decision-making loop:

🧠 Planner
- Strategic Brain: Dynamic planning based on global graph awareness
- Adaptive Capability: Identify dead ends and automatically generate alternative paths
- Graph Operation Driven: Output structured graph editing instructions rather than natural language
- Parallel Scheduling: Automatically identify parallelizable tasks based on topological dependencies
- Adaptive Step Count: Allocate extra execution steps (max_steps) per subtask for complex tasks (blind injection extraction, multi-stage bypass, etc.)
⚙️ Executor
- Tactical Execution: Focus on single sub-task tool invocation and result analysis
- Tool Orchestration: Unified scheduling of security tools via MCP (Model Context Protocol)
- Context Compression: Intelligent message history management to avoid token overflow
- Fault Tolerance: Automatic handling of network transient errors and tool invocation failures
- Hypothesis Persistence: Hypotheses from formulate_hypotheses are preserved across steps and survive context compression
- Parallel Discovery Sharing: Parallel subtasks exchange high-value findings in real-time via a shared bulletin board (ConfirmedVulnerability and high-confidence KeyFact)
- First-Step Guidance: When no confirmed vulnerabilities exist, automatically prompts the agent to formulate a hypothesis framework before blind exploration
⚖️ Reflector
- Audit Analysis: Review task execution and validate artifact effectiveness
- Failure Attribution: L1-L4 level failure pattern analysis to prevent repeated errors
- Intelligence Generation: Extract attack intelligence and build knowledge accumulation
- Termination Control: Judge goal achievement or task entrapment

Key Advantages: Role separation avoids the "split personality" problem of single agents. Each component focuses on its core responsibilities and collaborates via event bus.

2️⃣ Causal Graph Reasoning ⭐⭐⭐

LuaN1ao rejects blind guessing and LLM hallucinations, constructing explicit causal graphs to drive testing decisions:

graph LR
    E[🔍 Evidence<br/>Evidence Node] -->|Support| H[💭 Hypothesis<br/>Hypothesis Node]
    H -->|Validation| V[⚠️ Vulnerability<br/>Vulnerability Node]
    V -->|Exploitation| X[💥 Exploit<br/>Exploit Node]

Core Principles:

Evidence First: Any hypothesis requires explicit prior evidence support
Confidence Quantification: Each causal edge has a confidence score to avoid blind advancement
Traceability: Complete recording of reasoning chains for failure tracing and experience reuse
Hallucination Prevention: Mandatory evidence validation, rejecting unfounded attacks

Example Scenario:

Evidence: Port scan discovers 3306/tcp open
  ↓ (Confidence 0.8)
Hypothesis: Target runs MySQL service
  ↓ (Validation successful)
Vulnerability: MySQL weak password/unauthorized access
  ↓ (Attempt exploitation)
Exploit: mysql -h target -u root -p [brute-force/empty password]

3️⃣ Plan-on-Graph (PoG) Dynamic Task Planning ⭐⭐⭐

Say goodbye to static task lists. LuaN1ao models penetration testing plans as dynamically evolving Directed Acyclic Graphs (DAGs):

Core Features:

Graph Operation Language: Planner outputs standardized graph editing operations (ADD_NODE, UPDATE_NODE, DEPRECATE_NODE)
Real-time Adaptation: Task graphs deform in real-time with testing progress
- Discover new ports → Automatically mount service scanning subgraphs
- Encounter WAF → Insert bypass strategy nodes
- Path blocked → Automatically prune or branch planning
Topological Dependency Management: Automatically identify and parallelize independent tasks based on DAG topology
State Tracking: Each node contains a state machine (pending, in_progress, completed, failed, deprecated)

Comparison with Traditional Planning:

| Feature | Traditional Task List | Plan-on-Graph | |---------|----------------------|---------------| | Structure | Linear list | Directed graph | | Dependency Management | Manual sorting | Topological auto-sorting | | Parallel Capability | None | Auto-identify parallel paths | | Dynamic Adjustment | Regenerate | Local graph editing | | Visualization | Difficult | Native support (Web UI) |

Visualization Example: Start the Web Server to view the task graph evolution in real-time in the browser.

Core Capabilities

Tool Integration (MCP Protocol)

LuaN1ao achieves unified integration and scheduling of tools through the Model Context Protocol (MCP):

HTTP/HTTPS Requests: Support for custom headers, proxies, timeout control
Shell Command Execution: Securely encapsulated system command invocation (containerized execution recommended)
Python Code Execution: Dynamic execution of Python scripts for complex logic processing
Metacognitive Tools: think (deep thinking), formulate_hypotheses (hypothesis generation), reflect_on_failure (failure reflection)
Task Control: halt_task (early task termination)
Local Graph Query: query_causal_graph (direct in-process causal graph lookup, zero MCP latency)

💡 Extensibility: New tools can be easily integrated via mcp.json (e.g., Metasploit, Nuclei, Burp Suite API)

Knowledge Enhancement (RAG)

Vector Retrieval: Efficient knowledge base retrieval based on FAISS
Domain Knowledge: Integration of PayloadsAllTheThings and other open-source security knowledge bases
Dynamic Learning: Continuous addition of custom knowledge documents

Web Visualization (New Architecture)

The Web UI is now a standalone service powered by a database, enabling persistent task monitoring and management.

Real-time Monitoring: Browser view of dynamic task graph evolution and live logs.
Node Details: Click nodes to view execution logs, artifacts, state transitions.
Task Management: Create, abort, and delete historical tasks.
Data Persistence: All task data is stored in SQLite (luan1ao.db), preserving history across restarts.

Human-in-the-Loop (HITL) Mode

LuaN1ao Agent supports a Human-in-the-Loop (HITL) mode, allowing experts to supervise and intervene in the decision-making process.

Enable: Set HUMAN_IN_THE_LOOP=true in .env.
Approval: The agent pauses after generating a plan (initial or dynamic), waiting for human approval via Web UI or CLI.
Modification: Experts can reject or directly modify the plan (JSON editing) before execution.
Injection: Supports real-time injection of new sub-tasks via the Web UI ("Active Intervention").

Interaction Methods:

Web UI: Approval modal pops up automatically. Use "Modify" to edit plans or "Add Task" button to inject tasks.
CLI: Prompts with HITL >. Type y to approve, n to reject, or m to modify (opens system editor).

LuaN1aoAgent

Install / Use

README