GhidrAssist

An LLM extension for Ghidra to enable AI assistance in RE.

Generate Convert Improve

Install / Use

/learn @symgraph/GhidrAssist

About this skill

Quality Score

0/100

README

GhidrAssist

Author: Jason Tang

An advanced LLM-powered plugin for interactive reverse engineering assistance in Ghidra.

Description

GhidrAssist integrates Large Language Models (LLMs) into Ghidra to provide intelligent assistance for binary exploration and reverse engineering. It supports any OpenAI v1-compatible API, including local models (Ollama, LM-Studio, Open-WebUI) and cloud providers (OpenAI, Anthropic, Azure).

Key Features

Core Functionality:

Code Explanation - Explain functions and instructions in both disassembly and decompiled pseudo-C
- Security analysis panel showing risk level, activity profile, and API usage
- Editable summaries with user-edit protection from auto-overwrite
Interactive Chat - Multi-turn conversational queries with persistent chat history
Custom Queries - Direct LLM queries with optional context from current function/location

Graph-RAG Knowledge System:

Semantic Knowledge Graph - Hierarchical representation of binary analysis
- 5-level semantic hierarchy: Statement → Block → Function → Module → Binary
- Pre-computed LLM summaries enable fast, LLM-free queries
- SQLite persistence with JGraphT graph algorithms
- Full-text search (FTS5) on summaries and security annotations
Community Detection - Automatic module discovery via Leiden algorithm
- Groups related functions into logical modules
- Hierarchical community structure with summaries
- Visual graph exploration with configurable depth
Security Feature Extraction - Comprehensive security analysis
- Network APIs: POSIX sockets, WinSock, DNS, SSL/TLS, WinHTTP, WinINet
- File I/O APIs: POSIX, Windows, C library functions
- Crypto APIs: OpenSSL, Windows crypto, platform-specific
- String patterns: IP addresses, URLs, domains, file paths, registry keys
- Risk level classification (LOW/MEDIUM/HIGH) and activity profiling
Semantic Graph Tab - Visual knowledge graph interface
- Graph view with N-hop depth exploration
- List view of all indexed functions
- Semantic search across summaries
- One-click re-indexing and security analysis

Advanced Capabilities:

Extended Thinking/Reasoning Control - Adjust LLM reasoning depth for quality vs. speed trade-offs
- Support for OpenAI o1/o3/o4, Claude with extended thinking, and local reasoning models
- Configurable effort levels: Low (fast), Medium (balanced), High (thorough)
- Per-program persistence - different binaries can use different reasoning levels
- Provider-agnostic implementation (Anthropic, OpenAI, Azure, LiteLLM, LMStudio, Ollama)
ReAct Agentic Mode - Autonomous investigation using structured reasoning (Think-Act-Observe)
- LLM proposes investigation steps based on your query
- Systematic tool execution with progress tracking via todo lists
- Iteration history preservation showing all investigation steps
- Final synthesis with comprehensive answer and key findings
- Accurate metrics (iterations, tool calls, duration)
MCP Integration - Model Context Protocol client for tool-based analysis
- Works with GhidrAssistMCP for Ghidra-specific tools
- Conversational tool calling with automatic function execution
- Support for SSE (Server-Sent Events) transport
Function Calling - LLM can autonomously navigate binaries and modify analysis
- Rename functions and variables
- Navigate to addresses and cross-references
- Execute Ghidra commands
Actions Tab - Propose and apply bulk analysis improvements
- Security vulnerability detection
- Code quality analysis
- Automated refactoring suggestions
RAG (Retrieval Augmented Generation) - Enhance queries with contextual documents
- Add custom documentation, exploit notes, architecture references
- Lucene-based full-text search
- Context injection into queries
RLHF Dataset Generation - Collect feedback for model fine-tuning

Architecture

The plugin uses a modular, service-oriented architecture:

Core Services:

Query Modes: Regular queries, MCP-enhanced queries, or full agentic investigation
ReAct Orchestrator: Manages autonomous investigation loops with todo tracking and findings accumulation
Conversational Tool Handler: Manages multi-turn tool calling sessions
MCPToolManager: Interfaces with external MCP servers for specialized tools

Graph-RAG Backend:

BinaryKnowledgeGraph: Hybrid SQLite + JGraphT storage for semantic knowledge
GraphRAGEngine: LLM-free query engine using pre-computed summaries
SemanticExtractor: LLM-powered function summarization with batch processing
SecurityFeatureExtractor: Static analysis for network, file I/O, and crypto APIs
CommunityDetector: Leiden algorithm implementation for module discovery

Data Layer:

AnalysisDB: SQLite database for chat history, RLHF feedback, and knowledge graphs
SchemaMigrationRunner: Versioned database migrations for transparent upgrades
RAGEngine: Lucene-powered document search for custom context injection

UI Components:

Tab-based interface: Explain, Query, Actions, Semantic Graph, RAG Management, MCP Servers
Service orchestration via TabController

Future Roadmap:

Model fine-tuning using collected RLHF dataset
Additional MCP tool integrations
Enhanced agentic capabilities, multi-agent collaboration
Embedding-based similarity search

Screenshots

Screenshot

https://github.com/user-attachments/assets/bd79474a-c82f-4083-b432-96625fef1387

Quickstart

If necessary, copy the binary release ZIP archive to the Ghidra_Install/Extensions/Ghidra directory.
Launch Ghidra -> File -> Install Extension -> Enable GhidrAssist.
Load a binary and launch the CodeBrowser.
CodeBrowser -> File -> Configure -> Miscellaneous -> Enable GhidrAssist.
CodeBrowser -> Window -> GhidraAssistPlugin.
Ensure the RLHF and RAG database paths are appropriate for your environment.
Point the API host to your preferred API provider and set the API key.
(Optional) In the Analysis Options tab, set the Reasoning Effort level (None/Low/Medium/High) for models that support extended thinking.
Open GhidrAssist with the GhidrAssist option in the Windows menu and start exploring.

LLM Setup

GhidrAssist works with any OpenAI v1-compatible API. Setup details are provider-specific - here are some helpful resources:

Local LLM Providers:

LM Studio - Easy local model hosting with GUI
Ollama - Command-line local model management
Open-WebUI - Web interface for local models

Cloud Providers:

LiteLLM Proxy (Multi-Provider Gateway):

LiteLLM - Unified API for 100+ LLM providers
Supports AWS Bedrock, Google Vertex AI, Azure, and many others
Select "LiteLLM" as provider type in GhidrAssist settings
Automatic model family detection for proper message formatting

Recommended Models

For Agentic Mode (requires strong reasoning and tool use):

Cloud: GPT-5.1, Claude Sonnet 4.5
Local: GPT-OSS, Llama 3.3 70B, DeepSeek-R1 70B, Qwen2.5 72B

Models with Extended Thinking/Reasoning Support:

OpenAI: o1-preview, o1-mini, o3-mini, o4-mini, gpt-5 (use reasoning_effort parameter)
Anthropic: Claude Sonnet 4.5, Claude Opus 4.5, Claude Haiku 4.5, Claude Opus 4.1/4, Claude Sonnet 4 (use thinking.budget_tokens parameter)
Local: openai/gpt-oss-20b via Ollama/LMStudio (supports effort levels)

Reasoning Effort Guidelines:

Low: Quick analysis, minimal thinking tokens (~5-10s, lower cost)
Medium: Balanced reasoning depth (~15-30s, moderate cost)
High: Deep security analysis (~30-60s, 2x cost, recommended for vulnerability hunting)

Note: Agentic mode requires models with strong function calling and multi-step reasoning capabilities. Smaller models may struggle with complex investigations. Extended thinking is optional but can significantly improve analysis quality for complex reverse engineering tasks.

Using GhidrAssistMCP for Tool-Based Analysis

GhidrAssistMCP provides MCP tools that allow the LLM to interact directly with Ghidra's analysis capabilities.

Setup

Start the MCP Server
Configure GhidrAssist:
- Open Tools → GhidrAssist Settings → MCP Servers tab
- Add server: http://127.0.0.1:8081 as GhidrAssistMCP with transport type SSE
Enable MCP in queries:
- In the Custom Query tab, check "Use MCP"
- Optionally enable "Agentic" for autonomous investigation mode

Usage Modes

Regular MCP Queries:

Enable "Use MCP" checkbox
Ask questions like "What does the current function do?"
LLM can call tools to get decompilation, cross-references, etc.

Agentic Mode (Recommended):

Enable both "Use MCP" and "Agentic" checkboxes
Ask complex questions like "Find vulnerabilities in this function" or "Analyze the call graph"
The ReAct agent will:
1. Propose investigation steps as a todo list
2. Systematically execute tools to gather information
3. Track progress and accumulate findings
4. Synthesize a comprehensive answer with evidence

Example Queries:

"What security vulnerabilities exist in this function?"
"Trace the data flow from user input to this call"
"Find all functions that modify global variable X"
"Analyze the error handling in the current function"

Using the Semantic Graph (Graph-RAG)

The Semantic Graph tab provides a knowledge graph interface for exploring binary analysis results without requiring LLM

Related Skills

node-connect

352.2k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.1k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.2k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.2k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。