Codescan
Agent-native code security review with MCP, structured findings, and practical pre-merge scanning workflows.
Install / Use
/learn @HeJiguang/CodescanQuality Score
Category
Development & EngineeringSupported Platforms
README
CodeScan
English | 简体中文
AI-assisted code security scanning for files, repositories, Git diffs, and coding agents.
Start with deterministic rules, use LLM analysis to deepen context, and expose the scanner through MCP and a Codex skill.
Quick Links
- Why CodeScan
- Who This Is For
- Try It In 5 Minutes
- Quick Start
- Use With Codex
- Example Output
- Get Involved
- Roadmap
Why CodeScan
Many AI code scanners are just chat wrappers around pasted source files. They can sound smart, but the output is unstable, difficult to integrate, and hard to trust in real workflows.
CodeScan takes a stricter route:
- Start with deterministic rule-based signal
- Use LLM analysis to deepen context and explanation
- Force structured output instead of free-form blob parsing
- Deliver the same result model through CLI, reports, MCP tools, and Codex workflows
CodeScan focuses on review workflows where deterministic checks, structured findings, and agent integration matter.
Who This Is For
CodeScan is most useful today for:
- developers who want a second security pass before merging code
- teams using Codex, Cursor, or Claude and wanting structured security tooling
- maintainers who want a lightweight repository triage tool without standing up a large platform
- contributors interested in security rules, AI-assisted analysis, or MCP-native developer tools
Today it works best as a review assistant and agent-native scanning layer rather than a full SAST platform.
Try It In 5 Minutes
Start with these three items:
- Browse the example fixture at
examples/demo-vulnerable-app - Open the representative result at
examples/sample-mcp-result.json - Read the visual walkthrough in Example Output
For a local run:
pip install -e .
python -m codescan config --provider deepseek --api-key YOUR_API_KEY --model deepseek-chat
python -m codescan dir examples/demo-vulnerable-app --output demo-result.json
What Makes It Different
| Area | What it does now | Why it matters |
| --- | --- | --- |
| LangChain providers | Unifies DeepSeek, OpenAI, Anthropic, and OpenAI-compatible endpoints | Swap models without rewriting the scanner |
| LangGraph workflow | Models file analysis as rule_scan -> llm_scan -> merge_and_finalize | Gives the AI runtime a real pipeline instead of prompt spaghetti |
| MCP Server | Exposes structured scan tools for coding agents | Lets Codex and other MCP clients call CodeScan directly |
| Skill layer | Ships an installable codescan-review skill | Teaches Codex when to scan and how to present findings |
| Report system | Generates HTML / JSON / text output | Works for both humans and automation |
| Tests + CI | Verifies runtime, packaging, docs, and entry points | Keeps the repo from slipping back into prototype quality |
Architecture
flowchart LR
A["CLI / GUI"] --> B["CodeScanner"]
A2["MCP Server"] --> B
A3["Codex Skill"] --> A2
B --> C["AIAnalysisService"]
C --> D["providers.py"]
C --> E["chains.py"]
C --> F["workflow.py"]
B --> G["VulnerabilityDB"]
B --> H["report.py"]
F --> I["rule_scan"]
F --> J["llm_scan"]
F --> K["merge_and_finalize"]
Core layout:
codescan/
├── ai/
│ ├── providers.py
│ ├── prompts.py
│ ├── chains.py
│ ├── workflow.py
│ ├── schemas.py
│ └── service.py
├── scanner.py
├── report.py
├── vulndb.py
├── mcp_server.py
└── __main__.py
skills/
└── codescan-review/
Quick Start
1. Clone
git clone https://github.com/HeJiguang/codescan.git
cd codescan
2. Install
python -m venv .venv
# Linux / macOS
source .venv/bin/activate
# Windows
.venv\Scripts\activate
pip install -e .
3. Configure a model
python -m codescan config --show
python -m codescan config --provider deepseek --api-key YOUR_DEEPSEEK_API_KEY --model deepseek-chat
4. Try the CLI
python -m codescan file /path/to/file.py
python -m codescan dir /path/to/project
python -m codescan git-merge main
5. Try the MCP server
codescan-mcp --transport stdio
Use With Codex
<p align="center"> <img src="docs/assets/codex-workflow.svg" alt="CodeScan Codex workflow" width="100%" /> </p>To use CodeScan from Codex, combine both layers:
- Install the
codescan-reviewskill - Run
codescan-mcp --transport stdio - Ask Codex for a security review with a concrete scan scope
That gives Codex workflow guidance plus real structured scan tools.
Good starter prompts:
Use $codescan-review to inspect the current branch against main and report only actionable security findings.
Use $codescan-review to inspect this file for security issues, especially trust boundaries and command execution risks.
Use $codescan-review to scan this repository and summarize the top security risks by severity.
More setup detail is in Use With Codex, MCP Guide, and Skill Guide.
Can MCP Actually Improve Agent Security?
Yes, within a review workflow.
CodeScan can improve the safety of agent-authored code when it is used at the right time and treated as a review tool instead of an automatic guarantee:
- highest value: scan the current branch or diff before merge
- strong value: scan a suspicious file that touches auth, SQL, shell execution, file handling, templating, or secrets
- lower value: run a broad repository sweep for intake or triage
What MCP changes is the integration cost. Instead of asking an agent to shell out, wait for reports, and parse result files, CodeScan can return structured findings directly in the review loop.
What MCP does not solve by itself:
- false positives from lightweight rule matching
- missing deeper data-flow or framework-aware analysis
- the need to manually verify high-severity findings before treating them as confirmed
In other words: MCP makes secure review workflows easier for agents to use consistently. It does not turn any scanner into a complete security gate on its own.
Example Output
<p align="center"> <img src="docs/assets/sample-findings.svg" alt="CodeScan sample findings preview" width="100%" /> </p>The repo includes a small intentionally vulnerable fixture plus a representative structured scan result:
These files show the expected result shape before any local model setup.
More detail is in Example Output.
Get Involved
Current contribution areas:
- improve rule quality and reduce false positives
- add Semgrep or AST-backed checks
- improve GUI usability or split
gui.py - add benchmark repositories and evaluation fixtures
- improve docs, examples, onboarding, and Codex workflows
Start here:
- Contributing Guide
- Good First Issues Guide
- Community Guide
- Support
- MCP Guide
- Skill Guide
- the
good first issueslane described in the contributing guide
Use the GitHub issue templates for bugs and feature proposals.
What Ships Today
- Unified provider layer for modern chat models
- LangGraph-based file analysis workflow
- File, directory, GitHub repo, and Git diff scanning
- HTML / JSON / text report generation
- Desktop GUI
- MCP server with structured security tools
- Installable
codescan-reviewskill for Codex - Codex-specific setup guide and workflow visuals
- Demo vulnerable fixture and example MCP-style findings
- GitHub Actions CI and test coverage
Quality Gate
python -m pytest tests -q
python -m compileall codescan
python -m codescan --help
python -m codescan mcp --help
Roadmap
- [x] Rebuild the AI runtime with
LangChain + LangGraph - [x] Repair CLI / GUI / report-layer contract mismatches
- [x] Add packaging metadata, tests, and public CI
- [x] Publish an MCP server surface for coding agents
- [x] Publish an installable Codex skill
- [x] Add concrete example outputs to the repo homepage
- [ ] Strengthen rule trustworthiness with deeper Semgrep / AST review flows
- [ ] Add SARIF output and GitHub code scanning integration
- [ ] Continue splitting scan/export/settings logic out of
gui.py - [ ] Add benchmark repositories and repeatable evaluation fixtures
Docs
- Technical Doc
- MCP Guide
- Skill Guide
- Use With Codex
- Example Output
- Good First Issues
- Community Guide
- Contributing
- Support
License
MIT. See LICENSE.
