CodeScan

English | 简体中文

AI-assisted code security scanning for files, repositories, Git diffs, and coding agents.
Start with deterministic rules, use LLM analysis to deepen context, and expose the scanner through MCP and a Codex skill.

</div>

Why CodeScan

Many AI code scanners are just chat wrappers around pasted source files. They can sound smart, but the output is unstable, difficult to integrate, and hard to trust in real workflows.

CodeScan takes a stricter route:

Start with deterministic rule-based signal
Use LLM analysis to deepen context and explanation
Force structured output instead of free-form blob parsing
Deliver the same result model through CLI, reports, MCP tools, and Codex workflows

CodeScan focuses on review workflows where deterministic checks, structured findings, and agent integration matter.

Who This Is For

CodeScan is most useful today for:

developers who want a second security pass before merging code
teams using Codex, Cursor, or Claude and wanting structured security tooling
maintainers who want a lightweight repository triage tool without standing up a large platform
contributors interested in security rules, AI-assisted analysis, or MCP-native developer tools

Today it works best as a review assistant and agent-native scanning layer rather than a full SAST platform.

Try It In 5 Minutes

Start with these three items:

Browse the example fixture at examples/demo-vulnerable-app
Open the representative result at examples/sample-mcp-result.json
Read the visual walkthrough in Example Output

For a local run:

pip install -e .
python -m codescan config --provider deepseek --api-key YOUR_API_KEY --model deepseek-chat
python -m codescan dir examples/demo-vulnerable-app --output demo-result.json

What Makes It Different

| Area | What it does now | Why it matters | | --- | --- | --- | | LangChain providers | Unifies DeepSeek, OpenAI, Anthropic, and OpenAI-compatible endpoints | Swap models without rewriting the scanner | | LangGraph workflow | Models file analysis as rule_scan -> llm_scan -> merge_and_finalize | Gives the AI runtime a real pipeline instead of prompt spaghetti | | MCP Server | Exposes structured scan tools for coding agents | Lets Codex and other MCP clients call CodeScan directly | | Skill layer | Ships an installable codescan-review skill | Teaches Codex when to scan and how to present findings | | Report system | Generates HTML / JSON / text output | Works for both humans and automation | | Tests + CI | Verifies runtime, packaging, docs, and entry points | Keeps the repo from slipping back into prototype quality |

Architecture

flowchart LR
    A["CLI / GUI"] --> B["CodeScanner"]
    A2["MCP Server"] --> B
    A3["Codex Skill"] --> A2
    B --> C["AIAnalysisService"]
    C --> D["providers.py"]
    C --> E["chains.py"]
    C --> F["workflow.py"]
    B --> G["VulnerabilityDB"]
    B --> H["report.py"]
    F --> I["rule_scan"]
    F --> J["llm_scan"]
    F --> K["merge_and_finalize"]

Core layout:

codescan/
├── ai/
│   ├── providers.py
│   ├── prompts.py
│   ├── chains.py
│   ├── workflow.py
│   ├── schemas.py
│   └── service.py
├── scanner.py
├── report.py
├── vulndb.py
├── mcp_server.py
└── __main__.py

skills/
└── codescan-review/

Quick Start

1. Clone

git clone https://github.com/HeJiguang/codescan.git
cd codescan

2. Install

python -m venv .venv

# Linux / macOS
source .venv/bin/activate

# Windows
.venv\Scripts\activate

pip install -e .

3. Configure a model

python -m codescan config --show
python -m codescan config --provider deepseek --api-key YOUR_DEEPSEEK_API_KEY --model deepseek-chat

4. Try the CLI

python -m codescan file /path/to/file.py
python -m codescan dir /path/to/project
python -m codescan git-merge main

5. Try the MCP server

codescan-mcp --transport stdio

Use With Codex

To use CodeScan from Codex, combine both layers:

Install the codescan-review skill
Run codescan-mcp --transport stdio
Ask Codex for a security review with a concrete scan scope

That gives Codex workflow guidance plus real structured scan tools.

Good starter prompts:

Use $codescan-review to inspect the current branch against main and report only actionable security findings.

Use $codescan-review to inspect this file for security issues, especially trust boundaries and command execution risks.

Use $codescan-review to scan this repository and summarize the top security risks by severity.

More setup detail is in Use With Codex, MCP Guide, and Skill Guide.

Can MCP Actually Improve Agent Security?

Yes, within a review workflow.

CodeScan can improve the safety of agent-authored code when it is used at the right time and treated as a review tool instead of an automatic guarantee:

highest value: scan the current branch or diff before merge
strong value: scan a suspicious file that touches auth, SQL, shell execution, file handling, templating, or secrets
lower value: run a broad repository sweep for intake or triage

What MCP changes is the integration cost. Instead of asking an agent to shell out, wait for reports, and parse result files, CodeScan can return structured findings directly in the review loop.

What MCP does not solve by itself:

false positives from lightweight rule matching
missing deeper data-flow or framework-aware analysis
the need to manually verify high-severity findings before treating them as confirmed

In other words: MCP makes secure review workflows easier for agents to use consistently. It does not turn any scanner into a complete security gate on its own.

Example Output

The repo includes a small intentionally vulnerable fixture plus a representative structured scan result:

These files show the expected result shape before any local model setup.

More detail is in Example Output.

Get Involved

Current contribution areas:

improve rule quality and reduce false positives
add Semgrep or AST-backed checks
improve GUI usability or split gui.py
add benchmark repositories and evaluation fixtures
improve docs, examples, onboarding, and Codex workflows

Start here:

Contributing Guide
Good First Issues Guide
Community Guide
Support
MCP Guide
Skill Guide
the good first issues lane described in the contributing guide

Use the GitHub issue templates for bugs and feature proposals.

What Ships Today

Unified provider layer for modern chat models
LangGraph-based file analysis workflow
File, directory, GitHub repo, and Git diff scanning
HTML / JSON / text report generation
Desktop GUI
MCP server with structured security tools
Installable codescan-review skill for Codex
Codex-specific setup guide and workflow visuals
Demo vulnerable fixture and example MCP-style findings
GitHub Actions CI and test coverage

Quality Gate

python -m pytest tests -q
python -m compileall codescan
python -m codescan --help
python -m codescan mcp --help

Roadmap

[x] Rebuild the AI runtime with LangChain + LangGraph
[x] Repair CLI / GUI / report-layer contract mismatches
[x] Add packaging metadata, tests, and public CI
[x] Publish an MCP server surface for coding agents
[x] Publish an installable Codex skill
[x] Add concrete example outputs to the repo homepage
[ ] Strengthen rule trustworthiness with deeper Semgrep / AST review flows
[ ] Add SARIF output and GitHub code scanning integration
[ ] Continue splitting scan/export/settings logic out of gui.py
[ ] Add benchmark repositories and repeatable evaluation fixtures

Docs

License

MIT. See LICENSE.

Codescan

Install / Use

README