Auditor

Antidote to VibeCoding

Generate Convert Improve

Install / Use

/learn @TheAuditorTool/Auditor

About this skill

Quality Score

0/100

README

Hello everyone. After being Sherlocked twice lol... I've decided im going to go close sourced. I don't feel like giving architecture to supposedly "Silicon valley legends" for nothing anymore. Everything I've done has been incredibly validated and ive decided to pivot into a product with free and paid subscription tiers. I'm also going to pivot the current architecture to support said product. Going to be even more amazing, more deterministic and more automated. I have already solved java, fully integrated, maven, spring, gradle, Jakarta etc. The full shebang already wired up to taint and ive also achieved polyglot taint with multihop, cross file path and cross polyglot indexing/taint with full call chain provenance across guards, validators, classpaths and frameworks.

Its going to be amazing and im personally thrilled, the hate, spit and revenge tanks are fully loaded and the engines upgraded to v8... I know you and your team is reading this... You broke the cardinal sin... Everyone knows the recipe for pasta carbonara... Very few can cook it to the level of a 3 star Michele chef... And even fewer can invent new dishes on that level...

Its now 2026/01/23... I have polished every language to make sure its parity for "full call chain provenance", which is now done... Drinking celebatory beers today and tomorrow starting work on the product pivot. I expect it to take me 2-3 more months from today until the product launch which will be announced here and other places :)

I'm not going to be lame and delete or private the repo. I'm happy to contribute to the opensource community but the tool will not get any more updates and its bit broken in some places, wish I could share those fixes but sadly? I cant anymore... :(

Was a blast. Wish you all the best :)

TheAuditor

Database-First Static Analysis and Code Context Intelligence

Multi-language security analysis platform with strict data fidelity guarantees for Python, JavaScript/TypeScript, Go, Rust, Bash, and Terraform/HCL projects

🔒 Privacy-First: All code analysis runs locally. Your source code never leaves your machine.

Network Features (fully optional - use --offline to disable):

Dependency version checks (npm, pip, cargo registries)
Documentation fetching for improved AI context
Public vulnerability database updates

Default mode includes network calls. Run aud full --offline for air-gapped operation.

What is TheAuditor?

TheAuditor is a database-first code intelligence platform that indexes your entire codebase into a structured SQLite database, enabling:

25 rule categories with 200+ detection functions for framework-aware vulnerability detection
Complete data flow analysis with cross-file taint tracking
Architectural intelligence with hotspot detection and circular dependency analysis
Deterministic query tools providing ground truth for AI agents (prevents hallucination)
Database-first queries replacing slow file I/O with indexed lookups
Framework-aware detection for Django, Flask, FastAPI, React, Vue, Next.js, Express, Angular, SQLAlchemy, Prisma, Sequelize, TypeORM, Celery, GraphQL, Terraform, AWS CDK, GitHub Actions

Key Differentiator: While most SAST tools re-parse files for every query, TheAuditor indexes incrementally and queries from the database - enabling sub-second queries across 100K+ LOC. Re-index only when files change, branches switch, or after code edits.

📺 See the A/B Test

TheAuditor vs. Standard AI: Head-to-Head Refactor

The Experiment: We ran an A/B test giving the exact same problem statement to two Claude Code sessions.

Session A (Standard): File reading, grepping, assumptions about the codebase.

Session B (TheAuditor): Used aud planning to verify the problem, aud impact for blast radius, and aud refactor to guide implementation.

Result: Watch how the database-first approach verifies the fix before writing code, preventing the hallucinations and incomplete refactors seen in Session A.

# Index your codebase
aud full

# Query from the database
aud query --symbol validateUser --show-callers --depth 3
aud blueprint --security
aud taint --severity critical
aud impact --symbol AuthService --planning-context

# Re-index after changes (incremental via workset)
aud workset --diff main..HEAD
aud full --index

Architecture: Custom Compilers, Not Generic Parsers

TheAuditor's analysis accuracy comes from deep compiler integrations, not generic parsing:

Python Analysis Engine

Built on Python's native ast module with 27 specialized extractor modules:

| Extractor Category | Modules | |-------------------|---------| | Core | core_extractors, fundamental_extractors, control_flow_extractors | | Framework | django_web_extractors, flask_extractors, orm_extractors, task_graphql_extractors | | Security | security_extractors, validation_extractors, data_flow_extractors | | Advanced | async_extractors, protocol_extractors, type_extractors, cfg_extractor |

Each extractor performs semantic analysis—understanding Django signals, Flask routes, Celery tasks, Pydantic validators, and 100+ framework-specific patterns.

JavaScript/TypeScript Analysis Engine

Uses the actual TypeScript Compiler API via Node.js subprocess integration:

Full semantic type resolution (not regex pattern matching)
Module resolution across complex import graphs
JSX/TSX transformation with component tree analysis
tsconfig.json-aware path aliasing
Vue SFC script extraction and analysis

This is not tree-sitter. The TypeScript Compiler provides the same semantic analysis as your IDE.

Polyglot Support

| Language | Parser | Fidelity | |----------|--------|----------| | Python | Native ast module + 27 extractors | Full semantic | | TypeScript/JavaScript | TypeScript Compiler API | Full semantic | | Go | tree-sitter | Structural + taint | | Rust | tree-sitter | Structural + taint | | Bash | tree-sitter | Structural + taint |

Tree-sitter provides fast structural parsing for Go, Rust, and Bash. The heavy lifting for Python and JS/TS uses language-native compilers.

Key Differentiators

| Traditional Tools | TheAuditor | |-------------------|------------| | Re-parse files per query | Index incrementally, query from database | | Single analysis dimension | 4-vector convergence (static + structural + process + flow) | | Human-only interfaces | Deterministic query tools for AI agents | | File-based navigation | Database-first with recursive CTEs | | Point-in-time analysis | ML models trained on your codebase history |

Limitations & Trade-offs

Analysis Speed vs Correctness:

We prioritize correctness over speed
Full indexing: 1-10 minutes depending on codebase size (framework-heavy projects slower)
Complete call graph construction rather than approximate heuristics

Language Support Fidelity:

Python & TypeScript/JavaScript: Full semantic analysis via native compilers
Go & Rust: Structural analysis via Tree-sitter (no type resolution)
C++: Not currently supported

Database Size:

repo_index.db: 50MB (5K LOC) to 500MB+ (100K+ LOC)
graphs.db: 30MB (5K LOC) to 300MB+ (100K+ LOC)
Trade-off: Disk space for instant queries

Setup Overhead:

Requires initial aud full run before querying (1-10 min first-time)
Not suitable for quick one-off file scans
Designed for sustained development on a codebase

Current Scope:

Security-focused static analysis, not a linter replacement
Complements (doesn't replace) language-specific tools like mypy, eslint
No IDE integration (CLI-only, designed for terminal and AI agent workflows)

What This Is NOT

Not a Traditional SAST:

We don't provide "risk scores" or subjective ratings
We provide facts (FCE shows evidence convergence, not risk opinions)
You interpret the findings based on your context

Not a Code Formatter:

We detect patterns, we don't fix them
See findings as signals to investigate, not auto-fix targets

Not a Replacement for Linters:

TheAuditor focuses on security patterns and architecture
Use alongside Ruff, ESLint, Clippy for comprehensive coverage

Installation

pip install theauditor

# Or from source
git clone https://github.com/TheAuditorTool/Auditor.git
cd Auditor
pip install -e .

# Install language tooling (Node.js runtime, linters)
aud setup-ai

Prerequisites:

Python 3.14+ (Strict Requirement)
- Why? We rely on PEP 649 (Deferred Evaluation of Annotations) for accurate type resolution in the Taint Engine. We cannot track data flow through Pydantic models or FastAPI endpoints correctly without it.

Quick Start

# 1. Index your codebase
cd your-project
aud full

# 2. Explore architecture
aud blueprint --structure

# 3. Find security issues
aud taint --severity high
aud boundaries --type input-validation

# 4. Query anything
aud explain src/auth/service.ts
aud query --symbol authenticate --show-callers

Feature Overview

Core Analysis Engine

| Command | Purpose | |---------|---------| | aud full | Comprehensive 24-phase indexing pipeline | | aud workset | Create focused file subsets for targeted analysis | | aud detect-patterns | 25 rule categories with 200+ detection functions | | aud taint | Source-to-sink data flow tracking | | aud boundaries | Security boundary enforcement analysis |

Intelligence & Queries

| Command | Purpose | |---------|---------| | aud explain | Complete briefing packet for any file/symbol/component | | aud query | SQL-powered code structure queries | | aud blueprint | Architectural visualization (8 analysis modes) | | aud impact | Blast radius calculation