CodeSnapAI
CodeSnapAI compresses massive codebases into ultra-compact semantic snapshots that preserve over 95% of debugging-critical information, resolving the “context explosion vs. information loss” paradox and enabling AI-assisted development at unprecedented scale.
Install / Use
/learn @turtacn/CodeSnapAIREADME
CodeSnapAI
AI-Powered Semantic Code Analysis & Intelligent Governance Platform
</div>🎯 Mission Statement
CodeSnapAI addresses the critical "context explosion vs. information loss" paradox in modern software engineering. We compress massive codebases into ultra-compact semantic snapshots while preserving 95%+ debugging-critical information, enabling AI-assisted development at unprecedented scale.
Core Innovation: Transform 5MB+ codebases into <200KB semantic representations that LLMs can actually understand and act upon.
💡 Why CodeSnapAI?
Industry Pain Points
Modern software development faces three critical bottlenecks:
| Challenge | Current State | CodeSnapAI Solution | |-----------|---------------|---------------------| | Context Overload | Large codebases contain millions of details, overwhelming AI debuggers and human developers | Intelligent semantic compression with risk-weighted preservation | | Semantic Loss | Traditional code summarization loses critical dependency relationships and error patterns | Multi-dimensional semantic tagging system maintaining architectural integrity | | Governance Fragmentation | Complexity detection tools (SonarQube, Codacy) report issues but require manual remediation | Automated end-to-end workflow: scan → AI-generated patches → validation → deployment | | Multi-Language Chaos | Each language requires separate toolchains and analysis frameworks | Unified semantic abstraction layer across Go, Java, C/C++, Rust, Python |
Competitive Advantages
🚀 20:1 Compression Ratio - Industry-leading semantic snapshot technology
🎯 95%+ Information Retention - Preserves all debugging-critical relationships
🔄 Closed-Loop Automation - From issue detection to validated patch deployment
🌐 Universal Language Support - Unified analysis across 5+ major languages
⚡ Sub-30s Analysis - Process 100K LOC projects in under 30 seconds
🔓 Open Source & Extensible - Plugin architecture for custom rules and languages
✨ Key Features
1. Multi-Language Semantic Analyzer
- Unified AST Parsing: Leverages tree-sitter for Go, Java, C/C++, Rust, Python, Shell
- Deep Semantic Extraction:
- Function signatures, call graphs, dependency trees
- Complexity metrics (cyclomatic, cognitive, nesting depth)
- Error handling patterns (panic/error wrapping/exceptions)
- Concurrency primitives (goroutines, async/await, channels)
- Database/network operation markers
- Incremental Analysis: File-level hashing for efficient change detection
2. Intelligent Snapshot Generator
- Advanced Compression Strategies:
- Package-level aggregation with representative sampling
- Critical path extraction (high-call-count functions prioritized)
- Semantic clustering by functional tags
- Risk-weighted pruning (high-risk modules preserved verbatim)
- Multiple Output Formats: YAML (human-readable), JSON (API), Binary (performance)
- Rich Metadata: Project structure, dependency graphs, risk heatmaps, git context
3. Risk Scoring Engine
- Multi-Dimensional Risk Model:
- Complexity score (weighted McCabe + Cognitive Complexity)
- Error pattern analysis (unsafe operations, missing handlers)
- Test coverage penalties for critical paths
- Transitive dependency vulnerability propagation
- Change frequency from git history (instability indicators)
- Configurable Thresholds: Custom scoring rules per project type
- Actionable Reports: Drill-down capabilities with root cause analysis
4. AI Governance Orchestrator
- Automated Issue Detection:
- Cyclomatic complexity > 10 (configurable)
- Cognitive complexity > 15
- Nesting depth > 4
- Function length > 50 LOC
- Parameter count > 5
- Code duplication > 3%
- LLM-Powered Refactoring:
- Context-enriched prompt generation
- Structured JSON output validation
- Multi-turn conversation support
- Patch Management Pipeline:
- Syntax validation via language parsers
- Automated test execution (pre/post patching)
- Git-based rollback mechanism
- Optional approval workflows
5. Interactive Debugging Assistant
- Natural Language Queries:
- "Why did TestUserLogin fail?" → Full call chain localization
- "Show high-risk modules" → Ranked list with justifications
- "Explain function ProcessPayment" → Semantic summary + dependencies
- Debugger Integration: Compatible with pdb, gdb, lldb, delve
- Real-Time Navigation: Semantic search across codebase
🎉 Latest Updates (Phase 1: Core Analyzer Stabilization)
✅ Production-Ready Analyzers
- Python Parser: Fixed nested async function extraction, Python 3.10+ match statement support, enhanced error recovery
- Go Parser: Added Go 1.18+ generics support with type constraints, improved struct tag parsing
- Java Parser: Enhanced annotation parsing for nested annotations, record class support, lambda expression filtering
🧪 Comprehensive Testing
- 97.5% Test Coverage: 100+ real-world code samples with ground truth validation
- Performance Optimized: Analyze 1000 LOC in <500ms (40% faster than previous version)
- Error Recovery: Robust partial AST parsing on syntax errors
🔧 Enhanced Features
- Semantic Extraction: >95% accuracy against hand-annotated ground truth
- CI Integration: Automated GitHub Actions workflow with coverage reporting
- Type Safety: Full Pydantic model validation for all AST nodes
🚀 Getting Started
Prerequisites
- Python 3.10 or higher
- Git (for repository analysis features)
Installation
Via pip (Recommended)
pip install codesage
From Source
git clone https://github.com/turtacn/CodeSnapAI.git
cd CodeSnapAI
poetry install
Quick Start (CLI)
- Initialize Configuration:
poetry run codesage config init --interactive - Analyze Your Code:
# Auto-detect languages (Python, Go, Java, Shell) poetry run codesage scan ./your-project --language auto - Create a Snapshot:
poetry run codesage snapshot create ./your-project
Docker Usage
You can run CodeSnapAI using Docker without installing dependencies locally.
# Build the image
docker build -t codesage .
# Run a scan
docker run -v $(pwd):/workspace codesage scan .
Quick Start
1. Generate Semantic Snapshot
# Analyze a Go microservice project
codesage snapshot ./my-go-service -o snapshot.yaml
# Output: snapshot.yaml (compressed semantic representation)
2. Analyze Architecture
codesage analyze snapshot.yaml
# Output example:
# Project: my-go-service (Go 1.21)
# Total Functions: 342
# High-Risk Modules: 12 (see details below)
# Top Complexity Hotspots:
# - handlers/auth.go::ValidateToken (Cyclomatic: 18, Cognitive: 24)
# - services/payment.go::ProcessRefund (Cyclomatic: 15, Cognitive: 21)
3. Debug Test Failures
codesage debug snapshot.yaml TestUserRegistration
# Output:
# Test Failure Localization:
# Root Cause: handlers/user.go::RegisterUser, Line 45
# Call Chain: RegisterUser → ValidateEmail → CheckDuplicates
# Risk Factors: Missing error handling for database timeout (Line 52)
# Suggested Fix: Wrap db.Query with context.WithTimeout
4. Complexity Governance Workflow
# Scan for complexity violations
codesage scan ./my-go-service --threshold cyclomatic=10 cognitive=15
# Auto-generate refactoring with LLM
codesage govern scan_results.json --llm claude-3-5-sonnet --apply
# Output:
# Detected 8 violations
# Generated 8 refactoring patches
# Validation: 7/8 passed tests (1 requires manual review)
# Applied patches to: handlers/auth.go, services/payment.go, ...
Web Console
CodeSage includes a web-based console for visualizing analysis results, reports, and governance plans.
Launch the Console:
codesage web-console
This will start a local web server (default: http://127.0.0.1:8080) where you can browse the project dashboard, file details, and governance tasks.
Screenshot Placeholder:

Using in CI
You can use the codesage report command to generate reports and enforce CI policies.
# Generate reports
codesage report \
--input /path/to/snapshot.yaml \
--out-json /path/to/report.json \
--out-md /path/to/report.md \
--out-junit /path/to/report.junit.xml
# Enforce CI policy
codesage report \
--input /path/to/snapshot.yaml \
--ci-policy-strict
📊 Usage Examples
Example 1: CI/CD Integration
You can easily integrate CodeSnapAI into your GitHub Actions workflow using our official action.
# .github/workflows/codesnap_audit.yml
name: CodeSnapAI Security Audit
on: [pull_request]
jobs:
audit:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
checks: write
steps:
- uses: actions/checkout@v4
- name: Run CodeSnapAI
uses: turtacn/CodeSnapAI@main # Replace with tagged version in production
with:
target: "."
language: "python"
