SkillAgentSearch skills...

CodeWise

CodeWise is an AI-powered coding copilot that indexes your entire project (code, configs, docs), understands its structure through AST analysis, and answers questions with a hybrid vector + keyword search engine. It runs in a four-container stack—Next.js UI, FastAPI backend, Python indexer, and secure file server

Install / Use

/learn @samballington/CodeWise
About this skill

Quality Score

0/100

Supported Platforms

GitHub Copilot

README

CodeWise

CI/CD PR Checks

AI-powered code intelligence platform for understanding large codebases. Combines semantic search, knowledge graphs, and LLM reasoning with production-grade error handling, caching, and observability. Check out the Design decisions

Demo

Check out the live demo! (note: queries will take up to 2 minutes due to AWS EC2 pricing bottlenecks, but run faster locally!)

Check out the video demo

🚀 Quick Setup

Get CodeWise running in 5 minutes with Docker. Jump to setup instructions

Screenshots

<div align="center">

CodeWise Interface Example 1 CodeWise analyzing a complex codebase with semantic search and knowledge graph integration

CodeWise Interface Example 2

</div>

What It Does

Hybrid Search: Combines FAISS vector similarity with knowledge graph structure queries. Vector search finds semantically similar code, knowledge graph provides architectural relationships.

Multi-Language Analysis: Processes 23 programming languages using tree-sitter parsers. Extracts symbols, relationships, and patterns into queryable structures.

Performance: 0.5-15 second response times. 95% accuracy for architectural discovery queries with intelligent error recovery.

Scale: Currently indexing 44,599 code chunks across production deployments with 92% query success rate.

Core Components

Vector Search Engine

  • FAISS index with BGE-large-en-v1.5 embeddings (1024 dimensions)
  • Semantic similarity for finding related code across the codebase
  • Chunk-based indexing preserves context while enabling precise retrieval
  • 4-layer cache system with 85% hit rate reducing API costs proportionally

Knowledge Graph

  • SQLite-based relationship store with VSS extension support
  • Symbol extraction for classes, functions, interfaces, variables
  • Relationship mapping for inheritance, calls, dependencies, imports
  • Structural queries for architecture analysis and dependency graphs
  • Recursive traversal with optimized WAL mode and 64MB cache

LLM Agent Architecture

  • Function calling with 2 specialized tools (query_codebase, navigate_filesystem)
  • Workflow system for intelligent query routing and mode selection
  • Progressive retry logic with 3-attempt correction prompts
  • Parallel tool execution and streaming responses via WebSocket
  • Multi-level fallback chains ensuring 95% query success rate

Universal Pattern Recognition

Detects architectural patterns across 23 languages:

  • Dependency injection (Spring @Autowired, Angular services)
  • Interface implementations (Java implements, Go interfaces)
  • Inheritance hierarchies (class extends/inherits patterns)
  • Call relationships (method invocations, function calls)

Architecture

graph TB
    subgraph "Client Layer"
        FE[React Frontend<br/>Port 3000]
    end

    subgraph "API Layer"
        BE[FastAPI Backend<br/>WebSocket Server<br/>Port 8000<br/>17 REST Endpoints]
    end

    subgraph "Processing Layer"
        QE[Query Engine<br/>Unified Search]
        PR[Pattern Recognizer<br/>Universal AST]
        AG[LLM Agent<br/>Function Calling]
    end

    subgraph "Storage Layer"
        KG[(SQLite KG<br/>VSS Extension<br/>relationships)]
        VS[(FAISS Vector<br/>BGE embeddings<br/>1024 dims)]
        CC[4-Layer Cache<br/>85% hit rate]
        FS[File System<br/>/workspace]
    end

    subgraph "External"
        LLM[Cerebras API<br/>gpt-oss-120b<br/>65K context]
    end

    FE -.->|WebSocket| BE
    BE --> AG
    AG --> QE
    QE --> VS
    QE --> KG
    QE --> PR
    QE --> CC
    PR --> FS
    AG --> LLM

    classDef storage fill:#e1f5fe
    classDef processing fill:#f3e5f5
    classDef api fill:#e8f5e8
    classDef client fill:#fff3e0
    classDef external fill:#ffebee

    class KG,VS,FS,CC storage
    class QE,PR,AG processing
    class BE api
    class FE client
    class LLM external

Production Features

Error Handling & Resilience

  • Progressive retry with 3-attempt correction prompts
  • Multi-level fallback chains (Vector → BM25, SDK → Adapter, JSON → String)
  • Graceful degradation maintaining 95% query success despite component failures
  • WebSocket state management preventing connection errors

Performance & Caching

  • 4-layer cache system: Discovery (24h TTL) → Embeddings (persistent HDF5) → Chunks (7d) → Metrics
  • 85% cache hit rate reduces API costs and latency proportionally
  • SHA256-based invalidation prevents false cache misses
  • Incremental indexing 80-95% faster than full reindex

API & Integration

  • 17 REST endpoints for health checks, project management, cache monitoring
  • WebSocket streaming for real-time LLM responses
  • FastAPI auto-generated docs at /docs and /redoc
  • Pydantic validation on all request/response models

Observability

  • Structured logging with severity-coded emojis for readability
  • Performance metrics tracking tokens/sec, bottleneck detection
  • Cache dashboard at /api/cache/performance with optimization recommendations
  • Request tracing with unique IDs and full conversation state

CI/CD & Quality

  • Automated testing on every PR: pytest, flake8, ESLint, TypeScript checks
  • Docker matrix builds to ghcr.io with version tags
  • Conventional commits validation for clean git history
  • 1,265 type hints across 71 Python files enforced by CI

Security

  • Path traversal prevention with Path.resolve() validation
  • API key masking in logs (csk-xxxx...xxxx format)
  • CORS restrictions to localhost:3000
  • Request timeouts (300s enforcement)
  • Pydantic input validation on all endpoints

Architecture Decisions

SQLite over Graph Database

Chose SQLite instead of Neo4j. Zero administration overhead, ACID compliance, sufficient performance at current scale (44K+ chunks). Trade-off: manual relationship traversal vs operational simplicity.

FAISS over Managed Vector DB

Local FAISS with custom caching instead of Pinecone/Weaviate. Full control over embeddings, no API dependencies, predictable costs. Works offline. 4-layer cache achieves 85% hit rate.

Two-Tool Architecture

Consolidated from 6 overlapping tools to 2: query_codebase + navigate_filesystem. Eliminated tool selection paralysis. 87% complexity reduction, 58% accuracy improvement.

WebSocket over REST

Real-time streaming of LLM responses. Persistent connections reduce latency for interactive queries. State management prevents connection errors.

Technical Stack

| Component | Technology | Purpose | |-----------|------------|---------| | Backend | FastAPI, Python 3.9+ | WebSocket handling, query orchestration, 17 REST endpoints | | Vector Store | FAISS, BGE embeddings | Semantic code search with HDF5-backed caching | | Knowledge Graph | SQLite with VSS extension | Code structure, relationships, recursive queries | | Pattern Engine | tree-sitter parsers | Multi-language AST analysis (23 languages) | | LLM Integration | Cerebras SDK | gpt-oss-120b, qwen models with function calling | | Frontend | React, TypeScript | Query interface with project management | | Caching | 4-layer system | Discovery, embeddings, chunks, metrics | | CI/CD | GitHub Actions | pytest, flake8, ESLint, Docker builds |

Performance Numbers

Query Response Times (Production Data):

  • Simple searches: 0.5-2 seconds
  • Architecture analysis: 5-15 seconds
  • Complex queries with fallbacks: 5-12 seconds
  • Diagram generation: 10-25 seconds

Indexing Performance (GPU-accelerated embedding generation):

  • Small projects (<1K files): 30-60 seconds
  • Medium projects (1-5K files): 2-5 minutes
  • Large projects (5K+ files): 10-30 minutes
  • Incremental updates: 80-95% faster than full reindex

Note: Times achieved with GPU acceleration for embedding generation. CPU-only deployment would be significantly slower.

Resource Usage:

  • Memory: 2-4GB during indexing, 1-2GB at rest
  • Storage: 150-500MB per project
  • Query success rate: 95% (with error recovery)
  • Cache hit rate: 85% (4-layer system)

Scale Metrics:

  • Production deployment: 44,599 indexed code chunks
  • Active projects: 2 (bitchat, SWE_Project)
  • Languages supported: 23 via tree-sitter
  • API endpoints: 17 (health, monitoring, management)

Evolution Metrics (Phase 2 → Phase 3):

  • Response time: 15-30s → 0.5-15s (50-97% improvement)
  • Discovery accuracy: 60% → 95% (58% improvement)
  • Tool complexity: 6 tools → 2 tools (87% reduction)
  • Query success: 60% → 95% (error recovery implemented)

Supported Languages

Pattern recognition works across 23 languages: Python, Java, JavaScript, TypeScript, Go, Rust, Swift, C#, PHP, Ruby, Kotlin, Dart, C, and more via tree-sitter language pack.

API Endpoints

CodeWise provides 17 REST endpoints for comprehensive system control:

Health & Monitoring:

  • GET /health - System health check
  • GET /api/provider/health - LLM provider health
  • GET /api/kg/status - Knowledge Graph status
  • GET /indexer/status - Vector index readiness

Project Management:

  • GET /projects/ - List all projects
  • GET /projects/{name}/tree - File tree exploration
  • GET /projects/{name}/file - File content retrieval
  • GET /projects/{name}/summary - AI-generated summary
  • POST /projects/clone - Clone GitHub repositories
  • POST /projects/import - Import local files
  • `DELETE /projects/{na
View on GitHub
GitHub Stars12
CategoryDevelopment
Updated1mo ago
Forks0

Languages

Python

Security Score

75/100

Audited on Mar 7, 2026

No findings