🧠 RepoMind

Codebase Cognitive Modeling System

Transform repositories into interactive knowledge graphs with AI-powered insights

Features • Quick Start • Architecture • Documentation • Contributing

</div>

🎯 Overview

RepoMind is a PhD-grade open-source platform that analyzes software repositories to construct a comprehensive "cognitive model" of codebases. By combining static analysis, graph theory, and optional LLM reasoning, RepoMind helps developers understand complex codebases through interactive visualizations and actionable insights.

Why RepoMind?

🔍 Deep Analysis: AST parsing, module graphs, call graphs, and domain entity extraction
📊 Rich Metrics: Fan-in/out, cyclomatic complexity, centrality, and coupling analysis
🤖 AI-Powered: Optional LLM integration for architecture inference and tech debt detection
🎨 Beautiful UI: Dark-mode interface with interactive force-directed graphs
🔬 Research-Grade: Built for code comprehension studies and architecture recovery research

✨ Features

🔬 Static Analysis

Repository Scanner: Intelligently discovers source files while respecting .gitignore
AST Parser: Extracts imports, functions, classes, and decorators
Module Graph: Visualizes import dependencies across your codebase
Call Graph: Maps function-level relationships and invocations
Domain Extraction: Automatically detects Pydantic models, dataclasses, and ORM entities

📈 Metrics & Insights

Coupling Analysis: Fan-in/fan-out metrics for dependency tracking
Complexity Metrics: Cyclomatic complexity and lines of code
Centrality Analysis: Graph-based importance scoring
Hotspot Detection: Identifies critical modules and potential bottlenecks

🤖 LLM-Powered Analysis (Optional)

Architecture Inference: Detects patterns (MVC, microservices, layered, etc.)
Design Intent: Understands the purpose and responsibility of modules
Tech Debt Detection: Identifies code smells, violations, and improvement areas
Invariant Discovery: Extracts business rules and contracts from code

🎨 Interactive Visualization

Force-Directed Graphs: Explore module and call dependencies
Metrics Dashboard: Real-time analytics and trend visualization
Tech Debt Reports: Prioritized actionable recommendations
Module Explorer: Drill down into individual components

🚀 Quick Start

Prerequisites

Python 3.10 or higher
Node.js 16+ and npm/yarn
(Optional) OpenRouter API key for LLM features

Installation

1️⃣ Clone the Repository

git clone https://github.com/Adil-Ijaz7/RepoMind.git
cd RepoMind

2️⃣ Backend Setup

cd backend

# Install dependencies
pip install -r requirements.txt

# Configure environment (optional for LLM)
cp .env.example .env
# Edit .env and add your OPENROUTER_API_KEY

3️⃣ Frontend Setup

cd frontend

# Install dependencies
npm install
# or
yarn install

Running RepoMind

Start the Backend Server

cd backend
uvicorn server:app --host 0.0.0.0 --port 8001 --reload

The API will be available at http://localhost:8001

Start the Frontend

cd frontend
npm start
# or
yarn start

The UI will be available at http://localhost:3000

Analyze a Repository

cd backend
python -m repomind.cli analyze ./examples/sample_repo

# Without LLM (static analysis only)
python -m repomind.cli analyze ./path/to/repo --no-llm

🏗️ Architecture

RepoMind/
├── backend/                    # FastAPI Backend
│   ├── repomind/
│   │   ├── core/              # Static analysis engine
│   │   │   ├── scanner.py     # Repository discovery
│   │   │   ├── parser.py      # AST parsing
│   │   │   └── graph_builder.py # Dependency graphs
│   │   ├── llm/               # LLM integration
│   │   │   ├── client.py      # OpenRouter client
│   │   │   └── prompts.py     # Analysis prompts
│   │   ├── analysis/          # High-level analysis
│   │   │   ├── architecture.py # Pattern detection
│   │   │   └── tech_debt.py   # Debt analysis
│   │   ├── api/               # REST API routes
│   │   ├── models/            # Data models
│   │   └── reports/           # Report generators
│   ├── artifacts/             # Generated analysis results
│   ├── examples/              # Sample repositories
│   └── requirements.txt
│
├── frontend/                   # React Frontend
│   ├── src/
│   │   ├── components/        # Reusable UI components
│   │   ├── pages/             # Page components
│   │   │   ├── Dashboard.js   # Main visualization
│   │   │   ├── ModuleView.js  # Module explorer
│   │   │   └── TechDebt.js    # Debt dashboard
│   │   └── api/               # API client
│   └── package.json
│
└── design_guidelines.json      # UI/UX specifications

Technology Stack

Backend

FastAPI: High-performance async web framework
NetworkX: Graph analysis and algorithms
Pandas/NumPy: Data processing and metrics computation
Motor/PyMongo: MongoDB integration (optional)
Rich/Typer: Beautiful CLI interface

Frontend

React 19: UI library with hooks
Radix UI: Accessible component primitives
Tailwind CSS: Utility-first styling
Framer Motion: Smooth animations
react-force-graph-2d: Interactive graph visualization
Recharts: Metrics charting
Lucide React: Icon library

📖 Documentation

Configuration

Environment Variables

Create a .env file in the backend directory:

# LLM Configuration (Optional)
OPENROUTER_API_KEY=your-api-key-here
OPENROUTER_MODEL=qwen/qwen-2.5-coder-32b-instruct

# API Configuration
CORS_ORIGINS=http://localhost:3000

API Endpoints

| Endpoint | Method | Description | |----------|--------|-------------| | /api/ | GET | API health check | | /api/repo/summary | GET | Architecture summary | | /api/repo/metrics | GET | Repository-wide metrics | | /api/modules | GET | List all analyzed modules | | /api/modules/{name} | GET | Detailed module information | | /api/graph/modules | GET | Module dependency graph data | | /api/graph/calls | GET | Function call graph data | | /api/tech-debt | GET | Tech debt items and recommendations |

Full API documentation available at http://localhost:8001/docs

CLI Usage

# Basic analysis
python -m repomind.cli analyze <repo_path>

# Static analysis only (no LLM)
python -m repomind.cli analyze <repo_path> --no-llm

# Custom output directory
python -m repomind.cli analyze <repo_path> --output ./custom-output

# Verbose logging
python -m repomind.cli analyze <repo_path> --verbose

🎨 Design Philosophy

RepoMind follows a "Control Room" design language:

Dark Mode Only: Deep blacks (#050506) with red accents (#EF233C)
Information Density: High-density interfaces for expert users
Precision UI: Sharp edges, minimal rounded corners
Data as Art: Graphs and visualizations take center stage
Technical Language: No "friendly" copy, precision-first communication

See design_guidelines.json for complete specifications.

🔬 Research Applications

RepoMind is designed as a research artifact for:

Code Comprehension Studies: Understanding how developers navigate codebases
Architecture Recovery: Automated detection of architectural patterns
Technical Debt Quantification: Measuring and tracking code quality over time
LLM-Augmented Analysis: Exploring the effectiveness of AI in program analysis

Evaluation Capabilities

Ground-truth annotation comparison
Architecture style accuracy metrics
Tech debt detection precision/recall
Performance benchmarking on large codebases

🛠️ Development

Running Tests

# Backend tests
cd backend
pytest

# Frontend tests
cd frontend
npm test

Code Quality

# Backend linting
cd backend
flake8 repomind/
black repomind/
mypy repomind/

# Frontend linting
cd frontend
npm run lint

Project Structure Conventions

Modular Design: Each analysis component is independent
Type Safety: Full type hints in Python, PropTypes in React
Error Handling: Graceful degradation when LLM unavailable
Performance: Async operations, lazy loading, caching

🤝 Contributing

We welcome contributions! Please follow these guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Contribution Areas

🌐 Multi-language support (JavaScript, Java, Go)
🔄 Incremental analysis for changed files
📊 Additional metrics and visualizations
🤖 Custom LLM backend support (Ollama, local models)
🧪 Test coverage improvements
📚 Documentation enhancements

⚠️ Limitations

Language Support: Currently Python-only (extensible architecture)
LLM Dependency: AI features require external API (or can be disabled)
Call Graph: Best-effort analysis (dynamic dispatch not fully tracked)
Scale: Large repositories (>100k LOC) may require batched processing

🗺️ Roadmap

[ ] Multi-language support (TypeScript, Java, Go)
[ ] Historical trend analysis and codebase evolution t

RepoMind

Install / Use

README