SkillAgentSearch skills...

Reflyx

Reflyx — a free, open-source AI coding assistant for VS Code. Works like Augment, using only free resources. Runs fully offline with local LLMs (Ollama, LM Studio, DeepSeek, Qwen) or online with GPT-4o/Claude. Indexes your entire codebase with Tree-sitter + embeddings + vector DB. Private, fast, and auto-syncs as you code.

Install / Use

/learn @njrgourav11/Reflyx
About this skill

Quality Score

0/100

Supported Platforms

Claude Code
Claude Desktop
GitHub Copilot

README

🤖 AI Coding Assistant - Enhanced with Dual-Mode AI Processing

A comprehensive, production-ready AI coding assistant that matches and exceeds the functionality of Augment Code, Cursor, and Windsurf. Features dual-mode operation with seamless switching between local privacy and cloud-powered performance, advanced semantic search, intelligent code generation, and contextual explanations.

🌟 NEW: Augment Code-Level Features

  • 🔄 Dual-Mode AI Processing: Seamlessly switch between local (Ollama) and online (GPT-4o, Claude-3.5-Sonnet, Gemini Pro) AI models
  • 🔐 Secure API Key Management: Built-in secure storage using VS Code's SecretStorage API
  • ⚡ Ultra-Fast Inference: Groq integration with 500+ tokens/second processing
  • 🎯 Smart Provider Selection: Automatic fallback and intelligent routing
  • 📱 Enhanced UI: Context-aware chat, inline suggestions, and real-time streaming
  • 🔧 Advanced Configuration: Comprehensive settings panel with provider management

🚀 Core Features

🤖 Dual-Mode AI Processing

  • Local Mode: Complete privacy with Ollama (CodeLlama, DeepSeek-Coder, Qwen2.5-Coder)
  • Online Mode: Access to latest models (GPT-4o, Claude-3.5-Sonnet, Gemini Pro, Groq)
  • Hybrid Mode: Intelligent routing between local and cloud for optimal performance
  • Smart Fallback: Automatic provider switching if primary fails

🔍 Advanced Code Intelligence

  • Semantic Code Search: Query your entire codebase using natural language
  • Context-Aware Explanations: Detailed code explanations with surrounding context
  • Intelligent Code Generation: Generate production-ready code from prompts
  • Smart Refactoring: AI-powered refactoring suggestions with examples
  • Pattern Detection: Find similar code patterns and potential duplications
  • Real-time Indexing: Automatic re-indexing when files change

🎯 Enhanced User Experience

  • Inline Code Suggestions: Real-time AI suggestions as you type
  • Streaming Responses: See AI responses as they're generated
  • Context-Aware Chat: Persistent chat with full codebase context
  • Quick Actions: Right-click context menu for instant AI help
  • Status Indicators: Real-time status of AI providers and indexing progress

🤖 Supported AI Providers

🏠 Local Providers (Free & Private)

| Provider | Models | Context | Speed | Privacy | |----------|--------|---------|-------|---------| | Ollama | CodeLlama 7B/13B/34B<br>DeepSeek-Coder 6.7B<br>Qwen2.5-Coder 7B | 16K-32K | Hardware-dependent | 🟢 Complete |

☁️ Online Providers (Cloud APIs)

| Provider | Models | Context | Speed | Free Tier | |----------|--------|---------|-------|-----------| | OpenAI | GPT-4o, GPT-4 Turbo | 128K | Fast | $5 credit | | Anthropic | Claude 3.5 Sonnet, Claude 3 Opus | 200K | Fast | Limited | | Google AI | Gemini 1.5 Pro, Gemini 1.5 Flash | 2M | Medium | Generous | | Groq | Llama 3.1 70B, Mixtral 8x7B | 131K | Ultra-fast | 14.4K req/day | | Together AI | Llama 3 70B, CodeLlama 34B | 8K-16K | Fast | $25 credit |

🎯 Quick Setup Recommendations

  • Privacy-First: Use Local mode with Ollama
  • Speed-First: Use Groq (500+ tokens/second, generous free tier)
  • Quality-First: Use GPT-4o or Claude 3.5 Sonnet
  • Budget-First: Use Hybrid mode (local + Groq fallback)
  • Enterprise: Use Local mode with larger models (13B/34B)

🏗️ Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   VS Code       │    │   FastAPI        │    │   Qdrant        │
│   Extension     │◄──►│   Backend        │◄──►│   Vector DB     │
│   (TypeScript)  │    │   (Python)       │    │   (Self-hosted) │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                │
                       ┌────────▼────────┐
                       │   Ollama        │
                       │   Local LLMs    │
                       │   (Free Models) │
                       └─────────────────┘

🛠️ Tech Stack (100% Free & Open Source)

Core Components

  • Frontend: VS Code Extension (TypeScript)
  • Backend: Python FastAPI
  • Database: Qdrant Vector Database (self-hosted)
  • AI Models: Ollama (CodeLlama, DeepSeek-Coder, Qwen2.5-Coder)
  • Code Parsing: Tree-sitter
  • Embeddings: Sentence-Transformers (all-MiniLM-L6-v2)

Free Cloud Alternatives (Optional)

  • Vector DB: Pinecone (free tier), Weaviate Cloud
  • LLM APIs: Groq (free tier), Together AI, Google Gemini Pro
  • Hosting: Railway, Render, Fly.io (all have free tiers)

📋 Prerequisites

Minimum Requirements

  • RAM: 8GB (16GB recommended)
  • CPU: 4-core (8-core recommended)
  • Storage: 5GB free space (10GB recommended)
  • OS: Windows 10+, macOS 10.15+, or Linux

Required Software

🚀 Quick Start (3 Steps)

1. Automated Setup

# Clone and setup everything automatically
git clone https://github.com/njrgourav11/Reflyx.git
cd Reflyx
python setup.py  # Installs everything: Docker, models, dependencies

# Start all services
docker-compose up -d

2. Install VS Code Extension

# Build and package the extension
cd extension
npm install && npm run compile && npm run package

# Install the .vsix file in VS Code:
# 1. Open VS Code
# 2. Ctrl+Shift+P → "Extensions: Install from VSIX"
# 3. Select the generated .vsix file
# 4. Restart VS Code

3. Configure AI Providers

# Option A: Local Only (Complete Privacy)
# 1. Install Ollama: https://ollama.ai
# 2. Pull models: ollama pull codellama:7b-code
# 3. In VS Code: Ctrl+Shift+, → Set mode to "Local"

# Option B: Online + Local (Best Performance)
# 1. Get free API keys (see API Keys Guide)
# 2. In VS Code: Ctrl+Shift+, → Configure providers
# 3. Set mode to "Hybrid" for best of both worlds

🎯 First Steps After Installation

Immediate Setup (2 minutes)

  1. Open VS Code in your project directory
  2. Index your codebase: Ctrl+Shift+P → "AI Coding Assistant: Index Workspace"
  3. Open settings: Ctrl+Shift+, to configure AI providers
  4. Start chatting: Ctrl+Shift+C to open the AI chat panel

Test Your Installation

# Quick health check
make health

# Or manually test each service:
curl http://localhost:8000/api/v1/health  # Backend
curl http://localhost:6333/health         # Vector DB
curl http://localhost:11434/api/tags      # Ollama (if using local)

Try These Example Queries

  • "Where is user authentication handled in this codebase?"
  • "Show me all database connection functions"
  • "Find error handling patterns"
  • Select code → Right-click → "Explain Selection"
  • Ctrl+Shift+G → "Create a REST API endpoint for user login"

Or start individually:

Backend server

cd server && python -m uvicorn app.main:app --reload

Qdrant vector database

docker run -p 6333:6333 qdrant/qdrant

Ollama (install from https://ollama.ai)

ollama pull codellama:7b-code


### 3. Install VS Code Extension
```bash
cd extension
npm install
npm run compile
# Install the .vsix package in VS Code

4. Configure Workspace

  1. Open VS Code in your project directory
  2. Run command: AI Coding Assistant: Index Workspace
  3. Start chatting with your codebase!

📖 Usage Guide

Basic Commands

  • Ctrl+Shift+PAI Coding Assistant: Ask Codebase
  • Ctrl+Shift+PAI Coding Assistant: Explain Selection
  • Ctrl+Shift+PAI Coding Assistant: Generate Code
  • Ctrl+Shift+PAI Coding Assistant: Find Similar

Chat Interface

  • Open the AI Assistant sidebar panel
  • Type natural language queries about your code
  • Get contextual responses with code references

Example Queries

"Where is user authentication handled?"
"Explain this function and its dependencies"
"Generate a REST API endpoint for user login"
"Find all database connection functions"
"Suggest refactoring for this class"

⚙️ Configuration

VS Code Settings

{
  "aiCodingAssistant.modelProvider": "ollama",
  "aiCodingAssistant.embeddingModel": "all-MiniLM-L6-v2",
  "aiCodingAssistant.maxChunkSize": 500,
  "aiCodingAssistant.retrievalCount": 10,
  "aiCodingAssistant.ignorePatterns": [
    "node_modules/**",
    ".git/**",
    "*.min.js",
    "dist/**"
  ]
}

Environment Variables

# Backend Configuration
QDRANT_URL=http://localhost:6333
OLLAMA_URL=http://localhost:11434
EMBEDDING_MODEL=all-MiniLM-L6-v2

# Optional Cloud APIs (Free Tiers)
OPENAI_API_KEY=your_key_here
GROQ_API_KEY=your_key_here

🧪 Supported Languages

  • Primary: TypeScript, JavaScript, Python, Java
  • Additional: C++, Rust, Go, C#, PHP, Ruby
  • Extensible: Easy to add new languages via Tree-sitter grammars

📊 Performance Metrics

Indexing Performance

  • Speed: ~1000 files/minute (4-core CPU)
  • Memory: <2GB during indexing
  • Storage: ~100MB per 10K files

Query Performance

  • Simple Queries: <3 seconds
  • Code Generation: <8 seconds
  • Context Window: Up to 32K tokens

🔧 Development

Project Structure

ai-coding-assistant/
├── extension/          # VS Code extension (TypeScript)
├── server/            # FastAPI backend (Python)
├── indexer/           # Code parsing & embedding
├── docker-compose.yml # Local development setup
└── docs/             # Documentation

Running in Development Mode

# Backend with hot reload
cd server && python -m uvicorn app.main:app --reload

# Extension development
cd extension && npm run watch

# Vector database
docker run -p 6333:6333 qdrant/qdrant

🐛 Troubleshooting

Common Issues

1. Ollama Connection Failed

# Check if Ollama is running
curl http://localhost:11434/api/tags
View on GitHub
GitHub Stars6
CategoryDevelopment
Updated6mo ago
Forks1

Languages

Python

Security Score

77/100

Audited on Sep 25, 2025

No findings