FalconEYE
AI-powered security code analyzer using local LLMs for semantic vulnerability detection. Unlike traditional SAST tools, FalconEYE reasons about code contextually, no pattern matching. Supports Python, JavaScript, TypeScript, Go, Rust, C/C++, Java, and more.
Install / Use
/learn @FalconEYE-ai/FalconEYEREADME
FalconEYE
███████╗ █████╗ ██╗ ██████╗ ██████╗ ███╗ ██╗███████╗██╗ ██╗███████╗
██╔════╝██╔══██╗██║ ██╔════╝██╔═══██╗████╗ ██║██╔════╝╚██╗ ██╔╝██╔════╝
█████╗ ███████║██║ ██║ ██║ ██║██╔██╗ ██║█████╗ ╚████╔╝ █████╗
██╔══╝ ██╔══██║██║ ██║ ██║ ██║██║╚██╗██║██╔══╝ ╚██╔╝ ██╔══╝
██║ ██║ ██║███████╗╚██████╗╚██████╔╝██║ ╚████║███████╗ ██║ ███████╗
╚═╝ ╚═╝ ╚═╝╚══════╝ ╚═════╝ ╚═════╝ ╚═╝ ╚═══╝╚══════╝ ╚═╝ ╚══════╝
Next-Generation Security Code Analysis Powered by Local LLMs
by hardw00t & h4ckologic
FalconEYE represents a paradigm shift in static code analysis. Instead of relying on predefined vulnerability patterns, it leverages large language models to reason about your code the same way a security expert would — understanding context, intent, and subtle security implications that traditional tools miss.
Table of Contents
- Why FalconEYE?
- How It Works
- Getting Started
- MLX Backend (Apple Silicon)
- Supported Languages
- CLI Reference
- Output Formats
- Configuration
- Architecture
- Development
- FAQ
- Discoveries
- License
Why FalconEYE?
Traditional security scanners are limited by their pattern databases. They can only find what they've been programmed to look for. FalconEYE is different:
- No Pattern Matching — Uses pure AI reasoning to understand your code semantically
- Context-Aware Analysis — Retrieval-Augmented Generation (RAG) provides relevant code context for deeper insights
- Novel Vulnerability Detection — Identifies security issues that don't match known patterns
- LLM-Powered Enrichment — Every finding gets AI-generated descriptions, mitigations, code snippets, and line numbers
- Reduced False Positives — Optional AI validation pass filters noise from false alarms
- Rich Reporting — Console, JSON, HTML, and SARIF output with interactive dashboards
- Smart Re-indexing — Incremental analysis means re-scans only process changed files
- Privacy-First — Runs entirely locally with Ollama or MLX — your code never leaves your machine
- Apple Silicon Accelerated — Native MLX backend delivers 20-40% faster inference on M-series chips
How It Works
FalconEYE follows a multi-stage analysis pipeline:
┌─────────────────────────────────────────────────────────────────┐
│ 1. CODE INGESTION │
│ Scans repository -> Detects languages -> Parses AST structure │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ 2. INTELLIGENT INDEXING │
│ Chunks code semantically -> Generates embeddings -> Stores in │
│ vector database for fast semantic search │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ 3. CONTEXT ASSEMBLY (RAG) │
│ For each code segment -> Retrieves similar code -> Gathers │
│ relevant context from your entire codebase │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ 4. AI SECURITY ANALYSIS │
│ LLM analyzes code with context -> Reasons about vulns -> │
│ Understands data flow -> Identifies security implications │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ 5. LLM-POWERED ENRICHMENT │
│ Incomplete findings sent back to the LLM for detailed │
│ reasoning, specific mitigations, code snippets & line numbers │
└──────────────────────────────┬──────────────────────────────────┘
│
┌──────────────────────────────▼──────────────────────────────────┐
│ 6. VALIDATION & REPORTING │
│ Optional AI validation pass -> Formats findings -> Outputs in │
│ Console/JSON/HTML/SARIF format with actionable remediation │
└─────────────────────────────────────────────────────────────────┘
Semantic Understanding: FalconEYE reads your code like a security engineer, understanding business logic, data flows, and architectural patterns to identify real vulnerabilities.
RAG-Enhanced Analysis: By retrieving similar code patterns from your entire codebase, the AI gets crucial context about how functions are used, what data they handle, and potential security implications across your application.
LLM-Powered Enrichment: When the initial analysis produces incomplete findings (missing line numbers, generic descriptions, or vague mitigations), FalconEYE sends them back to the LLM with the full source code for enrichment. Every displayed finding includes specific line numbers, the vulnerable code snippet, a detailed description of the exploit, and actionable remediation referencing actual identifiers from your code.
Getting Started
Prerequisites
- Python 3.12 (recommended) · 3.12+ minimum
- Ollama running locally (Install Ollama)
Installation
# Pull required AI models
ollama pull qwen3-coder:30b
ollama pull embeddinggemma:300m
# Install FalconEYE
pip install -e .
Apple Silicon Installation (MLX)
For native Apple Silicon acceleration:
# Pull embedding model (still required -- MLX uses Ollama for embeddings)
ollama pull embeddinggemma:300m
# Install FalconEYE with MLX support
pip install -e ".[mlx]"
Note: MLX requires an Apple Silicon Mac (M1/M2/M3/M4). The MLX backend uses a hybrid approach — MLX handles LLM inference while Ollama handles embeddings. Ollama must still be running for the indexing step. The MLX analysis model is downloaded automatically from HuggingFace on first use.
Keeping Up to Date
Once installed from a git clone, you can upgrade FalconEYE to the latest version with a single command:
falconeye upgrade
This will:
git pull origin main— fetch and merge the latest changes- Reinstall dependencies from
pyproject.toml - Show what changed (commits pulled) and the new version
Note:
falconeye upgraderequires FalconEYE to be installed from a git clone (pip install -e .). If installed from a package archive, reinstall from the repository instead.
Your First Scan
# Full scan (index + review in one step)
falconeye scan /path/to/your/project
# Or run the steps separately:
falconeye index /path/to/your/project # Index codebase (one-time)
falconeye review /path/to/your/project # Analyze for vulnerabilities
# Scan with MLX backend on Apple Silicon
falconeye scan /path/to/your/project --backend mlx
# Generate an HTML report
falconeye scan /path/to/your/project --format html --output report.html
# Verbose mode -- see LLM streaming and full logs
falconeye scan /path/to/your/project -v
MLX Backend (Apple Silicon)
FalconEYE supports native Apple Silicon inference via MLX, Apple's machine learning framework optimized for the unified memory architecture and Neural Engine of M-series chips.
Performance Benefits
MLX delivers significant performance improvements over Ollama (which uses llama.cpp internally) on Apple Silicon hardware:
| Metric | Improvement | Details |
|--------|-------------|---------|
| Token Generation | 20-40% faster | MLX runs inference directly on the Apple GPU/Neural Engine. On MoE models like Qwen3-30B-A3B, benchmarks show 17-43% higher tok/s vs llama.cpp. Smaller models see up to 87% gains. |
| Memory Usage | ~30% lower RAM | Zero-copy unified memory eliminates data duplication between CPU and GPU. Lazy evaluation fuses operations and reduces allocation overhead. |
| First-Token Latency | ~50% lower | In-process inference removes the HTTP round-trip overhead of Ollama's REST API on localhost:11434. |
| Prompt Processing | ~25% faster prefill | Native Metal compute path vs llama.cpp's Metal abstraction layer. |
| Model Availability | Broader selection | Access thousands of quantized models from HuggingFace's mlx-community, compared to Ollama's curated library. |
Figures based on published benchmarks of MLX vs llama.cpp on M-series chips (Barrios et al., arXiv:2601.19139; arXiv:2511.05502; Google Cloud Community Gemma 3 benchmarks). Actual results vary by model size, quantization level, and hardware generation.
When to use MLX:
- You have an Apple Silicon Mac (M1/M2/M3/M4)
- You want faster scan times and lower memory consumption
- You want access to the latest quantized models from HuggingFace
When to stick with Ollama:
- Cross-platform deployment (Linux, Intel Mac, Windows via WSL)
- You prefer Ollama's curated model library and CLI tooling
- Running on non-Apple hardware
Hybrid Architecture
The MLX backend uses a hybrid approach because MLX does not natively support embedding models:
┌──────────────────────────┐
│ FalconEYE CLI │
└────────────┬─────────────┘
│
┌───────────────┴───────────────┐
│ │
┌─────────▼──────────┐ ┌──────────▼──────────┐
│ MLX (Analysis) │ │ Ollama (Embeddings) │
│ │ │ │
│ Qwen3-Coder-30B │
