SkillAgentSearch skills...

FalconEYE

AI-powered security code analyzer using local LLMs for semantic vulnerability detection. Unlike traditional SAST tools, FalconEYE reasons about code contextually, no pattern matching. Supports Python, JavaScript, TypeScript, Go, Rust, C/C++, Java, and more.

Install / Use

/learn @FalconEYE-ai/FalconEYE
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

FalconEYE

███████╗ █████╗ ██╗      ██████╗ ██████╗ ███╗   ██╗███████╗██╗   ██╗███████╗
██╔════╝██╔══██╗██║     ██╔════╝██╔═══██╗████╗  ██║██╔════╝╚██╗ ██╔╝██╔════╝
█████╗  ███████║██║     ██║     ██║   ██║██╔██╗ ██║█████╗   ╚████╔╝ █████╗
██╔══╝  ██╔══██║██║     ██║     ██║   ██║██║╚██╗██║██╔══╝    ╚██╔╝  ██╔══╝
██║     ██║  ██║███████╗╚██████╗╚██████╔╝██║ ╚████║███████╗   ██║   ███████╗
╚═╝     ╚═╝  ╚═╝╚══════╝ ╚═════╝ ╚═════╝ ╚═╝  ╚═══╝╚══════╝   ╚═╝   ╚══════╝

Next-Generation Security Code Analysis Powered by Local LLMs

by hardw00t & h4ckologic

FalconEYE represents a paradigm shift in static code analysis. Instead of relying on predefined vulnerability patterns, it leverages large language models to reason about your code the same way a security expert would — understanding context, intent, and subtle security implications that traditional tools miss.


Table of Contents


Why FalconEYE?

Traditional security scanners are limited by their pattern databases. They can only find what they've been programmed to look for. FalconEYE is different:

  • No Pattern Matching — Uses pure AI reasoning to understand your code semantically
  • Context-Aware Analysis — Retrieval-Augmented Generation (RAG) provides relevant code context for deeper insights
  • Novel Vulnerability Detection — Identifies security issues that don't match known patterns
  • LLM-Powered Enrichment — Every finding gets AI-generated descriptions, mitigations, code snippets, and line numbers
  • Reduced False Positives — Optional AI validation pass filters noise from false alarms
  • Rich Reporting — Console, JSON, HTML, and SARIF output with interactive dashboards
  • Smart Re-indexing — Incremental analysis means re-scans only process changed files
  • Privacy-First — Runs entirely locally with Ollama or MLX — your code never leaves your machine
  • Apple Silicon Accelerated — Native MLX backend delivers 20-40% faster inference on M-series chips

How It Works

FalconEYE follows a multi-stage analysis pipeline:

┌─────────────────────────────────────────────────────────────────┐
│                     1. CODE INGESTION                          │
│  Scans repository -> Detects languages -> Parses AST structure │
└──────────────────────────────┬──────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────┐
│                    2. INTELLIGENT INDEXING                      │
│  Chunks code semantically -> Generates embeddings -> Stores in │
│  vector database for fast semantic search                      │
└──────────────────────────────┬──────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────┐
│                   3. CONTEXT ASSEMBLY (RAG)                     │
│  For each code segment -> Retrieves similar code -> Gathers    │
│  relevant context from your entire codebase                    │
└──────────────────────────────┬──────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────┐
│                    4. AI SECURITY ANALYSIS                      │
│  LLM analyzes code with context -> Reasons about vulns ->      │
│  Understands data flow -> Identifies security implications     │
└──────────────────────────────┬──────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────┐
│                   5. LLM-POWERED ENRICHMENT                    │
│  Incomplete findings sent back to the LLM for detailed         │
│  reasoning, specific mitigations, code snippets & line numbers │
└──────────────────────────────┬──────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────┐
│                     6. VALIDATION & REPORTING                  │
│  Optional AI validation pass -> Formats findings -> Outputs in │
│  Console/JSON/HTML/SARIF format with actionable remediation    │
└─────────────────────────────────────────────────────────────────┘

Semantic Understanding: FalconEYE reads your code like a security engineer, understanding business logic, data flows, and architectural patterns to identify real vulnerabilities.

RAG-Enhanced Analysis: By retrieving similar code patterns from your entire codebase, the AI gets crucial context about how functions are used, what data they handle, and potential security implications across your application.

LLM-Powered Enrichment: When the initial analysis produces incomplete findings (missing line numbers, generic descriptions, or vague mitigations), FalconEYE sends them back to the LLM with the full source code for enrichment. Every displayed finding includes specific line numbers, the vulnerable code snippet, a detailed description of the exploit, and actionable remediation referencing actual identifiers from your code.


Getting Started

Prerequisites

  • Python 3.12 (recommended) · 3.12+ minimum
  • Ollama running locally (Install Ollama)

Installation

# Pull required AI models
ollama pull qwen3-coder:30b
ollama pull embeddinggemma:300m

# Install FalconEYE
pip install -e .

Apple Silicon Installation (MLX)

For native Apple Silicon acceleration:

# Pull embedding model (still required -- MLX uses Ollama for embeddings)
ollama pull embeddinggemma:300m

# Install FalconEYE with MLX support
pip install -e ".[mlx]"

Note: MLX requires an Apple Silicon Mac (M1/M2/M3/M4). The MLX backend uses a hybrid approach — MLX handles LLM inference while Ollama handles embeddings. Ollama must still be running for the indexing step. The MLX analysis model is downloaded automatically from HuggingFace on first use.

Keeping Up to Date

Once installed from a git clone, you can upgrade FalconEYE to the latest version with a single command:

falconeye upgrade

This will:

  1. git pull origin main — fetch and merge the latest changes
  2. Reinstall dependencies from pyproject.toml
  3. Show what changed (commits pulled) and the new version

Note: falconeye upgrade requires FalconEYE to be installed from a git clone (pip install -e .). If installed from a package archive, reinstall from the repository instead.

Your First Scan

# Full scan (index + review in one step)
falconeye scan /path/to/your/project

# Or run the steps separately:
falconeye index /path/to/your/project     # Index codebase (one-time)
falconeye review /path/to/your/project    # Analyze for vulnerabilities

# Scan with MLX backend on Apple Silicon
falconeye scan /path/to/your/project --backend mlx

# Generate an HTML report
falconeye scan /path/to/your/project --format html --output report.html

# Verbose mode -- see LLM streaming and full logs
falconeye scan /path/to/your/project -v

MLX Backend (Apple Silicon)

FalconEYE supports native Apple Silicon inference via MLX, Apple's machine learning framework optimized for the unified memory architecture and Neural Engine of M-series chips.

Performance Benefits

MLX delivers significant performance improvements over Ollama (which uses llama.cpp internally) on Apple Silicon hardware:

| Metric | Improvement | Details | |--------|-------------|---------| | Token Generation | 20-40% faster | MLX runs inference directly on the Apple GPU/Neural Engine. On MoE models like Qwen3-30B-A3B, benchmarks show 17-43% higher tok/s vs llama.cpp. Smaller models see up to 87% gains. | | Memory Usage | ~30% lower RAM | Zero-copy unified memory eliminates data duplication between CPU and GPU. Lazy evaluation fuses operations and reduces allocation overhead. | | First-Token Latency | ~50% lower | In-process inference removes the HTTP round-trip overhead of Ollama's REST API on localhost:11434. | | Prompt Processing | ~25% faster prefill | Native Metal compute path vs llama.cpp's Metal abstraction layer. | | Model Availability | Broader selection | Access thousands of quantized models from HuggingFace's mlx-community, compared to Ollama's curated library. |

Figures based on published benchmarks of MLX vs llama.cpp on M-series chips (Barrios et al., arXiv:2601.19139; arXiv:2511.05502; Google Cloud Community Gemma 3 benchmarks). Actual results vary by model size, quantization level, and hardware generation.

When to use MLX:

  • You have an Apple Silicon Mac (M1/M2/M3/M4)
  • You want faster scan times and lower memory consumption
  • You want access to the latest quantized models from HuggingFace

When to stick with Ollama:

  • Cross-platform deployment (Linux, Intel Mac, Windows via WSL)
  • You prefer Ollama's curated model library and CLI tooling
  • Running on non-Apple hardware

Hybrid Architecture

The MLX backend uses a hybrid approach because MLX does not natively support embedding models:

                 ┌──────────────────────────┐
                 │      FalconEYE CLI       │
                 └────────────┬─────────────┘
                              │
              ┌───────────────┴───────────────┐
              │                               │
    ┌─────────▼──────────┐        ┌──────────▼──────────┐
    │   MLX (Analysis)   │        │  Ollama (Embeddings) │
    │                    │        │                      │
    │  Qwen3-Coder-30B  │  
View on GitHub
GitHub Stars46
CategoryCustomer
Updated11d ago
Forks14

Languages

Python

Security Score

75/100

Audited on Mar 30, 2026

No findings