Neurondb
NeuronDB PostgreSQL extension: vector similarity search (HNSW, IVFFlat), embeddings, kNN, ML in SQL, and hybrid full-text + vector retrieval.
Install / Use
/learn @neurondb/NeurondbREADME
NeuronDB - AI Database Extension for PostgreSQL
<div align="center">PostgreSQL extension for vector similarity search (HNSW, IVFFlat), kNN, embeddings, machine learning, and hybrid full-text + vector search in SQL
</div>Run the system in 5 minutes
From the repository root:
docker compose -f docker/docker-compose.yml up -d neurondb
docker compose -f docker/docker-compose.yml ps # wait until healthy
docker compose -f docker/docker-compose.yml exec neurondb psql -U neurondb -d neurondb -c "CREATE EXTENSION IF NOT EXISTS neurondb; SELECT neurondb.version();"
Then run your first vector search: Simple Start or Quick Start.
Table of Contents
<details> <summary><strong>Expand full table of contents</strong></summary>- Overview
- Documentation
- Official Documentation
- Architecture
- Compatibility
- Support & Community
- Contributing
- License
- Authors
Overview
Vectors, embeddings, and ML—inside PostgreSQL. NeuronDB keeps similarity search and models on your live rows, not in a separate database you have to sync and babysit.
HNSW · IVFFlat · kNN · hybrid full-text + vector · RAG pieces · train & predict in SQL—all first-class in the engine.
One extension: same Postgres backups, HA, and security. Start with CREATE EXTENSION neurondb;, then index and query from SQL.
Key Capabilities
<details> <summary><strong>Feature summary</strong></summary>| Category | Details |
|:---------|:--------|
| Vector types | vector, vectorp, vecmap, vgraph, rtext, halfvec, binaryvec, sparsevec (8 types) |
| Index access methods | HNSW and IVF only. PQ and OPQ are quantization (codebook training), not separate index types. Hybrid and multi-vector search are query-level functions. |
| Distance metrics | L2, cosine, inner product, L1, Hamming, Jaccard, and others |
| ML | 25+ algorithm families (train/predict/evaluate): linear regression, XGBoost, LightGBM, CatBoost, K-Means, etc. |
| SQL | ~650+ functions and operators (vector, ML, embeddings, RAG, indexing). See FEATURES.md and SQL API. |
| GPU | CUDA, ROCm, Metal (distance and search; index build is CPU only). See GPU feature matrix. |
| Background workers | neuranq, neuranmon, neurandefrag, neuranllm |
Performance Metrics
NeuronDB provides significant performance improvements over standard PostgreSQL extensions:
Index Build Performance:
The index build time for HNSW follows the relationship:
$$T_{build} = O(N \cdot \log N \cdot m \cdot ef_{construction})$$
Where:
- $N$ = number of vectors
- $m$ = number of connections per node (typically 16-32)
- $ef_{construction}$ = size of candidate list during construction (typically 64-200)
Query Performance:
Query latency for HNSW search:
$$T_{query} = O(\log N + ef_{search} \cdot k)$$
Where:
- $ef_{search}$ = size of candidate list during search (typically 40-200)
- $k$ = number of results requested
Throughput Calculation:
$$QPS = \frac{1}{T_{query}} = \frac{1}{O(\log N + ef_{search} \cdot k)}$$
[!TIP] For optimal performance, tune
ef_searchbased on your recall requirements. Higher values improve recall but increase latency.
Documentation
Getting Started
- Installation - Install NeuronDB extension
- Extension packaging - Control file, file layout, CREATE/UPDATE/DROP EXTENSION, dump/restore
- Quick Start - Get up and running quickly
Vector Search & Indexing
- Vector Types —
vector,vectorp,vecmap,vgraph,rtext,halfvec,binaryvec,sparsevec - Indexing — HNSW and IVF indexing
- Distance Metrics — L2, cosine, inner product, and more
- Quantization — PQ and OPQ compression
ML Algorithms & Analytics
- Random Forest - Classification and regression
- Gradient Boosting - XGBoost, LightGBM, CatBoost
- Clustering - K-Means, DBSCAN, GMM, Hierarchical
- Dimensionality Reduction - PCA and PCA Whitening
- Classification - SVM, Logistic Regression, Naive Bayes, Decision Trees
- Regression - Linear, Ridge, Lasso, Deep Learning
- Outlier Detection - Z-score, Modified Z-score, IQR
- Quality Metrics - Recall@K, Precision@K, F1@K, MRR
- Drift Detection - Centroid drift, Distribution divergence
- Topic Discovery - Topic modeling and analysis
- Time Series - Forecasting and analysis
- Recommendation Systems - Collaborative filtering
ML & Embeddings
- Embedding Generation - Text, image, multimodal embeddings
- Model Inference - ONNX runtime, batch processing
- Model Management - Load, export, version models
- AutoML - Automated hyperparameter tuning
- Feature Store - Feature management and versioning
Hybrid Search & Retrieval
- Hybrid Search - Combine vector and full-text search
- Multi-Vector - Multiple embeddings per document
- Faceted Search - Category-aware retrieval
- Temporal Search - Time-decay relevance scoring
Reranking
- Cross-Encoder - Neural reranking models
- LLM Reranking - GPT/Claude-powered scoring
- ColBERT - Late interaction models
- Ensemble - Combine multiple strategies
RAG Pipeline
- Complete RAG Support - End-to-end RAG
- LLM Integration - Hugging Face and OpenAI
- Document Processing - Text processing and NLP
Background Workers
- neuranq - Async job queue executor
- neuranmon - Live query auto-tuner
- neurandefrag - Index maintenance
- neuranllm - LLM job processor
GPU Acceleration
- CUDA Support - NVIDIA GPU acceleration
- ROCm Support - AMD GPU acceleration
- Metal Support - Apple Silicon GPU acceleration
- Auto-Detection - Automatic GPU detection
Performance & Security
- SIMD Optimization - AVX2/AVX512, NEON optimization
- Security - Encryption, privacy, RLS
- Monitoring - Monitoring views and Prometheus
Configuration & Operations
- Configuration - Essential configuration options
- Troubleshooting - Common issues and solutions (getting started and operations)
Official Documentation
https://www.neurondb.ai/docs — API reference (~650+ SQL functions), tutorials, deployment, and troubleshooting.
Architecture
NeuronDB follows PostgreSQL's architectural patterns and extends the database with AI capabilities.
System Architecture
graph TB
subgraph SQL["SQL Interface Layer"]
FUNC["~650+ SQL Functions"]
TYPES["Vector Types: vector, vectorp, vecmap, vgraph, rtext, halfvec, binaryvec, sparsevec"]
OPS["Distance Operators: <->, <=>, <#>"]
end
subgraph VECTOR["Vector Operations"]
INDEX["HNSW/IVF Indexes"]
DIST["Distance Metrics: L2, Cosine,
