Vectra
Vectra is a local vector database for Node.js with features similar to pinecone but built using local files.
Install / Use
/learn @Stevenic/VectraREADME
Vectra: local, file‑backed vector database for Node.js
Overview
Vectra is a file‑backed, in‑memory vector database for Node.js. It works like a local Pinecone Qdrant: each index is just a folder on disk with an index.json file containing vectors and any metadata fields you choose to index; all other metadata is stored per‑item as separate JSON files. Queries use a Pinecone‑compatible subset of MongoDB‑style operators for filtering, then rank matches by cosine similarity. Because the entire index is loaded into memory, lookups are extremely fast (often <1 ms for small indexes, commonly 1–2 ms for larger local sets). It’s ideal when you want simple, zero‑infrastructure retrieval over a small, mostly static corpus. Pinecone‑style namespaces aren’t built‑in, but you can mimic them by using separate folders (indexes).
Typical use cases:
- Prompt augmentation over a small, mostly static corpus
- Infinite few‑shot example libraries
- Single‑document or small multi‑document Q&A
- Local/dev workflows where hosted vector DBs are overkill
Table of contents
- Why Vectra
- When to use (and when not)
- Requirements
- Install
- Quick Start
- CLI in 60 seconds
- Core concepts
- File-backed vs in-memory usage
- Best practices
- Performance and limits
- Troubleshooting (quick)
- Next steps
- License
- Project links
Why Vectra
- Zero infrastructure: everything lives in a local folder; no servers, clusters, or managed services required.
- Predictable local performance: full in‑memory scans with pre‑normalized cosine similarity deliver sub‑millisecond to low‑millisecond latency for small/medium corpora.
- Simple mental model: one folder per index; index.json holds vectors and indexed fields, while non‑indexed metadata is stored as per‑item JSON.
- Easy portability: because the format is file‑based and language‑agnostic, indexes can be written in one language and read in another.
- Pinecone‑style filtering: use a familiar subset of MongoDB query operators to filter by metadata before similarity ranking.
- Great for prompt engineering: quickly assemble and retrieve few‑shot examples or small static corpora without external dependencies.
When to use (and when not)
Use Vectra when:
- You have a small, mostly static corpus (e.g., a few hundred to a few thousand chunks).
- You want zero‑infrastructure local retrieval with fast, predictable latency.
- You’re assembling “infinite few‑shot” example libraries or single/small document Q&A.
- You need portable, file‑based indexes that other languages can read/write.
- You want simple “namespaces” by using separate folders per dataset.
Avoid Vectra when:
- You need long‑term, ever‑growing chat memory or very large corpora (the entire index loads into RAM).
- You require multi‑tenant, networked, or horizontally scalable serving.
- You need advanced vector DB features like HNSW/IVF indexing, sharding/replication, or distributed operations.
Notes and tips:
- Mimic namespaces via separate index folders.
- Index only the metadata fields you’ll filter on; keep everything else in per‑item JSON.
- Rough sizing: a 1536‑dim float32 vector is ~6 KB, plus JSON/metadata overhead; size indexes accordingly to your RAM budget.
Requirements
- Node.js 20.x or newer
- A package manager (npm or yarn)
- An embeddings provider for similarity search:
- OpenAI (API key + model, e.g., text-embedding-3-large or compatible)
- Azure OpenAI (endpoint, deployment name, API key)
- OpenAI‑compatible OSS endpoint (model name + base URL)
- If you plan to ingest web pages via the CLI or API, outbound network access to those URLs
- Sufficient RAM to hold your entire index in memory during queries (see “Performance and limits”)
Install
- npm:
npm install vectra - yarn:
yarn add vectra
CLI usage
- Run without installing globally:
npx vectra --help - Optional global install:
npm install -g vectra(then usevectra --help)
Quick Start
Two common paths:
- Path A: you already have vectors (or can generate them) and want to store items + metadata.
- Path B: you have raw text documents; Vectra will chunk, embed, and retrieve relevant spans.
Path A: LocalIndex (items + metadata)
- Create a folder‑backed index
- Choose which metadata fields to index (others are stored per‑item on disk)
- Insert items (vector + metadata)
- Query by vector with optional metadata filters
TypeScript example:
import path from 'node:path';
import { LocalIndex } from 'vectra';
import { OpenAI } from 'openai';
// 1) Create an index folder
const index = new LocalIndex(path.join(process.cwd(), 'my-index'));
// 2) Create the index (set which metadata fields you want searchable)
if (!(await index.isIndexCreated())) {
await index.createIndex({
version: 1,
metadata_config: { indexed: ['category'] }, // only these fields live in index.json; others go to per-item JSON
});
}
// 3) Prepare an embeddings helper (use any provider you like)
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
async function getVector(text: string): Promise<number[]> {
const resp = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
});
return resp.data[0].embedding;
}
// 4) Insert items
await index.insertItem({
vector: await getVector('apple'),
metadata: { text: 'apple', category: 'food', note: 'stored on disk if not indexed' },
});
await index.insertItem({
vector: await getVector('blue'),
metadata: { text: 'blue', category: 'color' },
});
// 5) Query by vector, optionally filter by metadata
async function query(text: string) {
const v = await getVector(text);
// Signature: queryItems(vector, queryString, topK, filter?)
const results = await index.queryItems(v, '', 3, { category: { $eq: 'food' } });
for (const r of results) {
console.log(r.score.toFixed(4), r.item.metadata.text);
}
}
await query('banana'); // should surface 'apple' in top results
Supported filter operators (subset): $and, $or, $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin. Only fields listed in metadata_config.indexed are stored inline and should be used for filtering (everything else is kept per‑item on disk).
Path B: LocalDocumentIndex (documents + chunking + retrieval)
- Create a document index backed by an embeddings model
- Add documents (raw strings, files, or URLs)
- Query by text; Vectra returns the most relevant chunks grouped by document
- Render top sections for direct drop‑in to prompts
- Optional hybrid retrieval: add BM25 keyword matches alongside semantic matches
TypeScript example:
import path from 'node:path';
import { LocalDocumentIndex, OpenAIEmbeddings } from 'vectra';
// 1) Configure embeddings (OpenAI, Azure OpenAI, or OpenAI‑compatible OSS)
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY!,
model: 'text-embedding-3-small',
maxTokens: 8000, // batching limit for chunked requests
});
// 2) Create the index
const docs = new LocalDocumentIndex({
folderPath: path.join(process.cwd(), 'my-doc-index'),
embeddings,
// optional: customize chunking
// chunkingConfig: { chunkSize: 512, chunkOverlap: 0, keepSeparators: true }
});
if (!(await docs.isIndexCreated())) {
await docs.createIndex({ version: 1 });
}
// 3) Add a document (string); you can also add files/URLs via FileFetcher/WebFetcher or the CLI
const uri = 'doc://welcome';
const text = `
Vectra is a file-backed, in-memory vector DB for Node.js. It supports Pinecone-like metadata filtering
and fast local retrieval. It’s ideal for small, mostly static corpora and prompt augmentation.
`;
await docs.upsertDocument(uri, text, 'md'); // optional docType hints chunking
// 4) Query and render sections for your prompt
const results = await docs.queryDocuments('What is Vectra best suited for?', {
maxDocuments: 5,
maxChunks: 20,
// isBm25: true, // turn on hybrid (semantic + keyword) retrieval
});
// Take top document and render spans of text
if (results.length > 0) {
const top = results[0];
console.log('URI:', top.uri, 'score:', top.score.toFixed(4));
const sections = await top.renderSections(2000, 1, true); // maxTokens per section, number of sections
for (const s of sections) {
console.log('Section score:', s.score.toFixed(4), 'tokens:', s.tokenCount, 'bm25:', s.isBm25);
console.log(s.text);
}
}
Notes:
- queryDocuments returns LocalDocumentResult objects, each with scored chunks. renderSections merges adjacent chunks, keeps within your token budget, and can optionally add overlapping context for readability.
- Hybrid retrieval: set isBm25: true in queryDocuments to include keyword‑based chunks (Okapi‑BM25) alongside semantic chunks. Each rendered section includes isBm25 to help you distinguish them.
CLI in 60 seconds
Three steps: create → add → query. No servers, just a folder.
- Create an index folder
npx vectra create ./my-d
