SkillAgentSearch skills...

Vectra

Vectra is a local vector database for Node.js with features similar to pinecone but built using local files.

Install / Use

/learn @Stevenic/Vectra
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Vectra: local, file‑backed vector database for Node.js

Overview

Vectra is a file‑backed, in‑memory vector database for Node.js. It works like a local Pinecone Qdrant: each index is just a folder on disk with an index.json file containing vectors and any metadata fields you choose to index; all other metadata is stored per‑item as separate JSON files. Queries use a Pinecone‑compatible subset of MongoDB‑style operators for filtering, then rank matches by cosine similarity. Because the entire index is loaded into memory, lookups are extremely fast (often <1 ms for small indexes, commonly 1–2 ms for larger local sets). It’s ideal when you want simple, zero‑infrastructure retrieval over a small, mostly static corpus. Pinecone‑style namespaces aren’t built‑in, but you can mimic them by using separate folders (indexes).

Typical use cases:

  • Prompt augmentation over a small, mostly static corpus
  • Infinite few‑shot example libraries
  • Single‑document or small multi‑document Q&A
  • Local/dev workflows where hosted vector DBs are overkill

Table of contents

Why Vectra

  • Zero infrastructure: everything lives in a local folder; no servers, clusters, or managed services required.
  • Predictable local performance: full in‑memory scans with pre‑normalized cosine similarity deliver sub‑millisecond to low‑millisecond latency for small/medium corpora.
  • Simple mental model: one folder per index; index.json holds vectors and indexed fields, while non‑indexed metadata is stored as per‑item JSON.
  • Easy portability: because the format is file‑based and language‑agnostic, indexes can be written in one language and read in another.
  • Pinecone‑style filtering: use a familiar subset of MongoDB query operators to filter by metadata before similarity ranking.
  • Great for prompt engineering: quickly assemble and retrieve few‑shot examples or small static corpora without external dependencies.

When to use (and when not)

Use Vectra when:

  • You have a small, mostly static corpus (e.g., a few hundred to a few thousand chunks).
  • You want zero‑infrastructure local retrieval with fast, predictable latency.
  • You’re assembling “infinite few‑shot” example libraries or single/small document Q&A.
  • You need portable, file‑based indexes that other languages can read/write.
  • You want simple “namespaces” by using separate folders per dataset.

Avoid Vectra when:

  • You need long‑term, ever‑growing chat memory or very large corpora (the entire index loads into RAM).
  • You require multi‑tenant, networked, or horizontally scalable serving.
  • You need advanced vector DB features like HNSW/IVF indexing, sharding/replication, or distributed operations.

Notes and tips:

  • Mimic namespaces via separate index folders.
  • Index only the metadata fields you’ll filter on; keep everything else in per‑item JSON.
  • Rough sizing: a 1536‑dim float32 vector is ~6 KB, plus JSON/metadata overhead; size indexes accordingly to your RAM budget.

Requirements

  • Node.js 20.x or newer
  • A package manager (npm or yarn)
  • An embeddings provider for similarity search:
    • OpenAI (API key + model, e.g., text-embedding-3-large or compatible)
    • Azure OpenAI (endpoint, deployment name, API key)
    • OpenAI‑compatible OSS endpoint (model name + base URL)
  • If you plan to ingest web pages via the CLI or API, outbound network access to those URLs
  • Sufficient RAM to hold your entire index in memory during queries (see “Performance and limits”)

Install

  • npm: npm install vectra
  • yarn: yarn add vectra

CLI usage

  • Run without installing globally: npx vectra --help
  • Optional global install: npm install -g vectra (then use vectra --help)

Quick Start

Two common paths:

  • Path A: you already have vectors (or can generate them) and want to store items + metadata.
  • Path B: you have raw text documents; Vectra will chunk, embed, and retrieve relevant spans.

Path A: LocalIndex (items + metadata)

  • Create a folder‑backed index
  • Choose which metadata fields to index (others are stored per‑item on disk)
  • Insert items (vector + metadata)
  • Query by vector with optional metadata filters

TypeScript example:

import path from 'node:path';
import { LocalIndex } from 'vectra';
import { OpenAI } from 'openai';

// 1) Create an index folder
const index = new LocalIndex(path.join(process.cwd(), 'my-index'));

// 2) Create the index (set which metadata fields you want searchable)
if (!(await index.isIndexCreated())) {
  await index.createIndex({
    version: 1,
    metadata_config: { indexed: ['category'] }, // only these fields live in index.json; others go to per-item JSON
  });
}

// 3) Prepare an embeddings helper (use any provider you like)
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
async function getVector(text: string): Promise<number[]> {
  const resp = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
  return resp.data[0].embedding;
}

// 4) Insert items
await index.insertItem({
  vector: await getVector('apple'),
  metadata: { text: 'apple', category: 'food', note: 'stored on disk if not indexed' },
});
await index.insertItem({
  vector: await getVector('blue'),
  metadata: { text: 'blue', category: 'color' },
});

// 5) Query by vector, optionally filter by metadata
async function query(text: string) {
  const v = await getVector(text);
  // Signature: queryItems(vector, queryString, topK, filter?)
  const results = await index.queryItems(v, '', 3, { category: { $eq: 'food' } });
  for (const r of results) {
    console.log(r.score.toFixed(4), r.item.metadata.text);
  }
}

await query('banana'); // should surface 'apple' in top results

Supported filter operators (subset): $and, $or, $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin. Only fields listed in metadata_config.indexed are stored inline and should be used for filtering (everything else is kept per‑item on disk).

Path B: LocalDocumentIndex (documents + chunking + retrieval)

  • Create a document index backed by an embeddings model
  • Add documents (raw strings, files, or URLs)
  • Query by text; Vectra returns the most relevant chunks grouped by document
  • Render top sections for direct drop‑in to prompts
  • Optional hybrid retrieval: add BM25 keyword matches alongside semantic matches

TypeScript example:

import path from 'node:path';
import { LocalDocumentIndex, OpenAIEmbeddings } from 'vectra';

// 1) Configure embeddings (OpenAI, Azure OpenAI, or OpenAI‑compatible OSS)
const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
  maxTokens: 8000, // batching limit for chunked requests
});

// 2) Create the index
const docs = new LocalDocumentIndex({
  folderPath: path.join(process.cwd(), 'my-doc-index'),
  embeddings,
  // optional: customize chunking
  // chunkingConfig: { chunkSize: 512, chunkOverlap: 0, keepSeparators: true }
});

if (!(await docs.isIndexCreated())) {
  await docs.createIndex({ version: 1 });
}

// 3) Add a document (string); you can also add files/URLs via FileFetcher/WebFetcher or the CLI
const uri = 'doc://welcome';
const text = `
Vectra is a file-backed, in-memory vector DB for Node.js. It supports Pinecone-like metadata filtering
and fast local retrieval. It’s ideal for small, mostly static corpora and prompt augmentation.
`;
await docs.upsertDocument(uri, text, 'md'); // optional docType hints chunking

// 4) Query and render sections for your prompt
const results = await docs.queryDocuments('What is Vectra best suited for?', {
  maxDocuments: 5,
  maxChunks: 20,
  // isBm25: true, // turn on hybrid (semantic + keyword) retrieval
});

// Take top document and render spans of text
if (results.length > 0) {
  const top = results[0];
  console.log('URI:', top.uri, 'score:', top.score.toFixed(4));
  const sections = await top.renderSections(2000, 1, true); // maxTokens per section, number of sections
  for (const s of sections) {
    console.log('Section score:', s.score.toFixed(4), 'tokens:', s.tokenCount, 'bm25:', s.isBm25);
    console.log(s.text);
  }
}

Notes:

  • queryDocuments returns LocalDocumentResult objects, each with scored chunks. renderSections merges adjacent chunks, keeps within your token budget, and can optionally add overlapping context for readability.
  • Hybrid retrieval: set isBm25: true in queryDocuments to include keyword‑based chunks (Okapi‑BM25) alongside semantic chunks. Each rendered section includes isBm25 to help you distinguish them.

CLI in 60 seconds

Three steps: create → add → query. No servers, just a folder.

  1. Create an index folder
npx vectra create ./my-d
View on GitHub
GitHub Stars596
CategoryData
Updated15m ago
Forks49

Languages

TypeScript

Security Score

95/100

Audited on Mar 25, 2026

No findings