<img height="100" src="https://cdn.trieve.ai/trieve-logo.png" alt="Trieve Logo"> <a href="https://dashboard.trieve.ai">Sign Up (1k chunks free)</a> | <a href="https://pdf2md.trieve.ai">PDF2MD</a> | <a href="https://docs.trieve.ai">Hacker News Search Engine</a> | <a href="https://docs.trieve.ai">Documentation</a> | <a href="https://cal.com/nick.k/meet">Meet a Maintainer</a> | <a href="https://discord.gg/eBJXXZDB8z">Discord</a> | <a href="https://matrix.to/#/#trieve-general:trieve.ai">Matrix</a> <a href="https://github.com/devflowinc/trieve/stargazers"> <img src="https://img.shields.io/github/stars/devflowinc/trieve.svg?style=flat&color=yellow" alt="Github stars"/> </a> <a href="https://discord.gg/CuJVfgZf54"> <img src="https://img.shields.io/discord/1130153053056684123.svg?label=Discord&logo=Discord&colorB=7289da&style=flat" alt="Join Discord"/> </a> <a href="https://matrix.to/#/#trieve-general:trieve.ai"> <img src="https://img.shields.io/badge/matrix-join-purple?style=flat&logo=matrix&logocolor=white" alt="Join Matrix"/> </a> <a href="https://smithery.ai/server/trieve-mcp-server"> <img src="https://smithery.ai/badge/trieve-mcp-server" alt="smithery badge"/> </a> <a href="https://insiders.vscode.dev/redirect?url=vscode%3Amcp%2Finstall%3F%257B%2522name%2522%253A%2522trieve-mcp-server%2522%252C%2522command%2522%253A%2522npx%2522%252C%2522args%2522%253A%255B%2522more%2520args...%2522%255D%257D"> <img src="https://img.shields.io/badge/vscode-mcp-install?style=flat&logoColor=%230078d4&label=vscode-mcp&labelColor=%230078d4&link=https%3A%2F%2Finsiders.vscode.dev%2Fredirect%3Furl%3Dvscode%253Amcp%252Finstall%253F%25257B%252522name%252522%25253A%252522trieve-mcp-server%252522%25252C%252522command%252522%25253A%252522npx%252522%25252C%252522args%252522%25253A%25255B%252522more%252520args...%252522%25255D%25257D" alt="vscode mcp install badge"/> </a> <h2 align="center"> All-in-one solution for search, recommendations, and RAG </h2> <a href="https://trieve.ai"> <img src="https://cdn.trieve.ai/landing-tabs/light-api.webp"> </a>

Quick Links

Features

🔒 Self-Hosting in your VPC or on-prem: We have full self-hosting guides for AWS, GCP, Kubernetes generally, and docker compose available on our documentation page here.
🧠 Semantic Dense Vector Search: Integrates with OpenAI or Jina embedding models and Qdrant to provide semantic vector search.
🔍 Typo Tolerant Full-Text/Neural Search: Every uploaded chunk is vector'ized with naver/efficient-splade-VI-BT-large-query for typo tolerant, quality neural sparse-vector search.
🖊️ Sub-Sentence Highlighting: Highlight the matching words or sentences within a chunk and bold them on search to enhance UX for your users. Shout out to the simsearch crate!
🌟 Recommendations: Find similar chunks (or files if using grouping) with the recommendation API. Very helpful if you have a platform where users' favorite, bookmark, or upvote content.
🤖 Convenient RAG API Routes: We integrate with OpenRouter to provide you with access to any LLM you would like for RAG. Try our routes for fully-managed RAG with topic-based memory management or select your own context RAG.
💼 Bring Your Own Models: If you'd like, you can bring your own text-embedding, SPLADE, cross-encoder re-ranking, and/or large-language model (LLM) and plug it into our infrastructure.
🔄 Hybrid Search with cross-encoder re-ranking: For the best results, use hybrid search with BAAI/bge-reranker-large re-rank optimization.
📆 Recency Biasing: Easily bias search results for what was most recent to prevent staleness
🛠️ Tunable Merchandizing: Adjust relevance using signals like clicks, add-to-carts, or citations
🕳️ Filtering: Date-range, substring match, tag, numeric, and other filter types are supported.
👥 Grouping: Mark multiple chunks as being part of the same file and search on the file-level such that the same top-level result never appears twice

Are we missing a feature that your use case would need? - call us at 628-222-4090, make a Github issue, or join the Matrix community and tell us! We are a small company who is still very hands-on and eager to build what you need; professional services are available.

Local Development

Installing via Smithery

To install Trieve for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install trieve-mcp-server --client claude

System Dependencies

Linux (Debian/Ubuntu)

sudo apt install curl \
gcc \
g++ \
make \
pkg-config \
python3 \
python3-pip \
libpq-dev \
libssl-dev \
openssl

Linux (Arch)

sudo pacman -S base-devel postgresql-libs

MacOS

# Install Xcode command line tools
xcode-select --install

# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install required packages
brew install pkg-config openssl

Install NodeJS and Yarn

You can install NVM using its install script.

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash

You should restart the terminal to update bash profile with NVM. Then, you can install NodeJS LTS release and Yarn.

nvm install --lts
npm install -g yarn

Make server tmp dir

mkdir server/tmp

Install rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Install cargo-watch

cargo install cargo-watch

Setup env's

You might need to create the analytics directory in ./frontends

cp .env.analytics ./frontends/analytics/.env
cp .env.chat ./frontends/chat/.env
cp .env.search ./frontends/search/.env
cp .env.example ./server/.env
cp .env.dashboard ./frontends/dashboard/.env

Add your `LLM_API_KEY` to `./server/.env`

Here is a guide for acquiring that.

Steps once you have the key

Open the ./server/.env file
Replace the value for LLM_API_KEY to be your own OpenAI API key.
Replace the value for OPENAI_API_KEY to be your own OpenAI API key.

Override the following keys in your terminal for local dev

The PAGEFIND_CDN_BASE_URL and S3_SECRET_KEY_CSVJSONL can be set to a random list of strings.

export OPENAI_API_KEY="your_OpenAI_api_key" \
LLM_API_KEY="your_OpenAI_api_key" \
PAGEFIND_CDN_BASE_URL="lZP8X4h0Q5Sj2ZmV,aAmu1W92T6DbFUkJ,DZ5pMvz8P1kKNH0r,QAqwvKh8rI5sPmuW,YMwgsBz7jLfN0oX8" \
S3_SECRET_KEY_CSVJSONL="Gq6wzS3mjC5kL7i4KwexnL3gP8Z1a5Xv,V2c4ZnL0uHqBzFvR2NcN8Pb1g6CjmX9J,TfA1h8LgI5zYkH9A9p7NvWlL0sZzF9p8N,pKr81pLq5n6MkNzT1X09R7Qb0Vn5cFr0d,DzYwz82FQiW6T3u9A4z9h7HLOlJb7L2V1" \
GROQ_API_KEY="GROQ_API_KEY_if_applicable"

Start docker container services needed for local dev

cat .env.chat .env.search .env.server .env.docker-compose > .env

./convenience.sh -l

Start embedding servers

We offer 2 docker-compose files for embedding servers. One for GPU and one for CPU.

docker compose -f docker-compose-cpu-embeddings.yml up -d

docker compose -f docker-compose-gpu-embeddings.yml up -d

Note on embedding servers. If you want to use a separate GPU enabled device for embedding servers you will need to update the following parameters

SPARSE_SERVER_QUERY_ORIGIN
SPARSE_SERVER_DOC_ORIGIN
EMBEDDING_SERVER_ORIGIN
SPARSE_SERVER_QUERY_ORIGIN

Using CPU embeddings on MacOS

cp docker-compose-cpu-embeddings.override.mac docker-compose-cpu-embeddings.override.yml
docker-compose -f docker-compose-cpu-embeddings.yml -f docker-compose-cpu-embeddings.override.yml docker up -d

Install front-end packages for local dev

cd frontends
yarn

cd ..

cd clients/ts-sdk
yarn build

cd ../..

Start services for local dev

It is recommend to manage services through tmuxp, see the guide here or terminal tabs.

cd frontends
yarn
yarn dev

cd server
cargo watch -x run

cd server
cargo run --bin ingestion-worker

cd server
cargo run --bin file-worker

cd server
cargo run --bin delete-worker

cd search
yarn
yarn dev

Verify Working Setup

After the cargo build has finished (after the tmuxp load trieve):

check that you can see redoc with the OpenAPI reference at localhost:8090/redoc
make an account create a dataset with test data at localhost:5173
search that dataset with test data at localhost:5174

Additional Instructions for testing cross encoder reranking models

To test the Cross Encoder rerankers in local dev,

click on the dataset, go to the Dataset Settings -> Dataset Options -> Additional Options and uncheck the Fulltext Enabled option.
in the Embedding Settings, select your reranker model and enter the respective key i

Trieve

Install / Use

README

Quick Links

Features

Local Development

Installing via Smithery

System Dependencies

Linux (Debian/Ubuntu)

Linux (Arch)

MacOS

Install NodeJS and Yarn

Make server tmp dir

Install rust

Install cargo-watch

Setup env's

Add your `LLM_API_KEY` to `./server/.env`

Steps once you have the key

Override the following keys in your terminal for local dev

Start docker container services needed for local dev

Start embedding servers

Using CPU embeddings on MacOS

Install front-end packages for local dev

Start services for local dev

Verify Working Setup

Additional Instructions for testing cross encoder reranking models

Trieve

Install / Use

README

Quick Links

Features

Local Development

Installing via Smithery

System Dependencies

Linux (Debian/Ubuntu)

Linux (Arch)

MacOS

Install NodeJS and Yarn

Make server tmp dir

Install rust

Install cargo-watch

Setup env's

Add your LLM_API_KEY to ./server/.env

Steps once you have the key

Override the following keys in your terminal for local dev

Start docker container services needed for local dev

Start embedding servers

Using CPU embeddings on MacOS

Install front-end packages for local dev

Start services for local dev

Verify Working Setup

Additional Instructions for testing cross encoder reranking models

Add your `LLM_API_KEY` to `./server/.env`