Trieve
All-in-one platform for search, recommendations, RAG, and analytics offered via API
Install / Use
/learn @devflowinc/TrieveREADME
Quick Links
Features
- 🔒 Self-Hosting in your VPC or on-prem: We have full self-hosting guides for AWS, GCP, Kubernetes generally, and docker compose available on our documentation page here.
- 🧠 Semantic Dense Vector Search: Integrates with OpenAI or Jina embedding models and Qdrant to provide semantic vector search.
- 🔍 Typo Tolerant Full-Text/Neural Search: Every uploaded chunk is vector'ized with naver/efficient-splade-VI-BT-large-query for typo tolerant, quality neural sparse-vector search.
- 🖊️ Sub-Sentence Highlighting: Highlight the matching words or sentences within a chunk and bold them on search to enhance UX for your users. Shout out to the simsearch crate!
- 🌟 Recommendations: Find similar chunks (or files if using grouping) with the recommendation API. Very helpful if you have a platform where users' favorite, bookmark, or upvote content.
- 🤖 Convenient RAG API Routes: We integrate with OpenRouter to provide you with access to any LLM you would like for RAG. Try our routes for fully-managed RAG with topic-based memory management or select your own context RAG.
- 💼 Bring Your Own Models: If you'd like, you can bring your own text-embedding, SPLADE, cross-encoder re-ranking, and/or large-language model (LLM) and plug it into our infrastructure.
- 🔄 Hybrid Search with cross-encoder re-ranking: For the best results, use hybrid search with BAAI/bge-reranker-large re-rank optimization.
- 📆 Recency Biasing: Easily bias search results for what was most recent to prevent staleness
- 🛠️ Tunable Merchandizing: Adjust relevance using signals like clicks, add-to-carts, or citations
- 🕳️ Filtering: Date-range, substring match, tag, numeric, and other filter types are supported.
- 👥 Grouping: Mark multiple chunks as being part of the same file and search on the file-level such that the same top-level result never appears twice
Are we missing a feature that your use case would need? - call us at 628-222-4090, make a Github issue, or join the Matrix community and tell us! We are a small company who is still very hands-on and eager to build what you need; professional services are available.
Local Development
Installing via Smithery
To install Trieve for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install trieve-mcp-server --client claude
System Dependencies
Linux (Debian/Ubuntu)
sudo apt install curl \
gcc \
g++ \
make \
pkg-config \
python3 \
python3-pip \
libpq-dev \
libssl-dev \
openssl
Linux (Arch)
sudo pacman -S base-devel postgresql-libs
MacOS
# Install Xcode command line tools
xcode-select --install
# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Install required packages
brew install pkg-config openssl
Install NodeJS and Yarn
You can install NVM using its install script.
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash
You should restart the terminal to update bash profile with NVM. Then, you can install NodeJS LTS release and Yarn.
nvm install --lts
npm install -g yarn
Make server tmp dir
mkdir server/tmp
Install rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Install cargo-watch
cargo install cargo-watch
Setup env's
You might need to create the analytics directory in ./frontends
cp .env.analytics ./frontends/analytics/.env
cp .env.chat ./frontends/chat/.env
cp .env.search ./frontends/search/.env
cp .env.example ./server/.env
cp .env.dashboard ./frontends/dashboard/.env
Add your LLM_API_KEY to ./server/.env
Here is a guide for acquiring that.
Steps once you have the key
- Open the
./server/.envfile - Replace the value for
LLM_API_KEYto be your own OpenAI API key. - Replace the value for
OPENAI_API_KEYto be your own OpenAI API key.
Override the following keys in your terminal for local dev
The PAGEFIND_CDN_BASE_URL and S3_SECRET_KEY_CSVJSONL can be set to a random list of strings.
export OPENAI_API_KEY="your_OpenAI_api_key" \
LLM_API_KEY="your_OpenAI_api_key" \
PAGEFIND_CDN_BASE_URL="lZP8X4h0Q5Sj2ZmV,aAmu1W92T6DbFUkJ,DZ5pMvz8P1kKNH0r,QAqwvKh8rI5sPmuW,YMwgsBz7jLfN0oX8" \
S3_SECRET_KEY_CSVJSONL="Gq6wzS3mjC5kL7i4KwexnL3gP8Z1a5Xv,V2c4ZnL0uHqBzFvR2NcN8Pb1g6CjmX9J,TfA1h8LgI5zYkH9A9p7NvWlL0sZzF9p8N,pKr81pLq5n6MkNzT1X09R7Qb0Vn5cFr0d,DzYwz82FQiW6T3u9A4z9h7HLOlJb7L2V1" \
GROQ_API_KEY="GROQ_API_KEY_if_applicable"
Start docker container services needed for local dev
cat .env.chat .env.search .env.server .env.docker-compose > .env
./convenience.sh -l
Start embedding servers
We offer 2 docker-compose files for embedding servers. One for GPU and one for CPU.
docker compose -f docker-compose-cpu-embeddings.yml up -d
or
docker compose -f docker-compose-gpu-embeddings.yml up -d
- Note on embedding servers. If you want to use a separate GPU enabled device for embedding servers you will need to update the following parameters
SPARSE_SERVER_QUERY_ORIGIN
SPARSE_SERVER_DOC_ORIGIN
EMBEDDING_SERVER_ORIGIN
SPARSE_SERVER_QUERY_ORIGIN
Using CPU embeddings on MacOS
cp docker-compose-cpu-embeddings.override.mac docker-compose-cpu-embeddings.override.yml
docker-compose -f docker-compose-cpu-embeddings.yml -f docker-compose-cpu-embeddings.override.yml docker up -d
Install front-end packages for local dev
cd frontends
yarn
cd ..
cd clients/ts-sdk
yarn build
cd ../..
Start services for local dev
It is recommend to manage services through tmuxp, see the guide here or terminal tabs.
cd frontends
yarn
yarn dev
cd server
cargo watch -x run
cd server
cargo run --bin ingestion-worker
cd server
cargo run --bin file-worker
cd server
cargo run --bin delete-worker
cd search
yarn
yarn dev
Verify Working Setup
After the cargo build has finished (after the tmuxp load trieve):
- check that you can see redoc with the OpenAPI reference at localhost:8090/redoc
- make an account create a dataset with test data at localhost:5173
- search that dataset with test data at localhost:5174
Additional Instructions for testing cross encoder reranking models
To test the Cross Encoder rerankers in local dev,
- click on the dataset, go to the Dataset Settings -> Dataset Options -> Additional Options and uncheck the
Fulltext Enabledoption. - in the Embedding Settings, select your reranker model and enter the respective key i
