Canopy
Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
Install / Use
/learn @pinecone-io/CanopyREADME
Canopy
<p align="center"> <a href="https://pypi.org/project/canopy-sdk" target="_blank"> <img src="https://img.shields.io/pypi/pyversions/canopy-sdk" alt="Supported Python versions"> </a> <a href="https://pypi.org/project/canopy-sdk" target="_blank"> <img src="https://img.shields.io/pypi/v/canopy-sdk?label=pypi%20package" alt="Package version"> </a> </p>[!NOTE]
The Canopy team is no longer maintaining this repository. Thank you for your support and enthusiasm for the project! If you're looking for a high quality managed RAG solution with continued updates and improvements, please check out the Pinecone Assistant.
Canopy is an open-source Retrieval Augmented Generation (RAG) framework and context engine built on top of the Pinecone vector database. Canopy enables you to quickly and easily experiment with and build applications using RAG. Start chatting with your documents or text data with a few simple commands.
Canopy takes on the heavy lifting for building RAG applications: from chunking and embedding your text data to chat history management, query optimization, context retrieval (including prompt engineering), and augmented generation.
Canopy provides a configurable built-in server so you can effortlessly deploy a RAG-powered chat application to your existing chat UI or interface. Or you can build your own, custom RAG application using the Canopy library.
Canopy lets you evaluate your RAG workflow with a CLI based chat tool. With a simple command in the Canopy CLI you can interactively chat with your text data and compare RAG vs. non-RAG workflows side-by-side.
Check out our blog post to learn more, or see a quick tutorial here.
RAG with Canopy

Canopy implements the full RAG workflow to prevent hallucinations and augment your LLM with your own text data.
Canopy has two flows: knowledge base creation and chat. In the knowledge base creation flow, users upload their documents and transform them into meaningful representations stored in Pinecone's Vector Database. In the chat flow, incoming queries and chat history are optimized to retrieve the most relevant documents, the knowledge base is queried, and a meaningful context is generated for the LLM to answer.
What's inside the box?
- Canopy Core Library - The library has 3 main classes that are responsible for different parts of the RAG workflow:
- ChatEngine - Exposes a chat interface to interact with your data. Given the history of chat messages, the
ChatEngineformulates relevant queries to theContextEngine, then uses the LLM to generate a knowledgeable response. - ContextEngine - Performs the “retrieval” part of RAG. The
ContextEngineutilizes the underlyingKnowledgeBaseto retrieve the most relevant documents, then formulates a coherent textual context to be used as a prompt for the LLM. - KnowledgeBase - Manages your data for the RAG workflow. It automatically chunks and transforms your text data into text embeddings, storing them in a Pinecone(Default)/Qdrant vector database. Given a text query - the knowledge base will retrieve the most relevant document chunks from the database.
- ChatEngine - Exposes a chat interface to interact with your data. Given the history of chat messages, the
More information about the Core Library usage can be found in the Library Documentation
-
Canopy Server - This is a webservice that wraps the Canopy Core library and exposes it as a REST API. The server is built on top of FastAPI, Uvicorn and Gunicorn and can be easily deployed in production. The server also comes with a built-in Swagger UI for easy testing and documentation. After you start the server, you can access the Swagger UI at
http://host:port/docs(default:http://localhost:8000/docs) -
Canopy CLI - A built-in development tool that allows users to swiftly set up their own Canopy server and test its configuration.
With just three CLI commands, you can create a new Canopy server, upload your documents to it, and then interact with the Chatbot using a built-in chat application directly from the terminal. The built-in chatbot also enables comparison of RAG-infused responses against a native LLM chatbot.
Setup
- set up a virtual environment (optional)
python3 -m venv canopy-env
source canopy-env/bin/activate
More information about virtual environments can be found here
- install the package
pip install canopy-sdk
<details>
<summary>You can also install canopy-sdk with extras. <b><u>CLICK HERE</u></b> to see the available extras
<br />
</summary>
Extras
| Name | Description |
| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| grpc | To unlock some performance improvements by working with the GRPC version of the Pinecone Client |
| torch | To enable embeddings provided by sentence-transformers |
| transformers | If you are using Anyscale LLMs, it's recommended to use LLamaTokenizer tokenizer which requires transformers as dependency |
| cohere | To use Cohere reranker or/and Cohere LLM |
| qdrant | To use Qdrant as an alternate knowledge base |
- Set up the environment variables
export PINECONE_API_KEY="<PINECONE_API_KEY>"
export OPENAI_API_KEY="<OPENAI_API_KEY>"
export INDEX_NAME="<INDEX_NAME>"
<details>
<summary><b><u>CLICK HERE</u></b> for more information about the environment variables
<br />
</summary>
Mandatory Environment Variables
| Name | Description | How to get it? |
|-----------------------|-----------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PINECONE_API_KEY | The API key for Pinecone. Used to authenticate to Pinecone services to create indexes and to insert, delete and search data | Register or log into your Pinecone account in the console. You can access your API key from the "API Keys" section in the sidebar of your dashboard |
| OPENAI_API_KEY | API key for OpenAI. Used to authenticate to OpenAI's services for embedding and chat API | You can find your OpenAI API key here. You might need to login or register to OpenAI services |
| INDEX_NAME | Name of the Pinecone index Canopy will underlying work with | You can choose any name as long as it follows Pinecone's restrictions |
| CANOPY_CONFIG_FILE | The path of a configuration yaml file to be used by the Canopy server. | Optional - if not provided, default configuration would be used |
Optional Environment Variables
These optional environment variables are used to authenticate to other supported services for embeddings and LLMs. If you configure Canopy to use any of these providers - you would need to set the relevant environment variables.
| Name | Description | How to get it? |
|-----------------------|-----------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| ANYSCALE_API_KEY | API key for Anyscale. Used to authenticate to Anyscale Endpoints for open source LLMs | You can register Anyscale Endpoints and find your API key here
| CO_API_KEY | API key for Cohere. Used to authenticate to Cohere services for embedding | You can find more information on registering to Cohere here
| JINA_API_KEY | API key for Jina AI. Used to authenticate to JinaAI's services for embedding and chat API | You can find your OpenAI API key here. Yo
