<div align="center"> <a href="https://www.colette.chat/"> <img src="https://www.colette.chat/img/colette_logo.png" width="320" alt="colette logo"> </a> </div> <p align="center"> <b>Search and interact locally with technical documents of any kind</b> </p>

What is Colette?

Colette is an open-source self-hosted RAG and LLM serving software. It is well-suited for searching and interacting with technical documents that cannot be leaked to external APIs.

As the main core feature, Colette embeds a Vision-RAG (V-RAG) that transforms and analyzes all documents as images. This allows to conserve and handle all visual elements such as images, figures, schemas, visual highlights and layouts in documents. This is based on the idea that most documents are targeted at human eyes, and thus can be more thoroughtly analyzed by vision and multimodal LLMs.

Colette was co-financed by Jolibrain, CNES and Airbus.

Demo

https://github.com/user-attachments/assets/7e36b4af-880a-4260-af61-3041b7d60439

Key Features

📊 Vision Retrieval-Augmented Generation (V-RAG) system by combining the Document Screenshot Embedding/ColPali retrievers for document retrieval with Vision Language Model (VLM).
📚 Text based RAG system by combining unstructured based text extraction, text embedding and common LLMs
🚀 Multi-Model Support for both embedders and inference VLLMs
🎨 Image Generation Integration with diffusers
🚀 Effortless Setup, dockerized and our tests show decent results on many corpuses, including technical documentations with images, figure and shemas

System Architecture

Get Started

Prerequisites

Python >= 3.12
CUDA >= 12.1
GPU >= 24GB
RAM >= 16GB
Disk >= 50GB
Docker >= 24.0.0 & Docker Compose >= v2.26.1

If you have not installed Docker on your local machine (Windows, Mac, or Linux), see Install Docker Engine.

NOTE: Colette requires a GPU with at least 24GB of VRAM to run the default models. If you have less VRAM, you can try to change the models to lighter ones in the configuration files, but performance may be impacted.

Also the default config file is vrag_default_lite.json which is designed to run on 24GB VRAM GPUs. If you have multiples GPUs, you can try vrag_default.json which uses larger models and should provide better results and also needs multiple GPUs

Docker (recommended)

Easiest way to get started uses Docker. If you with to install from sources, see Developer Setup

For rebuilding images after local code changes, see Container Rebuild Guide.

Jenkins (Container Publish)

Use the dedicated pipeline file Jenkinsfile.images for container build/push automation.

For a Jenkins Multibranch Pipeline job:

Set Script Path to Jenkinsfile.images.
Keep the root Jenkinsfile for test CI only.
The image pipeline builds colette_gpu, colette_gpu_server, and colette_ui with a short SHA tag.
Push runs only on main and release/* branches; latest is refreshed on main only.
Optional RUN_INTEGRATION=true runs containerized integration tests via ci/container_integration.sh before push.
Pull the Docker image

docker pull docker.jolibrain.com/colette_gpu:latest

Create folders for models and app_colette

mkdir -p models
mkdir -p app_colette

This is to ensure that your user is the owner of these folders and files that will be created inside when we run the docker containers.

Index your data

docker run --gpus all --user $(id -u):$(id -g) \
  -e HOME=/tmp \
  -v $PWD:/rag \
  -v $PWD/docs:/data \
  -v $PWD/models:/app/models \
  docker.jolibrain.com/colette_gpu \
  bash -c "git config --global --add safe.directory /app && colette_cli index --app-dir /rag/app_colette --data-dir /data/pdf --config-file src/colette/config/vrag_default_lite.json --models-dir /app/models"

Test by sending a question

docker run --gpus all --user $(id -u):$(id -g) \
  -e HOME=/tmp \
  -v $PWD:/rag \
  -v $PWD/app_colette:/app/app_colette \
  -v $PWD/models:/models \
  docker.jolibrain.com/colette_gpu \
  bash -c "git config --global --add safe.directory /app && colette_cli chat --app-dir app_colette --models-dir /models --msg \"What are the identified sources of errors of a RAG?\""

Activate `venv_colette` for Command line & Developer Setup (Python API)

Clone the repo:

git clone https://github.com/jolibrain/colette.git

Create a virtual environment and install dependencies

cd colette
chmod +x create_venv_colette.sh
./create_venv_colette.sh
source venv_colette/bin/activate

NOTE: This process may take a while, as there are many dependencies to install and some of them require compilation.

For platform-specific source setup, use:

create_venv_colette_ARM.sh for ARM machines (see docs/source/users/get_started_ARM_machine.md)
create_venv_colette_DGX.sh for DGX machines (see docs/source/users/get_started_DGX_machine.md)

Command Line Interface (CLI)

(don't forget to activate the virtual environment, see above)

Index the data

Let's index a PDF slidedeck from docs/pdf

colette_cli index --app-dir app_colette --data-dir docs/pdf/ --config-file src/colette/config/vrag_default_lite.json

Test with a question

colette_cli chat --app-dir app_colette --msg "What are the identified sources of errors ?" #--crop-label "text"

Python API

(don't forget to activate the virtual environment, see above)

The example below is also available in examples/get_start_python_api.py. There is also a Jupyter notebook version in examples/get_start_python_api.ipynb. For text-search-only examples, see examples/text_search_demo.py and examples/text_search_demo.ipynb.

Index PDFs and query

import json
import re
import base64
from io import BytesIO
from PIL import Image

import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

from colette.jsonapi import JSONApi
from colette.apidata import APIData

# Get the root path of the colette package
import os
colette_root = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
print(f'Colette root path: {colette_root}')

colette_api = JSONApi()

documents_dir = os.path.join(colette_root, 'docs/pdf') # where the input documents are located
app_dir = os.path.join(colette_root, 'app_colette') # where to store the app
models_dir = os.path.join(colette_root, 'models') # where the models are located
app_name = 'app_colette' # name of the app

# read the configuration file
config_file = os.path.join(colette_root, 'src/colette/config/vrag_default_lite.json')
index_file = os.path.join(colette_root, 'src/colette/config/vrag_default_index.json')

with open(config_file, 'r') as f:
    create_config = json.load(f)
with open(index_file, 'r') as f:
    index_config = json.load(f)

create_config['app']['repository'] = app_dir
create_config['app']['models_repository'] = models_dir
index_config['parameters']['input']['data'] = [documents_dir]
# Index in hybrid mode so embedding + text-search retrieval are both available.
index_config['parameters']['input']['rag']['retrieval_mode'] = 'hybrid'
#index_config['parameters']['input']['rag']['reindex'] = False # if True, the RAG will be reindexed

# Create the service
api_data_create = APIData(**create_config)
colette_api.service_create(app_name, api_data_create)

# Index the documents
api_data_index = APIData(**index_config)
colette_api.service_index(app_name, api_data_index)

# Note the optional 'crop_label' parameter to filter the sources by crop label
# The default crop labels are: 'text', 'table', 'figure'

# Query the vision RAG
query_api_msg = {
    'parameters': {
        'input': {
            'message': 'What are the identified sources of errors ?',
            # 'crop_label': 'text'
        }
    }
}
query_data = APIData(**query_api_msg)
response = colette_api.service_predict(app_name, query_data)

# Get the text output
print(response.output)

# Get the image sources
for item in response.sources['context']:
    print(f"Key: {item['key']}, Distance: {item['distance']}")

    # Extract base64 string (remove 'data:image/png;base64,' prefix)
    base64_data = re.sub('^data:image/.+;base64,', '', item['content'])

    # Decode base64 string
    image_data = base64.b64decode(base64_data)
    
    # Create PIL Image
    image = Image.open(BytesIO(image_data))

    # Export image (optional)
    image_filename = f"{item['key']}.png"
    image.save(image_filename)
    print(f"Image saved as: {image_filename}")

# Optional: override retrieval mode per request
query_text_search = {
    'parameters': {
        'input': {
            'message': 'What are the identified sources of errors ?',
            'rag': {'retrieval_mode': 'text_search_retrieval'}
        }
    }
}
response_text_search = colette_api.service_predict(app_name, APIData(**query_text_search))
for hit in response_text_search.sources.get('text_context', []):
    print(hit['source'], hit.get('page_number'), hit.get('score'))

# Notes:
# - If you index with retrieval_mode='hybrid', both embedding and text-search data are available.
# - The indexing retrieval_mode is persisted in app_colette/config.json.
# - If chat/predict requests omit parameters.input.rag.retrieval_mode,
#   Colette falls back to the persisted value from config.json.

Retrieval modes

parameters.input.rag.retrieval_mode supports:

embedding_retrieval
text_search_retrieval
hybrid

See Text Search Engine for details.

Configurations

Colette use

Colette

Install / Use

README

What is Colette?

Demo

Key Features

System Architecture

Get Started

Prerequisites

Docker (recommended)

Jenkins (Container Publish)

Activate `venv_colette` for Command line & Developer Setup (Python API)

Command Line Interface (CLI)

Index the data

Test with a question

Python API

Index PDFs and query

Retrieval modes

Configurations

Colette

Install / Use

README

What is Colette?

Demo

Key Features

System Architecture

Get Started

Prerequisites

Docker (recommended)

Jenkins (Container Publish)

Activate venv_colette for Command line & Developer Setup (Python API)

Command Line Interface (CLI)

Index the data

Test with a question

Python API

Index PDFs and query

Retrieval modes

Configurations

Activate `venv_colette` for Command line & Developer Setup (Python API)