Vector Compression and Search with Improved Implicit Neural Codebooks (QINCo2)

This repository has been updated with the code from QINCo2. To access the original QINCo1 code, see qinco_v1 directory.

This code repository corresponds to the paper QINCo2: Vector Compression and Search with Improved Implicit Neural Codebooks, introducing an improved quantization process over QINCo. We also include code reproducing the ICML'24 paper Residual Quantization with Implicit Neural Codebooks (qinco_v1 directory), in which Quantization with Implicit Neural Codebooks (QINCo) was proposed. Please read both papers to learn about QINCo and QINCo2.

QINCo is a neurally-augmented algorithm for multi-codebook vector quantization, specifically residual quantization (RQ). Instead of using a fixed codebook per quantization step, QINCo uses a neural network to predict a codebook for the next quantization step, conditioned upon the quantized vector so far. In other words, the codebooks to be used depend on the Voronoi cells selected previously. This greatly enhances the capacity of the compression system, without the need to store more codebook vectors explicitly. An additional advantage of QINCo is its modularity. Thanks to training each quantization step with its own quantization error, the trained system for a certain compression rate, can also be exploited for lower compression rates, making QINCo a dynamic rate quantizer.

QINCo2 introduces several key novelties:

A fast approximate encoding method, yielding similar MSE for a much faster training and encoding time.
Integration of beam search to the encoding process, reaching much lower compression errors than QINCo1 for a similar encoding time when combined to approximate encoding.
A new (optional) module to the large-scale retrieval pipeline, improving accuracy using a pairwise decoder.
An overall upgrade of the architecture and training process.

Citation

If you use QINCo in a research work please cite our paper:

@inproceedings{
    vallaeys2025qinco2,
    title={Qinco2: Vector Compression and Search with Improved  Implicit Neural Codebooks},
    author={Th{\'e}ophane Vallaeys and Matthew J. Muckley and Jakob Verbeek and Matthijs Douze},
    booktitle={ICLR},
    year={2025},
    url={https://openreview.net/forum?id=2zMHHZ569S}
}

Setup

Installation

QINCo2 requires python3 and packages from the requirements.txt file. They can be installed using conda:

git clone https://github.com/facebookresearch/Qinco2
cd Qinco2
conda env create -f environment.yml

Downloading the data

Download the datasets used in the paper by running the corresponding bash scripts:

BigANN

./data/bigann/download_data.sh

or if you have limited storage space, you can also download only the first 1M database vectors, and the first 10M training vectors using:

./data/bigann/download_data.sh -small

Deep1B

./data/deep1b/download_data.sh

or if you have limited storage space, you can also download only the first 1M database vectors, and the first 10M training vectors using:

./data/deep1b/download_data.sh -small

Contriever

./data/contriever/download_data.sh

FB-ssnpp

./data/fb_ssnpp/download_data.sh

Pretrained checkpoints

Base experiments checkpoints

Below are the checkpoints for the QINCo1 and QINCo2-L models, trained on all four datasets. Instructions below show how to use and evaluate them. The commands suppose that the files are stored inside the models/ directory.

| | BigANN1M | Deep1M | Contriever1M | FB-ssnpp1M | |---|---|---|---|---| | Qinco1 (8 bytes) | qinco1-bigann1M-8x8.pt | qinco1-deep1M-8x8.pt | qinco1-contriever1M-8x8.pt | qinco1-FB_ssnpp1M-8x8.pt | | Qinco1 (16 bytes) | qinco1-bigann1M-16x8.pt | qinco1-deep1M-16x8.pt | qinco1-contriever1M-16x8.pt | qinco1-FB_ssnpp1M-16x8.pt | | Qinco2-L (8 bytes) | qinco2_L-bigann1M-8x8.pt | qinco2_L-deep1M-8x8.pt | qinco2_L-contriever1M-8x8.pt | qinco2_L-FB_ssnpp1M-8x8.pt | | Qinco2-L (16 bytes) | qinco2_L-bigann1M-16x8.pt | qinco2_L-deep1M-16x8.pt | qinco2_L-contriever1M-16x8.pt | qinco2_L-FB_ssnpp1M-16x8.pt |

IVF models checkpoints

These models are trained with and additional codebook of $K_{IVF}=2^20=1048576$ codewords, corresponding to the IVF step. They can be used to evaluate large-scale search.

| | BigANN1B | Deep1B | |---|---|---| | Qinco2-S (8 bytes) | IVF-qinco2_S-bigann1B-8x8.pt | IVF-qinco2_S-deep1B-8x8.pt | | Qinco2-S (16 bytes) | IVF-qinco2_S-bigann1B-16x8.pt | IVF-qinco2_S-deep1B-16x8.pt | | Qinco2-S (32 bytes) | IVF-qinco2_S-bigann1B-32x8.pt | IVF-qinco2_S-deep1B-32x8.pt | | Qinco2-M (8 bytes) | IVF-qinco2_M-bigann1B-8x8.pt | IVF-qinco2_M-deep1B-8x8.pt | | Qinco2-M (16 bytes) | IVF-qinco2_M-bigann1B-16x8.pt | IVF-qinco2_M-deep1B-16x8.pt | | Qinco2-M (32 bytes) | IVF-qinco2_M-bigann1B-32x8.pt | IVF-qinco2_M-deep1B-32x8.pt |

We also provide the IVF centroids used to create these models:

Usage

Every command uses the run.py endpoint. It can be run on multiple GPUs with accelerate using the ./run.sh script. If you are familiar with accelerate, you can use it directly to launch run.py. Command-line arguments are parsed using the Hydra format. You can find the default configuration and all overloadable parameters inside the config/qinco_cfg.yaml file.

Running on an single GPU or on CPU: In any of the following commands, you can replace ./run.sh by python run.py to use a single GPU. Set the cpu argument to true (python run.py cpu=true) to run on CPU instead.

Datasets: for all commands below, you can either use your own data using the db, trainset, queries and queries_gt arguments, or use one of the default dataset that is used within the paper. To use one of theses datasets, replace these arguments by db=<name of the dataset>. The paths will be automatically populated. Be sure to download the corresponding datasets beforehand (see above).

Available default datasets: db=FB_ssnpp1M, db=contriever1M, db=bigann1M, db=bigann1B, db=bigann1B, db=deep1B. The 1M datasets are intended to be used for most tasks, while the 1B datasets should only be used to build and search a index for large-scale search.

Using your own data: you will need at least a vector database (db) and a set of training vectors (trainset) to train and evaluate the MSE of the model. Available data formats are: bvecs, fvecs, ivecs, npy, and the format should be a single matrix of dimensions (N_samples, D). During training, the last 10,000 vectors of the training set will set appart at the validation set. The database is the test set. Additionally, for nearest-neighbour search, you need a set of queries (queries, matrix of dimension (N_queries, D)) and the id of their answers in the database (queries_gt, matrix of dimension (N_queries, 1)).

Using a subset of the data files: you can use only a subset of the training set and/or database, which can be usefull for testing or using limited data. Use ds.trainset=<...> and ds.db=<...> to limit their size. Similarly, you can control the number of validation samples extracted from the training set

Qinco

Install / Use

README