Unq
Supplementary Code for Unsupervised Neural Quantization for Compressed-Domain Similarity Search
Install / Use
/learn @stanis-morozov/UnqREADME
UNQ
Supplementary Code for Unsupervised Neural Quantization for Compressed-Domain Similarity Search
What does it do?
It trains a neural network that maps database vectors in 8- or 16-byte codes optimized for nearest neighbor search.

What do I need to run it?
- A machine with some CPU (preferably 8+) and a GPU
- Running with no GPU or less than 4 CPU cores may cause premature senility;
- Some popular Linux x64 distribution
- Tested on Ubuntu16.04, should work fine on any popular linux64 and even MacOS;
- Windows and x32 systems may require heavy wizardry to run;
- When in doubt, use Docker, preferably GPU-enabled (i.e. nvidia-docker)
How do I run it?
- Clone or download this repo.
cdyourself to it's root directory. - Grab or build a working python enviromnent. Anaconda works fine.
- Install standard compilers (e.g.
gccandg++for most linux) andswig3.0
- On ubuntu, just
sudo apt-get -y install swig3.0 gcc-4.9 g++-4.9 libstdc++6 wget unzip - and maybe
sudo ln -s /usr/bin/swig3.0 /usr/bin/swigfor a good measure
- Install packages from
requirements.txt, with a little twist
- FAISS library is hard to install via pip, we recommend using their anaconda installation
- You will also need jupyter or some other way to work with .ipynb files
- Run jupyter notebook and open a notebook in
./notebooks/
- Before you run the first cell, change
%env CUDA_VISIBLE_DEVICES=#to devices that you plan to use. - First it downloads data from dropbox. You will need up to 1.5Gb
- Second, it defines an experiment setup. The setups are:
bigann1m_unq_8b.ipynb- BIGANN1M dataset, 8 bytes per vectordeep1m_unq_8b.ipynb- DEEP1M dataset, 8 bytes per vectorbigann1m_unq_16b.ipynb- BIGANN1M dataset, 16 bytes per vectordeep1m_unq_16b.ipynb- DEEP1M dataset, 16 bytes per vector
Related Skills
node-connect
349.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.8kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
