<h1 align="center">USearch</h1> <h3 align="center"> Smaller & <a href="https://www.unum.cloud/blog/2023-11-07-scaling-vector-search-with-intel">Faster</a> Single-File<br/> Similarity Search & Clustering Engine for <a href="https://github.com/ashvardanian/simsimd">Vectors</a> & 🔜 <a href="https://github.com/ashvardanian/stringzilla">Texts</a> </h3> <br/> <p align="center"> <a href="https://discord.gg/A6wxt6dS9j"><img height="25" src="https://github.com/unum-cloud/.github/raw/main/assets/discord.svg" alt="Discord"></a>     <a href="https://www.linkedin.com/company/unum-cloud/"><img height="25" src="https://github.com/unum-cloud/.github/raw/main/assets/linkedin.svg" alt="LinkedIn"></a>     <a href="https://twitter.com/unum_cloud"><img height="25" src="https://github.com/unum-cloud/.github/raw/main/assets/twitter.svg" alt="Twitter"></a>     <a href="https://unum.cloud/post"><img height="25" src="https://github.com/unum-cloud/.github/raw/main/assets/blog.svg" alt="Blog"></a>     <a href="https://github.com/unum-cloud/usearch"><img height="25" src="https://github.com/unum-cloud/.github/raw/main/assets/github.svg" alt="GitHub"></a> </p> <p align="center"> Spatial • Binary • Probabilistic • User-Defined Metrics <br/> <a href="https://unum-cloud.github.io/USearch/cpp">C++11</a> • <a href="https://unum-cloud.github.io/USearch/python">Python 3</a> • <a href="https://unum-cloud.github.io/USearch/javascript">JavaScript</a> • <a href="https://unum-cloud.github.io/USearch/java">Java</a> • <a href="https://unum-cloud.github.io/USearch/rust">Rust</a> • <a href="https://unum-cloud.github.io/USearch/c">C99</a> • <a href="https://unum-cloud.github.io/USearch/objective-c">Objective-C</a> • <a href="https://unum-cloud.github.io/USearch/swift">Swift</a> • <a href="https://unum-cloud.github.io/USearch/csharp">C#</a> • <a href="https://unum-cloud.github.io/USearch/golang">Go</a> • <a href="https://unum-cloud.github.io/USearch/wolfram">Wolfram</a> <br/> Linux • macOS • Windows • iOS • Android • WebAssembly • <a href="https://unum-cloud.github.io/USearch/sqlite">SQLite</a> </p> <div align="center"> <a href="https://pepy.tech/project/usearch"> <img alt="PyPI" src="https://static.pepy.tech/personalized-badge/usearch?period=total&units=abbreviation&left_color=black&right_color=blue&left_text=Python%20PyPI%20installs"> </a> <a href="https://www.npmjs.com/package/usearch"> <img alt="NPM" src="https://img.shields.io/npm/dy/usearch?label=JavaScript%20NPM%20installs"> </a> <a href="https://crates.io/crates/usearch"> <img alt="Crate" src="https://img.shields.io/crates/d/usearch?label=Rust%20Crate%20installs"> </a> <a href="https://www.nuget.org/packages/Cloud.Unum.USearch"> <img alt="NuGet" src="https://img.shields.io/nuget/dt/Cloud.Unum.USearch?label=CSharp%20NuGet%20installs"> </a>  <img alt="GitHub code size in bytes" src="https://img.shields.io/github/languages/code-size/unum-cloud/usearch?label=Repo%20size"> </div>

✅ 10x faster HNSW implementation than FAISS.
✅ Simple and extensible single C++11 header library.
✅ Trusted by giants like Google and DBs like ClickHouse & DuckDB.
✅ SIMD-optimized and user-defined metrics with JIT compilation.
✅ Hardware-agnostic f16 & i8 - half-precision & quarter-precision support.
✅ View large indexes from disk without loading into RAM.
✅ Heterogeneous lookups, renaming/relabeling, and on-the-fly deletions.
✅ Binary Tanimoto and Sorensen coefficients for Genomics and Chemistry applications.
✅ Space-efficient point-clouds with uint40_t, accommodating 4B+ size.
✅ Compatible with OpenMP and custom "executors" for fine-grained parallelism.
✅ Semantic Search and Joins.
🔄 Near-real-time clustering and sub-clustering for Tens or Millions of clusters.

Technical Insights and related articles:

Comparison with FAISS

FAISS is a widely recognized standard for high-performance vector search engines. USearch and FAISS both employ the same HNSW algorithm, but they differ significantly in their design principles. USearch is compact and broadly compatible without sacrificing performance, primarily focusing on user-defined metrics and fewer dependencies.

| | FAISS | USearch | Improvement | | :------------------------------------------- | ----------------------: | -----------------------: | ----------------------: | | Indexing time ⁰ | | | | | 100 Million 96d f32, f16, i8 vectors | 2.6 · 2.6 · 2.6 h | 0.3 · 0.2 · 0.2 h | 9.6 · 10.4 · 10.7 x | | 100 Million 1536d f32, f16, i8 vectors | 5.0 · 4.1 · 3.8 h | 2.1 · 1.1 · 0.8 h | 2.3 · 3.6 · 4.4 x | | | | | | | Codebase length ¹ | 84 K SLOC | 3 K SLOC | maintainable | | Supported metrics ² | 9 fixed metrics | any metric | extendible | | Supported languages ³ | C++, Python | 10 languages | portable | | Supported ID types ⁴ | 32-bit, 64-bit | 32-bit, 40-bit, 64-bit | efficient | | Filtering ⁵ | ban-lists | any predicates | composable | | Required dependencies ⁶ | BLAS, OpenMP | - | light-weight | | Bindings ⁷ | SWIG | Native | low-latency | | Python binding size ⁸ | ~ 10 MB | < 1 MB | deployable |

⁰ Tested on Intel Sapphire Rapids, with the simplest inner-product distance, equivalent recall, and memory consumption while also providing far superior search speed. ¹ A shorter codebase of usearch/ over faiss/ makes the project easier to maintain and audit. ² User-defined metrics allow you to customize your search for various applications, from GIS to creating custom metrics for composite embeddings from multiple AI models or hybrid full-text and semantic search. ³ With USearch, you can reuse the same preconstructed index in various programming languages. ⁴ The 40-bit integer allows you to store 4B+ vectors without allocating 8 bytes for every neighbor reference in the proximity graph. ⁵ With USearch the index can be combined with arbitrary external containers, like Bloom filters or third-party databases, to filter out irrelevant keys during index traversal. ⁶ Lack of obligatory dependencies makes USearch much more portable. ⁷ Native bindings introduce lower call latencies than more straightforward approaches. ⁸ Lighter bindings make downloads and deployments faster.

Base functionality is identical to FAISS, and the interface must be familiar if you have ever investigated Approximate Nearest Neighbors search:

# pip install usearch

import numpy as np
from usearch.index import Index

index = Index(ndim=3)               # Default settings for 3D vectors
vector = np.array([0.2, 0.6, 0.4])  # Can be a matrix for batch operations
index.add(42, vector)               # Add one or many vectors in parallel
matches = index.search(vector, 10)  # Find 10 nearest neighbors

assert matches[0].key == 42
assert matches[0].distance <= 0.001
assert np.allclose(index[42], vector, atol=0.1) # Ensure high tolerance in mixed-precision comparisons

More settings are always available, and the API is designed to be as flexible as possible. The default storage/quantization level is hardware-dependant for efficiency, but bf16 is recommended for most modern CPUs.

index = Index(
    ndim=3, # Define the number of dimensions in input vectors
    metric='c

USearch

Install / Use

README

Comparison with FAISS