SkillAgentSearch skills...

Rapidhash

Very fast, high quality, platform-independent hashing algorithm.

Install / Use

/learn @Nicoshev/Rapidhash
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

rapidhash - Very fast, high quality, platform-independent

Family of three hash functions: rapidhash, rapidhashMicro and rapidhashNano

Used by Chromium, Folly's F14, Fuchsia, Ninja, JuliaLang, ziglang, fb303, zxc, among others

Rapidhash
General purpose hash function, amazing performance across all sizes.
Surpasses 70GB/s on Apple's M4 cpus.
Clang-18+ compiles it to ~185 instructions, both on x86-64 and aarch64.
The fastest recommended hash function by SMHasher and SMHasher3.

RapidhashMicro
Designed for HPC and server applications, where cache misses make a noticeable performance detriment.
Clang-18+ compiles it to ~140 instructions without stack usage, both on x86-64 and aarch64.
Faster for sizes up to 512 bytes, just 15%-20% slower for inputs above 1kb.
Produces same output as Rapidhash for inputs up to 80 bytes.

RapidhashNano
Designed for Mobile and embedded applications, where keeping a small code size is a top priority.
Clang-18+ compiles it to less than 100 instructions without stack usage, both on x86-64 and aarch64.
The fastest for sizes up to 48 bytes, but may be considerably slower for larger inputs.
Produces same output as Rapidhash for inputs up to 48 bytes.

Streamable
The three functions can be computed without knowing the input length upfront.

Universal
All functions have been optimized for both AMD64 and AArch64 systems.
Compatible with gcc, clang, icx and MSVC.
They do not use machine-specific vectorized or cryptographic instruction sets.

Excellent
All functions pass all tests in both SMHasher and SMHasher3.
Collision-based study showed a collision probability close to ideal.
Outstanding collision ratio when tested with datasets of 16B and 67B keys:

| Input Len | Nb Hashes | Expected | Nb Collisions | | --- | --- | --- | --- | | 12 | 15 Gi | 7.0 | 6 | | 16 | 15 Gi | 7.0 | 7 | | 24 | 15 Gi | 7.0 | 7 | | 32 | 15 Gi | 7.0 | 10 | | 40 | 15 Gi | 7.0 | 4 | | 48 | 15 Gi | 7.0 | 7 | | 64 | 15 Gi | 7.0 | 6 | | 80 | 15 Gi | 7.0 | 11 | | 96 | 15 Gi | 7.0 | 6 | | 120 | 15 Gi | 7.0 | 8 | | 128 | 15 Gi | 7.0 | 6 | | 12 | 62 Gi | 120.1 | 122 | | 16 | 62 Gi | 120.1 | 97 | | 24 | 62 Gi | 120.1 | 125 | | 32 | 62 Gi | 120.1 | 131 | | 40 | 62 Gi | 120.1 | 117 | | 48 | 62 Gi | 120.1 | 146 | | 64 | 62 Gi | 120.1 | 162 | | 80 | 62 Gi | 120.1 | 165 | | 96 | 62 Gi | 120.1 | 180 | | 120 | 62 Gi | 120.1 | 168 |

More results can be found in the collisions folder

Outstanding performance

Average latency when hashing keys of 4, 8 and 16 bytes

| Hash | M1 Pro | M3 Pro | Neoverse V2 | AMD Turin | Ryzen 9700X | | --- | --- | --- | --- | --- | --- | | rapidhash | 1.79ns | 1.38ns | 2.05ns | 2.31ns | 1.46ns | | xxh3 | 1.92ns | 1.50ns | 2.15ns | 2.35ns | 1.45ns |

Peak throughput when hashing files of 16Kb-2Mb

| Hash | M1 Pro | M3 Pro | M3 Ultra | M4 | Neoverse V2 | Ryzen 9700X | | --- | --- | --- | --- | --- | --- | --- | | rapidhash | 47GB/s | 57GB/s | 61GB/s | 71GB/s | 38GB/s | 68GB/s | | xxh3 | 37GB/s | 43GB/s | 47GB/s | 49GB/s | 34GB/s | 78GB/s |

Long-input measurements were taken compiling with the RAPIDHASH_UNROLLED macro.

The benchmarking program can be found in the bench folder

Collision-based hash quality study

A perfect hash function distributes its domain uniformly onto the image.
When the domain's cardinality is a multiple of the image's cardinality, each potential output has the same probability of being produced.
A function producing 64-bit hashes should have a $p=1/2^{64}$ of generating each output.

If we compute $n$ hashes, the expected amount of collisions should be the number of unique input pairs times the probability of producing a given hash.
This should be $(n*(n-1))/2 * 1/2^{64}$, or simplified: $(n*(n-1))/2^{65}$.
In the case of hashing $15*2^{30}$ (~16.1B) different keys, we should expect to see $7.03$ collisions.

We present an experiment in which we use rapidhash to hash $68$ datasets of $15*2^{30}$ (15Gi) keys each.
For each dataset, the amount of collisions produced is recorded as measurement.
Ideally, the average among measurements should be $7.03$ and the results collection should be a binomial distribution.
We obtained a mean value of $7.60$, just $8.11$% over $7.03$.
Each dataset individual result and the collisions test program can be found in the collisions folder.
The default seed $0$ was used in all experiments.

Ports

Java by hash4j
Rust by hoxxep
JavaScript by komiya-atsushi
Go by dwisiswant0

Related Skills

View on GitHub
GitHub Stars791
CategoryDevelopment
Updated1d ago
Forks43

Languages

C++

Security Score

95/100

Audited on Mar 24, 2026

No findings