SkillAgentSearch skills...

Prvhash

PRVHASH - Pseudo-Random-Value Hash. Hash functions, PRNG with unlimited period, randomness extractor, and a glimpse into abyss. (header-only C/C++) (Codename Gradilac/Градилак)

Install / Use

/learn @avaneev/Prvhash

README

PRVHASH - Pseudo-Random-Value Hash (in C/C++)

Introduction

PRVHASH is a hash function that generates a uniform pseudo-random number sequence derived from the message. PRVHASH is conceptually similar (in the sense of using a pseudo-random number sequence as a hash) to keccak and RadioGatun schemes, but is a completely different implementation of such concept. PRVHASH acts as both a "randomness extractor" and an "extendable-output function" (XOF).

PRVHASH can generate 64- to unlimited-bit hashes, yielding hashes of approximately equal quality independent of the chosen hash length. PRVHASH is based on 64-bit math, which is scalar, portable, cross-platform, inlinable, C++ compatible (fairly efficient on 32-bit systems as well). The function can easily be used beyond 1024-bit hashes, but such cases should be statistically tested (until a rigorous formal extrapolation is available). For example, any 32-bit element extracted from 2048-, or 4096-bit resulting hash is as collision resistant as just a 32-bit hash. It is a constant-time hash function, meaning its execution time depends only on the message's length. A streamed, higher-security, hashing implementation is available.

PRVHASH is solely based on the butterfly effect, inspired by LCG pseudo-random number generators. The generated hashes exhibit strong avalanche properties. For best security, a random seed should be supplied to the hash function, but this is not a requirement.

64-, 128-, 192-, 256-, 512-, and 1024-bit PRVHASH hashes pass all SMHasher tests. Other hash lengths were not thoroughly tested, but extrapolations can be made. The author makes no cryptographic claims (neither positive nor negative) about PRVHASH-based constructs.

PRVHASH core function can be used as a PRNG with an arbitrarily-chosen (practically unlimited) period, depending on the number of hashwords in the system.

PRVHASH64

Please see the prvhash64.h file for the details of the basic hash function implementation (the prvhash.h, prvhash4.h, prvhash42.h, and versions below 4.3 are outdated versions). While this hash function is likely irreversible, according to SAT solver-based testing, it does not provide second preimage resistance. This function should not be used in open systems without a secret seed. Note that 64 refers to core function's variable size.

The default prvhash64.h-based 64-bit hash of the string The cat is out of the bag is 6ac39f7ac0c94d63.

A proposed short name for hashes created with prvhash64.h is PRH64-N, where N is the hash length in bits (e.g., PRH64-256).

PRVHASH64_64M

This is a minimized implementation of the prvhash64 hash function. It is one of very few smallest hash functions which produces 64-bit hashes of this quality level. While this function does not provide a bulk throughput that can be considered "fast", due to its statistical properties it is fast for short-key hash-maps and hash-tables.

Streamed Hashing

The file prvhash64s.h includes a relatively fast streamed hashing function which uses a "fused" PRVHASH arrangement. Please take a look at the prvhash64s_oneshot() function for usage example. The prvhash64s offers both an increased security and hashing speed.

This function has an increased preimage resistance compared to the basic hash function implementation. Preimage resistance cannot currently be estimated exactly, but the hash length affects it exponentially. Also, second preimage attack usually boils down to exchange of forged symbols to "trash" symbols (at any place of the data stream); substitutions usually result in random values, possibly damaging to any compressed or otherwise structured file. Which means that data compression software and libraries should always check any left-over, "unused", data beyond the valid compressed stream, for security reasons.

The time complexity of a preimage attack is highly variable and appears to follow a random-logarithmic distribution.

Even though a formal proof is not yet available, the author believes this hash function can compete with widely-used SHA2 and SHA3 families of hash functions while at the same time offering a considerably higher performance and scalability. When working in open systems, supplying a secret seed is not a requirement for this hash function.

The performance (expressed in cycles/byte) of this hash function on various platforms can be evaluated at the ECRYPT/eBASH project.

The default prvhash64s.h-based 64-bit hash of the string The cat is out of the bag is 2043ccf52ae2ca6f.

The default prvhash64s.h-based 256-bit hash of the string Only a toilet bowl does not leak is b13683799b840002689a1a42d93c826c25cc2d1f1bc1e48dcd005aa566a47ad8.

The default prvhash64s.h-based 256-bit hash of the string Only a toilet bowl does not leaj is d4534a922fd4f15ae8c6cc637006d1f33f655b06d60007a226d350e87e866250.

This demonstrates the Avalanche effect. On a set of 216553 English words, pair-wise hash comparisons give a 50.0% average difference in resulting hash bits, which fully satisfies the strict avalanche criterion.

This streamed hash function produces hash values that are different to the prvhash64 hash function. It is incorrect to use both of these hash function implementations on the same data set. While prvhash64 can be used as a hash for hash-tables and in-memory data blocks, prvhash64s can be used to create hashes of large data blocks like files, in streamed mode.

A proposed short name for hashes created with prvhash64s.h is PRH64S-N, where N is the hash length in bits (e.g., PRH64S-256). Or simply, SH4-N, Secure Hash 4.

Minimal PRNG for Everyday Use

The core function can be easily integrated into your applications, to be used as an effective PRNG. The period of this minimal PRNG is at least 2<sup>159</sup>. The initial parameters can be varied at will, and will not "break" the PRNG. Setting only the Seed value guarantees a random start point within the whole PRNG period, with at least 2<sup>64</sup> spacing. It is generally recommended to initialize lcg and Hash variables to 0.

Since the PRNG is based on multiplication with recurrence, it can be classified as "chaotic". An extremely small number (just sys_size_bits which is 192 out of 2<sup>64</sup> for this minimal PRNG) of its possible PRNG cycles have reduced periods, but they are never shorter than 2<sup>hash_bits-hash_count</sup>, which depends on the number of hashwords, as hashwords are linear accumulators similar to Weyl sequences (2<sup>63</sup> for this minimal PRNG).

The example code follows:

#include "prvhash_core.h"
#include <stdio.h>

int main()
{
    uint64_t Seed = 0;
    uint64_t lcg = 0;
    uint64_t Hash = 0;

    uint64_t v = 0;
    uint64_t i;

    for( i = 0; i < ( (uint64_t) 1 << 28 ); i++ )
    {
        v = prvhash_core64( &Seed, &lcg, &Hash );
    }

    printf( "%llu\n", v );
}

For implementation assurance, here are the first 16 output values in hex (starting from the all-zeroes state):

0x5555555555555555 0x00000000DB6DB6DB 0x2492492192492492 0x75D75DA0AAAAAA79
0x93064E905C127FE5 0xE2585C9CA95671A3 0x28A44B31D428179E 0x11B0B6A8D4BA3A73
0x195C6A4C23EE71AD 0x5AA47859226BA23E 0xA7D42121695056D4 0x142D7CD5D83342F2
0x3D42E83328C09C8F 0x7E691C66BAC23222 0x82E1032F441F23A5 0xA4BDE5C4A05E6256

Note that such minimal 1-hashword PRNG is definitely not cryptographically secure: its state can be solved by a SAT solver pretty fast; this applies to other arrangements ("fused", "parallel", multiple hashwords; with daisy-chaining being harder to solve). The known way to make PRNG considerably harder to solve for a SAT solver, with complexity corresponding to system's size, is to combine two adjacent PRNG outputs via XOR operation; this obviously has a speed impact and produces output with more than 1 solution (most probably, 2). This, however, does not measurably increase the probability of PRNG output overlap, which stays below 1/2<sup>sys_size_bits</sup>; in tests, practically undetectable.

So, the basic PRNG with some, currently not formally proven, security is as follows (XOR two adjacent outputs to produce a single "compressed" PRNG output):

v = prvhash_core64( &Seed, &lcg, &Hash );
v ^= prvhash_core64( &Seed, &lcg, &Hash );

A similar approach is to simply skip the next generated random number, but it is slightly less secure. It is likely that PRVHASH's k-equidistribution of separate outputs is implicitly secure. The reason is that skipping or XORing creates uncertainty or entanglement of current output with system's state "hashword array length" of outputs back. 3 XORs are needed to provide preimage resistance, or resistance against selection of entropy input that leads to a desired output. The security becomes effective only after system's initialization: initial "conditioning" rounds and a full hashword array pass.

TPDF Dithering

The core function can be used to implement a "statistically-good" and "neutrally-sounding" dithering noise for audio signals; for both floating-point to fixed-point, and bit-depth conversions.

uint64_t rv = prvhash_core64( &Seed, &lcg, &Hash );
double tpdf = ( (int64_t) (uint32_t) rv - (int64_t) ( rv >> 32 )) * 0x1p-32;

Floating-Point PRNG

The following expression can be used to convert 64-bit unsigned value to full-mantissa floating-point value, without a truncation bias:

uint64_t rv = prvhash_core64( &Seed, &lcg, &Hash );
double v = ( rv >> ( 64 - 53 )) * 0x1p-53;

Gradilac PRNG (C++)

The gradilac.h file includes the

View on GitHub
GitHub Stars325
CategoryDevelopment
Updated6d ago
Forks25

Languages

C

Security Score

100/100

Audited on Mar 21, 2026

No findings