SkillAgentSearch skills...

T1ha

One of the fastest hash functions.

Install / Use

/learn @erthink/T1ha
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<!-- Required extensions: pymdownx.betterem, pymdownx.tilde, pymdownx.emoji, pymdownx.tasklist, pymdownx.superfences -->

Основной репозиторий перемещен на GitFlic

Весной 2022, без каких-либо предупреждений или пояснений, администрация Github удалила мой аккаунт и все проекты. Через несколько месяцев, без какого-либо моего участия или уведомления, проекты были восстановлены/открыты в статусе "public read-only archive" из какой-то неполноценной резервной копии. Эти действия Github я расцениваю как злонамеренный саботаж, а сам сервис Github считаю навсегда утратившим какое-либо доверие.

Вследствие произошедшего, никогда и ни при каких условиях, я не буду размещать на Github первоисточники (aka origins) моих проектов, либо как-либо полагаться на инфраструктуру Github.

Тем не менее, понимая что пользователям моих проектов удобнее получать к ним доступ именно на Github, я не хочу ограничивать их свободу или создавать неудобство, и поэтому размещаю на Github зеркала (aka mirrors) репозиториев моих проектов. При этом ещё раз акцентирую внимание, что это только зеркала, которые могут быть заморожены, заблокированы или удалены в любой момент, как это уже было в 2022.

The origin has been migrated to GitFlic

In the spring of 2022, without any warnings or explanations, the Github administration deleted my account and all projects. A few months later, without any involvement or notification from me, the projects were restored/opened in the "public read-only archive" status from some kind of incomplete backup. I regard these actions of Github as malicious sabotage, and I consider the Github service itself to have lost any trust forever.

As a result of what has happened, I will never, under any circumstances, post the primary sources (aka origins) of my projects on Github, or rely in any way on the Github infrastructure.

Nevertheless, realizing that it is more convenient for users of my projects to access them on Github, I do not want to restrict their freedom or create inconvenience, and therefore I place mirrors of my project repositories on Github. At the same time, I would like to emphasize once again that these are only mirrors that can be frozen, blocked or deleted at any time, as was the case in 2022.


t1ha

Fast Positive Hash, aka "Позитивный Хэш" by Positive Technologies. Included in the Awesome C list of open source C software.

The Future will (be) Positive. Всё будет хорошо.

License: Zlib Coverity Scan Status

Briefly, it is a portable non-cryptographic 64-bit hash function:

  1. Intended for 64-bit little-endian platforms, predominantly for Elbrus and x86_64, but portable and without penalties it can run on any 64-bit CPU.

  2. In most cases up to 15% faster than xxHash, StadtX, MUM and others portable hash-functions (which do not use specific hardware tricks).

    Currently wyhash outperforms t1ha on x86_64. However next version t1ha3_atonce() will be even faster on all platforms, especially on E2K, architectures with SIMD and most RISC-V implementations. In addition, it should be noted that wyhash have a "blinding multiplication" flaw and can lose entropy (similarly as described below). For instance, when data could be correlated with the seed ^ _wypN values or equal to it. Another case is where one of _wymum() multipliers becomes zero. In result of such blinding all previous data will not be influence to the hash value.

  3. Licensed under zlib License.

Also pay attention to Rust, Erlang, Golang and Kotlin Multiplatform implementations.

FAQ: Why t1ha don't follow NH-approach like FARSH, XXH3, HighwayHash and so on?

Okay, just for clarity, we should distinguish functions families: MMH (Multilinear-Modular-Hashing), NMH (Non-linear Modular-Hashing) and the next simplification step UMAC's NH.

Now take a look to NH hash-function family definition: Wikipedia

It is very SIMD-friendly, since SSE2's _mm_add_epi32() and _mm_mul_epu32() is enough for W =
32. On the other hand, the result of the inner multiplication becomes zero when (m[2i] + k[2i]) mod 2^32 == 0 or (m[2i+1] + k[2i+1]) mod 2^32 == 0, in which case the opposite multiplier will not affect the result of hashing, i.e. NH function just ignores part of the input data. I called this an "blinding multiplication". That's all. More useful related information can be googled by "UMAC NH key recovery attack".

The right NMH/NH code without entropy loss should be looking like this:

    uint64_t proper_NH_block(const uint32_t *M /* message data */,
                             const uint64_t *K /* 64-bit primes */,
                             size_t N_even, uint64_t optional_weak_seed) {
      uint64_t H = optional_weak_seed;
      for (size_t i = 0; i < N_even / 2; ++i)
        H += (uint64_t(M[i*2]) + K[i*2]) * (uint64_t(M[i*2+1]) + K[i*2+1]);
      return H;
    }

Usage

The t1ha library provides several terraced hash functions with the dissimilar properties and for a different cases. These functions briefly described below, see t1ha.h for more API details.

To use in your own project you may link with the t1ha-library, or just add to your project corresponding source files from /src directory.

Please, feel free to fill an issue or make pull request.

t1ha0 = 64 bits, "Just Only Faster"

Provides fast-as-possible hashing for current CPU, including 32-bit systems and engaging the available hardware acceleration. You can rest assured that t1ha0 faster than all other fast hashes (with comparable quality) so, otherwise we will extend and refine it time-to-time.

On the other hand, without warranty that the hash result will be same for particular key on another machine or another version. Moreover, is deliberately known that the result will be different for systems with different bitness or endianness. Briefly, such hash-results and their derivatives, should be used only in runtime, but should not be persist or transferred over a network.

Also should be noted, the quality of t1ha0() hashing is a subject for tradeoffs with performance. Therefore the quality and strength of t1ha0() may be lower than t1ha1() and t1ha2(), especially on 32-bit targets, but then much faster. However, guaranteed that it passes all SMHasher tests.

Internally t1ha0() selects most faster implementation for current CPU, for now these are includes:

| Implementation | Platform/CPU | | :---------------------- | :------------------------------------- | | t1ha0_ia32aes_avx() | x86 with AES-NI and AVX extensions | | t1ha0_ia32aes_avx2() | x86 with AES-NI and AVX2 extensions | | t1ha0_ia32aes_noavx() | x86 with AES-NI without AVX extensions | | t1ha0_32le() | 32-bit little-endian | | t1h0a_32be() | 32-bit big-endian | | t1ha1_le() | 64-bit little-endian | | t1ha1_be() | 64-bit big-endian | | t1ha2_atonce() | 64-bit little-endian |

t1ha1 = 64 bits, baseline fast portable hash

The first version of "Fast Positive Hash" with reasonable quality for checksum, hash tables and thin fingerprinting. It is stable, e.g. returns same result on all architectures and CPUs.

  1. Speed with the reasonable quality of hashing.
  2. Efficiency on modern 64-bit CPUs, but not in a hardware.
  3. Strong as possible, until no penalties on performance.

Unfortunatelly, Yves Orton discovered that t1ha1() family fails the strict avalanche criteria in some cases. This flaw is insignificant for the t1ha1() purposes and imperceptible from a practical point of view. However, nowadays this issue has resolved in the next t1ha2() function, that was initially planned to providing a bit more quality.

The basic version of t1ha1() intends for little-endian systems and will run slowly on big-endian. Therefore a dedicated big-endian version is also provided, but returns the different result than the basic version.

t1ha2 = 64 and

View on GitHub
GitHub Stars365
CategoryDevelopment
Updated8d ago
Forks31

Languages

C

Security Score

85/100

Audited on Apr 1, 2026

No findings