SkillAgentSearch skills...

ExcaliburHash

Excalibur Hash is a high-speed hash map and hash set, ideal for performance-critical uses like video games

Install / Use

/learn @SergeyMakeev/ExcaliburHash
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

ExcaliburHash

License ci codecov

About

Excalibur Hash is a high-speed hash map and hash set, ideal for performance-critical uses like video games. Its design focuses on being friendly to the CPU cache, making it very efficient and fast. It uses an open addressing hash table and manages removed items with a method called tombstones.

Engineered for ease of use, Excalibur Hash, in the vast majority of cases (99%), serves as a seamless, drop-in alternative to std::unordered_map. However, it's important to note that Excalibur Hash does not guarantee stable addressing. So, if your project needs to hold direct pointers to the keys or values, Excalibur Hash might not work as you expect. It also offers improved iterator design, which is not 100% compatible with STL iterators and might require some minor code changes. That aside, its design and efficiency make it a great choice for applications where speed is crucial.

Features

  1. Extremely fast (see Performance section for details)
  2. CPU cache friendly
  3. Built-in configurable inline storage
  4. Can either work as a map (key, value) or as a set (keys only)
  5. Header-only library
  6. Standard container-like interface
  7. Support for custom key types via KeyInfo<T> specialization

Requirements

  • C++17 compatible compiler
  • Supported compilers: MSVC, GCC, Clang
  • Platforms: Windows, Linux, macOS

Design and Optimizations

ExcaliburHash achieves its exceptional performance through several carefully engineered design decisions and optimizations:

Core Architecture

Open Addressing with Linear Probing

  • Uses a single contiguous array instead of separate chaining, providing excellent cache locality
  • Linear probing minimizes memory indirection and maximizes CPU cache efficiency
  • No pointer chasing or dynamic memory allocation for individual elements

Tombstone Management

  • Deleted elements are marked as "tombstones" rather than shifting elements
  • Maintains stable iteration order and prevents expensive rehashing on deletions
  • Smart load factor calculation includes tombstones to prevent performance degradation
  • rehash() method available to clean up tombstones when needed

Memory Layout Optimizations

Configurable Inline Storage

  • Small hash tables store elements directly within the object (no heap allocation)
  • Default inline storage for 1 element eliminates malloc/free overhead for small maps
  • Configurable via template parameter: HashTable<K,V,64> for 64 inline elements
  • Seamlessly transitions to heap allocation when capacity is exceeded

Cache-Friendly Design

  • Power-of-2 bucket sizing enables fast modulo operations using bit masking
  • Contiguous memory layout maximizes cache line utilization
  • Elements stored in-place with proper alignment for optimal memory access

Performance Algorithms

Advanced Load Factor Management

  • Maintains 75% load factor (including tombstones) for optimal performance
  • Uses bit shifts for fast threshold calculation: (buckets >> 1) + (buckets >> 2) + 1
  • Prevents pathological cases where tombstones degrade lookup performance
  • Automatic growth by 2x when threshold is exceeded

High-Quality Hash Function

  • Uses WyHash algorithm, one of the fastest non-cryptographic hash functions
  • Platform-optimized implementation using compiler intrinsics
  • Multiplier constants derived from research (https://arxiv.org/abs/2001.05304)
  • Excellent distribution properties minimize clustering

Code Generation Optimizations

Template Metaprogramming

  • Single implementation for both const/non-const variants using std::conditional_t
  • Eliminates code duplication while maintaining type safety
  • Zero-cost abstractions for different iterator types
  • Compile-time feature detection for optimal code paths

Branch Prediction Friendly

  • Optimizes common cases (successful lookups, non-full tables)
  • Helps CPU branch predictor make better decisions

Memory Access Patterns

  • __restrict keywords inform compiler about non-aliasing pointers
  • Enables aggressive compiler optimizations and vectorization
  • Aligned memory allocation for SIMD-friendly access patterns

Customization Points

KeyInfo Specialization

  • Type traits system for custom key types
  • Defines empty/tombstone values and hash/equality functions
  • Compile-time customization without runtime overhead
  • Built-in optimized implementations for common types

Iterator Design: A Different Approach

ExcaliburHash takes a fundamentally different approach to iteration compared to standard library containers, prioritizing clarity, performance, and true genericity between maps and sets.

Default Iterator Behavior: Keys First

⚠️ Key Difference: Unlike STL containers, the default range-based for loop iterates over keys only:

Excalibur::HashMap<int, std::string> map;

// Default iteration - keys only (different from STL!)
for (const auto& key : map) {
    // 'key' is an int, not a pair
    std::cout << "Key: " << key << std::endl;
}

// Compare with STL:
std::unordered_map<int, std::string> stdMap;
for (const auto& pair : stdMap) {
    // 'pair' is std::pair<const int, std::string>
    std::cout << "Key: " << pair.first << std::endl;
}

Three Specialized Iterator Types

ExcaliburHash provides three distinct iterator types for different use cases:

Excalibur::HashMap<int, std::string> map;

// 1. Key-only iteration (IteratorK)
for (const auto& key : map.keys()) {
    std::cout << "Key: " << key << std::endl;
}

// 2. Value-only iteration (IteratorV) - hash maps only  
for (auto& value : map.values()) {
    value += "_modified";
}

// 3. Key-value iteration (IteratorKV) - hash maps only
for (const auto& item : map.items()) {
    std::cout << "Key: " << item.key() << ", Value: " << item.value() << std::endl;
}

Python-Inspired Design

This approach mirrors Python's dictionary iteration, which many developers find intuitive:

# Python dictionaries
for key in my_dict.keys():        # Keys only
for value in my_dict.values():    # Values only  
for key, value in my_dict.items():  # Key-value pairs

Why This Design?

1. Intent-Driven API

  • map.keys() clearly expresses "I want to iterate over keys"
  • map.values() clearly expresses "I want to iterate over values"
  • map.items() clearly expresses "I want both keys and values"

2. Semantic Clarity

  • item.key() and item.value() are self-documenting
  • No confusion about .first vs .second like STL pairs
  • Prevents common bugs from mixing up pair members

3. True Map/Set Genericity

The most important benefit: the same code works for both maps and sets:

template<typename Container>
void process_keys(const Container& container) {
    // Same code works for both maps and sets!
    for (const auto& key : container) {  // Always iterates keys
        std::cout << key << std::endl;
    }
}

// Usage:
Excalibur::HashMap<int, std::string> map;  // Key-value container
Excalibur::HashSet<int> set;               // Key-only container

process_keys(map);  // ✅ Works
process_keys(set);  // ✅ Works - same exact code!

STL Cannot Achieve This

template<typename Container>
void process_keys_stl(const Container& container) {
    for (const auto& item : container) {
        // ❌ Breaks! item.first doesn't exist for std::set
        std::cout << item.first << std::endl;  
    }
}

Iterator Availability Matrix

The type system automatically provides appropriate iterators based on container type:

// Hash Map: HashTable<Key, Value>
map.keys()    // ✅ Available
map.values()  // ✅ Available  
map.items()   // ✅ Available

// Hash Set: HashTable<Key, std::nullptr_t>
set.keys()    // ✅ Available
set.values()  // ❌ Compile error - no values in sets
set.items()   // ❌ Compile error - no key-value pairs in sets

This design enables writing truly generic algorithms that work across both container types without runtime checks, template specialization, or different APIs. The type system enforces correctness at compile time while maintaining optimal performance.

The combination of these optimizations results in a hash table that often outperforms std::unordered_map by 2-10x in real-world scenarios while maintaining a familiar API.

Performance

In this section, you can see a performance comparison against a few popular hash table implementations. This comparison will show their speed and efficiency, helping you understand which hash table might work best for your project.

Unless otherwise stated, all tests were run using the following configuration

OS: Windows 11 Pro (22621.4317)
CPU: Intel i9-12900K
RAM: 128Gb 
Compiled in the Release mode using VS2022 (19.39.33520)

Performance test repository

absl::flat_hash_map
boost::unordered_map
boost::unordered_flat_map
ska::flat_hash_map
ska::unordered_map
folly::F14ValueMap
llvm::DenseMap
[

View on GitHub
GitHub Stars334
CategoryContent
Updated2mo ago
Forks9

Languages

C++

Security Score

100/100

Audited on Jan 10, 2026

No findings