MerkleForge

A high-performance, unified Merkle tree library for the Rust ecosystem.

The Rust ecosystem for Merkle trees is fragmented. Libraries like rs-merkle cover only binary trees, while Ethereum-specific crates are maintained separately and incompatibly. merkleforge fixes this by providing all three major Merkle tree variants — Binary, Sparse, and Patricia — under a single, cohesive API with pluggable hash functions and a rigorous benchmarking suite.

This library is the implementation artifact of a final-year Software Engineering research project at MIVA Open University, supervised by Dr. Oluwasegun Ishaya Adelaiye.

Features
Architecture
Workspace Structure
Getting Started
Hash Functions
Tree Variants
Proof Generation & Verification
Benchmarking
Development Status
Roadmap
Contributing
License

Features

Unified API — BinaryMerkleTree<H>, SparseMerkleTree<H>, and MerklePatriciaTrie<H> all implement the same MerkleTree<H> trait. Switch variants without rewriting your integration code.
Pluggable hash functions — swap SHA-256, Keccak-256, and BLAKE3 with zero changes to tree logic. Rust generics monomorphise at compile time — no runtime overhead.
Zero-cost abstractions — the MerkleTree<H: HashFunction> bound compiles to the same machine code as a hand-rolled implementation for each hash.
Stateless proof verification — the ProofVerifier module lets light clients verify inclusion proofs using only the root hash, without storing the full tree.
Systematic benchmarking — a dedicated merkle-bench crate runs Criterion statistical benchmarks (mean/median/stddev) across all tree sizes and hash algorithms.
Property-based testing — Proptest verifies mathematical invariants: any leaf change must update the root; a corrupted proof must fail verification.
#[forbid(unsafe_code)] — guaranteed memory safety across the entire workspace.
CI/CD — GitHub Actions runs tests, Clippy (pedantic), rustfmt, doc builds, and a benchmark compile check on every push.

Architecture

The library is structured in two logical layers:

┌─────────────────────────────────────────────────────────────────┐
│                         merkle-core                             │
│  HashFunction trait · MerkleTree trait · ProofVerifier trait    │
│  MerkleProof · LeafIndex · NodeIndex · MerkleError              │
└───────────────────────────┬─────────────────────────────────────┘
                            │ depends on
          ┌─────────────────┼──────────────────┐
          ▼                 ▼                  ▼
   merkle-hash       merkle-variants       merkle-bench
  SHA-256 adapter   BinaryMerkleTree    Criterion suite
  Keccak-256        SparseMerkleTree    baseline_construction
  BLAKE3            PatriciaTrie        hash_throughput

Trait Hierarchy

HashFunction  ←  pluggable crypto abstraction
      │
      └──▶  MerkleTree<H: HashFunction>
                 insert() · remove() · root()
                 generate_proof() · metadata()
                      │
                      └──▶  ProofVerifier<H>  (stateless)
                                verify(root, leaf_data, proof)

Domain separation is enforced at the hash level:

Leaf hashes: H(0x00 || data)
Internal node hashes: H(0x01 || left || right)

This prevents second-preimage attacks where an attacker substitutes an internal node for a leaf.

Workspace Structure

MerkleForge/
├── Cargo.toml                        # workspace manifest + shared dep versions
│
├── merkle-core/                      # foundation — traits, types, errors
│   └── src/
│       ├── lib.rs
│       ├── error.rs                  # MerkleError (non_exhaustive)
│       ├── traits/
│       │   ├── hash_function.rs      # HashFunction trait
│       │   ├── merkle_tree.rs        # MerkleTree + ProofVerifier traits
│       │   └── serializable.rs       # blanket serde/bincode impl
│       └── types/
│           └── common.rs             # NodeIndex, LeafIndex, MerkleProof, ...
│
├── merkle-hash/                      # pluggable hash adapters
│   └── src/
│       ├── lib.rs
│       ├── sha256.rs
│       ├── keccak256.rs
│       └── blake3.rs
│
├── merkle-bench/                     # isolated benchmarking crate
│   ├── benches/
│   │   ├── baseline_construction.rs  # leaf + node hashing latency
│   │   └── hash_throughput.rs        # sustained MB/s per algorithm
│   └── src/lib.rs
│
└── .github/workflows/ci.yml          # automated test · lint · bench · docs

Getting Started

Add the crates you need to your Cargo.toml:

[dependencies]
merkle-core = "0.1"
merkle-hash = "0.1"
# merkle-variants = "0.1"   # coming in Phase 2-4

Quick example (once `merkle-variants` lands in Phase 2)

use merkle_hash::Sha256;
use merkle_variants::BinaryMerkleTree;
use merkle_core::prelude::*;

fn main() -> Result<(), MerkleError> {
    let mut tree = BinaryMerkleTree::<Sha256>::new();

    tree.insert(b"alice:100")?;
    tree.insert(b"bob:250")?;
    tree.insert(b"carol:75")?;

    let root = tree.root().expect("tree is non-empty");
    println!("Root: {:?}", root);

    // Generate an inclusion proof for leaf 0 (alice)
    let proof = tree.generate_proof(LeafIndex(0))?;
    println!("Proof depth: {}", proof.depth());

    // Stateless verification — no tree needed
    let valid = BinaryMerkleTree::<Sha256>::verify(root, b"alice:100", &proof);
    assert!(valid);
    Ok(())
}

Swap the hash function

// From SHA-256 to BLAKE3 — one character change, same API
use merkle_hash::Blake3;
let mut tree = BinaryMerkleTree::<Blake3>::new();

Hash Functions

| Type | Algorithm | Output | Best for | |------|-----------|--------|----------| | Sha256 | SHA-256 | 32 bytes | Production deployments with hardware SHA extensions | | Keccak256 | Keccak-256 | 32 bytes | Ethereum-compatible Patricia Tries | | Blake3 | BLAKE3 | 32 bytes | Maximum throughput on modern CPUs |

All three adapters implement HashFunction with domain-separated leaf and internal-node hashing. SHA-256 and Keccak-256 use 0x00/0x01 byte prefixes; BLAKE3 uses its native derive_key mode with context strings for zero-overhead domain separation.

Benchmark data comparing throughput across all three will be published after Phase 5.

Tree Variants

| Crate | Variant | Status | Best for | |-------|---------|--------|----------| | merkle-variants | BinaryMerkleTree<H> | 🔜 Phase 2 | Transaction batching, Bitcoin-style SPV | | merkle-variants | SparseMerkleTree<H> | 🔜 Phase 3 | Account state, Layer-2 rollups | | merkle-variants | MerklePatriciaTrie<H> | 🔜 Phase 4 | Ethereum state roots, EVM compatibility |

Binary Merkle Tree

Balanced, power-of-two tree with iterative bottom-up construction. Optimised for throughput when the full leaf set is known at build time. Proof size is O(log n).

Sparse Merkle Tree

256-bit key space (2²⁵⁶ possible slots). Implements shortcut nodes (Buterin, 2018), precomputed empty-hash caching, and node batching (Ouvrard, 2018/2019) for sub-linear memory use on mostly-empty trees. Supports one-phase batch updates (Ma et al., 2023) for rollup workloads.

Merkle Patricia Trie

Ethereum-compatible trie with four node types (branch, extension, leaf, empty), RLP encoding, and state-root outputs that match Ethereum mainnet. Validated against official Ethereum test vectors.

Proof Generation & Verification

A MerkleProof<D> carries the sibling hashes along the path from a specific leaf to the root:

pub struct MerkleProof<D> {
    pub leaf_index: LeafIndex,
    pub leaf_count:  usize,
    pub path:        Vec<ProofNode<D>>,   // O(log n) siblings
}

Verification is stateless — a light client needs only the root hash (e.g. from a trusted block header) and the proof:

1. current = H(0x00 || leaf_data)
2. for each ProofNode in proof.path:
     if side == Left:  current = H(0x01 || sibling || current)
     if side == Right: current = H(0x01 || current || sibling)
3. return current == expected_root

Proofs are serde-serialisable out of the box via the blanket Serializable impl — store them in a database or send them over a network with no extra setup.

Benchmarking

The merkle-bench crate provides an isolated Criterion suite that reports mean, median, and standard deviation to eliminate noise.

# Run all benchmarks and open the HTML report
cargo bench --workspace
open target/criterion/report/index.html

# Run just the hash throughput comparison
cargo bench --bench hash_throughput

Metrics tracked:

Construction latency — time to hash one leaf / combine two nodes
Throughput — sustained MB/s per hash algorithm (32 B → 1 MB inputs)
(Phase 5) Tree construction — 100 / 1 000 / 10 000 / 100 000 / 1 000 000 leaves
(Phase 5) Proof generation & verification — latency + proof size in bytes
(Phase 5) Memory consumption — peak RSS during construction

Development Status

| Phase | Scope | Status | |-------|-------|--------| | 1 — Core Infrastructure | Trait hierarchy, hash adapters, CI/CD | ✅ Complete | | 2 — Binary Merkle Tree | `Bin

MerkleForge

Install / Use

README

MerkleForge

Table of Contents

Features

Architecture

Trait Hierarchy

Workspace Structure

Getting Started

Quick example (once `merkle-variants` lands in Phase 2)

Swap the hash function

Hash Functions

Tree Variants

Binary Merkle Tree

Sparse Merkle Tree

Merkle Patricia Trie

Proof Generation & Verification

Benchmarking

Development Status

MerkleForge

Install / Use

README

MerkleForge

Table of Contents

Features

Architecture

Trait Hierarchy

Workspace Structure

Getting Started

Quick example (once merkle-variants lands in Phase 2)

Swap the hash function

Hash Functions

Tree Variants

Binary Merkle Tree

Sparse Merkle Tree

Merkle Patricia Trie

Proof Generation & Verification

Benchmarking

Development Status

Quick example (once `merkle-variants` lands in Phase 2)