Quadrable is an authenticated multi-version database that can efficiently sync itself with remote instances. It is implemented as a sparse binary merkle tree with compact partial-tree proofs. The primary interface is a header-only C++ library, but there is also a git-like command-line tool.

Introduction
Building
Data Structure
Proofs
Syncing
Integer Keys
- Logs
- Proof Ranges
Storage
Command-line
C++ Library
Alternate Implementations
Author and Copyright

Introduction

Authenticated: The state of the database can be digested down to a 32-byte value, known as the "root". This represents the complete contents of the database, and any modifications will result in a new root. Anyone who knows the root value of a database can perform remote queries on it and be confident that the responses are authentic. To accomplish this, the remote server provides a proof along with each response, which is validated against the root.
Multi-version: Many different versions of the database can exist at the same time. Deriving one version from another doesn't require copying the database. Instead, all of the data that is common between the versions is shared. This copy-on-write behaviour allows very inexpensive database snapshots and checkpoints.
Syncable: By leveraging the proof functionality, applications can synchronise with remote instances. Conflict-handling is flexible, and can work by either cloning one side or by merging the states together (like a CRDT). By determining which records are present on the remote instance but not the local instance and vice versa, sync overhead is minimised, and can be optimised for either transfer size or round-trips.

Although not required to use the library, it may help to understand the core data-structure used by Quadrable:

Merkle tree: Each version of the database is a tree. The leaves of this tree are the inserted records which are combined together with calls to a cryptographic hash function, creating a smaller set of intermediate nodes. These intermediate nodes are then combined in a similar way to create a still smaller set, and this procedure continues until a single node is left, which is the root node. These "hash trees" are commonly called merkle trees, and they provide the mechanism for Quadrable's authentication.
Binary: The style of merkle tree used by Quadrable combines together exactly two nodes to create a node in the next layer. There are alternative designs such as N-ary radix trees, AVL trees, and tries, but they are more complicated to implement and typically have a higher authentication overhead (in terms of proof size). With a few optimisations and an attention to implementation detail, binary merkle trees enjoy almost all the benefits of these alternative designs.
Sparse: A traditional binary merkle tree does not have a concept of an "empty" leaf. This means that the leaves must be in a sequence, for example 1 through N (with no gaps). This raises the question about what to do when N is not a power of two. Furthermore, adding new records in a "path-independent" way, where insertion order doesn't matter, is difficult to do efficiently. Quadrable uses a sparse merkle tree structure, where there is a concept of an empty leaf, and leaf nodes can be placed anywhere inside a large (256-bit) key-space. This means that hashes of keys can be used directly as each leaf's location in the tree. Alternatively, Quadrable supports using sequential integers as keys to implement an ordered data-sets.

Values are authenticated by exporting and importing proofs:

Compact proofs: In the classic description of a merkle tree, a value is proved to exist in the tree by providing a list of witness values as a proof. The value to be proved is hashed and then combined with the witnesses in order to reconstruct the hashes of the intermediate nodes along the path from the leaf to the root. If at the end of the list of witnesses you end up with the root hash, the value is considered authenticated. However, if you wish to authenticate multiple values in the tree at the same time then these linear proofs will contain duplicated hashes which wastes space. Additionally, some hashes that would need to be included with a proof for a single value can instead be calculated by the verifier. Quadrable's proof encoding never includes redundant sibling hashes, or ones that could be calculated during verification. It does this with a low overhead (approximately 0-6 bytes per proved item, not including sibling hashes).
Partial-trees: Since the process of verifying a merkle proof reconstructs some intermediate nodes of the tree, Quadrable constructs a "partial-tree" when authenticating a set of values. This partial-tree can be queried in the same way as if you had the full tree locally, although it will throw errors if you try to access non-authenticated values. You can also make modifications on a partial-tree, so long as you don't attempt to modify a non-authenticated value. After an update, the new root of the partial-tree will be the same as the root would be if you made the same modifications to the full tree. Once a proof has been imported, additional proofs that were exported from the same tree can be merged in, expanding a partial-tree over time as new proofs are received. New proofs can also be generated from a partial-tree, as long as the values to prove are present (or were proved to not be present).

Quadrable is a Log Periodic project.

Building

Dependencies

The LMDB and BLAKE2 libraries and header files are required. On Ubuntu/Debian run this:

sudo apt install -y liblmdb-dev libb2-dev

Compilation

Clone the repo, cd into it, and run these commands:

git submodule update --init
make -j

Tests

You can run the tests like so:

make test

If you make clean prior to make test then a version of the check binary will be compiled with address sanitiser support.

You can view a coverage report of the tests by running:

make coverage

You will need lcov installed.
The report will be in this file: coverage-report/index.html

Data Structure

Trees and Exponential Growth

The reason we use trees is because of the exponential growth in the number of nodes as the number of levels is increased. In other words, the number of intermediate nodes that must be traversed to get to a leaf grows much slower than the total number of nodes.

For some reason, computer science trees are usually drawn as growing in the downwards direction. Because of this, we use the term "depth" to refer to how many levels down you are from the top node (the "root").

Merkle Trees

In a merkle tree, each node has a "nodeHash" which is formed by hashing the concatenation of its children nodeHashes. In Quadrable the tree is binary, so there are always exactly two children (except for leaf nodes, which have none). The order of the concatenation is important: The left child's nodeHash comes first, followed by the right child's:

The advantage of a merkle tree is that the nodeHash of the node at depth 0 (the top level) is a digest of all the other nodes and leaves. This top-level nodeHash is often just called the "root". As long as the tree structure is carefully designed, and the hash function is secure, any changes to the tree or its contents will result in a new, distinct root.

Keys

In Quadrable's implementation of a merkle tree, keys are 256-bits long and these bits are used to traverse the tree to find the locations where the values are stored. A 0 bit means to use the left child of a node, and a 1 bit means use the right child:

When using Quadrable as a map (as opposed to a log), keys are first hashed and these hashes are what are used to traverse the tree. Keys are hashed for multiple reasons:

Keys of any length can be supported, since a hash will always return a fixed-size output.
It puts a bound on the depth of the tree. Because Quadrable uses a 256-bit hash fu

Quadrable

Install / Use

README