Salt + Lattice

A Sovereign Microkernel for High-Performance Distributed Workloads, built in a systems language with embedded formal verification.

Salt is an ahead-of-time compiled systems language that combines the performance of C with compile-time safety through an embedded Z3 theorem prover. Lattice is a microkernel operating system written entirely in Salt, achieving unikernel-level latency while maintaining hardware-enforced Ring 0 / Ring 3 isolation.

Together, they form a single system where the language's core capabilities (formal verification and MLIR-based lowering) become the operating system's capabilities: zero-trap IPC, proof-carrying descriptors, and cache-line-deterministic data planes.

package main

use std.collections.HashMap

fn main() {
    let mut map = HashMap<StringView, i64>::new();
    map.insert("hello", 1);
    map.insert("world", 2);

    let result = map.get("hello") |?> println(f"Found: {_}");

    for entry in map.iter() {
        println(f"{entry.key}: {entry.value}");
    }
}

Why Salt + Lattice?

Most operating systems are written in C (Linux, Xv6) or C++ (Fuchsia, seL4). They rely on extensive runtime checks, POSIX syscall conventions, and manual memory management. Salt replaces all three with compile-time proofs, zero-trap shared memory, and arena-based allocation, giving Lattice the performance of a unikernel with the isolation guarantees of a microkernel.

The Three Pillars

🔥 Pillar A: Zero-Trap Data Plane (SPSC + Shared Memory)

Salt: High-performance, low-level memory control with MLIR-optimized lowering.

Lattice: Instead of legacy POSIX syscalls (read/write) that trap into the kernel on every packet, Lattice uses Shared Memory SPSC (Single-Producer, Single-Consumer) Rings. The networking stack (NetD) and storage stack (LatticeStore) run as Ring 3 "System Daemons" that communicate with the kernel through lock-free ring buffers in shared pages.

Traditional OS:  App → syscall → trap → kernel copy → return    (~1000 cycles)
Lattice:         App → SPSC write → shared memory → NetD reads  (~150 cycles)

The kernel's only role in the data plane is pushing raw Ethernet frames into the SPSC ring and firing a wake notification. All protocol parsing (ARP, TCP, IP) happens in Ring 3, isolating the kernel from packet-parsing RCE vulnerabilities.

🔒 Pillar B: The Formal Shadow (Z3-Verified Sovereignty)

Salt: A built-in Z3 verification gate that proves memory safety and alignment at compile time.

Lattice: Proof-Carrying IPC. The compiler "seals" a Z3 proof into a 64-bit proof_hint embedded in every SPSC descriptor. The NetD arbiter verifies this hint in O(1) time (two CPU instructions: alignment mask + bitwise compare).

// At compile time, Z3 proves @align(64) fields are on separate cache lines.
// The compiler seals this proof:
//   proof_hint = hash_combine(struct_id, field_offset, alignment)
// The arbiter validates the seal before touching any shared memory.

struct SpscDescriptor {
    ptr: u64,           // Must be 64-byte aligned (mechanical check)
    len: u32,
    proof_hint: u64,    // Z3-sealed "Right to Access" token
}

This eliminates the "Security Tax." We don't need expensive runtime bounds checks because the hardware (MMU page tables) and the math (Z3 SMT solver) have already validated the memory access before the binary is even loaded.

⚡ Pillar C: Mechanical Sympathy (The Cache-Line Guarantee)

Salt: First-class support for physical memory layout via the @align(N) attribute with Z3-verified struct padding.

Lattice: False-sharing elimination. Lattice SPSC rings are formally proven to isolate Producer and Consumer indexes on separate L3 cache lines:

struct SpscRing {
    @align(64)
    head: u64,         // Producer-owned (cache line 0)
    capacity: u64,

    @align(64)
    tail: u64,         // Consumer-owned (cache line 1)
}
// Z3 PROVED: head at offset 0, tail at offset 64 (z3_align_verified)

This targets the Cycles per Packet (Cpp) KPI. We aren't just fast; we are deterministic. No cache-line "ping-pong" between cores, no prefetcher-induced jitter, no false-sharing invalidation storms.

Approach

Salt takes a different path. The compiler integrates Z3 as a first-class verification backend: developers write requires preconditions and ensures postconditions on functions, and the compiler checks each contract using Z3. Preconditions are verified at every call site; postconditions are verified at every return site using Weakest Precondition (WP) generation with path-sensitive branch analysis. When Z3 proves the condition always holds, the check is elided entirely, at zero runtime cost. When Z3 finds a concrete counterexample, it reports the violating values. When neither can be determined, the compiler emits a standard runtime assertion as a fallback.

Memory is managed through arenas with compile-time escape analysis. No garbage collector, no lifetime annotations, no borrow checker. The ArenaVerifier verifies statically that no reference outlives its region, giving you the performance profile of manual allocation with the safety properties of managed memory.

Multi-Dialect Compilation

The compiler routes code through multiple MLIR dialects depending on the optimization opportunity:

| Pattern | Dialect | Optimization | |---------|---------|-------------| | Tensor/matrix loops | affine.for | Polyhedral tiling, loop fusion | | Scalar-heavy loops | scf.for | Register pressure optimization | | Branching control flow | cf + llvm | Standard LLVM backend | | Arena operations | Custom lowering | Escape analysis, bulk free |

This is the mechanism behind Salt's performance results. When a matmul kernel is compiled through the affine dialect, MLIR can tile the iteration space for cache locality in a way that a flat LLVM IR representation cannot express. The compiler emits 120 unique MLIR operations across these dialects.

Performance

All benchmarks use runtime-dynamic inputs to prevent constant folding, and results are printed to prevent dead code elimination. Each measurement averages 3 runs with cached binaries. Full methodology is documented in the benchmark suite.

Verified March 1, 2026 on Apple M4

| Benchmark | Salt | C (clang -O3) | Rust | vs. C | |-----------|------|-----------------|------|-------| | matmul (1024³) | 203ms | 923ms | 970ms | 4.5× | | buffered_writer | 43ms | 363ms | 60ms | 8.4× | | fstring_perf (10M) | 240ms | 1,113ms | 773ms | 4.6× | | forest (depth-22)* | 60ms | 237ms | 330ms | 4×* | | longest_consecutive | 260ms | 803ms | 393ms | 3.1× | | http_parser | 77ms | 97ms | 153ms | 1.3× | | trie | 83ms | 107ms | 277ms | 1.3× | | vector_add | 110ms | 133ms | 147ms | 1.2× | | sudoku_solver | 33ms | 50ms | 37ms | 1.5× | | lru_cache | 57ms | 77ms | 80ms | 1.4× | | window_access | 93ms | 120ms | 140ms | 1.3× | | hashmap_bench | 87ms | 100ms | 93ms | 1.1× | | sieve (10M) | 173ms | 200ms | 280ms | 1.2× | | fib | 207ms | 247ms | 233ms | 1.2× | | fannkuch | 177ms | 200ms | 200ms | 1.1× | | global_counter | 147ms | 183ms | 123ms | 1.2× | | binary_tree_path | 37ms | 40ms | 40ms | parity | | string_hashmap | 77ms | 77ms | 83ms | parity | | bitwise | 67ms | 67ms | 53ms | parity | | trapping_rain_water | 103ms | 97ms | 107ms | 0.9× | | merge_sorted_lists | 187ms | 167ms | 143ms | 0.9× | | writer_perf | 153ms | 123ms | 117ms | 0.8× |

Salt ≤ C in 18/22 head-to-head benchmarks. 28 total (including 6 Salt-only). 0 build failures. Binary size ~38KB (vs Rust ~430KB).

The "Abstraction Tax" is zero: Salt's Z3 verification, arena memory, and MLIR pipeline add no runtime overhead. The proofs discharge at compile time, the arenas free in O(1), and MLIR optimizes the same way LLVM does, or better when polyhedral tiling applies.

* Forest measures arena allocation strategy (O(1) bump + O(1) reset) vs individual malloc/free. The advantage is Salt's arena stdlib, not codegen.

Verified Safety

Contracts are proof obligations checked by Z3 at compile time. When Z3 can prove a requires precondition holds at a call site, the check is elided entirely, at zero runtime cost. When it cannot, the compiler emits a runtime assertion as a safe fallback.

fn binary_search(arr: &[i64], target: i64) -> i64
    requires(arr.len() > 0)
{
    let mut lo: i64 = 0;
    let mut hi: i64 = arr.len() - 1;

    while lo <= hi {
        let mid = lo + (hi - lo) / 2;
        if arr[mid] == target {
            return mid;
        } else if arr[mid] < target {
            lo = mid + 1;
        } else {
            hi = mid - 1;
        }
    }
    return -1;
}

Z3 verifies requires(arr.len() > 0) at every call site. Passing an empty array is a compile-time error with a concrete counterexample. Passing a non-empty array causes the check to be elided — the binary contains no guard.

Postconditions (v0.9.2)

ensures postconditions are verified at every return site using Weakest Precondition (WP) generation. The compiler tracks branch conditions through the control flow graph and provides Z3 with path-sensitive context at each exit point:

fn absolute_value(x: i32) -> i32
    ensures(result >= 0)
{
    if x < 0 {
        return -x;    // Z3 proves: given x < 0, -x >= 0  ✓
    }
    return x;         /

Lattice

Install / Use

README