turbolite

turbolite is a SQLite VFS in Rust that serves point lookups and joins directly from S3 with sub-250ms cold latency.

It also offers page-level compression (zstd) and encryption (AES-256) for efficiency and security at rest, which can be used separately from S3.

turbolite is experimental. It is new and contains bugs. It may corrupt your data. Please be careful.

Object storage is getting fast. S3 Express One Zone delivers single-digit millisecond GETs and Tigris is also extremely fast. The gap between local disk and cloud storage is shrinking, and turbolite exploits that.

The design and name are inspired by turbopuffer's approach of ruthlessly architecting around cloud storage constraints. The project's initial goal was to beat Neon's 500ms+ cold starts. Goal achieved.

If you have one database per server, use a volume. turbolite explores how to have hundreds or thousands of databases (one per tenant, one per workspace, one per device), don't want a volume for each one, and you're okay with a single write source.

turbolite ships as a Rust library, a SQLite loadable extension (.so/.dylib), and language packages for Python and Node.js, plus Github deps for Go. Any S3-compatible storage works (AWS S3, Tigris, R2, MinIO, etc.). It's a standard SQLite VFS operating at the page level, so most SQLite features should work: FTS, R-tree, JSON, WAL mode, etc.

If you want to contribute to turbolite or find bugs, please create a pull request or open an issue.

Performance

| Query | Type | Cold (S3 Express) | Cold (Tigris) | |-------|------|-------------------|---------------| | Post + user | point lookup + join | 77ms | 192ms | | Profile | multi-table join (5 JOINs) | 190ms | 524ms | | Who-liked | index search + join | 129ms | 340ms | | Mutual friends | multi-search join | 82ms | 183ms | | Indexed filter | covered index scan | 74ms | 173ms | | Full scan + filter | full table scan | 586ms | 984ms |

1M rows, 1.5GB at with nothing cached, every byte from S3. EC2 c5.2xlarge + S3 Express One Zone (same AZ, ~4ms GET latency). Fly performance-8x + Tigris (~25ms GET latency). Both: 8 dedicated vCPU, 16GB RAM, 8 prefetch threads. See Benchmarking and Storage backend matters.

Benchmarks are organized by cache level (what's already on local disk when the query runs):

| Cache level | What's cached | What's fetched from S3 | When this happens | |-------------|--------------|----------------------|-------------------| | none | nothing | everything | Fresh start, empty cache | | interior | interior B-tree pages | index + data pages | First query after connection open | | index | interior + index pages | data pages only | Normal turbolite operation | | data | everything | nothing | Equivalent to local SQLite |

interior is the most realistic cold benchmark: interior pages load eagerly on connection open, so by the time you run your first query, they're cached. Index pages aggressively prefetch on first access in the background and may not be ready yet.

Quick Start

Python

pip install turbolite

import turbolite

# cloud database — serve cold queries from S3-compatible storage (Tigris)
conn = turbolite.connect("my.db", mode="s3",
    bucket="my-bucket",
    endpoint="https://t3.storage.dev")

conn.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, email TEXT)")
conn.execute("INSERT INTO users VALUES (1, 'alice', 'alice@example.com')")
conn.commit()

alice = conn.cursor().execute("SELECT * FROM users").fetchone()
print(alice[1])
>>> "alice"

See Installation for Node, Go, Rust, local-only mode, and using the .so loadable extension directly

Design

turbolite is designed for S3's constraints over filesystem constraints. Every decision flows from this model:

| S3 Constraint | Implication | |---------------|-------------| | Round trips are slow | Minimize request count. Batch writes, prefetch reads aggressively. | | Bandwidth is a bottleneck | Maximize bandwidth utilization. | | PUTs and GETs charge per-operation | A 64KB GET costs the same as a 16MB GET. Optimize request count, not byte efficiency. | | Objects are immutable | Never update in place. Write new versions, swap a pointer. No partial-write corruption. | | Storage is cheap | Don't optimize for space. Over-provision, keep old versions, let GC clean up later. |

Architecture

turbolite adds introspection and indirection layers between SQLite and S3 that efficiently groups, compresses, tracks, and fetches pages.

SQLite uses a B-tree index and requests for one page at a time. It knows page N is at byte offset N * page_size. And those pages are distributed randomly throughout the pagemap for efficient random access. But on S3, fetching one page per request would mean thousands of potentially random GETs per query.

But pages are not created equally. SQLite has different types of pages. turbolite separates page groups by type: interior B-tree, index leaf, and data leaf pages.

Interior pages are touched on every query to route lookups to leaf pages. turbolite detects them, stores them in compressed bundles in S3, and loads them eagerly on VFS open. After that, every B-tree traversal is a cache hit.

Index leaf pages get the same treatment: separate bundles, lazy background prefetch, pinned against eviction. Cold queries only need to fetch data pages.

turbolite takes advantage of B-tree introspection to understand which tree (a table or index) a page is part of, and intelligently stores those pages together in S3 as page groups: many pages chunked into a single S3 object. Big enough to saturate bandwidth on prefetch, small enough for point queries. Default: 256 pages per group, ~16MB at 64KB pages.

Storing the same table/index together means that we make the fewest possible GETs for cold queries.

turbolite indirects page lookups with a manifest file that is the source of truth for where every page lives. It replaces SQLite's implicit offset = page * size with explicit pointers. Old page group versions are never overwritten; the manifest PUT is the atomic commit point. Old versions become garbage, cleaned up by gc().

SQLite defaults to 4KB pages to match filesystem disk page size. On S3, disk page size is irrelevant. What matters is minimizing request count and maximizing B-tree fan-out. The answer is large pages: turbolite defaults to 64KB pages. Fewer pages = fewer S3 round trips to reach a leaf.

To make point queries fast, turbolite uses seekable compression: each page group is encoded as multiple zstd frames (~4 pages per frame). The manifest stores byte offsets per frame, so a cache miss fetches just the ~256KB sub-chunk with the needed page via S3 range GET, not the entire group.

Prefetching has two layers: proactive (query-plan frontrunning) and reactive (adaptive miss-based).

Query-plan frontrunning runs first. Before a query executes, turbolite intercepts the SQLite query plan via EXPLAIN QUERY PLAN, extracts the exact tables and indexes the query will touch, and submits all their page groups to the prefetch pool before the first page is even read. A five-table join that would otherwise trigger five sequential miss-then-fetch cycles instead fires all five fetches in parallel at query start. For SCAN queries, this means the entire table is prefetched upfront.

Caveat: SQLite supports one trace callback per connection. If another extension claims the slot first, frontrunning silently falls back to reactive prefetch.

Reactive prefetching handles what frontrunning misses and acts as fallback. On a cache miss, two things happen concurrently:

Inline range GET: fetch the specific sub-chunk containing the needed page, return to SQLite immediately.
Background prefetch: submit sibling groups for that tree to the prefetch pool according to a schedule.

Miss counters are tracked per B-tree, not globally. A profile query that hits users (miss 1) then posts (miss 1) correctly tracks each tree at 1, not 2. This prevents a multi-table join from accidentally escalating prefetch on every tree just because it touches several.

Each consecutive miss advances through a prefetch schedule that controls what fraction of same-tree groups to prefetch. turbolite selects a schedule automatically based on the query plan:

Search schedule [0.3, 0.3, 0.4]: for SEARCH ... USING INDEX queries that scan unknown portions of indexes. Aggressive from the first miss because we don't know how much of the index will be scanned.
Lookup schedule [0.0, 0.0, 0.0]: for point queries and index lookups that hit 1-2 pages per tree. Three free hops before any prefetch. Zero-heavy schedules outperform early-ramp on both S3 Express and Tigris.

You can tune the prefetch schedule for each query via SELECT turbolite_config_set(...) - you know the query's storage needs, so the VFS doesn't have to guess. See Runtime tuning.

Both schedules take advantage of B-tree introspection: every prefetched group is guaranteed to contain pages from the right tree. An example: if SQLite requests a page from the users table, then requests another from the same table, turbolite assumes a scan is coming and prefetches the rest of the users table in the background, and nothing else. Without B-tree introspection, it would accidentally fetch half the users table and half the posts table just because the data lives next to each other on disk.

Encryption & Compression

Compression

All data is zstd-compressed before storage. Page groups use seekable mu

Turbolite

Install / Use

README