Verneuil
Verneuil is a VFS extension for SQLite that asynchronously replicates databases to S3-compatible blob stores.
Install / Use
/learn @backtrace-labs/VerneuilREADME
Verneuil: streaming replication for SQLite
Verneuil[^verneuil-process] [vɛʁnœj] is a VFS (OS abstraction layer) for SQLite that accesses local database files like the default unix VFS while asynchronously replicating snapshots to S3-compatible blob stores. We wrote it to improve the scalability and availability of pre-existing services for which SQLite is a good fit, at least for single-node deployments.
Backtrace relies on Verneuil to backup and replicate thousands of SQLite databases that range in size from 100KB to a few gigabytes, some of which see updates every second... for less than $40/day in S3 costs.
It has been tested on linux/amd64, linux/aarch64 (little endian), and darwin/aarch64. The sqlite file format and the Verneuil replication data are all platform agnostic.
[^verneuil-process]: The Verneuil process was the first commercial method of manufacturing synthetic gemstones... and DRH insists on pronouncing SQLite like a mineral, surely a precious one (:
The primary design goal of Verneuil is to add asynchronous read replication to working single-node systems without introducing new catastrophic failure modes. Avoiding new failure modes takes precedence over all other considerations, including replication lag: there is no attempt to bound or minimise the staleness of read replicas. Verneuil read replicas should only be used when stale data is acceptable.
In keeping with this conservative approach to replication, the local
database file on disk remains the source of truth, and the VFS is
fully compatible with SQLite's default unix VFS, even for concurrent
(with file locking) accesses. Verneuil stores all state that must
persist across SQLite transactions on disk, so multiple processes can
still access and replicate the same database with Verneuil.
Verneuil also paces all API calls (with a [currently hardcoded] limit of 30 call/second/process) to avoid "surprising" cloud bills, and decouples the SQLite VFS from the replication worker threads that upload data to a remote blob store with a crash-safe buffer directory that bounds its worst-case disk footprint to roughly four times the size of the source database file. It's thus always safe to disable access to the blob store: buffered replication data may grow over time, but always within bounds.
Replacing the default unix VFS with Verneuil impacts local SQLite
operations, of course: writes must be slower, in order to queue
updates for replication. However, this slowdown is usually
proportional to the time it took to perform the write itself, and
often dominated by the two fsyncs incurred by SQLite transaction
commits in rollback mode. In addition, the additional replication
logic runs with the write lock downgraded to a read lock, so
subsequent transactions only block on the new replication step once
they're ready to commit.
This effort is incomparable with litestream:
Verneuil is meant for asynchronous read replication, with streaming
backups as a nice side effect. The
replication approach
is thus completely different. In particular, while litestream only
works with SQLite databases in WAL mode, Verneuil only supports
rollback journaling. See doc/DESIGN.md for details.
What's in this repo
-
A "Linux" VFS (
c/linuxvfs.c) that implements everything that SQLite needs for a non-WAL DB, without all the backward compatibility cruft in SQLite's Unix VFS. The new VFS's behaviour is fully compatible with upstream's Unix VFS! It's a simpler starting point for new (Linux-only) SQLite VFSes. -
A Rust crate with a C interface (see
include/verneuil.h) to configure and register:-
The
verneuilVFS, which hooks into the Linux VFS to track changes, generate snapshots in spooling directories, and asynchronously upload spooled data to a remote blob store like S3. This VFS is only compatible with SQLite's rollback journal mode. It can be called directly as a Rust program, or via its C interface. -
The
verneuil_snapshotVFS that lets SQLite access snapshots stored in S3-compatible blob stores.
-
-
A runtime-loadable SQLite extension,
libverneuil_vfs, that lets SQLite open databases with theverneuilVFS (to replicate the database to remote storage), or with theverneuil_snapshotVFS (to access a replicated snapshot). -
The
verneuilctlcommand-line tool to restore snapshots, forcibly upload spooled data, synchronise a database file to remote storage, and perform other ad hoc administrative tasks.
Quick start
There is more detailed setup information, including how to directly
link against the verneuil crate instead of loading it as a SQLite
extension, in doc/VFS.md and doc/SNAPSHOT_VFS.md. The
rusqlite_integration example shows how that works for a Rust crate.
For quick hacks and test drives, the easiest way to use Verneuil is to
build it as a runtime loadable extension for SQLite
(libverneuil_vfs).
cargo build --release --examples --features='dynamic_vfs'
The verneuilctl tool will also be useful.
cargo build --release --examples --features='vendor_sqlite'
Verneuil needs additional configuration to know where to spool
replication data, and where to upload or fetch data from remote
storage. That configuration data must be encoded in JSON, and will be
deserialised into a verneuil::Options struct (in src/lib.rs).
A minimal configuration string looks as follows. See doc/VFS.md and
doc/SNAPSHOT_VFS.md for more details.
{
// "make_default": true, to use the replicating VFS by default
// "tempdir": "/my/tmp/", to override the location of temporary files
"replication_spooling_dir": "/tmp/verneuil/",
"replication_targets": [
{
"s3": {
"region": "us-east-1",
// "endpoint": "http://127.0.0.1:9000", //for non-standard regions
"chunk_bucket": "verneuil_chunks",
"manifest_bucket": "verneuil_manifests",
"domain_addressing": true // or false for the legacy bucket-as-path interface
// "create_buckets_on_demand": true // to create private buckets as needed
}
}
]
}
That's a mouthful to pass as query string parameters to
sqlite3_open_v2, so Verneuil currently looks for that configuration
string in the VERNEUIL_CONFIG environment variable. If that
variable's value starts with an at sign, like "@/path/to/config.json",
Verneuil looks for the configuration JSON in that file.
The configuration file does not include any credential: Verneuil
gets those from the environment, either by hitting the local EC2
credentials daemon, or by reading the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables.
Now that the environment is set up, we can load the extension in SQLite, and start replicating our writes to S3, or any other compatible blob server (we use minio for testing).
$ RUST_LOG=warn VERNEUIL_CONFIG=@verneuil.json sqlite3
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .load ./libverneuil_vfs -- Load the Verneuil VFS extension.
sqlite> .open file:source.db?vfs=verneuil
-- The contents of source.db will now be spooled for replication before
-- letting each transaction close.
sqlite> .open file:verneuil://source.host.name/path/to/replicated.db?vfs=verneuil_snapshot
-- opens a read replica for the most current snapshot replicated to s3 by `source.host.name`
-- for the database at `/path/to/replicated.db`.
Outside the SQLite shell, extensions loading must be enabled
in order to allow access to the load_extension SQL function.
URI filenames must also be enabled
in order to specify the VFS in the connection string; it's also possible
to pass a VFS argument to sqlite3_open_v2.
Replication data is buffered to the replication_spooling_dir
synchronously, before the end of each SQLite transaction. Actually
uploading the data to remote storage happens asynchronously: we
wouldn't want to block transaction commit on network calls.
After exiting the shell or closing an application, we can make sure
that all spooled data is flushed to remote storage with verneuilctl flush $REPLICATION_SPOOLING_DIR: this command will attempt to
synchronously upload all pending spooled data in the spooling
directory, and log noisily / error out on failure.
Find documentation for other verneuilctl subcommands with verneuilctl help:
$ ./verneuilctl --help
verneuilctl 0.1.0
utilities to interact with Verneuil snapshots
USAGE:
verneuilctl [OPTIONS] <SUBCOMMAND>
FLAGS:
-h, --help
Prints help information
-V, --version
Prints version information
OPTIONS:
-c, --config <config>
The Verneuil JSON configuration used when originally copying the database to remote storage.
A value of the form "@/path/to/json.file" refers to the contents of that file; otherwise, the argument
itself is the configuration string.
This parameter is optional, and defaults to the value of the `VERNEUIL_CONFIG` environment variable.
-l, --log <log>
Log level, in the same format as `RUST

