Sqlflite
An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.
Install / Use
/learn @voltrondata/SqlfliteREADME
SQLFlite
An Arrow Flight SQL Server with DuckDB or SQLite back-end execution engines
<img src="https://img.shields.io/badge/dockerhub-image-green.svg?logo=Docker"> <img src="https://img.shields.io/badge/Documentation-dev-yellow.svg?logo="> <img src="https://img.shields.io/badge/GitHub-voltrondata%2Fsqlflite-blue.svg?logo=Github"> <img src="https://img.shields.io/badge/Arrow%20JDBC%20Driver-download%20artifact-red?logo=Apache%20Maven"> <img src="https://img.shields.io/badge/PyPI-Arrow%20ADBC%20Flight%20SQL%20driver-blue?logo=PyPI"> <img src="https://img.shields.io/badge/PyPI-SQLFlite%20Ibis%20Backend-blue?logo=PyPI"> <img src="https://img.shields.io/badge/PyPI-SQLFlite%20SQLAlchemy%20Dialect-blue?logo=PyPI">
Description
This repo demonstrates how to build an Apache Arrow Flight SQL server implementation using DuckDB or SQLite as a backend database.
It enables authentication via middleware and allows for encrypted connections to the database via TLS.
For more information about Apache Arrow Flight SQL - please see this article.
Option 1 - Running from the published Docker image
Open a terminal, then pull and run the published Docker image which has everything setup (change: "--detach" to "--interactive" if you wish to see the stdout on your screen) - with command:
docker run --name sqlflite \
--detach \
--rm \
--tty \
--init \
--publish 31337:31337 \
--env TLS_ENABLED="1" \
--env SQLFLITE_PASSWORD="sqlflite_password" \
--env PRINT_QUERIES="1" \
--pull missing \
voltrondata/sqlflite:latest
The above command will automatically mount a very small TPC-H DuckDB database file.
Note: You can disable TLS in the container by setting environment variable: TLS_ENABLED to "0" (default is "1" - enabled). This is not recommended unless you are using an mTLS sidecar in Kubernetes or something similar, as it will be insecure.
Optional - open a different database file
When running the Docker image - you can have it run your own DuckDB database file (the database must be built with DuckDB version: 1.1.1).
Prerequisite: DuckDB CLI
Install DuckDB CLI version 1.1.1 - and make sure the executable is on your PATH.
Platform Downloads:
Linux x86-64
Linux arm64 (aarch64)
MacOS Universal
In this example, we'll generate a new TPC-H Scale Factor 1 (1GB) database file, and then run the docker image to mount it:
# Generate a TPC-H database in the host's /tmp directory
pushd /tmp
duckdb ./tpch_sf1.duckdb << EOF
.bail on
.echo on
SELECT VERSION();
INSTALL tpch;
LOAD tpch;
CALL dbgen(sf=1);
EOF
# Run the sqlflite docker container image - and mount the host's DuckDB database file created above inside the container
docker run --name sqlflite \
--detach \
--rm \
--tty \
--init \
--publish 31337:31337 \
--env TLS_ENABLED="1" \
--env SQLFLITE_PASSWORD="sqlflite_password" \
--pull missing \
--mount type=bind,source=$(pwd),target=/opt/sqlflite/data \
--env DATABASE_FILENAME="data/tpch_sf1.duckdb" \
voltrondata/sqlflite:latest
Running initialization SQL commands
You can now run initialization commands upon container startup by setting environment variable: INIT_SQL_COMMANDS to a string of SQL commands separated by semicolons - example value:
SET threads = 1; SET memory_limit = '1GB';.
Here is a full example of running the Docker image with initialization SQL commands:
docker run --name sqlflite \
--detach \
--rm \
--tty \
--init \
--publish 31337:31337 \
--env TLS_ENABLED="1" \
--env SQLFLITE_PASSWORD="sqlflite_password" \
--env PRINT_QUERIES="1" \
--env INIT_SQL_COMMANDS="SET threads = 1; SET memory_limit = '1GB';" \
--pull missing \
voltrondata/sqlflite:latest
You can also specify a file containing initialization SQL commands by setting environment variable: INIT_SQL_COMMANDS_FILE to the path of the file containing the SQL commands - example value: /tmp/init.sql. The file must be mounted inside the container.
Note: for the DuckDB back-end - the following init commands are automatically run for you:
SET autoinstall_known_extensions = true; SET autoload_known_extensions = true;
Note: Initialization SQL commands which SELECT data will NOT show the results (this is not supported).
Note: Initialization SQL commands which fail will cause the Flight SQL server to abort and exit with a non-zero exit code.
Connecting to the server via JDBC
Download the Apache Arrow Flight SQL JDBC driver
You can then use the JDBC driver to connect from your host computer to the locally running Docker Flight SQL server with this JDBC string (change the password value to match the value specified for the SQLFLITE_PASSWORD environment variable if you changed it from the example above):
jdbc:arrow-flight-sql://localhost:31337?useEncryption=true&user=sqlflite_username&password=sqlflite_password&disableCertificateVerification=true
For instructions on setting up the JDBC driver in popular Database IDE tool: DBeaver Community Edition - see this repo.
Note - if you stop/restart the Flight SQL Docker container, and attempt to connect via JDBC with the same password - you could get error: "Invalid bearer token provided. Detail: Unauthenticated". This is because the client JDBC driver caches the bearer token signed with the previous instance's secret key. Just change the password in the new container by changing the "SQLFLITE_PASSWORD" env var setting - and then use that to connect via JDBC.
Connecting to the server via the new ADBC Python Flight SQL driver
You can now use the new Apache Arrow Python ADBC Flight SQL driver to query the Flight SQL server. ADBC offers performance advantages over JDBC - because it minimizes serialization/deserialization, and data stays in columnar format at all phases.
You can learn more about ADBC and Flight SQL here.
Ensure you have Python 3.9+ installed, then open a terminal, then run:
# Create a Python virtual environment
python3 -m venv .venv
# Activate the virtual environment
. .venv/bin/activate
# Install the requirements including the new Arrow ADBC Flight SQL driver
pip install --upgrade pip
pip install pandas pyarrow adbc_driver_flightsql
# Start the python interactive shell
python
In the Python shell - you can then run:
import os
from adbc_driver_flightsql import dbapi as sqlflite, DatabaseOptions
with sqlflite.connect(uri="grpc+tls://localhost:31337",
db_kwargs={"username": os.getenv("SQLFLITE_USERNAME", "sqlflite_username"),
"password": os.getenv("SQLFLITE_PASSWORD", "sqlflite_password"),
DatabaseOptions.TLS_SKIP_VERIFY.value: "true" # Not needed if you use a trusted CA-signed TLS cert
}
) as conn:
with conn.cursor() as cur:
cur.execute("SELECT n_nationkey, n_name FROM nation WHERE n_nationkey = ?",
parameters=[24]
)
x = cur.fetch_arrow_table()
print(x)
You should see results:
pyarrow.Table
n_nationkey: int32
n_name: string
----
n_nationkey: [[24]]
n_name: [["UNITED STATES"]]
Connecting via the new sqlflite_client CLI tool
You can also use the new sqlflite_client CLI tool to connect to the Flight SQL server, and then run a single command. This tool is built into the Docker image, and is also available as a standalone executable for Linux and MacOS.
Example (run from the host computer's terminal):
sqlflite_client \
--command Execute \
--host "localhost" \
--port 31337 \
--username "sqlflite_username" \
--password "sqlflite_password" \
--query "SELECT version()" \
--use-tls \
--tls-skip-verify
That should return:
Results from endpoint 1 of 1
Schema:
version(): string
Results:
version(): [
"v1.1.1"
]
Total: 1
Connecting via Ibis
See: https://github.com/ibis-project/ibis-sqlflite
Connecting via SQLAlchemy
See: https://github.com/prmoore77/sqlalchemy-sqlflite-adbc-dialect
Tear-down
Stop the docker image with:
docker stop sqlflite
Option 2 - Download and run the sqlflite CLI executable
Download (and unzip) the latest release of the sqlflite_server CLI executable from these currently supported platforms:
[Linux x86-64](https://github.com/voltrond
Related Skills
feishu-drive
342.5k|
things-mac
342.5kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
342.5kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver
