SQLGateway

Access your SQL database over HTTP like it’s a SQL database but with superpowers. An edge function's best friend.

Superpowers include:

HTTP access for SQL databases enable WASM-based runtimes to use TCP-connected DBs
Connection pooling protects from reconnects, wasted idle connections, and bursts of load
Automatic query and transaction tracing
Caching capabilities

Currently only the PSQL protocol is supported. Additional protocol support (like MySQL) is on the roadmap.

Quick Start
Why This Exists
API
Configuration
Auth
Clustered vs. Single Node
Transactions
Running distributed tests

Quick Start

Pull this repo:

git clone https://github.com/danthegoodman1/SQLGateway && cd SQLGateway

Run the docker-compose.yml file:

docker compose up

Then in another terminal, run:

curl --location --request POST 'http://localhost:8080/psql/query' \
--header 'Content-Type: application/json' \
--data-raw '{
    "Queries": [
        {
            "Statement": "SELECT 1 as num_one, NOW() as current_time"
        }
    ]
}'

You should get the following back (with a different time):

{"Queries":[{"Columns":["num_one","current_time"],"Rows":[[1,"2022-11-27T19:20:13.030114Z"]],"TimeNS":958400}]}

Why This Exists

I wanted to use Cloudflare Workers, but also the Postgres ecosystem (specifically CockroachDB Serverless).

The idea was to keep the HTTP layer out of the way and make it feel like you are talking to a normal SQL database.

Now we can connect the two worlds of WASM-runtimes and SQL databases without vendor lock-in!

Some WASM runtimes that can now use SQL databases:

Cloudflare Workers
Vercel Edge Functions
Fastly Compute@Edge
Netlify Functions note: this seems to indicate that TCP connections may be supported, since they (at least used to) use WasmEdge. I have not bothered testing however :P

Some Databases that WASM runtimes can now use:

AWS RDS & Aurora
GCP Cloud SQL
CockroachDB Dedicated & Serverless
DigitalOcean managed databases
UpCloud Managed Databases

Querying and Transactions

Send single queries, or send an array of queries to run atomically in a transaction.

Start a transaction and go back and forth between the DB and your code just like normal. The nodes in the cluster will automatically route transaction queries to the correct node (coordinated through Redis). Abandoned transactions will be garbage collected.

Automatic query and transaction tracing

Metric logs emitted on the performance of individual queries, as well as entire transactions. Build dashboards and create alerts to find slowdowns and hot-spots in your code.

Coming soon (maybe?): Alerting and dashboards (for now just use some logging provider)

Caching (Coming Soon)

Specify SELECTs that don’t need to be consistent you can have them cache and TTL with stale-while-revalidate support.

Connection Pooling

Prevent constant session creation from creating unnecessary load on the DB, and burst execution environments from holding idle connections that won't be used again.

Use HTTP Keep-Alive to keep connections warm for Lambda-like environments, but don’t risk overloading the DB with new connections or leaving tons of resource-intensive DB sessions idle.

Database Throttling Under Load

With a finite number of pool connections, you prevent uncapped load from hitting your database directly.

API

GET /hc

Health check endpoint, only guarantees that the HTTP server is running.

POST /psql/query

Request Body:

* indicates optional

{
  Queries: []{
      Statement:   string
      Params:      []any
      Exec:        *bool // if provided, then no `Rows` or `Columns` will be returned for this query.
      TxKey:       *string
    }
    
  TxID:    *string
}

Examples:

{
  "Queries": [
    {
      "Statement": "SELECT $1::INT8 as a_number",
      "Params": [
        42
      ]
    }
  ]
}

{
  "Queries": [
    {
      "Statement": "CREATE TABLE test_table IF NOT EXISTS ( id TEXT NOT NULL, val TEXT NOT NULL, PRIMARY KEY(id) )",
      "Exec": true
    }
  ]
}

Note: Casting is probably required for parameters as due to the primitive type selection the SQL cannot always interpret which SQL type a JSON property should use.

If given a single query, it will be run directly on the connection.

If given multiple items, they will be run within the same transaction. You will receive the results of all that succeed, however if a single query fails then the entire transaction will fail, and all queries will remain un-applied regardless of whether there were rows returned. Rows will be returned for the successful queries of a failing transaction.

If a TxID is provided, then it will be run within a transaction, proxying if required.

DO NOT CALL COMMIT OR ROLLBACK through here, that should be handled via the respective endpoints, or functions within the client libraries.

Response Body:

{
    Queries []{
        Columns:  []any
        Rows:     [][]any
        Error:    *string
        TimeNS:   *int64 
    }
    
    // Whether this was proxied to a remote node
    Remote: bool
}

Any query errors that occur will be included in the response body, rather than failing the request.

/psql/begin

Starts a new transaction.

Request Body:

{
    TxTimeoutSec: *int64 // sets the garbage collection timeout, default `30`
}

Returns the transaction ID that must be carried through subsequent requests.

Response Body:

{
    TxID: string
}

/psql/commit

Commits an existing transaction. Returns status 200 and no content if successful.

Request Body:

{
    TxID: string
}

/psql/rollback

Rolls back an existing transaction. Returns status 200 and no content if successful.

Request Body:

{
    TxID: string
}

Error handling

All processing errors (not query errors) will return a 4XX/5XX error code, and as a text/plain response body.

Configuration

Configuration is done through environment variables

| Env Var | Description | Required? | Default | |--------------------|----------------------------------------------------------------------------------------------------------------------------|----------------------------|---------| | PG_DSN | PSQL wire protocol DSN. Used to connect to DB | Yes | | | PG_POOL_CONNS | Number of pool connections to acquire | No | 2 | | REDIS_ADDR | Redis Address. Currently used in non-cluster mode (standard client). If omitted then clustering features are disabled. | No | | | REDIS_PASSWORD | Redis connection password | No | | | REDIS_POOL_CONNS | Number of pool connections to Redis. | No | 2 | | V_NAMESPACE | Virtual namespace for Redis. Sets the key prefix for Service discovery. | Yes (WIP, so No currently) | | | POD_URL | Direct URL that this pod/node can be reached at. Replaces POD_NAME and POD_BASE_DOMAIN if exists. | Yes (conditional) | | | POD_NAME | Name of the node/pod (k8s semantics). Pod can be reached at {POD_NAME}{POD_BASE_DOMAIN} | Yes (conditional) | | | POD_BASE_DOMAIN | Base domain of the node/pod (k8s semantics). Pod can be reached at {POD_NAME}{POD_BASE_DOMAIN} | Yes (conditional) | | | HTTP_PORT | HTTP port to run the HTTP(2) server on | No | 8080 | | POD_HTTPS | Indicates whether the pods should use HTTPS to contact each other. Set to 1 if they should use HTTPS. | No | | | TRACES | Indicates whether query trace information should be included in log contexts. Set to 1 if they should be. | No | | | DEBUG | Indicates whether the debug log level should be enabled. Set to 1 to enable. | No | | | PRETTY | Indicates whether pretty logs should be printed. Set to 1 to enable. | | | | AUTH_USER | Sets the Basic Auth username required to connect. Requires that AUTH_PASS be set as well | Yes (

SQLGateway

Install / Use

README

SQLGateway

Quick Start

Why This Exists

Querying and Transactions

Automatic query and transaction tracing

Caching (Coming Soon)

Connection Pooling

Database Throttling Under Load

API

GET /hc

POST /psql/query

/psql/begin

/psql/commit

/psql/rollback

Error handling

Configuration