PgDog is a proxy for scaling PostgreSQL. It supports connection pooling, load balancing queries and sharding entire databases. Written in Rust, PgDog is fast, secure and can manage thousands of connections on commodity hardware.

Documentation

📘 PgDog documentation can be found here. Any questions? Chat with us on Discord.

Quick start

Kubernetes

Helm chart is here. To install it, run:

helm repo add pgdogdev https://helm.pgdog.dev
helm install pgdog pgdogdev/pgdog

AWS

If you're using AWS RDS, you can deploy PgDog using one of two supported methods:

Helm chart with EKS, or a self-hosted Kubernetes cluster
Terraform module to deploy PgDog on ECS

Try in Docker

You can try PgDog quickly using Docker. Install Docker Compose and run:

docker-compose up

Once started, you can connect to PgDog with psql or any other PostgreSQL client:

PGPASSWORD=postgres psql -h 127.0.0.1 -p 6432 -U postgres

The demo comes with 3 shards and 2 sharded tables:

INSERT INTO users (id, email) VALUES (1, 'admin@acme.com');
INSERT INTO payments (id, user_id, amount) VALUES (1, 1, 100.0);

SELECT * FROM users WHERE id = 1;
SELECT * FROM payments WHERE user_id = 1;

Features

📘 Configuration

All PgDog features are configurable and can be turned on and off. PgDog requires 2 configuration files to operate:

pgdog.toml: hosts, sharding configuration, and other settings
users.toml: usernames and passwords

Example

Most options have reasonable defaults, so a basic configuration for a single user and database running on the same machine is pretty short:

pgdog.toml

[general]
port = 6432
default_pool_size = 10

[[databases]]
name = "pgdog"
host = "127.0.0.1"

users.toml

[[users]]
name = "alice"
database = "pgdog"
password = "hunter2"

If a database in pgdog.toml doesn't have a user in users.toml, the connection pool for that database will not be created and users won't be able to connect.

If you'd like to try it out locally, create the database and user like so:

CREATE DATABASE pgdog;
CREATE USER pgdog PASSWORD 'pgdog' LOGIN;

Transaction pooling

📘 Transactions

Like PgBouncer, PgDog supports transaction (and session) pooling, allowing thousands of clients to use just a few PostgreSQL server connections.

Unlike PgBouncer, PgDog can parse and handle SET statements and startup options, ensuring session state is set correctly when sharing server connections between clients with different parameters.

PgDog also has more advanced connection recovery options, like automatic abandoned transaction rollbacks and connection re-synchronization to avoid churning server connections during an application crash.

Load balancer

📘 Load balancer

PgDog is an application layer (OSI Level 7) load balancer for PostgreSQL. It understands the Postgres protocol, can proxy multiple replicas (and primary) and distributes transactions evenly between databases. The load balancer supports 3 strategies: round robin, random and least active connections.

Example

The load balancer is enabled automatically when a database has more than one host:

[[databases]]
name = "prod"
host = "10.0.0.1"
role = "primary"

[[databases]]
name = "prod"
host = "10.0.0.2"
role = "replica"

Health checks

📘 Healthchecks

PgDog maintains a real-time list of healthy hosts. When a database fails a health check, it's removed from the active rotation and queries are re-routed to other replicas. This works like an HTTP load balancer, except it's for your database.

Health checks maximize database availability and protect against bad network connections, temporary hardware failures or misconfiguration.

Single endpoint

📘 Single endpoint

PgDog uses pg_query, which includes the PostgreSQL native parser. By parsing queries, PgDog can detect writes (e.g. INSERT, UPDATE, CREATE TABLE, etc.) and send them to the primary, leaving the replicas to serve reads (SELECT). This allows applications to connect to the same PgDog deployment for both reads and writes.

Transactions

📘 Load balancer & transactions

Transactions can execute multiple statements, so in a primary & replica configuration, PgDog routes them to the primary. Clients can indicate a transaction is read-only, in which case PgDog will send it to a replica:

BEGIN READ ONLY;
-- This goes to a replica.
SELECT * FROM users LIMIT 1;
COMMIT;

Failover

📘 Failover

PgDog monitors Postgres replication state and can automatically redirect writes to a different database if a replica is promoted. This doesn't replace tools like Patroni that actually orchestrate failovers. You can use PgDog alongside Patroni (or AWS RDS or other managed Postgres host), to gracefully failover live traffic.

Example

To enable failover, set all database role attributes to auto and enable replication monitoring (lsn_check_delay setting):

[general]
lsn_check_delay = 0

[[databases]]
name = "prod"
host = "10.0.0.1"
role = "auto"

[[databases]]
name = "prod"
host = "10.0.0.2"
role = "auto"

Authentication

📘 Authentication

PgDog supports two authentication methods:

Password-based
AWS RDS IAM

Password-based authentication

Password-based authentication allows for clients to authenticate to PgDog and for PgDog to authenticate to PostgreSQL. It currently supports the following password hashing algorithms:

SCRAM-SHA-256
MD5
Plain

RDS IAM backend authentication

PgDog can keep client-to-PgDog authentication unchanged while using AWS RDS IAM tokens for PgDog-to-PostgreSQL authentication on a per-user basis.

Example

[[users]]
name = "alice"
database = "pgdog"
password = "client-password"
server_auth = "rds_iam"
# Optional; PgDog infers region from *.region.rds.amazonaws.com(.cn) hostnames when omitted.
# server_iam_region = "us-east-1"

When any user has server_auth = "rds_iam", the following settings must be configured as well:

tls_verify must not be "disabled".
passthrough_auth must be "disabled".

Sharding

📘 Sharding

PgDog is able to manage databases with multiple shards. By using the PostgreSQL parser, PgDog extracts sharding keys and determines the best routing strategy for each query.

For cross-shard queries, PgDog assembles and transforms results in memory, sending all rows to the client as if they are coming from a single database.

Example

Configuring multiple hosts for the same database with different shard numbers (shard setting) enables sharding:

[[databases]]
name = "prod"
host = "10.0.0.1"
shard = 0

[[databases]]
name = "prod"
host = "10.0.0.2"
shard = 1

Note: read below for how to configure query routing. At least one sharded table is required for sharding to work as expected.

Sharding functions

📘 Sharding functions

PgDog has two main sharding algorithms:

PostgreSQL partition functions (HASH, LIST, RANGE)
Using schemas

Partition-based sharding

Partition-based sharding functions are taken directly from Postgres source code. This choice intentionally allows to shard data both with PgDog and with Postgres foreign tables and postgres_fdw.

Examples

The PARTITION BY HASH algorithm is used by default when configuring sharded tables:

[[sharded_tables]]
database = "prod"
column = "user_id"

List-based sharding (same as PARTITION BY LIST in Postgres) can be configured as follows:

# Sharded table definition still required.
[[sharded_tables]]
database = "prod"
column = "user_id"

# Value-specific shard mappings.
[[sharded_mapping]]
database = "prod"
column = "user_id"
values = [1, 2, 3, 4]
shard = 0

[[sharded_mapping]]
database = "prod"
column = "user_id"
values = [5, 6, 7, 8]
shard = 1

For range-based sharding, replace the values setting with a range, for example:

start = 0 # include
end = 5 # exclusive

Schema-based sharding

📘 Schema-based sharding

Schema-based sharding works on the basis of PostgreSQL schemas. Tables under the same schema are placed on the same shard and all queries that refer to those tables are routed to

Pgdog

Install / Use

README

Documentation

Quick start

Kubernetes

AWS

Try in Docker

Features

Example

Transaction pooling

Load balancer

Health checks

Single endpoint

Transactions

Failover

Authentication

Password-based authentication

RDS IAM backend authentication

Sharding

Sharding functions

Partition-based sharding

Schema-based sharding

Related Skills