hugr

The DataMesh service that provides access to various data sources through common GraphQL API.

The hugr is built on the top of DuckDB and uses it as a calculation engine. The hugr can work with following data sources:

PostgreSQL (incl. with extensions: PostGIS, TimescaleDB)
MySQL
Microsoft SQL Server (MSSQL)
DuckDB files
Apache Iceberg tables
DuckLake catalogs
HTTP REST API (support OpenAPI v3)
Redis
Hugr Apps — pluggable applications via DuckDB Airport (Arrow Flight gRPC)
LLM providers (OpenAI, Anthropic, Gemini) — as function data sources
All file formats supported by DuckDB (CSV, Parquet, JSON, ESRI Shape, etc.)

Files can be stored in the local file system or in cloud storage (S3, Azure, GCS, R2).

Executable

There are 2 executables provided in the repository:

cmd/server/main.go - main executable for the Hugr server (handles standalone, management, and worker roles),
cmd/migrate/main.go - executable for the core-db migrations.

A single server binary serves all roles. The role is determined by CLUSTER_ROLE environment variable.

The executable is built with Go and can be run on any platform that supports Go:

CGO_ENABLED=1 go build -tags='duckdb_arrow' -o server cmd/server/*.go

Dependencies

The Hugr uses the Hugr query engine package.

Deployment

The common way to deploy the Hugr is to use Docker. The Docker image provided by repository docker. There are two images provided:

ghcr.io/hugr-lab/server - hugr server (for standalone, management, and worker modes),
ghcr.io/hugr-lab/automigrate - hugr server with automigration for the core-db schema.

Modes of Operation

Standalone Mode (default)

Start the server without any CLUSTER_* env vars. Uses DuckDB or PostgreSQL as CoreDB.

./server

Cluster Mode — Management Node

CLUSTER_ENABLED=true \
CLUSTER_ROLE=management \
CLUSTER_NODE_NAME=mgmt-1 \
CLUSTER_NODE_URL=http://mgmt-1:15000/ipc \
CLUSTER_SECRET=my-secret \
CORE_DB_PATH="postgres://user:pass@db:5432/hugr" \
./server

Cluster Mode — Worker Node

CLUSTER_ENABLED=true \
CLUSTER_ROLE=worker \
CLUSTER_NODE_NAME=worker-1 \
CLUSTER_NODE_URL=http://worker-1:15000/ipc \
CLUSTER_SECRET=my-secret \
CORE_DB_PATH="postgres://user:pass@db:5432/hugr" \
./server

Important: Cluster mode requires PostgreSQL as CoreDB (CORE_DB_PATH must be a postgres:// DSN). The server will refuse to start in cluster mode with DuckDB CoreDB.

Environment variables for the server

General

BIND - string, that defines network interface and port, default: :15000
SERVICE_BIND - string, that defines network interface and port for the metrics and health check, if it is not set up than http server for the service will not start, default: ""
ADMIN_UI - flag to enable AdminUI, for path /admin (GraphiQL), default: true
ADMIN_UI_FETCH_PATH - path to fetch AdminUI, default: "/admin"
DEBUG - flag to run in debug mode (SQL queries will output to the stdout), default: false
ALLOW_PARALLEL - flag to allow run queries in parallel, default: true
MAX_PARALLEL_QUERIES - limit to numbers of parallels queries executed, default: 0 (unlimited)
MAX_DEPTH - maximal depth of GraphQL types hierarchy, default: 7

TLS

TLS_CERT_FILE - path to PEM-encoded TLS certificate file, default: "" (disabled). When set together with TLS_KEY_FILE, the server serves HTTPS instead of HTTP.
TLS_KEY_FILE - path to PEM-encoded TLS private key file, default: "" (disabled). Both TLS_CERT_FILE and TLS_KEY_FILE must be set together.

Example:

TLS_CERT_FILE=/etc/ssl/certs/hugr.crt \
TLS_KEY_FILE=/etc/ssl/private/hugr.key \
BIND=:443 \
./server

For local development, generate self-signed certificates with make certs and use the TLS-enabled .env files in .local/.

Note: The sidecar service endpoint (health/metrics on SERVICE_BIND) always uses plain HTTP regardless of TLS configuration.

MCP OAuth (OIDC Authentication for MCP Clients)

When MCP is enabled (MCP_ENABLED=true) and OIDC is configured, Hugr can act as a stateless OAuth 2.1 proxy so that MCP clients (Claude Desktop, Cursor, etc.) can authenticate via your OIDC provider.

Required environment variables (in addition to existing OIDC config):

OIDC_CLIENT_SECRET - client secret for the OIDC authorization code exchange, default: "" (OAuth proxy disabled)
OIDC_SCOPES - space-separated scopes to request from the OIDC provider, default: "openid profile email"
OIDC_REDIRECT_URL - optional override for Hugr's OAuth callback URL (auto-derived from Host header if not set)
SECRET_KEY - required for encrypting OAuth state (must be set when using MCP OAuth)

How it works: Hugr exposes /.well-known/oauth-authorization-server metadata pointing to its own /oauth/authorize, /oauth/token, and /oauth/register endpoints. MCP clients discover these endpoints, register dynamically, and complete an Authorization Code + PKCE flow. Hugr proxies the authentication to the external OIDC provider and passes the OIDC tokens back to the client. The existing auth middleware validates the tokens on subsequent MCP requests.

Setup with Cloudflare Tunnel (for local development):

# Create a named tunnel (one-time)
cloudflared tunnel create hugr-dev
cloudflared tunnel route dns hugr-dev hugr-dev.yourdomain.com

# Configure OIDC provider with redirect URI: https://hugr-dev.yourdomain.com/oauth/callback

# Run Hugr + tunnel
MCP_ENABLED=true \
OIDC_ISSUER=https://your-provider.com/realms/your-realm \
OIDC_CLIENT_ID=hugr-server \
OIDC_CLIENT_SECRET=your-secret \
SECRET_KEY=your-secret-key \
ALLOWED_ANONYMOUS=false \
./server &

cloudflared tunnel run --url http://localhost:15000 hugr-dev

Then in Claude Desktop settings:

{
  "mcpServers": {
    "hugr": {
      "url": "https://hugr-dev.yourdomain.com/mcp"
    }
  }
}

DuckDB engine settings

DB_HOME_DIRECTORY - path to the DuckDB home directory, default: "", example: "/data/.hugr". This very important to make persistence secrets (like S3 credentials) in container environments.
DB_PATH - path to management db file, if empty in memory storage is used, default: ""
DB_MAX_OPEN_CONNS - maximal number of open connections to the database, default: 0 (unlimited)
DB_MAX_IDLE_CONNS - maximal number of idle connections to the database, default: 0 (unlimited)
DB_ALLOWED_DIRECTORIES - comma separated list of allowed directories for the database, default: "", example: "/data,/tmp"
DB_ALLOWED_PATHS - comma separated list of allowed paths for the database, default: "", example: "/data/.local,/tmp/.local"
DB_ENABLE_LOGGING - flag to enable logging, default: false
DB_MAX_MEMORY - maximal memory limit for the database, default: 80% of the system memory
DB_MAX_TEMP_DIRECTORY_SIZE - maximal size of the temporary directory, default: 80% of the system memory
DB_TEMP_DIRECTORY - path to the temporary directory, default: ".tmp"
DB_WORKER_THREADS - number of worker threads, default: 0 (number of CPU cores)
DB_PG_CONNECTION_LIMIT - maximal number of connections to the database, default: 64
DB_PG_PAGES_PER_TASK - number of pages per task, default: 1000

Cluster settings

CLUSTER_ENABLED - flag to enable cluster mode, default: false
CLUSTER_ROLE - role of the node in the cluster: "management" or "worker", required when CLUSTER_ENABLED=true
CLUSTER_NODE_NAME - name of the node, default: "", example: "mgmt-1"
CLUSTER_NODE_URL - URL of the node for IPC communication, default: "", example: "http://mgmt-1:15000/ipc"
CLUSTER_SECRET - secret key for the cluster communication, default: ""
CLUSTER_HEARTBEAT - interval for node heartbeat, default: 30s
CLUSTER_GHOST_TTL - time-to-live for unresponsive nodes before removal, default: 2m
CLUSTER_POLL_INTERVAL - interval for workers to poll schema changes, default: 30s

Note: Cluster mode requires PostgreSQL as CoreDB. The CORE_DB_PATH must be a postgres:// DSN shared between management and worker nodes.

CoreDB settings

The core-db is used to store the metadata for the data sources and to manage the access to the data sources. The core-db stores data sources, catalog sources, roles and role permissions, and other metadata. It can be a DuckDB file, in-memory storage or PostgreSQL. Core-db based on the PostgreSQL is required for the cluster mode (to make all replicas on the same page). The DuckDB file can placed in the local file system or in the cloud storage (currently support only s3 cloud storage, in plan: Azure, GCS, AWS, R2), in that case path should be s3://bucket/path/to/file.duckdb.

CORE_DB_PATH - path to core-db file or PostgreSQL DSN, if empty in memory storage is used, default: ""
CORE_DB_READONLY - flag to open core-db in read-only mode, default: false
CORE_DB_S3_ENDPOINT - s3 endpoint, default: ""
CORE_DB_S3_REGION - s3 region, default: ""
CORE_DB_S3_KEY - s3 access key, default: ""
CORE_DB_S3_SECRET - s3 secret key, default: ""
CORE_DB_S3_USE_SSL - flag to use SSL for s3, default: false

CORS

CORS_ALLOWED_ORIGINS - comma separated list of allowed origins for CORS, default: "", example: "http://localhost:3000,http://localhost:3001"
CORS_ALLOWED_METHODS - comma separated list of allowed methods for CORS, default: "GET,POST,PUT,DELETE,OPTIONS"
CORS_ALLOWED_HEADERS - comma separated list of allowed headers for CORS, default: ""Content-Type,Authorization,x-api-key,Accept,Content-Length,Accept-Encoding,X-CSRF-Token"

Authentication and authorization

Each node configures auth independently from its own env vars — there is no push of auth config from management to workers.

ALLOWED_ANONYMOUS - flag to allow anonymous access, de

Hugr

Install / Use

README