Gproxy
gproxy is a Rust-based multi-channel LLM proxy that exposes OpenAI / Claude / Gemini-style APIs through a unified gateway, with a built-in admin console, user/key management, and request/usage auditing.
Install / Use
/learn @LeenHawk/GproxyQuality Score
Category
Development & EngineeringSupported Platforms
README
gproxy
gproxy is a Rust-based multi-channel LLM proxy that exposes OpenAI / Claude / Gemini-style APIs through a unified gateway, with a built-in admin console, user/key management, and request/usage auditing.
Chinese version: README.zh.md
If you want to look at the full docs, click here.
Key Features
- Unified multi-channel gateway: route requests to different upstreams by
channel(builtin + custom). - Multi-protocol compatibility: one upstream can accept OpenAI/Claude/Gemini requests (controlled by dispatch rules).
- Credential pool and health states: supports
healthy / partial / deadwith model-level cooldown retry. - OAuth and API Key support: OAuth channels (Codex, ClaudeCode, GeminiCli, Antigravity) and API Key channels.
- Built-in Web console: available at
/, supports English and Chinese. - Observability: records upstream/downstream requests and usage metrics (filterable by user/model/time).
- Async batched storage writes: queue + aggregation to reduce database pressure under load.
Built-in Channels
| Channel ID | Default Upstream | Auth Type |
|---|---|---|
| openai | https://api.openai.com | API Key |
| anthropic | https://api.anthropic.com | API Key |
| aistudio | https://generativelanguage.googleapis.com | API Key |
| vertexexpress | https://aiplatform.googleapis.com | API Key |
| vertex | https://aiplatform.googleapis.com | GCP service account (builtin object) |
| geminicli | https://cloudcode-pa.googleapis.com | OAuth (builtin object) |
| claudecode | https://api.anthropic.com | OAuth/Cookie (builtin object) |
| codex | https://chatgpt.com/backend-api/codex | OAuth (builtin object) |
| antigravity | https://daily-cloudcode-pa.sandbox.googleapis.com | OAuth (builtin object) |
| nvidia | https://integrate.api.nvidia.com | API Key |
| deepseek | https://api.deepseek.com | API Key |
| custom (for example mycustom) | your configured base_url | API Key (secret) |
Quick Start
1. Prerequisites
- Rust (must support
edition = 2024) - SQLite (default DSN uses sqlite)
- Optional: Node.js +
pnpm(if you want to rebuild the admin frontend)
2. Prepare Config
cp gproxy.example.toml gproxy.toml
At minimum, set:
global.admin_key- at least one enabled channel credential (
credentials.secretor builtin credential object)
Bootstrap login defaults:
- username:
admin - password: value of
global.admin_key
3. Run
cargo run -p gproxy
On startup, gproxy prints:
- listening address (default
http://127.0.0.1:8787) - current admin key (
password:)
If
./gproxy.tomldoes not exist, gproxy starts with in-memory defaults and auto-generates a 16-digit admin key (printed to stdout).
4. Minimal Verification
curl -sS http://127.0.0.1:8787/openai/v1/models \
-H "x-api-key: <your user key or admin key>"
Get a user/admin API key via password login:
curl -sS http://127.0.0.1:8787/login \
-H "content-type: application/json" \
-d '{
"name": "admin",
"password": "<your admin_key>"
}'
Deployment
Local deployment
Binary
- Download the binary from Releases.
- Prepare config:
cp gproxy.example.toml gproxy.toml
- Run binary:
./gproxy
Docker
Pull prebuilt image (recommended):
docker pull ghcr.io/leenhawk/gproxy:latest
Build from local source (only if you need local code changes):
docker build -t gproxy:local .
Run:
docker run --rm -p 8787:8787 \
-e GPROXY_HOST=0.0.0.0 \
-e GPROXY_PORT=8787 \
-e GPROXY_ADMIN_KEY=your-admin-key \
-e DATABASE_SECRET_KEY='replace-with-long-random-string' \
-e GPROXY_DSN='sqlite:///app/data/gproxy.db?mode=rwc' \
-v $(pwd)/data:/app/data \
ghcr.io/leenhawk/gproxy:latest
Set
DATABASE_SECRET_KEYvia env vars or your platform secret manager rather than committing it to the repo. Especially on free-tier or shared managed databases, configure it before the first bootstrap so sensitive fields are not stored in plaintext, and keep the same key on every instance using that database.
Cloud deployment
ClawCloud Run
- Template file:
claw.yaml - Use
claw.yamlas a custom template in ClawCloud Run App Store -> My Apps -> Debugging. - Key inputs:
admin_key(generated by default),proxy_url,rust_log,volume_size - Recommended persistence: mount
/app/dataas a persistent volume.
Release downloads and self-update (Cloudflare Pages)
- Release CI publishes signed binaries and update manifests to a dedicated Cloudflare Pages downloads project.
- Default public base URL:
https://download-gproxy.leenhawk.com - Generated manifests:
/manifest.json— full download index used by the docs downloads page/releases/manifest.json— stable self-update feed/staging/manifest.json— staging self-update feed
- The admin UI
Cloudflareupdate source and/admin/system/self_updateread from this downloads site. - Required GitHub Actions secrets for the downloads deployment:
CLOUDFLARE_API_TOKENCLOUDFLARE_ACCOUNT_IDCLOUDFLARE_DOWNLOADS_PROJECT_NAME
- Optional secrets:
DOWNLOAD_PUBLIC_BASE_URL— custom public domain or Pages URL exposed in docs/manifestsUPDATE_SIGNING_KEY_ID— manifest key id override (defaultgproxy-release-v1)UPDATE_SIGNING_PRIVATE_KEY_B64andUPDATE_SIGNING_PUBLIC_KEY_B64— checksum signature generation and verification
Admin Frontend
- Console entry:
GET / - Static assets:
/assets/* - Frontend build output:
apps/gproxy/frontend/dist - Backend embeds
distinto the binary viarust-embed
If you changed frontend code, rebuild first:
cd apps/gproxy/frontend
pnpm install
pnpm build
cd ../../..
cargo run -p gproxy
Configuration (gproxy.toml)
Reference files:
gproxy.example.toml(minimal)gproxy.example.full.toml(full)
global
| Field | Description |
|---|---|
| host | Bind host, default 127.0.0.1 |
| port | Bind port, default 8787 |
| proxy | Upstream proxy (empty string means disabled) |
| hf_token | HuggingFace token (optional for tokenizer download) |
| hf_url | HuggingFace base URL, default https://huggingface.co |
| admin_key | Admin bootstrap credential; used as admin password and admin API key on bootstrap, auto-generated if empty |
| mask_sensitive_info | Redact sensitive request/response payloads in logs/events |
| data_dir | Data directory, default ./data |
| dsn | Database DSN; if omitted and data_dir is changed, sqlite DSN is derived automatically |
runtime
| Field | Default | Description |
|---|---:|---|
| storage_write_queue_capacity | 4096 | Storage write queue size |
| storage_write_max_batch_size | 1024 | Max events per aggregated storage batch |
| storage_write_aggregate_window_ms | 25 | Aggregation window (ms) |
channels
Each channel is declared with [[channels]]:
id: channel id (for exampleopenai,claude,mycustom)enabled: runtime enable switch (falsedisables routing to this channel)settings: channel settings (must includebase_url)dispatch: optional; defaults to channel-specific dispatch table when omittedcredentials: credential list (supports multi-credential retry/fallback)
Anthropic/ClaudeCode Cache Rewrite (cache_breakpoints)
For anthropic and claudecode, configure cache-control rewrite with:
- setting key:
channels.settings.cache_breakpoints - max 4 rules
- targets:
top_level(globalalias),tools,system,messages messagesindexing uses flattenedmessages[*].contentblocks after normalizing Claude shorthands (content: "..."becomes one text block)- for
messages, you may also setcontent_position/content_index; when either field is present,position/indexfirst select a message, thencontent_*selects a block inside that message ttl:auto/5m/1h(automeans no ttl field is injected)- existing request-side
cache_controlis always preserved and counts toward the 4-rule limit
No-ttl default note:
anthropic: upstream default is5mclaudecode: upstream default is5m- use explicit ttl when you need deterministic behavior
Example:
[[channels]]
id = "anthropic"
enabled = true
[channels.settings]
base_url = "https://api.anthropic.com"
cache_breakpoints = [
{ target = "top_level", ttl = "auto" },
{ target = "messages", position = "last_nth", index = 1, ttl = "5m" },
{ target = "messages", position = "last_nth", index = 1, content_position = "last_nth", content_index = 1, ttl = "5m" }
]
[[channels]]
id = "claudecode"
enabled = true
[channels.settings]
base_url = "https://api.anthropic.com"
claudecode_flatten_system_text_before_cache_control = true
cache_breakpoints = [
{ target = "top_level", ttl = "auto" },
{ target = "messages", position = "last_nth", index = 1, content_position = "last_nth", content_index = 1, ttl = "1h" }
]
ClaudeCode also supports an optional setting:
channels.settings.claudecode_flatten_system_text_before_cache_control- when
true, after cache breakpoint rewrite and before billing-header injection, gproxy concatenates consecutivesystemtext blocks before the firstsystemcache_controlblock into one text block - the Claude Code billing header remains a separate
systemblock and is not merged into that prefix
channels.credentials
Each credential can include:
id/label: optional identifierssecret: for API key channelsbuiltin: structured credential object for OAuth/service-account channelsstate: opti
Related Skills
node-connect
352.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
async-pr-review
100.6kTrigger this skill when the user wants to start an asynchronous PR review, run background checks on a PR, or check the status of a previously started async PR review.
ci
100.6kCI Replicate & Status This skill enables the agent to efficiently monitor GitHub Actions, triage failures, and bridge remote CI errors to local development. It defaults to automatic replication
