Foundry
Multi-agent code factory. GitHub Issues that write their own code. Claude Code + Codex + Gemini with 3 AI reviewers per PR.
Install / Use
/learn @merlinrabens/FoundryQuality Score
Category
Development & EngineeringSupported Platforms
README
The Pitch
Add a foundry label to a GitHub Issue. Go to sleep. Wake up to a pull request with three AI code reviews, all fixes applied, CI green.
No prompts. No terminals. No babysitting. The issue body IS the spec. The agent figures out the rest.
Real Numbers
Early production use across 6 private repos (numbers from internal testing, updated periodically):
| Metric | Value | |---|---| | Tasks spawned | 47 | | Merged successfully | 42 (89%) | | Average time to merge | 4.2 hours | | Cost per task | $2-8 | | Avg review-fix cycles | 3.7 | | Required human help | 5 (11%) |
Most failures trace back to vague specs, not agent limitations. Fix the spec, re-run, it works.
Hardware: 2019 MacBook Pro 16" (Intel i9, 64GB RAM). Two Claude Max subscriptions ($200/mo each), Codex, Gemini. ~$400/month total.
How It Works
<p align="center"> <img src="docs/architecture.png" alt="Foundry Architecture" width="100%"> </p>The Full Chain
GitHub Issue (with `foundry` label)
↓
Foundry Orchestrator reads the issue body as a spec
↓
Routes to best agent (Claude Code / Codex / Gemini)
↓
Creates git worktree + branch
↓
Spawns agent with the spec as context
↓
Agent writes code, opens PR (with `fixes #N` in body)
↓
3 AI reviewers review independently
↓
Agent reads ALL reviews, pushes fixes in one cycle
↓
CI passes + all reviewers approve = ready to merge
↓
PR merges → Issue auto-closes
Spawning an Agent
<p align="center"> <img src="docs/gifs/foundry-spawn.gif" alt="foundry orchestrate in action" width="100%"> </p>Agent Routing
Not every agent is good at everything:
- Claude Code: Frontend, React, complex refactors, nuanced review feedback
- Codex: Backend, APIs, infrastructure, bulk changes (the workhorse)
- Gemini: Design systems, documentation, config files, creative structure
The router is a grep. When it picks wrong, you override with a hint in the issue.
# Keyword routing (yes, really)
"frontend|react|component|css|ui" → Claude Code
"api|backend|database|migration" → Codex
"design|theme|token|style" → Gemini
The Review Loop
<p align="center"> <img src="docs/review-loop.png" alt="Three AI Reviewers" width="100%"> </p>Every PR gets reviewed by three independent AI reviewers:
- Claude (Opus 4.6) — Principal Engineer review. Architecture, security, correctness, maintainability. Blocking.
- Codex — architecture, API contracts, test coverage. Creates blocking status check.
- Gemini — design patterns, naming, documentation. Advisory.
Critical: all three must report before the agent starts fixing. This prevents wasted cycles where fixing one reviewer's feedback breaks another's.
Budget: 20 fix cycles per attempt. 5 attempts per task. If it can't converge, it notifies you.
The Dashboard
<p align="center"> <img src="docs/gifs/foundry-status.gif" alt="foundry status dashboard" width="100%"> </p>foundry status shows all active tasks with BACKEND and CRKGS columns:
| Letter | Gate | |--------|------| | C | CI green | | R | Claude approved | | K | Codex approved | | G | Gemini approved | | S | Branch synced |
A task showing CRKGS is ready to merge.
The Respawn Engine
Agents crash. Rate limits. Token expiry. OOM kills. Network timeouts.
Foundry checks every 30 minutes:
- Agent alive? If not → respawn with same spec, branch, PR
- New reviews? → trigger fix cycle
- CI failed? → mark for investigation
- Budget exhausted? → archive, notify you
Most tasks complete on the first attempt. Persistent failures get escalated to you via Telegram.
Visual Evidence: Agents That Prove Their Work
For PRs with frontend changes, Foundry expects visual proof. Screenshots, videos, before/after comparisons. If the PR body has no images and the diff touches .tsx/.jsx/.vue/.css, Foundry flags it.
Telegram Topics: One Thread Per Agent
Every foundry spawn --topic creates a dedicated Telegram forum topic for that task. All status updates — CI results, review verdicts, respawns, merges — go to that thread. Your main chat gets a one-liner ("spawned TASK-123 — tracking in topic") and stays clean.
# Spawn with a new topic (auto-created)
foundry spawn my-org/my-repo specs/backlog/add-auth.md claude --topic
# Reuse an existing topic
foundry spawn my-org/my-repo specs/backlog/add-auth.md claude --topic-id 4821
Under the hood, tg_notify_task checks the SQLite registry for a tg_topic_id. If one exists, the message goes to that thread. If not, it falls back to your main chat. Zero config changes needed for existing tasks.
Requirements:
- Telegram supergroup with Topics enabled (group settings → Topics → On)
- Your bot must be an admin in the group
- Set
TG_CHAT_IDto the supergroup ID andOPENCLAW_TG_BOT_TOKENto your bot token
OpenClaw Integration
<p align="center"> <img src="docs/acp-flow.png" alt="ACP Protocol Flow" width="100%"> </p>What is ACP?
ACP (Agent Client Protocol) is like LSP (Language Server Protocol), but for AI coding agents. One standardized protocol that any agent can speak, any orchestrator can dispatch to.
OpenClaw as Orchestrator
OpenClaw is an AI gateway that speaks ACP natively. It turns Foundry from "scripts on a laptop" into "managed agent fleet you control from your phone":
# Spawn via your orchestrator agent (e.g. from Telegram, Slack, or CLI)
foundry spawn my-org/my-repo "Build the tracking integration per issue #6" claude --topic
What OpenClaw Adds
Push-based notifications: No more cron polling. When an agent finishes, OpenClaw pushes a notification to Telegram, Slack, Discord, or email. Instantly.
Remote control from your phone: Merge PRs, respawn agents, check status, all via Telegram message. You don't need to be at your laptop.
ACP Adapters: Each agent (Claude, Codex, Gemini) has an adapter that translates its native CLI into ACP. When a new agent drops, write one adapter. Instantly compatible.
Horizontal scaling: Run agents across multiple machines. Your Mac at home, a cloud instance, a colleague's server. OpenClaw distributes work based on capacity.
The notification chain:
Agent finishes work
↓ ACP result event
OpenClaw receives completion
↓ routes to your channel
📱 "PR #47 ready. CI green. 3 reviews pending."
↓ you tap "merge"
Done.
The Full Stack: Paperclip → OpenClaw → Foundry
Foundry is the execution layer in a three-tier autonomous development stack:
Paperclip (CEO/PM layer)
Creates issues, assigns priorities, tracks progress
↓ wake event
OpenClaw (orchestrator layer)
Receives wake, loads agent context, routes by label
↓ foundry spawn / foundry orchestrate
Foundry (execution layer)
Spawns coding agents, manages worktrees, runs review loop
↓ PR with fixes
GitHub (delivery layer)
CI, code review, merge, issue auto-close
↓ status update
Paperclip (closes the loop)
Agent reports PR link back to the original issue
Paperclip acts as the product management layer. Its CEO agent creates prioritized issues with labels. OpenClaw wakes on those events and routes them through Foundry based on label:
engineering/foundry→foundry spawn(isolated worktree, own branch + PR)ops→ handled directly by the orchestrator agent (config, crons, research)
The full loop has been verified end-to-end: CEO creates issue → OpenClaw agent wakes → creates GitHub issue → spawns Foundry agent → agent codes + opens PR → reviews pass → agent reports PR link back to Paperclip. Zero human keystrokes.
Without OpenClaw
Foundry works perfectly standalone. Cron jobs + local agents + GitHub. OpenClaw is the upgrade path when you want remote control, push notifications, and multi-machine scaling. Add Paperclip on top when you want autonomous project management.
Quick Start
# Install (or update)
curl -fsSL https://raw.githubusercontent.com/merlinrabens/foundry/main/install.sh | bash
# Configure repos, agents, notifications
foundry setup
# Go
foundry status # Dashboard
foundry scan ~/projects/my-repo # Find labeled issues
foundry orchestrate # Full auto: scan → spawn → check
That's it. The installer handles cloning, PATH, prerequisites, and database setup. The setup wizard walks you through everything else.
Requirements
- macOS or Linux
- GitHub CLI (
gh) — authenticated (gh auth login) - At least one AI agent (sign in via each CLI for OAuth, or set API key as fallback):
- Claude Code:
npm i -g @anthropic-ai/claude-code+claude /login - Codex:
npm i -g @openai/codex+ runcodexto sign in - Gemini:
npm i -g @google/gemini-cli+ rungeminito sign in
- Claude Code:
- SQLite3, jq (installer checks for these)
Setup Guide
foundry setup handles everything interactively:
- Repos — wh
