Agentops
The missing DevOps layer for coding agents. Flow, feedback, and memory that compounds between sessions.
Install / Use
/learn @boshu2/AgentopsQuality Score
Category
Development & EngineeringSupported Platforms
README
AgentOps
Every session starts where the last one left off.
Validation, memory, lifecycle gates, and briefing-first control for coding agents. AgentOps acts as a software-factory control plane: bounded context goes in, validated code and durable learning come out.
Start Here · Install · See It Work · Skills · CLI · FAQ · Newcomer Guide
</div> <p align="center"> <img src="docs/assets/swarm-6-rpi.png" alt="Agents running full development cycles in parallel with validation gates and a coordinating team leader" width="800"> </p>What AgentOps Gives You
Session 1, your agent spends 2 hours debugging a timeout bug. Session 15, a new agent finds the answer in 10 seconds — because /retro captured the lesson and the flywheel promoted it. Three capabilities make this work:
- Judgment validation — agents get risk context that challenges the plan and the code before shipping.
- Durable learning — solved problems stay solved. Your repo accumulates institutional knowledge across sessions, agents, and runtimes.
- Loop closure — completed work produces better next work, stronger rules, and richer future context.
Every skill, hook, and CLI command exists to deliver one of these three. They form a single lifecycle contract, not separate features.
Operationally, that means AgentOps behaves like a software factory:
- briefings and startup context prepare the work order
- RPI runs the delivery line
- validation gates accept or reject output
- the flywheel turns completed work into future advantage
See Software Factory Surface for the explicit operator lane.
| Capability | What you get |
|-----|---------------|
| Judgment validation | /pre-mortem challenges your plan before build; /vibe + /council validate code before commit |
| Durable learning | Repo-native memory via .agents/ — lessons compound across sessions, agents, and runtimes |
| Loop closure | Every cycle produces artifacts, issues, and next-work suggestions the next session acts on |
Install
# Claude Code (recommended): marketplace + plugin install
claude plugin marketplace add boshu2/agentops
claude plugin install agentops@agentops-marketplace
# Codex CLI (0.110.0+ native plugin)
curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-codex.sh | bash
# OpenCode
curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install-opencode.sh | bash
# Other Skills-compatible agents (agent-specific, install only what you need)
# Example (Cursor):
npx skills@latest add boshu2/agentops --cursor -g
On Linux, also install system bubblewrap so Codex uses it directly:
sudo apt-get install -y bubblewrap
Install ao CLI (optional)
Skills work standalone. The ao CLI unlocks the full repo-native layer: knowledge extraction, retrieval and injection, maturity scoring, goals, and control-plane workflows.
Homebrew (recommended)
brew tap boshu2/agentops https://github.com/boshu2/homebrew-agentops
brew install agentops
which ao
ao version
Or install via release binaries or build from source.
Then type /quickstart in your agent chat.
In Claude Code, CLAUDE.md is the startup surface. Installed hooks stay
silent: SessionStart prepares runtime state and can stage factory goal or
briefing files, while UserPromptSubmit can capture first-prompt intake
without injecting additional context into the session.
Start Here
Three commands, zero methodology. Pick one and go:
/council validate this PR # Multi-model code review — immediate value
/research "how does auth work" # Codebase exploration with memory
/implement "fix the login bug" # Full lifecycle for one task
When you're ready for more:
/plan → /crank # Decompose into issues, parallel-execute
/rpi "add retry backoff" # Full pipeline: research → plan → build → validate → learn
/evolve # Fitness-scored improvement loop — walk away, come back to better code
Every skill works alone. Compose them however you want. Full catalog: Skills.
If you want the explicit operator lane instead of individual primitives:
ao factory start --goal "fix auth startup"
/rpi "fix auth startup" # or: ao rpi phased "fix auth startup"
ao codex stop
That path keeps briefing, runtime startup, delivery, and loop closure on one surface. See Software Factory Surface.
How It Works
Each phase delivers one or more of the three capabilities — judgment, learning, loop closure:
| Phase | Primary skills | What you get |
|------|----------------|---------------------|
| Discovery | /brainstorm -> /research -> /plan -> /pre-mortem | Repo context, scoped work, known risks, execution packet |
| Implementation | /crank -> /swarm -> /implement | Closed issues, validated wave outputs, ratchet checkpoints |
| Validation + learning | /validation -> /vibe -> /post-mortem -> /retro -> /forge | Findings, learnings, next work, stronger prevention artifacts |
/rpi orchestrates all three phases. /evolve keeps running /rpi against GOALS.md so the worst fitness gap gets addressed next. The output is code + state + memory + gates.
The explicit CLI operator surface around that line is:
ao factory startfor briefing-first startup/rpiorao rpi phasedfor deliveryao codex stopfor explicit loop closure
| Pattern | Chain | When |
|---------|-------|------|
| Quick fix | /implement | One issue, clear scope |
| Validated fix | /implement → /vibe | One issue, want confidence |
| Planned epic | /plan → /pre-mortem → /crank → /post-mortem | Multi-issue, structured |
| Full pipeline | /rpi (chains all above) | End-to-end, autonomous |
| Evolve loop | /evolve (chains /rpi repeatedly) | Fitness-scored improvement |
| PR contribution | /pr-research → /pr-plan → /pr-implement → /pr-validate → /pr-prep | External repo |
| Knowledge query | ao search → /research (if gaps) | Understanding before building |
| Standalone review | /council validate <target> | Ad-hoc multi-judge review |
Primitive chains underneath it
- Mission and fitness:
GOALS.md,ao goals,/evolve - Discovery chain:
/brainstorm->ao search/ao lookup->/research->/plan->/pre-mortem - Execution chain:
/crank->/swarm->/implement->/vibe-> ratchet checkpoints - Compiled prevention chain: findings registry -> planning rules / pre-mortem checks / constraints -> later planning and validation
- Continuity chain: session hooks + phased manifests +
/handoff+/recover
Each cycle adds new rules, learnings, and constraints — without anyone shipping new code. See Primitive Chains for the audited map.
How Agent Memory Works
Session 50 starts with 50 sessions of accumulated wisdom.
.agents/ is a directory in your repo that stores what your agents learned — as plain files. Grep replaces RAG. Plain text you can diff, review in PRs, and open in Obsidian.
┌──────────────────────────────────────────────────────────────────────────┐
│ Traditional Cache .agents/ Knowledge Store │
│ ┌────────────────────┐ ┌──────────────────────────────────────────┐ │
│ │ Stores results │ │ Stores extracted lessons │ │
│ │ Hit = skip compute │ │ Hit = skip the 2-hour debugging │ │
│ │ Flat key-value │ │ Hierarchical: learning → pattern → rule │ │
│ │ Static after write │ │ Promotes through tiers over time │ │
│ │ One consumer │ │ Any agent, any runtime, any session │ │
│ └────────────────────┘ └──────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
How it compounds: Session 1, your agent hits a timeout bug and spends 2 hours debugging. /retro captures the lesson. /athena promotes it to a pattern. Session 15, a new agent greps "timeout" and finds the answer in 2 operations — turning a 2-hour investigation into a 10-second lookup. Session 20, a planning rule gates plans that omit timeout checks. That's institutional knowledge that survives agent death.
┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐
│ 1. WORK │─>│ 2. FORGE │─>│ 3. POOL │─>│ 4. PROMOTE │
│ Session │ │ Extract │ │ Score & │ │ Graduate │
└────────────┘ └────────────┘ └────────────┘ └────────────┘
^ │
│ ┌────────────┐ ┌────────────┐ │
└─────────│ 6. INJECT │<─│5. LEARNINGS│<────────┘
│ Surface │ │ Permanent │
└────────────┘ └────────────┘
> /research "retry backoff strategies"
[lookup] 3 prior learnings found (freshness-weighted):
- Token bucket with Redis (established, high confidence)
- Rate limit at middleware layer, not per-handler (pattern)
- /login endpoint was missing rate limiting (decision)
[research] Found prior art in your codebase + retrieved context
Recommends: exponential backoff with jitter, reuse existing Redis client
Stale insights decay automatically. Useful ones compound.
Related Skills
node-connect
345.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
106.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
AGENTS
345.9kExtensions Boundary This directory contains bundled plugins. Treat it as the same boundary that third-party plugins see. Public Contracts - Docs: - `docs/plugins/building-plugins.md` - `do
openai-whisper-api
345.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
