hoyeon

All you need is requirements. A Claude Code plugin that derives requirements from your intent, verifies every derivation, and delivers traced code — without you writing a plan.

Quick Start · Philosophy · The Chain · Commands · Agents

AI can build anything. The hard part is knowing what to build — precisely.

Most AI coding fails at the input, not the output. The bottleneck isn't AI capability. It's human clarity. You say "add dark mode" and there are a hundred decisions hiding behind those three words.

Most tools either force you to enumerate them upfront, or ignore them entirely. Hoyeon does neither — it derives them. Layer by layer. Gate by gate. From intent to verified code.

Requirements Are Not Written

You don't know what you want until you're asked the right questions.

Requirements aren't artifacts you produce before coding. They're discoveries — surfaced through structured interrogation of your intent. Every "add a feature" conceals unstated assumptions. Every "fix the bug" hides a root cause you haven't named yet.

Hoyeon's job is to find what you haven't said.

  You say:     "add dark mode toggle"
                    │
  Hoyeon asks: "System preference or manual?"     ← assumption exposed
               "Which components need variants?"   ← scope clarified
               "Persist where? How?"               ← decision forced
                    │
  Result:      3 requirements, 7 scenarios, 4 tasks — all with verify commands

This is not just process. It's built on three beliefs about how AI coding should work.

1. Requirements over tasks

Get the requirements right, and the code writes itself. Get them wrong, and no amount of code fixes it.

Most AI tools jump straight to tasks — "create file X, edit function Y." But tasks are derivatives. They change when requirements change. If you start from tasks, you're building on sand.

Hoyeon starts from goals and derives downward through a layer chain:

Goal → Decisions → Requirements → Scenarios → Tasks

Requirements are refined from multiple angles before a single line of code is written. Interviewers probe assumptions. Gap analyzers find what's missing. UX reviewers check user impact. Tradeoff analyzers weigh alternatives. Each perspective sharpens the requirements until they're precise enough to generate verifiable scenarios.

The chain is directional: requirements produce tasks, never the reverse. If requirements change, scenarios and tasks are re-derived. This is why Hoyeon can recover from mid-execution blockers — the requirements are still valid, only the tasks need adjustment.

2. Determinism by design

LLMs are non-deterministic. The system around them doesn't have to be.

An LLM given the same prompt twice may produce different code. This is the fundamental challenge of AI-assisted development. Hoyeon's answer: constrain the LLM with programmatic control so that non-determinism doesn't propagate.

Three mechanisms enforce this:

spec.json as single source of truth — Every agent reads from and writes to the same structured spec. No agent invents its own context. No information lives only in a conversation. The spec is the shared memory that survives context windows, compaction, and agent handoffs.
CLI-enforced structure — hoyeon-cli validates every merge to spec.json. Field names, types, required relationships — all checked programmatically before the LLM ever sees the data. The CLI doesn't suggest structure; it rejects invalid structure.
Derivation chain as contract — Goal → Decisions → Requirements → Scenarios → Tasks are linked. Each layer references the one above it. A scenario traces to a requirement. A task traces to scenarios. If the chain breaks, the gate blocks. This means: if you have valid requirements, the system will produce a result — deterministically routed, even if the LLM's individual outputs vary.

The LLM does the creative work. The system ensures it stays on rails.

3. Machine-verifiable by default

If a human has to check it, the system failed to automate it.

Every scenario in spec.json carries a verified_by classification:

{
  "given": "user clicks dark mode toggle",
  "when": "toggle is activated",
  "then": "theme switches to dark",
  "verified_by": "machine",
  "verify": { "type": "command", "run": "npm test -- --grep 'dark mode'" }
}

The system pushes everything toward machine verification. AC Quality Gate reviews each scenario and suggests converting human items to machine where possible. Multi-model code review (Codex + Gemini + Claude) runs independently and synthesizes a consensus verdict. Independent verifiers check Definition of Done in isolated contexts to eliminate self-verification bias.

Human review is reserved for what machines genuinely can't judge — UX feel, business logic correctness, naming decisions. Everything else runs automatically, every time, without asking.

4. Knowledge compounds

Most AI tools start from zero every session. Hoyeon remembers.

Every execution generates structured learnings — not logs, not chat history, but typed knowledge: what went wrong, why, and the rule to prevent it next time.

  /execute runs → Worker hits edge case
       │
  Worker records:
    { problem: "localStorage quota exceeded at 5MB",
      cause:   "No size check before write",
      rule:    "Always check remaining quota before localStorage.setItem" }
       │
  Next /specify → searches past learnings via BM25
       │
  Result: "Found: localStorage quota issue in todo-app spec.
           → Adding R5: quota guard requirement automatically"

This is cross-spec compounding. A lesson learned in one project surfaces as a requirement in the next. The system doesn't just avoid repeating mistakes — it actively strengthens future specs with evidence from past executions.

Three mechanisms make this work:

spec learning — Workers record structured learnings during execution, auto-mapped to the requirements and tasks that produced them
spec search — BM25 search across all specs: requirements, scenarios, constraints, and learnings. What you learned in project A informs what you ask in project B
Compounding loop — Each /specify session starts by searching past learnings. More projects → richer search results → more complete requirements → fewer surprises during execution → better learnings → the cycle continues

The result: the tenth project you run through Hoyeon is meaningfully better than the first — not because the LLM improved, but because the knowledge base did.

These aren't aspirations. They're enforced by the architecture — the CLI rejects invalid specs, gates block unverified layers, hooks guard writes, agents verify in isolation, and learnings compound across projects. The system is designed so that doing the right thing is the path of least resistance.

See It In Action

You:  /specify "add dark mode toggle to settings page"

  Hoyeon interviews you (scenario-based):
  ├─ "User opens the app at night — should it auto-detect OS dark mode or require a manual toggle?"
  ├─ "User switches to dark mode mid-session — should charts/images also invert?"
  └─ derives implications: CSS variables needed, localStorage for persistence, prefers-color-scheme media query

  Agents research your codebase in parallel:
  ├─ code-explorer scans component structure
  ├─ docs-researcher checks design system conventions
  └─ ux-reviewer flags potential regression

  → spec.json generated:
    3 requirements, 7 scenarios, 4 tasks — all with verify commands

You:  /execute

  Hoyeon orchestrates:
  ├─ Worker agents implement each task in parallel
  ├─ Verifier agents independently check scenarios per task
  ├─ Code review: Codex + Gemini + Claude (multi-model consensus)
  └─ Final Verify: goal + constraints + AC — holistic check

  → Done. Every file change traced to a requirement.

<details> <summary><strong>What just happened?</strong></summary>

/specify → Interview exposed hidden assumptions
           → Agents researched codebase in parallel
           → Layer-by-layer derivation: L0→L1→L2→L3→L4→L5
           → Each layer gated by CLI validation + agent review

/execute → Orchestrator read spec.json, dispatched parallel workers
           → Independent verifiers checked each scenario mechanically
           → Multi-model code review synthesized verdict
           → Final Verify checked goal, constraints, AC holistically
           → Atomic commits with full traceability

The chain ran from intent to proof. Every derivation verified.

</details>

The Derivation Chain

Six layers. Each derived from the one before it. Each gated before the next begins.

  L0: Goal           "add dark mode toggle"
   ↓  ◇ gate         is the goal clear?
  L1: Context        codebase analysis, UX review, docs research
   ↓  ◇ gate         is the context sufficient?
  L2: Decisions      scenario interview → implications derivation (L2.5)
   ↓  ◇ gate         are decisions justified?
  L3: Requirements   R1: "Toggle switches theme" → scenarios + verify
   ↓  ◇ gate         are requirements complete? (AC Quality Gate)
  L4: Tasks          T1: "Add toggle component" → file_scope, AC
   ↓  ◇ gate         do tasks cover all requirements?
  L5: Review         plan-reviewer + step-back gate-keeper

Each gate has two checks:

Merge checkpoint — CLI validates structure and completeness
Gate-keeper — agent team reviews

Hoyeon

Install / Use

README