The /be Workflow · the kolu Atlas

/be is how kolu ships. You type /be <task>, answer a couple of questions, and walk away — the agent takes it from a fresh branch to a draft PR that has already survived a four-reviewer gauntlet, with green CI and visual evidence attached. This note is the map of that pipeline: what each phase does, which skill runs it, and how they hand off. Every blue node in the diagram is a link — to the skill’s canonical SKILL.md in whichever repo it lives (kolu, agency, odu), or to deeper reading where there is some.

The /be pipeline. The spine (top to bottom) is the six phases; skills attach to the right of each. The magenta loop is the review gauntlet — four reviewers run strictly one after another, each the sole editor of the branch while it runs. Ship fans into two parallel lanes (CI and evidence) that rejoin before Done. Click any skill node to open its source.

How /be differs from /do

/be is the Claude-Code-native descendant of /do. Both run fully autonomously start to finish — /do is “mostly autonomous” by design and asks nothing along the way either, so autonomy is not what sets them apart. Two things actually do:

It’s built for Claude Code’s dynamic workflows. That’s the reason /be exists as its own skill. A dynamic workflow is a script Claude Code’s runtime executes to orchestrate many subagents at scale. /do is harness-agnostic; /be is optimized for Claude Code so its review gauntlet can run as dynamic workflows — the debate skills /be-review drives (/lens-debate, /codex-debate) each fan out dozens of subagents from a workflow script, which /do’s flow has no way to express.
It opens with a short interview. Before any work, /be asks a single batched AskUserQuestion — the one and only moment it asks you anything — to pin down a few choices up front rather than guessing.

/be lives in kolu today, but it’s meant to be upstreamed — to agency (where /do and most review skills already live) or another home — so it can decouple from kolu and be used anywhere. The kolu-specific pieces it leans on (the /pu box, /dev-server, the Atlas) are the seams that work would tease apart.

The interview covers three things:

Plan first? Write the plan as an Atlas note for review before implementing, or go straight to code. Default: straight, unless the task is large or ambiguous.
Task kind — bug · feature · refactor. This picks the test strategy: a bug needs a red-then-green reproduction; a feature needs a covering test written first; a refactor leans on existing coverage.
Ultracode? — only asked when it isn’t already on. With it, the review gauntlet fans out deeper and adversarially verifies each finding.

The six phases

The spine of the diagram is six phases, run in order. The first is the interview above; the rest are fully autonomous.

Set up — git fetch, branch off origin/HEAD (never commit to master), and read .agency/do.md for the project’s check / fmt / test / ci commands. If “plan first” was chosen, the plan of record is an Atlas note authored via /atlas.
Implement — test-first. A bug is reproduced red before it’s theorized about, then fixed until the repro flips green — via the /test harness when a failing e2e can express it. Heavy work (builds, the dev server, reproductions) runs off-machine on a /pu box, launched through /dev-server, because production kolu lives on the same machine. A lockfile change refreshes the Nix FOD hash via /nix-typescript. Docs and the changelog are synced in the same commit.
Open the PR — a draft PR, written with /forge-pr, before any review, so every reviewer’s findings land as comments on a real PR. The changelog link is backfilled and, if there’s a plan note, it’s finalized to status: implemented via /atlas.
Review gauntlet — the heart of /be (next section).
Ship — CI and evidence, in parallel (section after next).
Done — report the PR, the gauntlet outcome, and CI status. /be never merges; the human reviews the commits and merges when satisfied. Then it runs /self-improve to mine the session for recurring friction.

The review gauntlet

Phase 4 is /be-review, which runs four reviewers serially — each the sole editor of the branch while it runs. Serial, not parallel, by design: two reviewers writing the same worktree at once would see torn, half-edited state. Running one at a time means every reviewer reads a clean, committed tree and applies its own fixes directly — no snapshot machinery, no separate apply pass.

/lens-debate — two structural lenses, lowy (volatility-based boundaries) and hickey (structural simplicity), review independently, cross-examine every finding to consensus, then apply the agreed fixes. Both lenses — and why kolu reviews with them — are explained in the blog post Hickey & Lowy.
/codex-debate — codex (reviewer) and a Claude author debate the diff to consensus, each round auto-committing its fix(…).
/simplify — the self-applying reuse, simplification, and efficiency pass over the changed code. (A built-in Claude Code skill, so this links to its docs rather than a repo.)
/code-police — its rule-checklist and /fact-check passes, applying their fixes. It runs --no-elegance here, because the elegance pass would just re-invoke /simplify, which step 3 already ran over this same tree.

be-review commits each step locally but pushes once at the end, then posts the PR comments — so no comment ever advertises a commit that’s still local-only. If the diff touches a perf-sensitive surface, a performance pass checks it against the performance map and updates that note when a win is banked or a new one surfaces.

Ship in parallel, then close the loop

Phase 5 runs two independent lanes at once — there’s no reason to wait for green before capturing:

CI — /ci kicks off the pipeline first, backgrounded, driven through the odu MCP face (/odu-mcp): run → wait for settle → read the red node’s log → rerun. It reacts to failures the moment they land.
Evidence — /evidence captures on-screen behavior (a screenshot or a video) for any change with a visible effect, then posts it under an ## Evidence comment. Even a backend bug fix demonstrates the now-fixed behavior.

Both lanes run on a /pu box, never locally — a prior run piled local builds beside production kolu and the OOM-killer took production down. The lanes rejoin before Done: CI must be green on the final HEAD and evidence posted.

After Done, /be closes the loop with /self-improve, which runs forked (off the main context) to mine this session’s transcript for every point a human had to intervene. It produces nothing unless a lesson durably recurs; when one does, it ships a small fix to the skill sources on its own draft PR for a human to review — never on the /be branch, never merged.

This note shipped in #1565.