The /be Workflow
How kolu takes a task to a shipped, reviewed PR — one interview up front, then a fully autonomous pipeline of skills. The whole flow, every skill in place, in one diagram.
/beis how kolu ships. You type/be <task>, answer a couple of questions, and walk away — the agent takes it from a fresh branch to a draft PR that has already survived a four-reviewer gauntlet, with green CI and visual evidence attached. This note is the map of that pipeline: what each phase does, which skill runs it, and how they hand off. Every blue node in the diagram is a link — to the skill’s canonicalSKILL.mdin whichever repo it lives (kolu, agency, odu), or to deeper reading where there is some.
How /be differs from /do
/be is the Claude-Code-native descendant of /do. Both run
fully autonomously start to finish — /do is “mostly autonomous” by design
and asks nothing along the way either, so autonomy is not what sets them apart.
Two things actually do:
- It’s built for Claude Code’s dynamic workflows.
That’s the reason
/beexists as its own skill. A dynamic workflow is a script Claude Code’s runtime executes to orchestrate many subagents at scale./dois harness-agnostic;/beis optimized for Claude Code so its review gauntlet can run as dynamic workflows — the debate skills /be-review drives (/lens-debate, /codex-debate) each fan out dozens of subagents from a workflow script, which/do’s flow has no way to express. - It opens with a short interview. Before any work,
/beasks a single batchedAskUserQuestion— the one and only moment it asks you anything — to pin down a few choices up front rather than guessing.
/be lives in kolu today, but it’s meant to be upstreamed — to
agency (where /do and most review skills
already live) or another home — so it can decouple from kolu and be used
anywhere. The kolu-specific pieces it leans on (the /pu box,
/dev-server, the Atlas) are the seams that work would tease apart.
The interview covers three things:
- Plan first? Write the plan as an Atlas note for review before implementing, or go straight to code. Default: straight, unless the task is large or ambiguous.
- Task kind — bug · feature · refactor. This picks the test strategy: a bug needs a red-then-green reproduction; a feature needs a covering test written first; a refactor leans on existing coverage.
- Ultracode? — only asked when it isn’t already on. With it, the review gauntlet fans out deeper and adversarially verifies each finding.
The six phases
The spine of the diagram is six phases, run in order. The first is the interview above; the rest are fully autonomous.
- Set up —
git fetch, branch offorigin/HEAD(never commit to master), and read.agency/do.mdfor the project’s check / fmt / test / ci commands. If “plan first” was chosen, the plan of record is an Atlas note authored via /atlas. - Implement — test-first. A bug is reproduced red before it’s theorized about, then fixed until the repro flips green — via the /test harness when a failing e2e can express it. Heavy work (builds, the dev server, reproductions) runs off-machine on a /pu box, launched through /dev-server, because production kolu lives on the same machine. A lockfile change refreshes the Nix FOD hash via /nix-typescript. Docs and the changelog are synced in the same commit.
- Open the PR — a draft PR, written with /forge-pr, before
any review, so every reviewer’s findings land as comments on a real PR. The
changelog link is backfilled and, if there’s a plan note, it’s finalized to
status: implementedvia /atlas. - Review gauntlet — the heart of
/be(next section). - Ship — CI and evidence, in parallel (section after next).
- Done — report the PR, the gauntlet outcome, and CI status.
/benever merges; the human reviews the commits and merges when satisfied. Then it runs /self-improve to mine the session for recurring friction.
The review gauntlet
Phase 4 is /be-review, which runs four reviewers serially — each the sole editor of the branch while it runs. Serial, not parallel, by design: two reviewers writing the same worktree at once would see torn, half-edited state. Running one at a time means every reviewer reads a clean, committed tree and applies its own fixes directly — no snapshot machinery, no separate apply pass.
- /lens-debate — two structural lenses, lowy (volatility-based boundaries) and hickey (structural simplicity), review independently, cross-examine every finding to consensus, then apply the agreed fixes. Both lenses — and why kolu reviews with them — are explained in the blog post Hickey & Lowy.
- /codex-debate — codex (reviewer) and a Claude author debate the
diff to consensus, each round auto-committing its
fix(…). - /simplify — the self-applying reuse, simplification, and efficiency pass over the changed code. (A built-in Claude Code skill, so this links to its docs rather than a repo.)
- /code-police — its rule-checklist and /fact-check
passes, applying their fixes. It runs
--no-elegancehere, because the elegance pass would just re-invoke/simplify, which step 3 already ran over this same tree.
be-review commits each step locally but pushes once at the end, then posts
the PR comments — so no comment ever advertises a commit that’s still
local-only. If the diff touches a perf-sensitive surface, a performance pass
checks it against the performance map and updates that
note when a win is banked or a new one surfaces.
Ship in parallel, then close the loop
Phase 5 runs two independent lanes at once — there’s no reason to wait for green before capturing:
- CI — /ci kicks off the pipeline first, backgrounded, driven
through the odu MCP face (/odu-mcp):
run→ wait for settle → read the red node’s log → rerun. It reacts to failures the moment they land. - Evidence — /evidence captures on-screen behavior (a
screenshot or a video) for any change with a visible effect, then posts it under
an
## Evidencecomment. Even a backend bug fix demonstrates the now-fixed behavior.
Both lanes run on a /pu box, never locally — a prior run piled
local builds beside production kolu and the OOM-killer took production down. The
lanes rejoin before Done: CI must be green on the final HEAD and evidence
posted.
After Done, /be closes the loop with /self-improve, which runs
forked (off the main context) to mine this session’s transcript for every
point a human had to intervene. It produces nothing unless a lesson durably
recurs; when one does, it ships a small fix to the skill sources on its own
draft PR for a human to review — never on the /be branch, never merged.
This note shipped in #1565.