kolu
← all posts

odu: a CI runner for agents and humans

Local CI built on @kolu/surface and oRPC-over-ssh — it provisions real build hosts with nothing installed, holds the pipeline as live typed state you attach to from a terminal, and lets your coding agent drive the whole run over MCP.

· Sridhar Ratnakumar
juspay/odu

In the last post I argued that @kolu/surface over ssh makes a new class of app cheap: install-free, ephemeral, typed, and reactive — the source of truth lives on another machine and the plumbing is identical to running locally. drishti, htop for a whole fleet with nothing installed on the remotes, was the flagship. This post is about the second app in that class, and it’s one I use every day: odu, the thing that runs Kolu’s own CI.

odu (Tamil ஓடு — run) is a CI runner. You tag one just recipe as your pipeline, point odu at a couple of machines, and it runs the recipe’s dependency DAG across all of them, posts a GitHub commit status per node, and gives you back a verdict. That part is unremarkable; every team has a tool that does it. What’s different is the shape: odu doesn’t hand you a batch job and a directory of logs. It holds the run as live, typed state you attach to — and because that state is a @kolu/surface, the same run is a terminal dashboard for me and an MCP server for my coding agent, off one definition. It’s local CI built for both of the things that read CI now: humans, and the agents working alongside them.

odu run on the left, odu attach on the right — two terminals reading one live pipeline, the recipes×platforms matrix repainting as nodes go green.

The whole thing stands on four pieces, and naming them up front is the fastest way to say what odu is: oRPC for the transport, @kolu/surface for the typed reactive state, @kolu/surface-nix-host to provision the build machines over ssh with nothing installed, and @kolu/surface-mcp to hand the whole thing to an agent. The rest of this post is what each of those buys.

Why CI should be a live service

A normal local CI tool is a translator. You give it a task graph, it compiles that graph into a batch process — for justci, the tool odu replaced, the compilation target was a process-compose document — runs it once to a terminal verdict, and leaves you log files. If you want to know what’s happening mid-run, you scrape those logs or poll a process supervisor’s socket with a separately-versioned client. The run isn’t something you can talk to; it’s a job that happens and is then over.

I wanted the other thing. An agent — and, honestly, a person — wants to attach to a running system and ask it questions: what nodes exist, which one is red, show me that node’s log, run this one again. Those aren’t batch-job operations. They’re what you’d run against a live service. So I built odu as one: the runner owns the pipeline as state and serves it, live, the entire time it’s up. A run isn’t a process you scrape after the fact. It’s a service you attach to while it’s happening.

That single decision is what makes everything downstream fall out. justci actually tried to grow an agent-facing mode and backed it out, because there was no way to express “the pipeline is registered and idle, nothing running yet” on top of a substrate that starts every recipe the moment it comes up — and no live event source to push a node’s transition as it happened. odu doesn’t hit either wall, and not because it’s cleverer. It’s the other shape. The idle DAG and the live stream aren’t features I added; they’re what “the runner owns the state” already means.

batch translator vs. live service — the same DAG, two shapesbatch translatorgit bundle incompile + run onceeverything at oncescrape .log files outno idle state to attach to;no live source to push fromodu · live serviceDAG sits idleservice is already upattach · runstart nodes explicitlylive snapshot-then-deltas streamidle-vs-running separation is free —attach to a quiet DAG, or a busy one

What a run actually does

The reason odu can own the pipeline as state without a server you stand up is that the state is small and the work is remote. A run is a coordinator on your machine driving lanes on other machines over ssh. Concretely:

  • It refuses to lie. Strict by default: a real run won’t touch a dirty tree. It pins HEAD in a git worktree, and that pinned commit is what gets tested.
  • It reads your DAG from just. Exactly one recipe carries [metadata("ci")]; odu takes its dependency closure as the pipeline. No second config file describing the graph — the graph is your justfile.
  • It provisions each platform over ssh. For every machine in hosts.json, the coordinator nix copys the runner derivation over, realises it on the host, and runs odu-runner --stdio. The host then git fetches your pushed commit into a per-SHA workspace and runs each node as just --no-deps <recipe>. A lane host needs ssh, Nix, and outbound https — nothing else. No agent installed, no port opened, no daemon configured; the runner travels as a Nix closure and the toolchain comes from your repo’s own dev shell. This is @kolu/surface-nix-host doing exactly what it does for drishti.
  • It fans the lanes into one surface. Every lane’s state merges into a single pipeline surface, served on a unix socket (.ci/odu.sock) that status, logs, and attach dial — live.
  • It posts GitHub statuses and keeps durable logs. A commit status per <recipe>@<platform>, posted on transitions read off the state — your token never leaves your machine — and a per-SHA log file per node that survives even if the runner dies.

Because odu inherited justci’s status contexts, log layout, and flag table wholesale, switching Kolu’s CI over to it #1252 was invisible to branch protection. The DAG didn’t change; the thing watching it did.

The pipeline is three primitives

Here’s the part that ties back to the framework. odu’s entire live state is three @kolu/surface primitives — that’s the whole contract every frontend reads:

PrimitiveCallWhat it carries
Cellsurface.nodes.get({})The whole pipeline — one snapshot, then deltas as nodes change.
Streamsurface.nodeLog.get({ id })One node’s output — a buffered snapshot first (late subscribers replay from the top), then appends.
Proceduresurface.node.rerun({ id })The only mutation: reset a node and its transitive dependents, and reschedule.

That’s it. A reactive cell for “what’s the state of everything,” a stream for “show me this log including what I missed,” and a single procedure for the one thing you’re allowed to change. Every feature odu has — the live dashboard, log-follow, rerunning a failed lane, the agent face — is some frontend reading those three over a typed contract, the same useCell/useCollection reactivity I get locally in Kolu, except the truth is on a build box across an ssh connection.

Driving CI from your coding agent

The frontend I built this whole arc to reach is the MCP one. odu mcp serves odu’s surface as an MCP server over stdio, so a coding agent — Claude Code, Codex, opencode, Gemini CLI — drives CI with structured calls instead of scraping my terminal. It’s in-band, exactly like status and attach: it dials the .ci/odu.sock of a run in the current repo and predetermines no host, because which boxes run the lanes stays the coordinator’s job.

The agent gets a small, deliberate surface:

  • run starts a run and returns once it’s live.
  • wait_for_settle blocks until the run settles — or, fail-fast, the instant a node goes red. An agent shouldn’t wait twenty minutes for a nix build it’s about to throw away when the e2e lane already failed in two.
  • node_rerun resets a node and its dependents and reschedules — the only mutation, same as the surface’s.

The pipeline snapshot and the per-node logs aren’t tools, they’re subscribable resourcessurface://streams/nodes and surface://collections/logs/{id} — that fire notifications/resources/updated on every transition, so a notification-aware host gets pushed the changes. wait_for_settle is just the blocking-pull floor for hosts that don’t wake the model on a notification. The whole agent loop is four moves: runwait_for_settle → read the red node’s log resource → node_rerun.

one agent, two tempos — block on the verdict, or subscribe to the streamcodingagentClaude·Codexopencode·…tools — request / responserunwait_for_settlenode_rerun↩ returns the instant a node goes redresources — subscribe / pushsurface://streams/nodessurface://collections/logs/{id}both fire notifications/resources/updated on every changeThe verdict wait_for_settle returns and the values the resources push comefrom the same live Cell the terminal dashboard repaints from.

Wiring it up is one stdio entry — nix run github:juspay/odu -- mcp — and repos that manage agent config with APM get it injected automatically just by depending on odu.

Claude Code driving the run over MCP on top — run, then wait_for_settle — while odu attach on the bottom shows the same pipeline going green. One surface, two readers.

One surface, three frontends

The reason the MCP face was cheap to build, and stays honest, is that it isn’t a separate integration. It’s a projection of the same surface the terminal dashboard reads. odu has procedures and detail an agent shouldn’t see; what the agent gets is a curated view — dangerous procedures dropped, logs bounded, a verdict bit derived. @kolu/surface-mcp and a primitive called projectSurface #1270 express exactly that: derive a curated surface B from a live client of surface A, then serve B. A server that’s a client.

So the curation lives in one place — surface-land — and every frontend reads it. There are three:

  • The terminal dashboard (odu attach / odu run) — the recipes×platforms matrix, repainting live, with a focused log pane and a key to rerun the lane under the cursor.
  • The MCP server (odu mcp) — the agent face above.
  • A web dashboard — designed on the same surface, on the roadmap.

None of them owns the truth; each is a thin adapter over one contract. When the web face lands it won’t be a rewrite, it’ll be a third reader of a surface that already exists — the way drishti’s UI and Kolu’s are just readers of theirs.

one surface → one projection → three frontendsoduSurfacefull live runner state(over ssh, oRPC)projectSurfaceoduAgentSurfacedrop dangerous procs ·bound logs · derive verdictcurated once, observer-safeterminal dashboardattach · runMCP serveran agent drivesweb dashboardon the roadmapCuration happens once, insurface-land. Every frontendreads the same projection.

The stack underneath

Step back and odu is almost entirely the same four pieces that power drishti and Kolu, pointed at a new domain:

  • oRPC is the transport. The coordinator dials the lane’s odu-runner --stdio over ssh; the contract is base64-framed over stdout, snapshot-then-deltas on the wire, no daemon and no port. The same typed client the browser would use, talking over a pipe instead of a socket.
  • @kolu/surface is the state. I declared odu’s pipeline once — a cell, a stream, a procedure — and got it typed and reactive on every frontend, with the snapshot-then-deltas framing and race-free attach handled for me. I wrote zero subscribe-reconcile-retry code, which was the entire point of extracting surface in the first place.
  • @kolu/surface-nix-host is the provisioning. “Run my CI on that machine” becomes a nix copy of a closure and an ssh command, and the host needs nothing but ssh, Nix, and outbound https. The same property that lets drishti monitor a box you’ve never installed anything on lets odu build on one.
  • @kolu/surface-mcp is the agent face. Project the surface, serve it default-deny, add the two genuinely call-shaped tools, done.

The way I think about it now, building a CI runner stopped being “build a CI runner.” It was: declare a surface for the pipeline, provision the hosts over ssh, project the surface for agents. Three of those four steps were libraries I already had. odu is mostly the domain — the strict gate, the just DAG ingestion, the lane fan-in — sitting on a transport, a state model, a provisioner, and an adapter that Kolu and drishti proved out first.

What’s next

odu runs Kolu’s CI today, on Linux and macOS, driven both ways — I attach to it from a terminal, and my agent drives it over MCP. The honest edges are the Phase-1 ones: live state doesn’t yet survive a runner restart (the per-SHA logs do), it’s one run per checkout, and there’s no long-lived idle runner you can attach to before a run exists. The web dashboard and that idle-attach mode are the next surface readers, not rewrites.

If you’ve read the surface and ssh posts, odu is the answer to “what else is in that class of apps?” — and the answer turned out to be the tool I most wanted myself: CI that’s a live service, shaped once, read by whoever’s looking, human or not.


odu · @kolu/surface · @kolu/surface-nix-host · @kolu/surface-mcp · oRPC · the earlier chapters: Announcing @kolu/surface and Apps that ship themselves over ssh