← the Atlas

herdr vs. kolu — what to adopt

analysis · budding ·

A shipped Rust agent-multiplexer (herdr) makes the same first-party-owns-the-PTYs bet kolu's remote-terminals plan chose — so it's a reference implementation of R-4 Phase B, not a competitor. One handoff-discipline borrow, several validations, two gaps the plans don't cover (native resume; resize arbitration). Claims fact-checked against both codebases.

A study of ogulcancelik/herdr (cloned @ HEAD) read through kolu’s remote-terminals plan ( #951 and the pty-daemon / kolu-tui / chrome-bar docs). Sibling to the Ghostex vs. kolu remote-terminals analysis. Every load-bearing claim below was fact-checked against both codebases — herdr citations all verified; three kolu-side claims were corrected (noted inline). 13-agent workflow + adversarial critique

Both projects are AGPL-3.0-or-later, so herdr’s open-source code is license-compatible with kolu — porting is permitted under the AGPL’s terms, not blocked. The reason most recommendations are techniques and design rather than verbatim code is the stack gap (herdr is Rust; kolu is TypeScript/SolidJS), not licensing.

herdr is a ~107K-LOC Rust TUI agent multiplexer: one binary, a long-lived background server that owns every PTY (one ghostty VT emulator + one OS-thread PTY actor per pane), thin clients that attach/detach over a unix socket. Workspaces → tabs → panes. An agent-awareness sidebar rolls each workspace up to its most urgent state (blocked / working / done / idle). A second socket exposes a JSON API so agents themselves can create panes, read output, and wait for state. Named sessions, remote-over-SSH, 14+ agent integrations, and an experimental zero-downtime live handoff.

The architectural contrast

herdr — Rust TUI multiplexer (first-party server owns the PTYs)kolu — SolidJS web ADE (R-4: pty-host is the survivor)background server (server/headless.rs)PTY actor per pane (pty/actor.rs)live handoff — SCM_RIGHTS fd-pass (server/handoff.rs)agent detect + socket API (detect/, api/)native resume — claude --resume (agent_resume.rs)foreground_client_id — shared geometry (server/headless.rs)@kolu/pty-host (node-pty + @xterm/headless mirror)kolu-server — provider DAG, runs fresh@kolu/surface (oRPC links: ws / stdio / direct)kolu-tui (raw client) — shipped: list / snapshot / attach (spawn/kill = Phase 3, planned)Phase B recovery (capture -> drain -> respawn) — planned validates: thin survivor owns PTYs (A2)validates: snapshot-on-attach (A3) borrow DISCIPLINE; reject fd-pass (A1 / A7)Done=unseen rollup + optional hooks (U1 / U2)GAP: native session resume (G1)GAP: resize arbitration (G2)
Module correspondence. SOLID edges = herdr validates a decision kolu already made, or a direct borrow. DASHED edges = a gap or an explicit non-goal. herdr owns one long-lived server; kolu's R-4 inverts the survivor to @kolu/pty-host while the provider DAG runs fresh in kolu-server.
Concernherdrkolu (built + planned)
Who owns PTY lifetimeFirst-party long-lived server owns every master fd; clients are stateless front-ends.Same bet. R-4 makes @kolu/pty-host the thin survivor; the volatile provider DAG runs fresh in kolu-server.
Survive restartServer outlives clients; full restart restores from a snapshot; resume_agents_on_restore respawns agents.kolu-tui = client detach/reattach; Phase B = daemon survives systemctl restart via cgroup-escape + reattach-by-id. The #1034 hazard lives here.
Recovery on owner restartTransactional: old owner stays alive and re-binds sockets until the new one acks; one bool gates who may signal children; injected-failure tested.Phase B’s composed captureSession → drainTerminals → respawn → finalize with waitForPidGone — designed to never repeat the “kill-then-pray” loss.
Late / lazy attachA live screen snapshot, never a byte replay: reset baseline → re-render the live emulator into one full frame.Same: ptyHost.ts subscribes then serializes a snapshot | delta union (~4KB).
RendererServer diffs a cell-grid → ANSI. No web terminal.Raw VT → xterm.js in the browser; the headless mirror is for snapshot + taps only.

Architecture — what to adopt

UX — what to adopt

Gaps the plans don’t cover

These are the highest leverage: herdr ideas with no current plan coverage. One is real and clean; one is a pre-existing condition the plans never arbitrate.

What to do next

  1. Phase B, now (low risk): adopt the transactional handoff discipline (A1), the single-owner kill invariant (A2), the snapshot invariant test (A3), the two-axis honest-state + inline-recovery-hint (A4). Add the SCM_RIGHTS non-goal note (A7).
  2. Close G2 (multi-client resize arbitration) — Phase 2 shipped with documented last-resize-wins; an arbiter (and the size-change tap attach.ts already names) is still open.
  3. kolu-tui: A5’s socket path and G3’s attach TTY-guard shipped ( #1084 , #1255 ); carry the non-tty contract forward to Phase 3’s kill/spawn.
  4. UX: ship the attention rollup with Done = unseen (U1), keeping unread-bytes distinct from turn-finished; fold the navigator into the palette (U3).
  5. Investigate G1 (native --resume) — the clearest missed adoptable; builds on data kolu already has.
  6. U2’s blocked signal: #905 shipped (screen-scrape awaiting_user) — remaining: map awaiting_user into the U1 rollup as Blocked. Later: R-2 reattach-hint UX (A6).

Net: herdr is the reference implementation for the survivor kolu already chose to build — most of its architecture validates R-4 rather than redirecting it, with one battle-tested checklist to harden Phase B (A1) and one explicit non-goal to write down (A7). The durable surprises are the two gaps: native session resume (shipped in weaker form) and multi-client resize arbitration (latent in an already-endorsed feature).