Remote terminals over SSH
The phased master plan for kolu#951 — remote terminals over SSH. The foundation is shipped — the TerminalBackend seam (R-1), the @kolu/surface framework + @kolu/surface-nix-host (R-1.5), the shared provider engine (R-1.6), and local PTY daemonization (R-4, the kaval daemon — terminals survive a deploy). What remains is the ssh transport + ChromeBar host switcher (R-2) then network-blip resilience (R-3). The end state is multi-host — 1 local + N ssh-remote kaval daemons, all surviving and reattachable, switchable from the ChromeBar.
The master plan for
#951 — remote terminals over SSH, shipped as phases on different volatility axes. The foundation is done: R-1 (
#981 ), R-1.5 (
#984 ), R-1.6 (
#1004 ), and R-4 — local PTY daemonization — are all on master. R-4 is the big one: the long-lived survivor is kaval (Tamil kāval, watch/guard), a standalone PTY daemon kolu spawns and supervises, and local terminals now survive a kolu deploy — process, scrollback, and a running agent all reattach. Its design and full build history (A1 → A2 → B0 → B1 → B2 → B3.1–B3.4, the currency nudge last in
#1353 ) live in its own note, pty-daemon. R-4 was deliberately re-sequenced before R-2/R-3 — a whole-plan /lowy pass showed that building the multi-client survivor once collapses R-2 + R-3 from ~1860 to ~740 LoC combined. Two phases remain: R-2 adds the ssh transport against the same backend; R-3 closes the network-blip resilience loop.
The end state is multi-host: every host — the local machine and each ssh remote — runs the same hashed kaval daemon (one process entry, single-instance pid-gate included), and kolu-server keys its pty-host endpoint + status by hostId (a map of one — local — today, host-count-agnostic by construction). The user switches hosts from the ChromeBar — kinda like tmux sessions — and kolu reattaches to local and remote daemons after a server restart or a network blip. R-4’s redo baked the host-count-agnostic shapes in (hostId-keyed status collection, per-host endpoint lifecycle, id-keyed whole-record adoption, and the TerminalLocation discriminator at the single getTerminalBackendFor dispatch seam); R-2 then adds the ssh driver as an additive sibling of R-4’s local driver — never a retrofit.
Migrated + heavily compressed from the 208 KB docs/plans/remote-terminals.html monolith (Zed v1.5.0, 3a81f8e9, was the studied reference; the file:line citation list lives in git history). Cf. Ghostex vs. kolu remote-terminals and herdr vs. kolu.
Phases at a glance
| Phase | Gist | User-facing change | Status |
|---|---|---|---|
| R-1 | Server spine collapses around one TerminalBackend interface; the meta/*.ts orchestrators dissolve into LocalTerminalBackend. | None — byte-identical refactor. | shipped #981 |
| R-1.5 | @kolu/surface grows the stdio + loopback links, peer-server, and in-memory pub/sub; new @kolu/surface-nix-host (ssh HostSession + .drv-provisioning). Validated by the remote-process-monitor demo. | None — framework capability. | shipped #984 |
| R-1.6 | The per-terminal provider DAG extracts into terminalBackend/providers.ts behind ProviderHooks — one engine, two future hosts. | None — byte-identical refactor. | shipped #1004 |
| R-4 | Local PTYs move into kaval, a standalone surviving daemon kolu spawns + supervises; the survival chain (adoption + the currency nudge) lands. Owned by pty-daemon — A1–B3.4, all on master. | Local terminals survive a kolu-server restart — replaces tmux/zmx, closes #671 . | shipped |
| R-2 | Remote via the kaval-endpoint approach — one backend bound to a dialed endpoint (no RemoteTerminalBackend); the ssh driver reaches + provisions the same kaval closure, and the canvas multiplexes local + ssh kavals. See kaval-sessions. | Remote terminals on the canvas, beside local ones. | next |
| R-3 | Close the overflow-recovery loop (a slow-subscriber drop must re-attach for a fresh snapshot, not freeze); persist remoteSessionId. The reattach + handshake machinery itself shipped with R-4. | Remote terminals survive network blips with full scrollback. | last |
Volatility axes — the two that the remaining phases extend
Lowy’s question — what changes for its own reasons, and is each axis isolated to one module? R-4 closed out the two axes it owned (where the PTY process runs + how a host’s daemon is supervised/restarted — now detailed in pty-daemon’s own axes table). The two cross-phase axes R-2 and R-3 still hang on:
| Axis | Encapsulated by |
|---|---|
| Where a terminal’s state lives (this machine vs an SSH host) | One backend bound to a TerminalLocation-resolved kaval endpoint — { kind: "local" } today, { kind: "remote", host } in R-2 (no second backend implementation — R-2 confirms the kaval-endpoint approach, kaval-sessions). The location resolver is the sole place that maps a tile to its kaval; everything downstream talks to the backend and never asks “which kind?”. |
| How the backend reaches its agent (transport, framing, reconnect) | HostSession + @kolu/surface/links/stdio — unix-socket/loopback today, ssh in R-2, mTLS-over-TCP conceivable later. |
Hickey’s complement: the single-TerminalBackend cut keeps zero domain knowledge crossing the transport boundary — the agent runs the unmodified providers and the boundary just ships their output. An earlier draft’s per-domain remote providers (RemoteGitInfoProvider, …) were each a transport adapter wearing a domain costume; the seam dissolved them.
The foundation — what shipped (R-1 · R-1.5 · R-1.6 · R-4)
R-1 — the TerminalBackend seam #981
A TerminalBackend is the per-terminal world a terminal lives in: what process owns the PTY, what filesystem the Code tab reads, where the git watcher and agent detectors observe. The interface (packages/common/src/terminalBackend.ts): spawnPty / terminalChannel<K> / killTerminal / uploadFile plus fs and git sub-surfaces. Invariants that carry forward: kill convergence (killTerminal is the sole termination path — no dispose()); snapshot-then-delta on every stream (what makes reconnect transparent); the backend owns its filesystem; sync shadow entry, async I/O (the instant-tile UX). LocalTerminalBackend absorbed every meta/*.ts orchestrator; getTerminalBackendFor(location) is the single dispatch point.
R-1.5 — @kolu/surface gaps + @kolu/surface-nix-host #984
Closed the framework’s browser-coupling gaps generically: the stdio link (base64+newline framing over a Readable/Writable pair), the loopback pair, the peer server (serveOverStdio), and the in-memory pub/sub family (inMemoryChannelByName is load-bearing — without name-dedupe, publish and subscribe land on different channels and every delta is lost). The new @kolu/surface-nix-host package extracted the wholly-generic remote machinery R-2 consumes whole: HostSession<C> (one ref-counted ssh subprocess per host, state machine, reconnect), provisionAgent (nix copy --derivation + remote realise — ships the derivation, so a darwin parent drives a linux remote with no cross-builder), resolveSystem (the nix-system probe), mirrorRemoteCollection, waitForNextClient. The falsifiability gate was the remote-process-monitor demo — the same three-tier shape R-2 will have, exercising stdio-over-ssh, snapshot-then-delta across a network hop, and deferred heartbeat surviving multi-minute cold realisation.
R-1.6 — the shared provider engine #1004
The per-terminal provider DAG (agent detectors, git/PR watchers, command tracker, foreground observer) moved out of LocalTerminalBackend into terminalBackend/providers.ts, parameterized over ProviderHooks so the same engine runs on kolu-server and a future remote agent. The type-fence is load-bearing: writing a live field through updateServerMetadata is a compile error, so the terminals:dirty autosave firehose can’t be reintroduced by a new provider.
R-4 — kaval, the surviving daemon shipped · A1–B3.4
Local PTYs moved into kaval, a standalone hashed daemon kolu spawns and supervises over a unix socket. A kolu-server restart re-runs the cheap layer (the providers) fresh against the surviving PTYs, so detection is never stale while the terminals persist — only a rare wire-contract change forces terminal loss. The full arc — the spawn-policy inversion (B0), the kaval binary + client (B1), the topology flip with the honest degraded state (B2), and the survival chain (B3.1 refactor → B3.2 supervised restart → B3.3 adoption → B3.4 currency nudge) — plus the #1034 postmortem that constrains it, all live in pty-daemon. What matters to this plan is what R-4 left behind for R-2/R-3: a hostId-keyed endpoint + status model, id-keyed whole-record adoption, the per-connect system.version handshake, and B0’s fully-specified spawn {argv, env, initFiles} + system.info wire — every one of them shaped host-count-agnostic.
What’s next — R-2 then R-3
Everything before R-2 is shipped, so the remaining work is narrow and additive. R-4 and @kolu/surface-nix-host already paid for the hard parts:
R-2 — remote terminals, the kaval-endpoint way (confirmed; see kaval-sessions). A kaval is a dialed endpoint, so remote is not a second backend: one backend bound to an endpoint, never complected with location — P0 collapses R-1’s anticipated second implementation. The one genuinely-new piece is the ssh driver — reach + provision of the same kaval closure via @kolu/surface-nix-host — proven first in kaval-tui (create → --host, hosts auto-detected from ssh config). The canvas then multiplexes kavals: a tile per kaval, local always present, remotes on demand. The P0–P4 phases, the remote-only seams (the composeSpawnInput cleanEnv()→system.info inversion, the paste/upload PRECONDITION_FAILED guard, the drvPath resolver via resolveSystem), and the switch-vs-multiplex call all live in the child note.
R-3 — close the overflow-recovery loop ~130 LoC w/ tests. The one genuinely-remaining, local-testable item against master. kaval’s bounded channel sheds a slow attach subscriber by ending its iterator — today indistinguishable on the wire from a PTY exit, so the client treats the drop as terminal and silently freezes scrollback (onOverflow exists on the channel but is unwired). The fix threads a typed “overflow → re-attach” signal through the attach contract so the client re-attaches for a fresh snapshot (reusing the existing onRetry xterm/scroll-lock reset), distinct from PTY-exit. Everything else R-3 once owned already landed: the reattach machinery and the system.version handshake shipped with R-4; remoteSessionId persistence (an additive ServerPersistedTerminalFieldsSchema field) and host-picker pre-warm are inert until R-2’s ssh driver consumes them and fold into R-2; the --refresh force-fresh-eval hack lived only in the discarded #994 prototype, so there is nothing to retire against master. (Open question: whether an overflow-drop is reachable on the local socket at all, or only meaningfully across an ssh hop — if the latter, R-3 collapses to an R-2 companion. herdr vs. kolu independently flagged the same missing overflow-vs-exit distinction.)
History
- 2026-05-26 → 28: R-1 (
#981 ), R-1.5 (
#984 ), R-1.6 (
#1004 ) shipped — the seam, the framework gaps +
@kolu/surface-nix-host, and the shared provider engine. The earlier hand-rolled-stdio R-2 exploration ( #976 ) closed as wrong-altitude; the #994 ssh prototype (working remote terminals, 46 commits, never merged) seeded the design and was superseded by this carve. - R-4 re-sequenced before R-2/R-3 (the
/lowycollapse insight) and split into its own note. Its full build history — R4a/R4b, the #1010 local-daemon and #994 remote prototypes, the #1031/#1034 production postmortems, the kaval reframing, and B0 → B3.4 — lives in pty-daemon, not here. - Multi-host direction amended (2026-06-11): the end state is 1 local + N ssh-remote kaval daemons, host-switched in the ChromeBar; R-4’s redo bakes the host-count-agnostic shapes so R-2 retrofits nothing.
- This revamp (2026-06-14): compacted now that R-4 ships the whole survival chain — the prototype retros, process lessons, and R-4-internal volatility axes moved to their real owner (pty-daemon) — and R-2/R-3 re-scoped against master (R-2 = a driver behind a shipped seam; R-3 = the overflow-recovery loop, the rest folding into R-2).