kaval-tui that roams — remote attach that survives the network
Today kaval-tui attaches to a remote kaval over ssh stdio (TCP — it drops the moment you change networks). zmosh's idea, applied to kaval — teach kaval to also bind an encrypted-UDP listener beside its unix socket, so you attach once and keep the session through Wi-Fi↔cellular and sleep/wake, with no reconnect and no lost scrollback. No extra process; kaval already owns the hard half (server-side VT + snapshot-then-delta); only the transport changes.
kaval-tui attaches to kaval sessions — local over a unix socket, and (R-2) remote over ssh. zmosh (mmonad/zmosh) is a session daemon whose remote attach survives Wi-Fi↔cellular and sleep/wake with no reconnect. This note: give kaval-tui the same. 24-agent workflow · verified vs. both codebases
See it
Same command, today vs. with roaming. The difference is the 20 minutes you don’t notice.
That is the whole user-facing win: attach once, and the session is yours until you kill it — across network switches, VPN flips, and a closed lid. No ⚠, no spinner, no lost output.
One hop, one swap
kaval-tui ↔ kaval is a single hop — exactly zmosh’s shape. Local attach (unix socket) is already roam-proof; only the remote transport changes. Nothing about kaval the daemon, your shell, or detach/reattach moves.
- Local
kaval-tui attachis untouched — unix socket, same machine, nothing to roam. - Remote: kaval also binds an encrypted-UDP (QUIC) listener — a
serveOverUdplink beside today’sserveOverUnixSocket, serving the same shared router. No extra process. Started over ssh it hands off its port + cert (à la mosh); the ssh pipe closes and QUIC carries the session.
Already half-built
The expensive half of “mosh for a session daemon” is keeping a server-side terminal-state authority so a reconnecting client can be re-hydrated. kaval has it — @xterm/headless mirrors every byte and attach() hands back a race-free snapshot-then-delta (packages/kaval/src/ptyHost.ts:565-573). On overflow it sheds the wedged consumer and recovers with a fresh snapshot (packages/kaval/src/channel.ts:120-131). That is zmosh’s recovery model, already shipped.
So the work is only the transport — the daemon, the VT, the wire contract (ptyHostSurface · contract 3.0), and kaval-tui’s attach loop all stay as they are.
The transport — QUIC
The remote wire should be QUIC — and not as a hedge. QUIC is UDP + TLS 1.3 + reliable, multiplexed, ordered streams (RFC 9000/9001/9002). Two of its properties are exactly this problem:
- Connection migration is the roaming (RFC 9000 §9, §5.1, verified). A QUIC connection is keyed by a Connection ID in the packet, not the 4-tuple — so an IP/port change (Wi-Fi→cellular, NAT-rebind on wake) keeps the same connection after a one-round-trip
PATH_CHALLENGE/PATH_RESPONSE. No re-dial, no re-handshake. Only the client migrates — exactly kaval-tui’s direction. - A QUIC stream is reliable + ordered, so kaval’s wire rides it unchanged. kaval’s base64+newline oRPC already runs over a reliable duplex (
stdioLink); a QUIC bidi stream is that duplex — hand it to the same codec, zero reassembly shim.
So QUIC isn’t “zmosh’s idea, maybe” — it is zmosh’s design (per-packet AEAD, roaming, reliable transport), standardized, fuzzed, interop-tested, with path-spoofing defenses the bare seq rule lacks. You write none of it. That’s why hand-rolling a UDP transport is the wrong call: raw UDP forces you to rebuild sequencing/retransmit/ARQ by hand — the one piece mosh got to skip (it discards lost frames; an RPC byte-stream can’t).
The one honest caveat is the Node library, not the protocol (verified, mid-2026): there is no turnkey-perfect QUIC for Node yet.
| Option | State | Call |
|---|---|---|
node:quic (Node built-in, ngtcp2) | generic bidi streams ✓, but experimental (“Stability 1.0”) — needs a custom Node build (compile-time --experimental-quic + OpenSSL 3.5) plus a runtime flag; landed ~Node 26.2 | the target — bet on the standard; own the build |
@matrixai/quic (quiche bindings) | real ReadableStream/WritableStream ✓, but ~15 mo stale, single-sponsor, no linux-arm64 prebuilt (aarch64 compiles quiche from Rust) | prototype only — validates the design, too thin to depend on |
| hand-rolled crypto + ARQ | rejected — rebuilds what QUIC ships |
Doing it properly means node:quic, with the QUIC-enabled Node owned in the Nix closure — a clean cost that shrinks as the flag graduates — not an under-maintained binding. Nix is the ideal place to own that runtime build.
Phases
Prerequisite — node:quic in Nix (day one, not a phase). kolu is Nix-first, so the QUIC-enabled Node is baked into the flake from the start, not bolted on later: build Node with --experimental-quic (needs OpenSSL ≥ 3.5) in the devShell and in the kaval/agent closures. One change to the flake’s Node derivation; kolu and drishti both inherit it (both run node:quic). Everything below assumes it. Drop the compile flag if/when it graduates upstream — no code change, just the derivation.
There’s no decision-spike left (library/package settled: node:quic, codec in @kolu/surface/links/quic.ts), and nothing is demonstrable until the daemon roams — so the transport and its demo land together. The “session survives a roam” win is a daemon property, so the demo is kaval, not drishti: drishti has no surviving session (a dropped agent just re-streams), so it can de-risk the transport (and .claude/rules/surface.md mandates its PR) but it can’t show the feature.
| Phase | Ships | Visible? |
|---|---|---|
P1 · transport + kaval-tui --host roaming demo | the links/quic codec, kaval’s QUIC listener, the HostSession dial seam, and kaval-tui’s --host path — end to end | yes — attach → roam Wi-Fi↔cellular → session + scrollback survive, no reconnect |
| P2 · kolu dials a remote kaval | kolu’s TerminalEndpoint registry dials remote kavals over QUIC; ssh-stdio kept as the auto-fallback | yes — a roaming remote tile on the canvas |
P1, file by file
A single vertical slice. Each piece mirrors an existing member of the link family, so the pattern is already established in-tree:
@kolu/surface— the codec, client + server halves (mirror the unix-socket pair):src/links/quic.ts→quicLink({ host, port, certFingerprint }): open anode:quicsession, take one bidiQuicStream(a NodeDuplex), hand{ read, write }to the existingstdioLinkcodec. Direct sibling ofsrc/links/unix-socket.ts, which does exactly this with anet.Socket.src/quic.ts→serveOverQuic({ router, tlsKeyPair }): bind aQuicEndpoint, serverouterover each accepted bidi stream with the same peer framing asserveOverUnixSocket(src/unix-socket.ts:205) andserveOverStdio(src/peer-server.ts:130). Add./links/quic+./quicto packageexports.
packages/kaval— bind the listener beside the unix socket (additive, never a replacement):src/servePtyHostOverQuic.ts→ wrapsserveOverQuicfor the sameservedRouter(createInProcessPtyHost); the unix-socket precedent issrc/serveOverSocket.ts. Mints an ephemeral cert, binds an ephemeral UDP port, returns{ port, certFingerprint, close }.src/daemonMain.ts(~:45–50, besideservePtyHostOverUnixSocket) → under a--quicflag, also serve QUIC and print one line to stdout:KAVAL_QUIC <port> <certFingerprint>.
@kolu/surface-nix-host— the dial + supervise seam (the one genuinely hard change):src/hostSession.ts:spawn()(:388–547) → addtransport: "stdio" | "quic". Today it wrapsstdioLink({ read: child.stdout, write: child.stdin })(:524) and treatschild.on("exit")as the disconnect. For"quic": ssh starts the agent with--quic, read itsKAVAL_QUICline, then close the ssh child and buildquicLink({ host, port, certFingerprint }). The load-bearing flip: with stdio the ssh child is the link (its TCP death = disconnect); with QUIC the ssh child is only the bootstrap, so it must be closed — otherwise a roam kills its TCP and tripshandleChildDoneeven though QUIC migrated. Supervision moves from the child’sexitto the QUIC link’s connection-close event;scheduleReconnect/backoff/recheckreuse unchanged.
packages/kaval-tui— the demo driver:src/connect.ts→ addconnectPtyHostQuic({ host, port, certFingerprint })viaquicLink— the file already anticipates it (“the same link the ssh/daemon path will reuse”).src/main.ts→--host <ssh>:provisionAgent→ ssh-startkaval --quic→ read the bootstrap line →connectPtyHostQuic→attach. This--hostpath is the driver kolu reuses in P2.
Bootstrap (mosh-style). (1) provisionAgent (nix copy --derivation → realise) puts kaval on the host. (2) ssh runs kaval --quic, which binds a unix socket (local) + a QUIC listener (ephemeral UDP port, ephemeral self-signed cert) and prints KAVAL_QUIC <port> <certFingerprint>. (3) kaval-tui reads that line and closes ssh. (4) It dials quic://host:<port> pinning the fingerprint — TLS secures the channel, and the fingerprint, delivered over the authenticated ssh hop, is the trust anchor (TLS alone doesn’t say which kaval). (5) Only QUIC after that; an IP change migrates by Connection ID, no re-dial.
Acceptance — prove the roam, don’t assume it. Attach to a remote kaval running a counter; force a client IP/port change (a NAT rebind, or move the client between two network namespaces); the session keeps streaming with one PATH_CHALLENGE/PATH_RESPONSE and no reconnect log line, scrollback intact. The same flow over transport: "stdio" drops (agent exited → reconnect) — that contrast is the demo.
Companion (mandated, not a phase). The paired drishti PR for the @kolu/surface + HostSession change (.claude/rules/surface.md). drishti dials its process-monitor agent with transport: "quic"; a sequence/echo agent proves the bytes migrate across an IP change. drishti can’t show session-survival (no daemon) — that’s P1’s job.
P2 — kolu + the fallback
kolu’s endpoint registry (the TerminalEndpoint seam, #1364) dials remote kavals through the same HostSession transport: "quic". ssh-stdio stays as the automatic fallback: if QUIC can’t establish (UDP blocked, handshake timeout) fall back to transport: "stdio" — no roaming, but it always connects. Terminal input never rides 0-RTT early data (no replay protection). The canvas multiplexes a roaming remote kaval beside local ones.
zmosh cloned at HEAD; kaval claims verified against packages/kaval; QUIC claims verified against RFC 9000/9001/9002 and the mid-2026 Node QUIC landscape; placement per a lowy + hickey lens pass. Sibling to kaval-sessions (the kaval-tui plan).