docs: update docs for TCP probe refactor and frontend reconnection listeners

Replace stale attempt_ws_connect() references with attempt_tcp_probe()
across all docs. Add progress entry for reconnection hardening session.
Update CHANGELOG with new entries and probe refactor change.
This commit is contained in:
Hibryda 2026-03-06 21:50:54 +01:00
parent 71100da125
commit 4c06b5f121
6 changed files with 24 additions and 7 deletions

View file

@ -185,8 +185,9 @@ Controller Relay
- Controller reconnects with exponential backoff (1s, 2s, 4s, 8s, 16s, 30s cap)
- Reconnection runs as an async tokio task spawned on disconnect
- Uses `attempt_ws_connect()` probe: connects with auth header, immediately closes (5s timeout)
- Uses `attempt_tcp_probe()`: TCP connect only (no WS upgrade), 5s timeout, default port 9750. Avoids allocating per-connection resources (PtyManager, SidecarManager) on the relay during probes.
- Emits `remote-machine-reconnecting` event (with backoff duration) and `remote-machine-reconnect-ready` when probe succeeds
- Frontend listens via `onRemoteMachineReconnecting` and `onRemoteMachineReconnectReady` in remote-bridge.ts; machines store sets status to 'reconnecting' and auto-calls `connectMachine()` on ready
- Cancels if machine is removed or manually reconnected (checks status == "disconnected" && connection == None)
- On reconnect, relay sends current state snapshot (active sessions, PTY list)
- Controller reconciles: updates pane states, re-subscribes to streams
@ -274,7 +275,7 @@ Stored in SQLite `settings` table as JSON: `remote_machines` key.
- 12 Tauri commands: remote_add_machine, remote_remove_machine, remote_connect, remote_disconnect, remote_list_machines, remote_pty_spawn/write/resize/kill, remote_agent_query/stop, remote_sidecar_restart
- Heartbeat ping every 15s
- PTY creation event: emits `remote-pty-created` Tauri event with machineId, ptyId, commandId
- Exponential backoff reconnection on disconnect (1s/2s/4s/8s/16s/30s cap) via `attempt_ws_connect()` probe
- Exponential backoff reconnection on disconnect (1s/2s/4s/8s/16s/30s cap) via `attempt_tcp_probe()` (TCP-only, no WS upgrade)
- Reconnection events: `remote-machine-reconnecting`, `remote-machine-reconnect-ready`
### Phase D: Frontend integration [DONE]

View file

@ -282,7 +282,7 @@ Architecture designed in [multi-machine.md](multi-machine.md). Implementation ex
- [x] Heartbeat ping every 15s
- [x] PTY creation event: emits remote-pty-created Tauri event with machineId, ptyId, commandId
- [x] Exponential backoff reconnection on disconnect (1s/2s/4s/8s/16s/30s cap)
- [x] attempt_ws_connect() probe function (5s timeout, auth header, immediate close)
- [x] attempt_tcp_probe() function: TCP-only probe (5s timeout, default port 9750) — avoids allocating per-connection resources on relay during probes
- [x] Reconnection events: remote-machine-reconnecting, remote-machine-reconnect-ready
### Phase D: Frontend integration [status: complete]

View file

@ -323,7 +323,7 @@ Design: No separate sidecar process per subagent. Parent's sidecar handles all;
#### RemoteManager Reconnection
- [x] Exponential backoff reconnection in remote.rs: spawns async tokio task on disconnect
- [x] Backoff schedule: 1s, 2s, 4s, 8s, 16s, 30s (capped)
- [x] attempt_ws_connect() probe function: connects with proper WebSocket upgrade + auth header, 5s timeout, immediate close
- [x] attempt_tcp_probe() function: TCP-only connect probe (5s timeout, default port 9750) — avoids allocating per-connection resources on relay
- [x] Emits remote-machine-reconnecting (with backoffSecs) and remote-machine-reconnect-ready Tauri events
- [x] Cancellation: stops if machine removed (not in HashMap) or manually reconnected (status != disconnected)
- [x] Fixed scoping: disconnection cleanup uses inner block to release mutex before emitting event
@ -331,6 +331,20 @@ Design: No separate sidecar process per subagent. Parent's sidecar handles all;
#### RemoteManager PTY Creation Confirmation
- [x] Handles pty_created event type from relay: emits remote-pty-created Tauri event with machineId, ptyId, commandId
### Session: 2026-03-06 (continued) — Reconnection Hardening
#### TCP Probe Refactor
- [x] Replaced attempt_ws_connect() with attempt_tcp_probe() in remote.rs: TCP-only connect (no WS upgrade), 5s timeout, default port 9750
- [x] Avoids allocating per-connection resources (PtyManager, SidecarManager) on the relay during reconnection probes
- [x] Probe no longer needs auth token — only checks TCP reachability
#### Frontend Reconnection Listeners
- [x] Added onRemoteMachineReconnecting() listener in remote-bridge.ts: receives machineId + backoffSecs
- [x] Added onRemoteMachineReconnectReady() listener in remote-bridge.ts: receives machineId when probe succeeds
- [x] machines.svelte.ts: reconnecting handler sets machine status to 'reconnecting', shows toast with backoff duration
- [x] machines.svelte.ts: reconnect-ready handler auto-calls connectMachine() to re-establish full WebSocket connection
- [x] Updated docs/multi-machine.md to reflect TCP probe and frontend listener changes
### Next Steps
- [ ] Real-world relay testing (2 machines)
- [ ] TLS/certificate pinning for relay connections