docs: update docs for relay hardening, reconnection, and session wrap
Update multi-machine docs with reconnection implementation details, command response propagation, and pty_created confirmation flow. Mark reconnection as complete in phases.md, progress.md, TODO.md. Update CLAUDE.md files with reconnection and relay response info. Add CHANGELOG entries for new features.
This commit is contained in:
parent
b0cce7ae4f
commit
218570ac35
10 changed files with 84 additions and 35 deletions
|
|
@ -48,7 +48,8 @@
|
||||||
- RemoteManager (src-tauri/src/remote.rs) manages WebSocket client connections to bterminal-relay instances. 12 Tauri commands prefixed with `remote_`.
|
- RemoteManager (src-tauri/src/remote.rs) manages WebSocket client connections to bterminal-relay instances. 12 Tauri commands prefixed with `remote_`.
|
||||||
- remote-bridge.ts adapter wraps remote machine management IPC. machines.svelte.ts store tracks remote machine state.
|
- remote-bridge.ts adapter wraps remote machine management IPC. machines.svelte.ts store tracks remote machine state.
|
||||||
- Pane.remoteMachineId?: string routes operations through RemoteManager instead of local managers. Bridge adapters (pty-bridge, agent-bridge) check this field.
|
- Pane.remoteMachineId?: string routes operations through RemoteManager instead of local managers. Bridge adapters (pty-bridge, agent-bridge) check this field.
|
||||||
- bterminal-relay binary (v2/bterminal-relay/) is a standalone WebSocket server with token auth, rate limiting, and per-connection isolated managers.
|
- bterminal-relay binary (v2/bterminal-relay/) is a standalone WebSocket server with token auth, rate limiting, and per-connection isolated managers. Commands return structured responses (pty_created, pong, error) with commandId for correlation via send_error() helper.
|
||||||
|
- RemoteManager reconnection: exponential backoff (1s-30s cap) on disconnect, attempt_ws_connect() probe, emits remote-machine-reconnecting and remote-machine-reconnect-ready events.
|
||||||
|
|
||||||
## Memora Tags
|
## Memora Tags
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -8,6 +8,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||||
## [Unreleased]
|
## [Unreleased]
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
|
- Exponential backoff reconnection in RemoteManager: on disconnect, spawns async task with 1s/2s/4s/8s/16s/30s-cap backoff, uses attempt_ws_connect() probe (5s timeout), emits remote-machine-reconnecting and remote-machine-reconnect-ready events
|
||||||
|
- Relay command response propagation: bterminal-relay now sends structured responses (pty_created, pong, error) back to client via shared event channel with commandId correlation
|
||||||
|
- send_error() helper in bterminal-relay for consistent error reporting across all command handlers
|
||||||
|
- PTY creation confirmation flow: pty_create command returns pty_created event with session ID and commandId; RemoteManager emits remote-pty-created Tauri event
|
||||||
- bterminal-core shared crate with EventSink trait: extracted PtyManager and SidecarManager into reusable crate at v2/bterminal-core/, EventSink trait abstracts event emission for both Tauri and WebSocket contexts
|
- bterminal-core shared crate with EventSink trait: extracted PtyManager and SidecarManager into reusable crate at v2/bterminal-core/, EventSink trait abstracts event emission for both Tauri and WebSocket contexts
|
||||||
- bterminal-relay WebSocket server binary: standalone Rust binary at v2/bterminal-relay/ with token auth (--port, --token, --insecure CLI flags), rate limiting (10 attempts, 5min lockout), per-connection isolated PTY + sidecar managers
|
- bterminal-relay WebSocket server binary: standalone Rust binary at v2/bterminal-relay/ with token auth (--port, --token, --insecure CLI flags), rate limiting (10 attempts, 5min lockout), per-connection isolated PTY + sidecar managers
|
||||||
- RemoteManager for multi-machine WebSocket connections: v2/src-tauri/src/remote.rs manages WebSocket client connections to relay instances, 12 new Tauri commands for remote operations, heartbeat ping every 15s
|
- RemoteManager for multi-machine WebSocket connections: v2/src-tauri/src/remote.rs manages WebSocket client connections to relay instances, 12 new Tauri commands for remote operations, heartbeat ping every 15s
|
||||||
|
|
@ -44,6 +48,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||||
- tempfile dev dependency for Rust test isolation
|
- tempfile dev dependency for Rust test isolation
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
|
- bterminal-relay command handlers refactored: all error paths now use send_error() helper instead of log::error!() only; pong response sent via event channel instead of no-op
|
||||||
|
- RemoteManager disconnect handler: scoped mutex release before event emission to prevent deadlocks; spawns reconnection task
|
||||||
- PtyManager and SidecarManager extracted from src-tauri to bterminal-core shared crate (src-tauri now has thin re-export wrappers)
|
- PtyManager and SidecarManager extracted from src-tauri to bterminal-core shared crate (src-tauri now has thin re-export wrappers)
|
||||||
- Cargo workspace structure at v2/ level: members = [src-tauri, bterminal-core, bterminal-relay], Cargo.lock moved from src-tauri/ to workspace root
|
- Cargo workspace structure at v2/ level: members = [src-tauri, bterminal-core, bterminal-relay], Cargo.lock moved from src-tauri/ to workspace root
|
||||||
- agent-bridge.ts and pty-bridge.ts extended with remote routing (check remoteMachineId, route to remote_* commands)
|
- agent-bridge.ts and pty-bridge.ts extended with remote routing (check remoteMachineId, route to remote_* commands)
|
||||||
|
|
|
||||||
29
CLAUDE.md
29
CLAUDE.md
|
|
@ -2,7 +2,7 @@
|
||||||
|
|
||||||
## Project Overview
|
## Project Overview
|
||||||
|
|
||||||
Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Python) is production-stable. v2 redesign (Tauri 2.x + Svelte 5 + Claude Agent SDK) Phases 1-7 complete. Packaging: .deb + AppImage via GitHub Actions CI.
|
Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Python) is production-stable. v2 redesign (Tauri 2.x + Svelte 5 + Claude Agent SDK) Phases 1-7 + multi-machine (A-D) complete. Packaging: .deb + AppImage via GitHub Actions CI.
|
||||||
|
|
||||||
- **Repository:** github.com/DexterFromLab/BTerminal
|
- **Repository:** github.com/DexterFromLab/BTerminal
|
||||||
- **License:** MIT
|
- **License:** MIT
|
||||||
|
|
@ -22,13 +22,18 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth
|
||||||
| `install-v2.sh` | v2 build-from-source installer (Node.js 20+, Rust 1.77+, system libs) |
|
| `install-v2.sh` | v2 build-from-source installer (Node.js 20+, Rust 1.77+, system libs) |
|
||||||
| `.github/workflows/release.yml` | CI: builds .deb + AppImage on v* tags, uploads to GitHub Releases |
|
| `.github/workflows/release.yml` | CI: builds .deb + AppImage on v* tags, uploads to GitHub Releases |
|
||||||
| `docs/task_plan.md` | v2 architecture decisions and strategies |
|
| `docs/task_plan.md` | v2 architecture decisions and strategies |
|
||||||
| `docs/phases.md` | v2 implementation phases (1-7) |
|
| `docs/phases.md` | v2 implementation phases (1-7 + multi-machine A-D) |
|
||||||
| `docs/findings.md` | v2 research findings |
|
| `docs/findings.md` | v2 research findings |
|
||||||
| `docs/progress.md` | Session progress log |
|
| `docs/progress.md` | Session progress log |
|
||||||
| `docs/multi-machine.md` | Multi-machine architecture (WebSocket relay, 4-phase plan) |
|
| `docs/multi-machine.md` | Multi-machine architecture (implemented, Phases A-D) |
|
||||||
| `v2/src-tauri/src/pty.rs` | PTY backend (portable-pty, PtyManager) |
|
| `v2/Cargo.toml` | Cargo workspace root (members: src-tauri, bterminal-core, bterminal-relay) |
|
||||||
| `v2/src-tauri/src/lib.rs` | Tauri commands (pty + agent + session + file + settings) |
|
| `v2/bterminal-core/` | Shared crate: EventSink trait, PtyManager, SidecarManager |
|
||||||
| `v2/src-tauri/src/sidecar.rs` | SidecarManager (Deno-first + Node.js fallback, SidecarCommand, NDJSON) |
|
| `v2/bterminal-relay/` | Standalone relay binary (WebSocket server, token auth, CLI) |
|
||||||
|
| `v2/src-tauri/src/pty.rs` | PTY backend (thin re-export from bterminal-core) |
|
||||||
|
| `v2/src-tauri/src/lib.rs` | Tauri commands (pty + agent + session + file + settings + 12 remote) |
|
||||||
|
| `v2/src-tauri/src/sidecar.rs` | SidecarManager (thin re-export from bterminal-core) |
|
||||||
|
| `v2/src-tauri/src/event_sink.rs` | TauriEventSink (implements EventSink for AppHandle) |
|
||||||
|
| `v2/src-tauri/src/remote.rs` | RemoteManager (WebSocket client connections to relays) |
|
||||||
| `v2/src-tauri/src/session.rs` | SessionDb (rusqlite, sessions + layout + settings + ssh_sessions) |
|
| `v2/src-tauri/src/session.rs` | SessionDb (rusqlite, sessions + layout + settings + ssh_sessions) |
|
||||||
| `v2/src-tauri/src/watcher.rs` | FileWatcherManager (notify crate, file change events) |
|
| `v2/src-tauri/src/watcher.rs` | FileWatcherManager (notify crate, file change events) |
|
||||||
| `v2/src-tauri/src/ctx.rs` | CtxDb (read-only access to ~/.claude-context/context.db) |
|
| `v2/src-tauri/src/ctx.rs` | CtxDb (read-only access to ~/.claude-context/context.db) |
|
||||||
|
|
@ -44,6 +49,8 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth
|
||||||
| `v2/src/lib/adapters/settings-bridge.ts` | Settings IPC wrapper (get/set/list) |
|
| `v2/src/lib/adapters/settings-bridge.ts` | Settings IPC wrapper (get/set/list) |
|
||||||
| `v2/src/lib/adapters/ctx-bridge.ts` | ctx database IPC wrapper |
|
| `v2/src/lib/adapters/ctx-bridge.ts` | ctx database IPC wrapper |
|
||||||
| `v2/src/lib/adapters/ssh-bridge.ts` | SSH session IPC wrapper |
|
| `v2/src/lib/adapters/ssh-bridge.ts` | SSH session IPC wrapper |
|
||||||
|
| `v2/src/lib/adapters/remote-bridge.ts` | Remote machine management IPC wrapper |
|
||||||
|
| `v2/src/lib/stores/machines.svelte.ts` | Remote machine state store (Svelte 5 runes) |
|
||||||
| `v2/src/lib/utils/agent-tree.ts` | Agent tree builder (hierarchy from messages) |
|
| `v2/src/lib/utils/agent-tree.ts` | Agent tree builder (hierarchy from messages) |
|
||||||
| `v2/src/lib/utils/highlight.ts` | Shiki syntax highlighter (lazy singleton, 13 languages) |
|
| `v2/src/lib/utils/highlight.ts` | Shiki syntax highlighter (lazy singleton, 13 languages) |
|
||||||
| `v2/src/lib/utils/detach.ts` | Detached pane mode (pop-out windows via URL params) |
|
| `v2/src/lib/utils/detach.ts` | Detached pane mode (pop-out windows via URL params) |
|
||||||
|
|
@ -75,16 +82,20 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth
|
||||||
- Context DB: `~/.claude-context/context.db`
|
- Context DB: `~/.claude-context/context.db`
|
||||||
- Theme: Catppuccin Mocha
|
- Theme: Catppuccin Mocha
|
||||||
|
|
||||||
## v2 Stack (Phases 1-7 complete, branch: v2-mission-control)
|
## v2 Stack (Phases 1-7 + Multi-Machine A-D complete, branch: v2-mission-control)
|
||||||
|
|
||||||
- Tauri 2.x (Rust backend) + Svelte 5 (frontend)
|
- Tauri 2.x (Rust backend) + Svelte 5 (frontend)
|
||||||
|
- Cargo workspace: bterminal-core (shared), bterminal-relay (remote binary), src-tauri (Tauri app)
|
||||||
- xterm.js with Canvas addon (no WebGL on WebKit2GTK)
|
- xterm.js with Canvas addon (no WebGL on WebKit2GTK)
|
||||||
- Agent sessions via `claude` CLI subprocess with `--output-format stream-json`
|
- Agent sessions via `claude` CLI subprocess with `--output-format stream-json`
|
||||||
- Sidecar manages claude processes (Deno-first + Node.js fallback, stdio NDJSON to Rust)
|
- Sidecar manages claude processes (Deno-first + Node.js fallback, stdio NDJSON to Rust)
|
||||||
- portable-pty for terminal management
|
- portable-pty for terminal management (in bterminal-core)
|
||||||
|
- Multi-machine: bterminal-relay WebSocket server + RemoteManager WebSocket client
|
||||||
- SQLite session persistence (rusqlite, WAL mode) + layout restore on startup
|
- SQLite session persistence (rusqlite, WAL mode) + layout restore on startup
|
||||||
- File watcher (notify crate) for live markdown viewer
|
- File watcher (notify crate) for live markdown viewer
|
||||||
- Rust deps: tauri, portable-pty, rusqlite (bundled), dirs, notify, uuid, serde, tokio, tauri-plugin-updater
|
- Rust deps (src-tauri): tauri, bterminal-core (path), rusqlite (bundled), dirs, notify, serde, tokio, tokio-tungstenite, futures-util, tauri-plugin-updater
|
||||||
|
- Rust deps (bterminal-core): portable-pty, uuid, serde, serde_json, log
|
||||||
|
- Rust deps (bterminal-relay): bterminal-core, tokio, tokio-tungstenite, clap, env_logger, futures-util
|
||||||
- npm deps: @xterm/xterm, @xterm/addon-canvas, @xterm/addon-fit, @tauri-apps/api, @tauri-apps/plugin-updater, marked, shiki, vitest (dev)
|
- npm deps: @xterm/xterm, @xterm/addon-canvas, @xterm/addon-fit, @tauri-apps/api, @tauri-apps/plugin-updater, marked, shiki, vitest (dev)
|
||||||
- Source: `v2/` directory
|
- Source: `v2/` directory
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -136,7 +136,7 @@ cd v2 && cargo build --release -p bterminal-relay
|
||||||
./target/release/bterminal-relay --port 9750 --token <secret> --insecure
|
./target/release/bterminal-relay --port 9750 --token <secret> --insecure
|
||||||
```
|
```
|
||||||
|
|
||||||
Add remote machines in BTerminal Settings > Remote Machines (label, URL, token). Remote panes auto-group by machine label in the sidebar.
|
Add remote machines in BTerminal Settings > Remote Machines (label, URL, token). Remote panes auto-group by machine label in the sidebar. Connections automatically reconnect with exponential backoff (1s-30s cap) on disconnect.
|
||||||
|
|
||||||
See [docs/multi-machine.md](docs/multi-machine.md) for full architecture details.
|
See [docs/multi-machine.md](docs/multi-machine.md) for full architecture details.
|
||||||
|
|
||||||
|
|
@ -145,10 +145,10 @@ See [docs/multi-machine.md](docs/multi-machine.md) for full architecture details
|
||||||
| Document | Description |
|
| Document | Description |
|
||||||
|----------|-------------|
|
|----------|-------------|
|
||||||
| [docs/task_plan.md](docs/task_plan.md) | v2 architecture decisions, error handling, testing strategy |
|
| [docs/task_plan.md](docs/task_plan.md) | v2 architecture decisions, error handling, testing strategy |
|
||||||
| [docs/phases.md](docs/phases.md) | v2 implementation phases (1-7) with checklists |
|
| [docs/phases.md](docs/phases.md) | v2 implementation phases (1-7 + multi-machine A-D) with checklists |
|
||||||
| [docs/findings.md](docs/findings.md) | Research findings (Agent SDK, Tauri, xterm.js, performance) |
|
| [docs/findings.md](docs/findings.md) | Research findings (Agent SDK, Tauri, xterm.js, performance) |
|
||||||
| [docs/progress.md](docs/progress.md) | Session-by-session progress log |
|
| [docs/progress.md](docs/progress.md) | Session-by-session progress log |
|
||||||
| [docs/multi-machine.md](docs/multi-machine.md) | Multi-machine architecture (WebSocket relay, remote agents) |
|
| [docs/multi-machine.md](docs/multi-machine.md) | Multi-machine architecture (implemented, WebSocket relay, reconnection) |
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
|
|
|
||||||
19
TODO.md
19
TODO.md
|
|
@ -4,26 +4,19 @@
|
||||||
|
|
||||||
- [ ] **Deno sidecar real-world testing** -- Integrated into sidecar.rs (Deno-first + Node.js fallback). Needs testing with real claude CLI and startup time benchmark vs Node.js.
|
- [ ] **Deno sidecar real-world testing** -- Integrated into sidecar.rs (Deno-first + Node.js fallback). Needs testing with real claude CLI and startup time benchmark vs Node.js.
|
||||||
- [ ] **E2E testing (Playwright/WebDriver)** -- Scaffold at v2/tests/e2e/README.md. Needs display server to run. Test: open terminal, run command, open agent, verify output.
|
- [ ] **E2E testing (Playwright/WebDriver)** -- Scaffold at v2/tests/e2e/README.md. Needs display server to run. Test: open terminal, run command, open agent, verify output.
|
||||||
- [ ] **Multi-machine reconnection** -- Implement exponential backoff reconnection logic (1s-30s cap) in RemoteManager for dropped WebSocket connections.
|
|
||||||
- [ ] **Multi-machine real-world testing** -- Test bterminal-relay with 2 machines (local + 1 remote). Verify PTY + agent operations over WebSocket.
|
- [ ] **Multi-machine real-world testing** -- Test bterminal-relay with 2 machines (local + 1 remote). Verify PTY + agent operations over WebSocket.
|
||||||
- [ ] **Multi-machine TLS/certificate pinning** -- Add TLS support to bterminal-relay and certificate pinning in RemoteManager for production security.
|
- [ ] **Multi-machine TLS/certificate pinning** -- Add TLS support to bterminal-relay and certificate pinning in RemoteManager for production security.
|
||||||
- [ ] **Agent Teams real-world testing** -- Frontend routing implemented (Phase 7). Needs testing with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 and real subagent spawning.
|
- [ ] **Agent Teams real-world testing** -- Frontend routing implemented (Phase 7). Needs testing with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 and real subagent spawning.
|
||||||
|
|
||||||
## Completed
|
## Completed
|
||||||
|
|
||||||
|
- [x] **Multi-machine reconnection** -- Exponential backoff reconnection (1s-30s cap) in RemoteManager, attempt_ws_connect() probe, reconnection events. | Done: 2026-03-06
|
||||||
|
- [x] **Relay command response propagation** -- Structured responses (pty_created, pong, error) with commandId correlation, send_error() helper. | Done: 2026-03-06
|
||||||
- [x] **Multi-machine support (Phases A-D)** -- bterminal-core crate extraction, bterminal-relay WebSocket binary, RemoteManager, frontend integration. | Done: 2026-03-06
|
- [x] **Multi-machine support (Phases A-D)** -- bterminal-core crate extraction, bterminal-relay WebSocket binary, RemoteManager, frontend integration. | Done: 2026-03-06
|
||||||
|
|
||||||
- [x] **Set TAURI_SIGNING_PRIVATE_KEY secret** -- Set via `gh secret set` on DexterFromLab/BTerminal. | Done: 2026-03-06
|
|
||||||
- [x] **Dispatcher tests for subagent routing** -- 10 new tests covering spawn, dedup, child message routing, init/cost forwarding, fallbacks. Total: 28 dispatcher tests. | Done: 2026-03-06
|
|
||||||
- [x] **Subagent cost aggregation** -- `getTotalCost()` recursive helper in agents store, total cost shown in parent pane done-bar when children present. | Done: 2026-03-06
|
|
||||||
- [x] **Agent Teams frontend support** -- Subagent pane spawning, parent/child navigation, message routing by parentId, SUBAGENT_TOOL_NAMES detection in dispatcher. | Done: 2026-03-06
|
- [x] **Agent Teams frontend support** -- Subagent pane spawning, parent/child navigation, message routing by parentId, SUBAGENT_TOOL_NAMES detection in dispatcher. | Done: 2026-03-06
|
||||||
|
- [x] **Subagent cost aggregation** -- `getTotalCost()` recursive helper in agents store, total cost shown in parent pane done-bar when children present. | Done: 2026-03-06
|
||||||
|
- [x] **Dispatcher tests for subagent routing** -- 10 new tests covering spawn, dedup, child message routing, init/cost forwarding, fallbacks. Total: 28 dispatcher tests. | Done: 2026-03-06
|
||||||
- [x] **Session groups/folders** -- group_name column in sessions table, setPaneGroup in layout store, collapsible group headers in sidebar, right-click to set group. | Done: 2026-03-06
|
- [x] **Session groups/folders** -- group_name column in sessions table, setPaneGroup in layout store, collapsible group headers in sidebar, right-click to set group. | Done: 2026-03-06
|
||||||
- [x] **Auto-update signing key** -- Generated minisign keypair, pubkey set in tauri.conf.json. | Done: 2026-03-06
|
|
||||||
- [x] **Deno sidecar integration** -- SidecarCommand struct, resolve_sidecar_command() with Deno-first + Node.js fallback, both runners bundled in tauri.conf.json resources. | Done: 2026-03-06
|
- [x] **Deno sidecar integration** -- SidecarCommand struct, resolve_sidecar_command() with Deno-first + Node.js fallback, both runners bundled in tauri.conf.json resources. | Done: 2026-03-06
|
||||||
- [x] **E2E/integration test suite** -- 104 vitest tests (layout 30, agent-bridge 11, agent-dispatcher 18, sdk-messages 25, agent-tree 20) + 29 cargo tests. | Done: 2026-03-06
|
- [x] **E2E/integration test suite** -- 114 vitest tests + 29 cargo tests. | Done: 2026-03-06
|
||||||
- [x] **Copy/paste (Ctrl+Shift+C/V)** -- TerminalPane attachCustomKeyEventHandler, C copies selection, V writes clipboard to PTY. | Done: 2026-03-06
|
- [x] **Set TAURI_SIGNING_PRIVATE_KEY secret** -- Set via `gh secret set` on DexterFromLab/BTerminal. | Done: 2026-03-06
|
||||||
- [x] **Terminal theme hot-swap** -- onThemeChange callback registry in theme.svelte.ts, TerminalPane subscribes. All open terminals update. | Done: 2026-03-06
|
|
||||||
- [x] **Tree node click -> scroll to message** -- handleTreeNodeClick in AgentPane, scrollIntoView smooth. | Done: 2026-03-06
|
|
||||||
- [x] **Subtree cost display** -- Yellow cost text below each tree node (subtreeCost util, NODE_H 32->40). | Done: 2026-03-06
|
|
||||||
- [x] **Session resume** -- Follow-up prompt in AgentPane, resume_session_id passed to SDK. | Done: 2026-03-06
|
|
||||||
- [x] **Pane drag-resize handles** -- Splitter overlays in TilingGrid with mouse drag, 10-90% clamping. | Done: 2026-03-06
|
|
||||||
|
|
|
||||||
|
|
@ -17,7 +17,7 @@ Project documentation lives here.
|
||||||
| Document | Description |
|
| Document | Description |
|
||||||
|----------|-------------|
|
|----------|-------------|
|
||||||
| [task_plan.md](task_plan.md) | v2 architecture decisions, error handling, testing strategy |
|
| [task_plan.md](task_plan.md) | v2 architecture decisions, error handling, testing strategy |
|
||||||
| [phases.md](phases.md) | v2 implementation phases (1-7) with checklists |
|
| [phases.md](phases.md) | v2 implementation phases (1-7 + multi-machine A-D) with checklists |
|
||||||
| [findings.md](findings.md) | Research findings (Agent SDK, Tauri, xterm.js, performance) |
|
| [findings.md](findings.md) | Research findings (Agent SDK, Tauri, xterm.js, performance) |
|
||||||
| [progress.md](progress.md) | Session-by-session progress log |
|
| [progress.md](progress.md) | Session-by-session progress log |
|
||||||
| [multi-machine.md](multi-machine.md) | Multi-machine support architecture (WebSocket, relay binary) |
|
| [multi-machine.md](multi-machine.md) | Multi-machine support architecture (implemented, WebSocket relay, reconnection) |
|
||||||
|
|
|
||||||
|
|
@ -146,7 +146,7 @@ interface RelayCommand {
|
||||||
|
|
||||||
// Relay → Controller (events)
|
// Relay → Controller (events)
|
||||||
interface RelayEvent {
|
interface RelayEvent {
|
||||||
type: 'pty_data' | 'pty_exit'
|
type: 'pty_data' | 'pty_exit' | 'pty_created'
|
||||||
| 'sidecar_message' | 'sidecar_exited'
|
| 'sidecar_message' | 'sidecar_exited'
|
||||||
| 'error' | 'pong' | 'ready';
|
| 'error' | 'pong' | 'ready';
|
||||||
sessionId?: string;
|
sessionId?: string;
|
||||||
|
|
@ -181,9 +181,13 @@ Controller Relay
|
||||||
│── reconnect (exp backoff) ─────→│ (1s, 2s, 4s, 8s, max 30s)
|
│── reconnect (exp backoff) ─────→│ (1s, 2s, 4s, 8s, max 30s)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Reconnection
|
### Reconnection (Implemented)
|
||||||
|
|
||||||
- Controller reconnects with exponential backoff (1s → 30s cap)
|
- Controller reconnects with exponential backoff (1s, 2s, 4s, 8s, 16s, 30s cap)
|
||||||
|
- Reconnection runs as an async tokio task spawned on disconnect
|
||||||
|
- Uses `attempt_ws_connect()` probe: connects with auth header, immediately closes (5s timeout)
|
||||||
|
- Emits `remote-machine-reconnecting` event (with backoff duration) and `remote-machine-reconnect-ready` when probe succeeds
|
||||||
|
- Cancels if machine is removed or manually reconnected (checks status == "disconnected" && connection == None)
|
||||||
- On reconnect, relay sends current state snapshot (active sessions, PTY list)
|
- On reconnect, relay sends current state snapshot (active sessions, PTY list)
|
||||||
- Controller reconciles: updates pane states, re-subscribes to streams
|
- Controller reconciles: updates pane states, re-subscribes to streams
|
||||||
- Active agent sessions continue on relay regardless of controller connection
|
- Active agent sessions continue on relay regardless of controller connection
|
||||||
|
|
@ -260,12 +264,18 @@ Stored in SQLite `settings` table as JSON: `remote_machines` key.
|
||||||
- Routes RelayCommand to bterminal-core managers, forwards RelayEvent over WebSocket
|
- Routes RelayCommand to bterminal-core managers, forwards RelayEvent over WebSocket
|
||||||
- Rate limiting: 10 failed auth attempts triggers 5-minute lockout
|
- Rate limiting: 10 failed auth attempts triggers 5-minute lockout
|
||||||
- Per-connection isolated PtyManager + SidecarManager instances
|
- Per-connection isolated PtyManager + SidecarManager instances
|
||||||
|
- Command response propagation: structured responses (pty_created, pong, error) sent back via shared event channel
|
||||||
|
- send_error() helper: all command failures emit RelayEvent with commandId + error message
|
||||||
|
- PTY creation confirmation: pty_create command returns pty_created event with session ID and commandId for correlation
|
||||||
|
|
||||||
### Phase C: Add `RemoteManager` to controller [DONE]
|
### Phase C: Add `RemoteManager` to controller [DONE]
|
||||||
|
|
||||||
- v2/src-tauri/src/remote.rs — RemoteManager struct with WebSocket client connections
|
- v2/src-tauri/src/remote.rs — RemoteManager struct with WebSocket client connections
|
||||||
- 12 Tauri commands: remote_add_machine, remote_remove_machine, remote_connect, remote_disconnect, remote_list_machines, remote_pty_spawn/write/resize/kill, remote_agent_query/stop, remote_sidecar_restart
|
- 12 Tauri commands: remote_add_machine, remote_remove_machine, remote_connect, remote_disconnect, remote_list_machines, remote_pty_spawn/write/resize/kill, remote_agent_query/stop, remote_sidecar_restart
|
||||||
- Heartbeat ping every 15s
|
- Heartbeat ping every 15s
|
||||||
|
- PTY creation event: emits `remote-pty-created` Tauri event with machineId, ptyId, commandId
|
||||||
|
- Exponential backoff reconnection on disconnect (1s/2s/4s/8s/16s/30s cap) via `attempt_ws_connect()` probe
|
||||||
|
- Reconnection events: `remote-machine-reconnecting`, `remote-machine-reconnect-ready`
|
||||||
|
|
||||||
### Phase D: Frontend integration [DONE]
|
### Phase D: Frontend integration [DONE]
|
||||||
|
|
||||||
|
|
@ -278,7 +288,8 @@ Stored in SQLite `settings` table as JSON: `remote_machines` key.
|
||||||
|
|
||||||
### Remaining Work
|
### Remaining Work
|
||||||
|
|
||||||
- [ ] Reconnection logic with exponential backoff (1s-30s cap)
|
- [x] Reconnection logic with exponential backoff (1s-30s cap) — implemented in remote.rs
|
||||||
|
- [x] Relay command response propagation (pty_created, pong, error events) — implemented in main.rs
|
||||||
- [ ] Real-world relay testing (2 machines)
|
- [ ] Real-world relay testing (2 machines)
|
||||||
- [ ] TLS/certificate pinning
|
- [ ] TLS/certificate pinning
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -271,12 +271,19 @@ Architecture designed in [multi-machine.md](multi-machine.md). Implementation ex
|
||||||
- [x] Routes RelayCommand to PtyManager/SidecarManager, forwards RelayEvent over WebSocket
|
- [x] Routes RelayCommand to PtyManager/SidecarManager, forwards RelayEvent over WebSocket
|
||||||
- [x] Rate limiting on auth failures (10 attempts, 5min lockout)
|
- [x] Rate limiting on auth failures (10 attempts, 5min lockout)
|
||||||
- [x] Per-connection isolated PTY + sidecar managers
|
- [x] Per-connection isolated PTY + sidecar managers
|
||||||
|
- [x] Command response propagation: structured responses (pty_created, pong, error) via shared event channel
|
||||||
|
- [x] send_error() helper for consistent error reporting with commandId correlation
|
||||||
|
- [x] PTY creation confirmation: pty_created event with session ID and commandId
|
||||||
|
|
||||||
### Phase C: Add `RemoteManager` to controller [status: complete]
|
### Phase C: Add `RemoteManager` to controller [status: complete]
|
||||||
- [x] New remote.rs module in src-tauri — WebSocket client connections to relay instances
|
- [x] New remote.rs module in src-tauri — WebSocket client connections to relay instances
|
||||||
- [x] Machine lifecycle: add/remove/connect/disconnect
|
- [x] Machine lifecycle: add/remove/connect/disconnect
|
||||||
- [x] 12 new Tauri commands for remote operations
|
- [x] 12 new Tauri commands for remote operations
|
||||||
- [x] Heartbeat ping every 15s
|
- [x] Heartbeat ping every 15s
|
||||||
|
- [x] PTY creation event: emits remote-pty-created Tauri event with machineId, ptyId, commandId
|
||||||
|
- [x] Exponential backoff reconnection on disconnect (1s/2s/4s/8s/16s/30s cap)
|
||||||
|
- [x] attempt_ws_connect() probe function (5s timeout, auth header, immediate close)
|
||||||
|
- [x] Reconnection events: remote-machine-reconnecting, remote-machine-reconnect-ready
|
||||||
|
|
||||||
### Phase D: Frontend integration [status: complete]
|
### Phase D: Frontend integration [status: complete]
|
||||||
- [x] remote-bridge.ts adapter for machine management + remote events
|
- [x] remote-bridge.ts adapter for machine management + remote events
|
||||||
|
|
@ -287,6 +294,7 @@ Architecture designed in [multi-machine.md](multi-machine.md). Implementation ex
|
||||||
- [x] Sidebar auto-groups remote panes by machine label
|
- [x] Sidebar auto-groups remote panes by machine label
|
||||||
|
|
||||||
### Remaining Work
|
### Remaining Work
|
||||||
- [ ] Reconnection logic with exponential backoff
|
- [x] Reconnection logic with exponential backoff — implemented in remote.rs
|
||||||
|
- [x] Relay command response propagation — implemented in bterminal-relay main.rs
|
||||||
- [ ] Real-world relay testing (2 machines)
|
- [ ] Real-world relay testing (2 machines)
|
||||||
- [ ] TLS/certificate pinning
|
- [ ] TLS/certificate pinning
|
||||||
|
|
|
||||||
|
|
@ -311,8 +311,27 @@ Design: No separate sidecar process per subagent. Parent's sidecar handles all;
|
||||||
- bterminal-relay: tokio, tokio-tungstenite, clap, env_logger, futures-util
|
- bterminal-relay: tokio, tokio-tungstenite, clap, env_logger, futures-util
|
||||||
- src-tauri: tokio-tungstenite, tokio, futures-util, uuid (added for RemoteManager)
|
- src-tauri: tokio-tungstenite, tokio, futures-util, uuid (added for RemoteManager)
|
||||||
|
|
||||||
|
### Session: 2026-03-06 (continued) — Relay Hardening & Reconnection
|
||||||
|
|
||||||
|
#### Relay Command Response Propagation
|
||||||
|
- [x] Shared event channel between EventSink and command response sender (sink_tx clone in bterminal-relay)
|
||||||
|
- [x] send_error() helper function: all command failures now emit RelayEvent with commandId + error message instead of just logging
|
||||||
|
- [x] ping command: now sends pong response via event channel (was a no-op)
|
||||||
|
- [x] pty_create: returns pty_created event with session ID and commandId for correlation
|
||||||
|
- [x] All error paths (pty_write, pty_resize, pty_close, agent_query, agent_stop, sidecar_restart) use send_error()
|
||||||
|
|
||||||
|
#### RemoteManager Reconnection
|
||||||
|
- [x] Exponential backoff reconnection in remote.rs: spawns async tokio task on disconnect
|
||||||
|
- [x] Backoff schedule: 1s, 2s, 4s, 8s, 16s, 30s (capped)
|
||||||
|
- [x] attempt_ws_connect() probe function: connects with proper WebSocket upgrade + auth header, 5s timeout, immediate close
|
||||||
|
- [x] Emits remote-machine-reconnecting (with backoffSecs) and remote-machine-reconnect-ready Tauri events
|
||||||
|
- [x] Cancellation: stops if machine removed (not in HashMap) or manually reconnected (status != disconnected)
|
||||||
|
- [x] Fixed scoping: disconnection cleanup uses inner block to release mutex before emitting event
|
||||||
|
|
||||||
|
#### RemoteManager PTY Creation Confirmation
|
||||||
|
- [x] Handles pty_created event type from relay: emits remote-pty-created Tauri event with machineId, ptyId, commandId
|
||||||
|
|
||||||
### Next Steps
|
### Next Steps
|
||||||
- [ ] Reconnection logic with exponential backoff
|
|
||||||
- [ ] Real-world relay testing (2 machines)
|
- [ ] Real-world relay testing (2 machines)
|
||||||
- [ ] TLS/certificate pinning for relay connections
|
- [ ] TLS/certificate pinning for relay connections
|
||||||
- [ ] Deno sidecar: test with real claude CLI, benchmark startup time vs Node.js
|
- [ ] Deno sidecar: test with real claude CLI, benchmark startup time vs Node.js
|
||||||
|
|
|
||||||
|
|
@ -151,7 +151,7 @@ See [phases.md](phases.md) for the full phased implementation plan.
|
||||||
## Open Questions
|
## Open Questions
|
||||||
|
|
||||||
1. **Node.js or Deno for sidecar?** Resolved: Deno-first with Node.js fallback. SidecarCommand struct in sidecar.rs abstracts the choice. Deno preferred (runs TS directly, compiles to single binary). Falls back to Node.js if Deno not in PATH.
|
1. **Node.js or Deno for sidecar?** Resolved: Deno-first with Node.js fallback. SidecarCommand struct in sidecar.rs abstracts the choice. Deno preferred (runs TS directly, compiles to single binary). Falls back to Node.js if Deno not in PATH.
|
||||||
2. **Multi-machine support?** Resolved: Implemented (Phases A-D complete). See [multi-machine.md](multi-machine.md) for architecture. bterminal-core crate extracted, bterminal-relay binary built, RemoteManager + frontend integration done. Remaining: reconnection logic, real-world testing, TLS.
|
2. **Multi-machine support?** Resolved: Implemented (Phases A-D complete). See [multi-machine.md](multi-machine.md) for architecture. bterminal-core crate extracted, bterminal-relay binary built, RemoteManager + frontend integration done. Reconnection with exponential backoff implemented. Remaining: real-world testing, TLS.
|
||||||
3. **Agent Teams integration?** Phase 7 — frontend routing implemented (subagent pane spawning, parent/child navigation). Needs real-world testing with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1.
|
3. **Agent Teams integration?** Phase 7 — frontend routing implemented (subagent pane spawning, parent/child navigation). Needs real-world testing with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1.
|
||||||
4. **Electron escape hatch threshold?** If Canvas xterm.js proves >50ms latency on target system with 4 panes, switch to Electron. Benchmark in Phase 2.
|
4. **Electron escape hatch threshold?** If Canvas xterm.js proves >50ms latency on target system with 4 panes, switch to Electron. Benchmark in Phase 2.
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue