diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index a147b75..49f6082 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -5,7 +5,7 @@ - v1 is a single-file Python app (`bterminal.py`). Changes are localized. - v2 docs are in `docs/`. Architecture decisions are in `docs/task_plan.md`. - v2 Phases 1-7 + multi-machine (A-D) + profiles/skills complete. Extras: SSH, ctx, themes, detached mode, auto-updater, shiki, copy/paste, session resume, drag-resize, session groups, Deno sidecar, Claude profiles, skill discovery. -- v3 Mission Control (All Phases 1-10 Complete + S-1 Phase 1/1.5/2/3 + S-2 Session Anchors + Provider Adapter Pattern + Provider Runners + Memora Adapter + SOLID Phase 3 + Multi-Agent Orchestration): project groups, workspace store, 15 Workspace components, session continuity, workspace teardown, file overlap conflict detection, inotify-based external write detection, multi-provider adapter pattern (3 phases + Codex/Ollama runners), worktree isolation, session anchors, Memora adapter (read-only SQLite), SOLID refactoring (agent-dispatcher split → 4 utils, session.rs split → 7 sub-modules, branded types), multi-agent orchestration (btmsg inter-agent messaging, bttask kanban task board, agent prompt generator, BTMSG_AGENT_ID env passthrough, periodic re-injection, role-specific tabs: Manager=Tasks, Architect=Arch, Tester=Selenium+Tests, Reviewer=Tasks), dead v2 component cleanup, dashboard metrics panel (MetricsPanel.svelte — live health + task counts + SVG sparkline history), auto-wake Manager scheduler (3 strategies: persistent/on-demand/smart, 6 signal types, configurable threshold), reviewer agent role (workflow prompt, #review-queue/#review-log auto-channels, reviewQueueDepth attention scoring 10pts/task cap 50, Tasks tab). 388 vitest + 68 cargo + 82 E2E (smoke + Phase A + Phase B LLM-judged). +- v3 Mission Control (All Phases 1-10 + Production Readiness Complete): project groups, workspace store, 15+ Workspace components, session continuity, multi-provider adapter pattern, worktree isolation, session anchors, Memora adapter, SOLID refactoring, multi-agent orchestration (btmsg/bttask, 4 Tier 1 roles, role-specific tabs), dashboard metrics, auto-wake scheduler, reviewer agent. Production: sidecar supervisor (auto-restart, exponential backoff), FTS5 search (3 virtual tables, Spotlight overlay), plugin system (sandboxed new Function(), permission-gated), Landlock sandbox (kernel 6.2+), secrets management (system keyring), OS+in-app notifications, keyboard-first UX (18+ palette commands, vi-nav), agent health monitoring (heartbeats, dead letter queue), audit logging, error classification (6 types), optimistic locking (bttask). 409 vitest + 109 cargo + 82 E2E. - v3 docs: `docs/v3-task_plan.md`, `docs/v3-findings.md`, `docs/v3-progress.md`. - Consult Memora (tag: `bterminal`) before making architectural changes. diff --git a/CHANGELOG.md b/CHANGELOG.md index 2f2ddda..8e1791a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added +- **Sidecar crash recovery/supervision** — `bterminal-core/src/supervisor.rs`: SidecarSupervisor wraps SidecarManager with auto-restart, exponential backoff (1s base, 30s cap, 5 retries), SidecarHealth enum (Healthy/Degraded/Failed), 5min stability window. 17 tests +- **Notification system** — OS desktop notifications via `notify-rust` + in-app NotificationCenter.svelte (bell icon, unread badge, history max 100, 6 notification types). Agent dispatcher emits on complete/error/crash. notifications-bridge.ts adapter +- **Secrets management** — `keyring` crate with linux-native (libsecret). SecretsManager in secrets.rs: store/get/delete/list with `__bterminal_keys__` metadata tracking. SettingsTab Secrets section. secrets-bridge.ts adapter. No plaintext fallback +- **Keyboard-first UX** — Alt+1-5 project jump, Ctrl+H/L vi-nav, Ctrl+Shift+1-9 tab switch, Ctrl+J terminal toggle, Ctrl+Shift+K focus agent, Ctrl+Shift+F search overlay. `isEditing()` guard prevents conflicts. CommandPalette rewritten: 18+ commands, 6 categories, fuzzy filter, arrow nav, keyboard shortcuts overlay +- **Agent health monitoring** — heartbeats table + dead_letter_queue table in btmsg.db. 15s heartbeat polling in ProjectBox. Stale detection (5min threshold). ProjectHeader heart indicator (green/yellow/red). StatusBar health badge +- **FTS5 full-text search** — rusqlite upgraded to `bundled-full`. SearchDb with 3 FTS5 virtual tables (search_messages, search_tasks, search_btmsg). SearchOverlay.svelte: Spotlight-style Ctrl+Shift+F overlay, 300ms debounce, grouped results with FTS5 highlight snippets +- **Plugin system** — `~/.config/bterminal/plugins/` with plugin.json manifest. plugins.rs: discovery, path-traversal-safe file reading, permission validation. plugin-host.ts: sandboxed `new Function()` execution, permission-gated API (palette, btmsg:read, bttask:read, events). plugins.svelte.ts store. SettingsTab plugins section. Example hello plugin +- **Landlock sandbox** — `bterminal-core/src/sandbox.rs`: SandboxConfig with RW/RO paths, applied via `pre_exec()` in sidecar child process. Requires kernel 6.2+ (graceful fallback). Per-project toggle in SettingsTab +- **Error classification** — `error-classifier.ts`: classifyApiError() with 6 types (rate_limit, auth, quota, overloaded, network, unknown), actionable messages, retry delays. 20 tests +- **Audit log** — audit_log table in btmsg.db. AuditLogTab.svelte: Manager-only tab, filter by type+agent, 5s auto-refresh. audit-bridge.ts adapter. Events: agent_start/stop/error, task changes, wake events, prompt injection +- **Usage meter** — UsageMeter.svelte: compact inline cost/token meter with color thresholds (50/75/90%), hover tooltip. Integrated in AgentPane cost bar +- **Team agent orchestration** — install_cli_tools() copies btmsg/bttask to ~/.local/bin on startup. register_agents_from_groups() with bidirectional contacts. ensure_review_channels_for_group() creates #review-queue/#review-log per group +- **Optimistic locking for bttask** — `version` column in tasks table. `WHERE id=? AND version=?` in update_task_status(). Conflict detection in TaskBoardTab. Both Rust + Python CLI updated - **Unified test runner** — `v2/scripts/test-all.sh` runs vitest + cargo tests with optional E2E (`--e2e` flag). npm scripts: `test:all`, `test:all:e2e`, `test:cargo`. Summary output with color-coded pass/fail - **Testing gate rule** — `.claude/rules/20-testing-gate.md` requires running full test suite after every major change (new features, refactors touching 3+ files, store/adapter/bridge/backend changes) - **E2E test mode infrastructure** — `BTERMINAL_TEST=1` env var disables file watchers (watcher.rs, fs_watcher.rs), wake scheduler, and allows data/config dir overrides via `BTERMINAL_TEST_DATA_DIR`/`BTERMINAL_TEST_CONFIG_DIR`. New `is_test_mode` Tauri command bridges test state to frontend diff --git a/CLAUDE.md b/CLAUDE.md index 5968e58..094a1b0 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,7 +2,7 @@ ## Project Overview -Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Python) is production-stable. v2 redesign (Tauri 2.x + Svelte 5 + Claude Agent SDK) Phases 1-7 + multi-machine (A-D) + profiles/skills complete. Packaging: .deb + AppImage via GitHub Actions CI. v3 Mission Control (All Phases 1-10 Complete + Sidebar Redesign + Multi-Agent Orchestration): multi-project dashboard with project groups, per-project Claude sessions with session continuity, team agents panel, terminal tabs, VSCode-style left sidebar, multi-agent orchestration (Tier 1 management agents: Manager/Architect/Tester with role-specific tabs, btmsg inter-agent messaging, bttask kanban task board, BTMSG_AGENT_ID env passthrough, periodic system prompt re-injection, custom context for all tiers). +Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Python) is production-stable. v2 redesign (Tauri 2.x + Svelte 5 + Claude Agent SDK) Phases 1-7 + multi-machine (A-D) + profiles/skills complete. Packaging: .deb + AppImage via GitHub Actions CI. v3 Mission Control (All Phases 1-10 Complete + Production Readiness): multi-project dashboard with project groups, per-project Claude sessions with session continuity, team agents panel, terminal tabs, VSCode-style left sidebar, multi-agent orchestration (Tier 1 management agents: Manager/Architect/Tester/Reviewer with role-specific tabs, btmsg inter-agent messaging, bttask kanban task board with optimistic locking). Production features: sidecar crash recovery/supervision, FTS5 full-text search, plugin system, Landlock sandboxing, secrets management (system keyring), OS + in-app notifications, keyboard-first UX (18+ palette commands), agent health monitoring + dead letter queue, audit logging, error classification. - **Repository:** github.com/DexterFromLab/BTerminal - **License:** MIT @@ -37,9 +37,15 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth | `v2/src-tauri/src/groups.rs` | Groups config (load/save ~/.config/bterminal/groups.json) | | `v2/src-tauri/src/fs_watcher.rs` | ProjectFsWatcher (inotify per-project recursive file change detection, S-1 Phase 2) | | `v2/src-tauri/src/lib.rs` | AppState + setup + handler registration (~170 lines) | -| `v2/src-tauri/src/commands/` | 12 domain command modules (pty, agent, watcher, session, persistence, knowledge, claude, groups, files, remote, misc, bttask) | -| `v2/src-tauri/src/btmsg.rs` | Agent messaging backend (agents, DMs, channels, contacts ACL; SQLite WAL mode, named column access) | -| `v2/src-tauri/src/bttask.rs` | Task board backend (list, create, update status, delete, comments, review_queue_count; shared btmsg.db; auto-posts to #review-queue channel on task→review transition) | +| `v2/src-tauri/src/commands/` | 16 domain command modules (pty, agent, watcher, session, persistence, knowledge, claude, groups, files, remote, misc, bttask, notifications, plugins, search, secrets) | +| `v2/src-tauri/src/btmsg.rs` | Agent messaging backend (agents, DMs, channels, contacts ACL, heartbeats, dead_letter_queue, audit_log; SQLite WAL mode, named column access) | +| `v2/src-tauri/src/bttask.rs` | Task board backend (list, create, update status with optimistic locking, delete, comments, review_queue_count; shared btmsg.db) | +| `v2/src-tauri/src/search.rs` | FTS5 full-text search (SearchDb, 3 virtual tables: search_messages/tasks/btmsg, index/search/rebuild) | +| `v2/src-tauri/src/secrets.rs` | SecretsManager (keyring crate, linux-native/libsecret, store/get/delete/list with metadata tracking) | +| `v2/src-tauri/src/plugins.rs` | Plugin discovery (scan config dir for plugin.json, path-traversal-safe file reading, permission validation) | +| `v2/src-tauri/src/notifications.rs` | Desktop notifications (notify-rust, graceful fallback if daemon unavailable) | +| `v2/bterminal-core/src/supervisor.rs` | SidecarSupervisor (auto-restart, exponential backoff 1s-30s, 5 retries, SidecarHealth enum, 17 tests) | +| `v2/bterminal-core/src/sandbox.rs` | Landlock sandbox (SandboxConfig RW/RO paths, pre_exec() integration, kernel 6.2+ graceful fallback) | | `v2/src-tauri/src/sidecar.rs` | SidecarManager (thin re-export from bterminal-core) | | `v2/src-tauri/src/event_sink.rs` | TauriEventSink (implements EventSink for AppHandle) | | `v2/src-tauri/src/remote.rs` | RemoteManager (WebSocket client connections to relays) | @@ -106,7 +112,19 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth | `v2/src/lib/utils/highlight.ts` | Shiki syntax highlighter (lazy singleton, 13 languages) | | `v2/src/lib/utils/detach.ts` | Detached pane mode (pop-out windows via URL params) | | `v2/src/lib/utils/updater.ts` | Tauri auto-updater utility | -| `v2/src/lib/stores/notifications.svelte.ts` | Toast notification store (notify, dismiss) | +| `v2/src/lib/stores/notifications.svelte.ts` | Notification store (toast + history, 6 NotificationTypes, unread badge, max 100 history) | +| `v2/src/lib/stores/plugins.svelte.ts` | Plugin store (command registry, event bus, loadAllPlugins/unloadAllPlugins) | +| `v2/src/lib/adapters/audit-bridge.ts` | Audit log IPC adapter (logAuditEvent, getAuditLog, AuditEntry, AuditEventType) | +| `v2/src/lib/adapters/notifications-bridge.ts` | Desktop notification IPC wrapper (sendDesktopNotification) | +| `v2/src/lib/adapters/plugins-bridge.ts` | Plugin discovery IPC wrapper (discoverPlugins, readPluginFile) | +| `v2/src/lib/adapters/search-bridge.ts` | FTS5 search IPC wrapper (initSearch, searchAll, rebuildIndex, indexMessage) | +| `v2/src/lib/adapters/secrets-bridge.ts` | Secrets IPC wrapper (storeSecret, getSecret, deleteSecret, listSecrets, hasKeyring) | +| `v2/src/lib/utils/error-classifier.ts` | API error classification (6 types: rate_limit/auth/quota/overloaded/network/unknown, retry logic, 20 tests) | +| `v2/src/lib/plugins/plugin-host.ts` | Sandboxed plugin runtime (new Function(), permission-gated API, load/unload lifecycle) | +| `v2/src/lib/components/Agent/UsageMeter.svelte` | Compact inline usage meter (color thresholds 50/75/90%, hover tooltip) | +| `v2/src/lib/components/Notifications/NotificationCenter.svelte` | Bell icon + dropdown notification panel (unread badge, history, mark read/clear) | +| `v2/src/lib/components/Workspace/AuditLogTab.svelte` | Manager audit log tab (filter by type+agent, 5s auto-refresh, max 200 entries) | +| `v2/src/lib/components/Workspace/SearchOverlay.svelte` | FTS5 search overlay (Ctrl+Shift+F, Spotlight-style, 300ms debounce, grouped results) | | `v2/src/lib/stores/theme.svelte.ts` | Theme store (17 themes: 4 Catppuccin + 7 Editor + 6 Deep Dark, UI + terminal font restoration on startup) | | `v2/src/lib/styles/themes.ts` | Theme palette definitions (17 themes), ThemeId/ThemePalette/ThemeMeta types, THEME_LIST | | `v2/src/lib/styles/catppuccin.css` | CSS custom properties: 26 --ctp-* color vars + --ui-font-* + --term-font-* | @@ -167,8 +185,8 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth - SQLite session persistence (rusqlite, WAL mode) + layout restore on startup - File watcher (notify crate) for live markdown viewer - OpenTelemetry: tracing + tracing-subscriber + opentelemetry 0.28 + tracing-opentelemetry 0.29, OTLP/HTTP to Tempo, BTERMINAL_OTLP_ENDPOINT env var -- Rust deps (src-tauri): tauri, bterminal-core (path), rusqlite (bundled), dirs, notify, serde, tokio, tokio-tungstenite, futures-util, tracing, tracing-subscriber, opentelemetry, opentelemetry_sdk, opentelemetry-otlp, tracing-opentelemetry, tauri-plugin-updater, tauri-plugin-dialog -- Rust deps (bterminal-core): portable-pty, uuid, serde, serde_json, log +- Rust deps (src-tauri): tauri, bterminal-core (path), rusqlite (bundled-full, FTS5), dirs, notify, serde, tokio, tokio-tungstenite, futures-util, tracing, tracing-subscriber, opentelemetry, opentelemetry_sdk, opentelemetry-otlp, tracing-opentelemetry, tauri-plugin-updater, tauri-plugin-dialog, notify-rust, keyring (linux-native) +- Rust deps (bterminal-core): portable-pty, uuid, serde, serde_json, log, landlock - Rust deps (bterminal-relay): bterminal-core, tokio, tokio-tungstenite, clap, env_logger, futures-util - npm deps: @anthropic-ai/claude-agent-sdk, @xterm/xterm, @xterm/addon-canvas, @xterm/addon-fit, @tauri-apps/api, @tauri-apps/plugin-updater, @tauri-apps/plugin-dialog, marked, shiki, pdfjs-dist, vitest (dev) - Source: `v2/` directory diff --git a/TODO.md b/TODO.md index d567107..9237ef4 100644 --- a/TODO.md +++ b/TODO.md @@ -2,13 +2,6 @@ ## Active -### v3 Production Readiness (from Tribunal Assessment 2026-03-12) -- [ ] **Sidecar crash recovery/supervision** -- Rust supervisor threads for sidecar processes: detect exit codes, restart with exponential backoff (max 5 retries), surface alerts in dashboard. Currently silent failure. | Impact: 9, Effort: M -- [ ] **Notification system (OS + in-app)** -- notify-rust for desktop notifications on agent events (task complete, error, review requested), in-app bell icon with notification history. | Impact: 8, Effort: S -- [ ] **Secrets management (system keyring)** -- Tauri keychain plugin for API key storage, migrate env var keys to system keyring on first run. | Impact: 8, Effort: M -- [ ] **Keyboard-first UX pass** -- Vi-style pane switching (Ctrl+hjkl), Alt+1-5 project jump, ensure command palette covers 100% of actions. | Impact: 8, Effort: S -- [ ] **Agent health monitoring + dead letter queue** -- Tier 1 agents respond to heartbeat within configurable timeout, dead agents flagged in dashboard, queue undelivered btmsg. | Impact: 9, Effort: M - ### v3 Remaining - [ ] **Multi-machine real-world testing** -- Test bterminal-relay with 2 machines. Mark as experimental until docker-compose integration tests pass. - [ ] **Multi-machine TLS/certificate pinning** -- TLS support for bterminal-relay + certificate pinning in RemoteManager. @@ -16,6 +9,7 @@ ## Completed +- [x] **v3 Production Readiness — ALL tribunal items** -- Implemented all 13 features from tribunal assessment: sidecar supervisor, notifications, secrets, keyboard UX, agent health, search, plugins, sandbox, error classifier, audit log, team agent orchestration, optimistic locking, usage meter. 409 vitest + 109 cargo. | Done: 2026-03-12 - [x] **Unified test runner + testing gate rule** -- Created v2/scripts/test-all.sh (vitest + cargo + optional E2E), added npm scripts (test:all, test:all:e2e, test:cargo), added .claude/rules/20-testing-gate.md requiring full suite after major changes. | Done: 2026-03-12 - [x] **E2E testing — Phase B+ & test fixes** -- Phase B: LLM judge (llm-judge.ts, claude-haiku-4-5), 6 multi-project scenarios, CI workflow (3 jobs). Test fixes: 27 failures across 3 spec files. 388 vitest + 68 cargo + 82 E2E (0 fail, 4 skip). | Done: 2026-03-12 - [x] **Reviewer agent role** -- Tier 1 specialist with role='reviewer'. Reviewer workflow in agent-prompts.ts (8-step process). #review-queue/#review-log auto-channels. reviewQueueDepth in attention scoring (10pts/task, cap 50). 388 vitest + 76 cargo. | Done: 2026-03-12 @@ -25,4 +19,3 @@ - [x] **Regression tests + sidecar env security** -- 49 new tests. Added ANTHROPIC_* to Rust env strip. 327 vitest + 72 cargo. | Done: 2026-03-11 - [x] **Integrate dexter_changes + fix 5 critical bugs** -- Fixed: btmsg.rs column index, btmsg-bridge camelCase, GroupAgentsPanel stopPropagation, ArchitectureTab PlantUML, TestingTab Tauri 2.x. | Done: 2026-03-11 - [x] **SOLID Phase 3 — Primitive obsession** -- Branded types SessionId/ProjectId. Applied to ~130 sites. 293 vitest + 49 cargo. | Done: 2026-03-11 -- [x] **SOLID Phases 1-2** -- AttentionScorer extraction, type guards, lib.rs split (976→170), agent-dispatcher split (496→260), session.rs split (1008→7 modules). | Done: 2026-03-11 diff --git a/docs/v3-progress.md b/docs/v3-progress.md index c68a39b..031edd1 100644 --- a/docs/v3-progress.md +++ b/docs/v3-progress.md @@ -1006,3 +1006,38 @@ Reviewed and integrated Dexter's multi-agent orchestration branch (dexter_change - [x] Vitest: 388 passed, 0 failed (was 345, +43 from prior sessions) - [x] Cargo: 68 passed, 0 failed - [x] No regressions + +--- + +### Session: v3 Production Readiness — Tribunal Implementation (2026-03-12) + +Implemented ALL 13 features from tribunal assessment in 3 parallel waves (11 sub-agents total). 60+ files changed, 3861 insertions. + +#### Wave 1: Core Infrastructure (3 agents) +- [x] **Sidecar supervisor** — `bterminal-core/src/supervisor.rs`: SidecarSupervisor with exponential backoff (1s-30s, 5 retries), SidecarHealth enum, 5min stability window, 17 tests +- [x] **FTS5 search** — rusqlite `bundled-full`, SearchDb with 3 FTS5 virtual tables, SearchOverlay.svelte (Ctrl+Shift+F), search-bridge.ts +- [x] **Secrets management** — `keyring` crate (linux-native/libsecret), SecretsManager, secrets-bridge.ts, SettingsTab section + +#### Wave 2: Features (5 agents) +- [x] **Notification system** — notify-rust + NotificationCenter.svelte (bell icon, history, 6 types), notifications-bridge.ts +- [x] **Keyboard-first UX** — Alt+1-5 jump, Ctrl+H/L vi-nav, Ctrl+Shift+1-9 tabs, Ctrl+J terminal, CommandPalette rewrite (18+ cmds, 6 categories) +- [x] **Agent health monitoring** — heartbeats + dead_letter_queue tables, 15s poll, ProjectHeader heart indicator, StatusBar badge +- [x] **Plugin system** — plugins.rs discovery, plugin-host.ts sandboxed runtime, plugins.svelte.ts store, example plugin +- [x] **Landlock sandbox** — bterminal-core/src/sandbox.rs, SandboxConfig, pre_exec() integration, per-project toggle + +#### Wave 3: Integration (3 agents) +- [x] **Error classification** — error-classifier.ts (6 types, retry logic), 20 tests +- [x] **Audit log** — audit_log table, AuditLogTab.svelte, audit-bridge.ts +- [x] **Team agent orchestration** — install_cli_tools(), register_agents_from_groups(), bidirectional contacts, review channels +- [x] **Optimistic locking** — version column in bttask, conflict detection in Rust + Python CLI +- [x] **Usage meter** — UsageMeter.svelte, AgentPane integration + +#### Key Fixes +- PRAGMA journal_mode=WAL: changed to query_row for bundled-full compatibility (session/mod.rs, btmsg.rs, bttask.rs) +- SidecarConfig: changed to `Mutex` for interior mutability (sandbox updates) +- New Cargo deps: notify-rust 4, keyring 3 (linux-native), landlock 0.4 + +#### Verification +- [x] Vitest: 409 passed, 0 failed (+21 from prior) +- [x] Cargo: 109 passed, 0 failed (+41 from prior) +- [x] No regressions