docs: update meta files for OTEL telemetry session
This commit is contained in:
parent
fd9f55faff
commit
a69022756a
6 changed files with 73 additions and 16 deletions
|
|
@ -67,6 +67,7 @@
|
|||
- RemoteManager reconnection: exponential backoff (1s-30s cap) on disconnect, attempt_tcp_probe() (TCP-only, no WS upgrade), emits remote-machine-reconnecting and remote-machine-reconnect-ready events. Frontend listeners in remote-bridge.ts; machines store auto-reconnects on ready.
|
||||
- v3 workspace store (`workspace.svelte.ts`) replaces layout store for v3. Groups loaded from `~/.config/bterminal/groups.json` via `groups-bridge.ts`. State: groups, activeGroupId, activeTab, focusedProjectId. Derived: activeGroup, activeProjects.
|
||||
- v3 groups backend (`groups.rs`): load_groups(), save_groups(), default_groups(). Tauri commands: groups_load, groups_save.
|
||||
- Telemetry (`telemetry.rs`): tracing + optional OTLP export to Tempo. `BTERMINAL_OTLP_ENDPOINT` env var controls (absent = console-only). TelemetryGuard in AppState with Drop-based shutdown. Frontend events route through `frontend_log` Tauri command → Rust tracing (no browser OTEL SDK — WebKit2GTK incompatible). `telemetry-bridge.ts` provides `tel.info/warn/error()` convenience API. Docker stack at `docker/tempo/` (Grafana port 9715).
|
||||
- v3 SQLite additions: agent_messages table (per-project message persistence), project_agent_state table (sdkSessionId, cost, status per project), sessions.project_id column.
|
||||
- v3 App.svelte: VSCode-style sidebar layout. Horizontal: left icon rail (GlobalTabBar, 2.75rem, single Settings gear icon) + expandable drawer panel (Settings only, content-driven width, max 50%) + main workspace (ProjectGrid always visible) + StatusBar. Sidebar has Settings only — Sessions/Docs/Context are project-specific (in ProjectBox tabs). Keyboard: Ctrl+B (toggle sidebar), Ctrl+, (settings), Escape (close).
|
||||
- v3 component tree: App -> GlobalTabBar (settings icon) + sidebar-panel? (SettingsTab) + workspace (ProjectGrid) + StatusBar. See `docs/v3-task_plan.md` for full tree.
|
||||
|
|
|
|||
18
CHANGELOG.md
18
CHANGELOG.md
|
|
@ -20,8 +20,26 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||
- `remove_machine` now aborts `WsConnection` tasks before removal, preventing resource leak (remote.rs)
|
||||
- `save_agent_messages` wrapped in `unchecked_transaction()` for atomic DELETE+INSERT, preventing partial writes on crash (session.rs)
|
||||
- Non-null assertion `msg.event!` replaced with safe check `if (msg.event)` in agent bridge event handler (agent-bridge.ts)
|
||||
- Runtime type guards (`str()`, `num()`) replace bare `as` casts on untrusted SDK wire format in sdk-messages.ts
|
||||
- ANTHROPIC_* environment variables now stripped alongside CLAUDE* in sidecar agent-runner.ts
|
||||
- Frontend persistence timestamps use `Math.floor(Date.now() / 1000)` matching Rust seconds convention (agent-dispatcher.ts)
|
||||
- Remote disconnect handler converted from `try_lock()` to async `.lock().await` (remote.rs)
|
||||
- `save_layout` pane_ids serialization error now propagated instead of silent fallback (session.rs)
|
||||
- ctx.rs Mutex::lock() returns Err instead of panicking on poisoned lock (5 occurrences)
|
||||
- ctx CLI: `int()` limit argument validated with try/except (ctx)
|
||||
- ctx CLI: FTS5 MATCH query wrapped in try/except for syntax errors (ctx)
|
||||
- File watcher: explicit error for root-level path instead of silent fallback (watcher.rs)
|
||||
- Agent bridge payload validated before cast to SidecarMessage (agent-bridge.ts)
|
||||
- Profile.toml and resource_dir failures now log::warn instead of silent empty fallback (lib.rs)
|
||||
|
||||
### Added
|
||||
- OpenTelemetry instrumentation: `telemetry.rs` module with TelemetryGuard (Drop-based shutdown), tracing + optional OTLP/HTTP export to Tempo, controlled by `BTERMINAL_OTLP_ENDPOINT` env var (absent = console-only fallback)
|
||||
- `#[tracing::instrument]` on 10 key Tauri commands: pty_spawn, pty_kill, agent_query, agent_stop, agent_restart, remote_connect, remote_disconnect, remote_agent_query, remote_agent_stop, remote_pty_spawn
|
||||
- `frontend_log` Tauri command: routes frontend telemetry events (level + message + context JSON) to Rust tracing layer with `source="frontend"` field
|
||||
- `telemetry-bridge.ts` adapter: `tel.info/warn/error/debug/trace()` convenience wrappers for frontend → Rust tracing bridge via IPC
|
||||
- Agent dispatcher telemetry: structured events for agent_started, agent_stopped, agent_error, sidecar_crashed, and agent_cost (with full metrics: costUsd, tokens, turns, duration)
|
||||
- Docker Tempo + Grafana stack (`docker/tempo/`): Tempo (OTLP gRPC 4317, HTTP 4318, query 3200) + Grafana (port 9715) with auto-provisioned Tempo datasource
|
||||
- 6 new Rust dependencies: tracing 0.1, tracing-subscriber 0.3, opentelemetry 0.28, opentelemetry_sdk 0.28, opentelemetry-otlp 0.28, tracing-opentelemetry 0.29
|
||||
- `ctx_register_project` Tauri command and `ctxRegisterProject()` bridge function: registers a project in the ctx database via `INSERT OR IGNORE` into sessions table; opens DB read-write briefly then closes
|
||||
- Agent preview terminal (`AgentPreviewPane.svelte`): read-only xterm.js terminal that subscribes to agent session messages in real-time; renders Bash commands as cyan `❯ command`, file operations as yellow `[Read/Write/Edit] path`, tool results (80-line truncation), text summaries, errors in red, session start/complete with cost; uses `disableStdin: true`, Canvas addon, theme hot-swap; spawned via 👁 button in TerminalTabs tab bar (appears when agent session is active); deduplicates — only one preview per session
|
||||
- `TerminalTab.type` extended with `'agent-preview'` variant and `agentSessionId?: string` field in workspace store
|
||||
|
|
|
|||
10
CLAUDE.md
10
CLAUDE.md
|
|
@ -42,6 +42,7 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth
|
|||
| `v2/src-tauri/src/session.rs` | SessionDb (rusqlite, sessions + layout + settings + ssh_sessions + agent_messages + project_agent_state) |
|
||||
| `v2/src-tauri/src/watcher.rs` | FileWatcherManager (notify crate, file change events) |
|
||||
| `v2/src-tauri/src/ctx.rs` | CtxDb (read-only access to ~/.claude-context/context.db) |
|
||||
| `v2/src-tauri/src/telemetry.rs` | OTEL telemetry (TelemetryGuard, tracing + OTLP export, BTERMINAL_OTLP_ENDPOINT) |
|
||||
| `v2/src/lib/stores/workspace.svelte.ts` | v3 workspace store (project groups, tabs, focus, replaces layout store) |
|
||||
| `v2/src/lib/stores/layout.svelte.ts` | v2 layout store (panes, presets, groups, persistence, Svelte 5 runes) |
|
||||
| `v2/src/lib/stores/agents.svelte.ts` | Agent session store (messages, cost, parent/child hierarchy) |
|
||||
|
|
@ -59,6 +60,8 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth
|
|||
| `v2/src/lib/adapters/claude-bridge.ts` | Claude profiles + skills IPC wrapper |
|
||||
| `v2/src/lib/adapters/groups-bridge.ts` | Groups config IPC wrapper (load/save) |
|
||||
| `v2/src/lib/adapters/remote-bridge.ts` | Remote machine management IPC wrapper |
|
||||
| `v2/src/lib/adapters/telemetry-bridge.ts` | Frontend telemetry bridge (routes events to Rust tracing via IPC) |
|
||||
| `docker/tempo/` | Docker compose: Tempo + Grafana for trace visualization (port 9715) |
|
||||
| `v2/src/lib/stores/machines.svelte.ts` | Remote machine state store (Svelte 5 runes) |
|
||||
| `v2/src/lib/utils/agent-tree.ts` | Agent tree builder (hierarchy from messages) |
|
||||
| `v2/src/lib/utils/highlight.ts` | Shiki syntax highlighter (lazy singleton, 13 languages) |
|
||||
|
|
@ -104,7 +107,8 @@ Terminal emulator with SSH and Claude Code session management. v1 (GTK3+VTE Pyth
|
|||
- Multi-machine: bterminal-relay WebSocket server + RemoteManager WebSocket client
|
||||
- SQLite session persistence (rusqlite, WAL mode) + layout restore on startup
|
||||
- File watcher (notify crate) for live markdown viewer
|
||||
- Rust deps (src-tauri): tauri, bterminal-core (path), rusqlite (bundled), dirs, notify, serde, tokio, tokio-tungstenite, futures-util, tauri-plugin-updater, tauri-plugin-dialog
|
||||
- OpenTelemetry: tracing + tracing-subscriber + opentelemetry 0.28 + tracing-opentelemetry 0.29, OTLP/HTTP to Tempo, BTERMINAL_OTLP_ENDPOINT env var
|
||||
- Rust deps (src-tauri): tauri, bterminal-core (path), rusqlite (bundled), dirs, notify, serde, tokio, tokio-tungstenite, futures-util, tracing, tracing-subscriber, opentelemetry, opentelemetry_sdk, opentelemetry-otlp, tracing-opentelemetry, tauri-plugin-updater, tauri-plugin-dialog
|
||||
- Rust deps (bterminal-core): portable-pty, uuid, serde, serde_json, log
|
||||
- Rust deps (bterminal-relay): bterminal-core, tokio, tokio-tungstenite, clap, env_logger, futures-util
|
||||
- npm deps: @anthropic-ai/claude-agent-sdk, @xterm/xterm, @xterm/addon-canvas, @xterm/addon-fit, @tauri-apps/api, @tauri-apps/plugin-updater, @tauri-apps/plugin-dialog, marked, shiki, vitest (dev)
|
||||
|
|
@ -130,6 +134,10 @@ cd v2/src-tauri && cargo test # Cargo tests (backend)
|
|||
|
||||
# v2 install from source (builds + installs to ~/.local/bin/bterminal-v2)
|
||||
./install-v2.sh
|
||||
|
||||
# Telemetry stack (Tempo + Grafana)
|
||||
cd docker/tempo && docker compose up -d # Grafana at http://localhost:9715
|
||||
BTERMINAL_OTLP_ENDPOINT=http://localhost:4318 npm run tauri dev # Enable OTLP export
|
||||
```
|
||||
|
||||
## Conventions
|
||||
|
|
|
|||
15
README.md
15
README.md
|
|
@ -140,6 +140,21 @@ Add remote machines in BTerminal Settings > Remote Machines (label, URL, token).
|
|||
|
||||
See [docs/multi-machine.md](docs/multi-machine.md) for full architecture details.
|
||||
|
||||
## Telemetry (v2)
|
||||
|
||||
BTerminal supports OpenTelemetry tracing with optional export to Tempo + Grafana.
|
||||
|
||||
```bash
|
||||
# Start the tracing stack
|
||||
cd docker/tempo && docker compose up -d
|
||||
# Grafana at http://localhost:9715
|
||||
|
||||
# Run BTerminal with OTLP export enabled
|
||||
BTERMINAL_OTLP_ENDPOINT=http://localhost:4318 npm run tauri dev
|
||||
```
|
||||
|
||||
Without `BTERMINAL_OTLP_ENDPOINT`, telemetry falls back to console-only tracing (no network calls). Key Tauri commands (PTY, agent, remote) are instrumented with `#[tracing::instrument]`. Frontend events (agent lifecycle, errors, cost) route to Rust tracing via IPC bridge.
|
||||
|
||||
## Documentation
|
||||
|
||||
| Document | Description |
|
||||
|
|
|
|||
26
TODO.md
26
TODO.md
|
|
@ -3,26 +3,22 @@
|
|||
## Active
|
||||
|
||||
### v2/v3 Remaining
|
||||
- [ ] **OTEL logging** -- Full-scope OpenTelemetry instrumentation: Rust backend (tracing + opentelemetry crates) + frontend bridge to Rust. Target: Tempo + Grafana. Research complete (Memora #1529).
|
||||
- [ ] **Fix remaining audit findings** -- 5 HIGH + 10 MEDIUM + 6 LOW open from 2026-03-08 audit (Memora #1528). Includes: workspace teardown race, sdk-messages unvalidated casts, ANTHROPIC_* env leak, ctx CLI input validation.
|
||||
- [ ] **E2E testing (Playwright/WebDriver)** -- Scaffold at v2/tests/e2e/README.md. Needs display server.
|
||||
- [ ] **Multi-machine real-world testing** -- Test bterminal-relay with 2 machines.
|
||||
- [ ] **Multi-machine TLS/certificate pinning** -- TLS support for bterminal-relay + certificate pinning in RemoteManager.
|
||||
- [ ] **Agent Teams real-world testing** -- Test with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1.
|
||||
- [ ] **Convert remaining components to rem** -- Apply rule 18 (relative-units.md) to all remaining px-based layout CSS across v3 components.
|
||||
- [ ] **Workspace teardown race** -- clearAllAgentSessions in workspace.svelte.ts:69 races with in-flight persistence (1 remaining HIGH audit finding).
|
||||
|
||||
## Completed
|
||||
|
||||
- [x] **Security & correctness audit fixes** -- 5 CRITICAL + 4 HIGH findings fixed: path traversal in claude_read_skill (canonicalize + starts_with), re-entrant exit handler race (restarting guard), memory leak (clear maps in stopAgentDispatcher), listener leak (UnlistenFn array + destroyMachineListeners), fragile abort detection (controller.signal.aborted), unhandled rejection (async handleMessage + .catch), remote.rs try_lock→async lock, remove_machine task abort, session.rs transaction safety. 3 false positives dismissed. All 172 tests pass. | Done: 2026-03-08
|
||||
- [x] **ctx dead code cleanup** -- Removed ContextTab.svelte (dead wrapper), CtxProject struct, list_projects() method, ctx_list_projects command, ctxListProjects() bridge function. Simplified register_project() guard. Added FTS5 limitation docs. 4 insertions, 81 deletions across 6 files. | Done: 2026-03-08
|
||||
- [x] **ContextPane project-scoped redesign** -- ContextPane now takes projectName + projectCwd props from ProjectBox. Auto-registers project in ctx DB on mount (INSERT OR IGNORE). Removed project selector — context shown directly for current project. Added ctx_register_project Tauri command. | Done: 2026-03-08
|
||||
- [x] **ctx init fix + UI init button** -- Fixed ctx CLI script (missing parent directory creation). Added ctx_init_db Tauri command + "Initialize Database" button in ContextPane that creates ~/.claude-context/context.db with full schema (tables + FTS5 + triggers) when DB doesn't exist. | Done: 2026-03-08
|
||||
- [x] **Premium markdown typography** -- MarkdownPane CSS overhaul: hardcoded Inter font (not --ui-font-family which resolves to monospace), text-rendering optimizeLegibility, antialiased, font-feature-settings (Inter cv01-cv04, ss01). Tailwind-prose-inspired spacing (1.15-1.75em margins), gradient HR (fade to transparent edges), fade-in link underlines (text-decoration-color transition), italic blockquotes with translucent bg, inset box-shadow on code blocks, h5/h6 uppercase styles. Body color softened to --ctp-subtext1. All colors via --ctp-* vars for 17-theme compatibility. | Done: 2026-03-08
|
||||
- [x] **Collapsible terminal panel** -- Terminal section on Claude tab collapses to a status bar (chevron + "Terminal" label + tab count badge). Click to expand/collapse. Default collapsed. Hidden on Files/Context tabs. | Done: 2026-03-08
|
||||
- [x] **Sidebar simplification + markdown fixes** -- Sidebar stripped to Settings-only (Sessions/Docs/Context removed — project-specific). MarkdownPane file switching fixed ($effect watches filePath changes). MarkdownPane restyled: sans-serif font, --ctp-* vars, styled blockquotes/tables/links. Terminal area hidden on Files/Context tabs. | Done: 2026-03-08
|
||||
- [x] **Agent preview terminal** -- AgentPreviewPane.svelte: read-only xterm.js terminal subscribing to agent session messages. Renders Bash commands (cyan), file ops (yellow), tool results, errors. 👁 button in TerminalTabs spawns preview tab. TerminalTab type extended with 'agent-preview' + agentSessionId field. | Done: 2026-03-08
|
||||
- [x] **Terminal tabs close fix** -- Svelte 5 `$state<Map>` reactivity bug: Map.set() didn't trigger $derived updates. Changed projectTerminals from Map to Record (plain object). Fixes: tabs can now be closed, sequential tab naming works. | Done: 2026-03-08
|
||||
- [x] **Project settings card redesign** -- SettingsTab project section redesigned: card layout per project with Svelte-state icon picker, inline-editable name, CWD left-ellipsis (direction:rtl), account/profile dropdown (listProfiles), custom toggle switch, subtle remove footer. ProjectHeader profile badge styled as blue pill. All CSS in rem. | Done: 2026-03-08
|
||||
- [x] **VSCode-style sidebar redesign** -- VSCode-style left sidebar with icon rail + expandable drawer + always-visible workspace. | Done: 2026-03-08
|
||||
- [x] **v3 Phases 6-10 Complete** -- Session continuity, workspace teardown, dead v2 component removal (~1,836 lines). | Done: 2026-03-07
|
||||
- [x] **v3 Mission Control MVP (Phases 1-5)** -- Data model + groups.rs + workspace store + 12 Workspace components. 138 vitest + 36 cargo tests. | Done: 2026-03-07
|
||||
- [x] **OTEL telemetry** -- Full-scope OpenTelemetry: telemetry.rs (TelemetryGuard, tracing + OTLP layers), telemetry-bridge.ts (frontend→Rust), #[tracing::instrument] on 10 commands, agent dispatcher lifecycle logging, Docker Tempo+Grafana stack (port 9715). BTERMINAL_OTLP_ENDPOINT env var controls export. | Done: 2026-03-08
|
||||
- [x] **Medium/Low audit fixes** -- All 6 MEDIUM + 8 LOW findings fixed: runtime type guards in sdk-messages.ts, ANTHROPIC_* env stripping, timestamp mismatch, async lock, error propagation, input validation, mutex poisoning, log warnings, payload validation. 172/172 tests pass. | Done: 2026-03-08
|
||||
- [x] **Security & correctness audit fixes** -- 5 CRITICAL + 4 HIGH findings fixed: path traversal, race conditions, memory leaks, listener leaks, transaction safety. 3 false positives dismissed. | Done: 2026-03-08
|
||||
- [x] **ctx dead code cleanup** -- Removed ContextTab.svelte, CtxProject struct, list_projects(), ctx_list_projects command. | Done: 2026-03-08
|
||||
- [x] **ContextPane project-scoped redesign** -- Auto-registers project in ctx DB on mount. Removed project selector. | Done: 2026-03-08
|
||||
- [x] **ctx init fix + UI init button** -- Fixed ctx CLI, added Initialize Database button in ContextPane. | Done: 2026-03-08
|
||||
- [x] **Premium markdown typography** -- MarkdownPane CSS overhaul with Inter font, prose spacing, gradient HR. | Done: 2026-03-08
|
||||
- [x] **Collapsible terminal panel** -- Terminal section collapses to status bar. Default collapsed. | Done: 2026-03-08
|
||||
- [x] **Sidebar simplification + markdown fixes** -- Sidebar stripped to Settings-only. MarkdownPane file switching fixed. | Done: 2026-03-08
|
||||
- [x] **Agent preview terminal** -- AgentPreviewPane.svelte: read-only xterm.js terminal for agent activity. | Done: 2026-03-08
|
||||
|
|
|
|||
|
|
@ -377,3 +377,22 @@ All editor themes map to the same `--ctp-*` CSS custom property names (26 vars).
|
|||
#### Verification
|
||||
- All 138 vitest tests pass
|
||||
- Vite build succeeds
|
||||
|
||||
### Session: 2026-03-08 — Security Audit Fixes + OTEL Telemetry
|
||||
|
||||
#### Security Audit Fixes
|
||||
- [x] Fixed all CRITICAL (5) + HIGH (4) findings — path traversal, race conditions, memory leaks, listener leaks, transaction safety
|
||||
- [x] Fixed all MEDIUM (6) findings — runtime type guards, ANTHROPIC_* env stripping, timestamp mismatch, async lock, error propagation
|
||||
- [x] Fixed all LOW (8) findings — input validation, mutex poisoning, log warnings, payload validation
|
||||
- [x] 3 false positives dismissed with rationale
|
||||
- [x] 172/172 tests pass (138 vitest + 34 cargo)
|
||||
|
||||
#### OTEL Telemetry Implementation
|
||||
- [x] Added 6 Rust deps: tracing, tracing-subscriber, opentelemetry 0.28, opentelemetry_sdk 0.28, opentelemetry-otlp 0.28, tracing-opentelemetry 0.29
|
||||
- [x] Created `v2/src-tauri/src/telemetry.rs` — TelemetryGuard, layer composition, OTLP export via BTERMINAL_OTLP_ENDPOINT env var
|
||||
- [x] Integrated into lib.rs: TelemetryGuard in AppState, init before Tauri builder
|
||||
- [x] Instrumented 10 Tauri commands with `#[tracing::instrument]`: pty_spawn, pty_kill, agent_query/stop/restart, remote_connect/disconnect/agent_query/agent_stop/pty_spawn
|
||||
- [x] Added `frontend_log` Tauri command for frontend→Rust tracing bridge
|
||||
- [x] Created `v2/src/lib/adapters/telemetry-bridge.ts` — `tel.info/warn/error/debug/trace()` convenience API
|
||||
- [x] Wired agent dispatcher lifecycle events: agent_started, agent_stopped, agent_error, sidecar_crashed, cost metrics
|
||||
- [x] Created Docker compose stack: `docker/tempo/` — Tempo (4317/4318/3200) + Grafana (port 9715)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue