Commit graph

328 commits

Author SHA1 Message Date
DexterFromLab
a7077c7987 Add Ctx Manager panel as new sidebar tab for browsing and editing project contexts
Adds a StackSwitcher to the sidebar with Sessions and Ctx tabs. The Ctx panel
provides a tree view of all ctx projects/entries with a detail preview pane,
CRUD operations (add/edit/delete projects and entries), right-click context
menus, and auto-refresh on tab switch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 16:02:17 +01:00
DexterFromLab
f9ec78ce1e Remove shared context from ctx get output to avoid misleading project info
Shared entries (server, webhooks, workflow) were shown for every project,
causing Claude to misattribute them. Now ctx get shows only project-specific
data. Use --shared flag to include shared context when needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 16:02:17 +01:00
DexterFromLab
af670871ed Replace ctx auto-setup with step-by-step wizard
- Remove silent setup_ctx and "Edit ctx entries" button from ClaudeCodeDialog
- Add CtxSetupWizard: 3-step guided flow (project registration, first entry, confirm)
- Show ctx status label in session dialog (registered vs new project)
- Launch wizard automatically on save when project_dir is set and ctx not initialized
- Add ctx cleanup prompt when deleting a Claude session
- Extract helper functions: _detect_project_description, _is_ctx_project_registered

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 16:02:17 +01:00
Hibryda
6938e8c3a9 chore: add nested Claude session E2E TODO + trim completed list 2026-03-12 07:34:40 +01:00
Hibryda
c9927a41e6 docs: update CHANGELOG and TODO for E2E fixture/judge fixes 2026-03-12 07:30:56 +01:00
Hibryda
4c0d27aca3 fix: LLM judge CLI context isolation (--setting-sources user, cwd /tmp) 2026-03-12 07:30:56 +01:00
Hibryda
78afb0e552 test: increase WebDriverIO timeout for LLM-judged E2E tests
Increase global mocha timeout from 60s to 180s in wdio.conf.js to accommodate longer-running LLM judge tests that evaluate agent responses and code generation. Add explicit per-test overrides for Phase B scenarios B4 and B5 to ensure adequate time for agent startup, execution, and LLM verification.

- wdio.conf.js: global timeout 60_000 → 180_000ms
- phase-b.test.ts: explicit 180_000ms timeout for B4 and B5 scenarios
2026-03-12 07:13:57 +01:00
Hibryda
f555186843 test: update WebDriverIO configuration with improved fixture setup and logging 2026-03-12 06:58:58 +01:00
Hibryda
a8a10ee4af chore: remove obsolete rules files (consolidated into 53/54 sequence) 2026-03-12 06:47:58 +01:00
Hibryda
ee198a2fdb chore: reorganize rules files — consolidate duplicates
Migrates legacy rule numbering (18, 20) to standardized sequence (53, 54) and adds new 18-preexisting-issues.md for handling pre-existing issues during development. This consolidates duplicate rule coverage across the old and new numbering schemes.

Files changed:
- Removed: 18-relative-units.md (moved to 53-relative-units.md)
- Removed: 20-testing-gate.md (moved to 54-testing-gate.md)
- Added: 18-preexisting-issues.md (new)
- Added: 53-relative-units.md (renamed from 18)
- Added: 54-testing-gate.md (renamed from 20)
2026-03-12 06:47:47 +01:00
Hibryda
65973fbf06 docs: add comprehensive E2E testing facility documentation
New docs/e2e-testing.md covering all 3 pillars: test fixtures
(isolated temp environments), test mode (BTERMINAL_TEST=1), and
LLM judge (dual-mode CLI/API). Includes spec phases, CI integration,
WebKit2GTK pitfalls, and troubleshooting guide.
2026-03-12 06:35:04 +01:00
Hibryda
a3185656eb feat: refactor LLM judge to dual-mode CLI/API and fix config test race
Refactor llm-judge.ts from raw API-only to dual-mode: CLI first
(spawns claude with --output-format text, unsets CLAUDECODE), API
fallback. Backend selectable via LLM_JUDGE_BACKEND env var.

Fix pre-existing race condition in config.rs tests where parallel
test execution caused env var mutations to interfere. Added static
Mutex to serialize env-mutating tests.
2026-03-12 06:35:04 +01:00
Hibryda
05c9e1abbb test: add Phase C E2E tests and fix pre-existing test failures
- Add phase-c.test.ts: 27 new E2E tests across 11 scenarios covering
  hardening sprint features (command palette, search overlay, notification
  center, keyboard navigation, settings panel, project health, metrics tab,
  context tab, files tab, LLM-judged settings/status bar)
- Fix 3 pre-existing failures in bterminal.test.ts: update stale CSS
  selectors (.group-name → .cmd-label, .palette-item.active → .selected)
- Register phase-c.test.ts in wdio.conf.js specs array
- Update test counts: 444 vitest + 151 cargo + 109 E2E = 704 total
2026-03-12 06:20:21 +01:00
Hibryda
661f092fb2 fix: use tauri::async_runtime::spawn for WAL checkpoint task
tokio::spawn() panics during Tauri setup in WebDriver E2E mode because
the Tokio runtime is not directly accessible. Switch to
tauri::async_runtime::spawn() which uses Tauri's managed runtime.
2026-03-12 05:51:51 +01:00
Hibryda
2aec5889f8 docs: add v3.0 release notes and update meta files for hardening sprint
- docs/v3-release-notes.md: comprehensive v3.0 release notes covering
  Mission Control, multi-agent orchestration, production readiness,
  multi-machine early access, test coverage, and known limitations
- docs/v3-progress.md: hardening sprint session entry
- CHANGELOG.md: security entries (TLS, WAL, plugin sandbox, Landlock)
  and bug fixes (subagent delegation, gitignore)
- TODO.md: hardening complete, remaining items moved to v3.1
- CLAUDE.md: updated test counts (444 vitest + 111 cargo)
2026-03-12 05:30:32 +01:00
Hibryda
8754b64ee3 fix: track plugin-host source and add 35 sandbox security tests
Fix .gitignore 'plugins/' rule that was accidentally ignoring source
files in v2/src/lib/plugins/. Narrow to /plugins/ and /v2/plugins/
(runtime plugin directories only). Track plugin-host.ts (was written
but never committed) and add comprehensive test suite covering all 13
shadowed globals, this-binding, permission gating, API freeze, and
lifecycle management.
2026-03-12 05:25:12 +01:00
Hibryda
e46b9e06d1 feat: add WAL checkpoint task and improve Landlock fallback logging
Add periodic PRAGMA wal_checkpoint(TRUNCATE) every 5 minutes for both
sessions.db and btmsg.db to prevent unbounded WAL growth under sustained
multi-agent load. Improve Landlock fallback log message with kernel
version requirement. Add WAL checkpoint tests.
2026-03-12 05:21:39 +01:00
Hibryda
83c6711cd6 feat: add TLS support to bterminal-relay
Add optional --tls-cert and --tls-key CLI args. When provided, the relay
wraps TCP streams with native-tls before WebSocket upgrade. Refactored
to generic accept_ws_with_auth<S> and run_ws_session<S> to avoid code
duplication between plain and TLS paths. Client side already supports
wss:// URLs via connect_async with native-tls feature.
2026-03-12 05:21:33 +01:00
Hibryda
cd774ab4bd feat: fix subagent delegation for Manager agents
Add multi-agent delegation documentation to Manager system prompt so
Claude knows it can spawn child agents via the Agent tool. Also inject
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 env var for Manager agents.
2026-03-12 05:21:26 +01:00
Hibryda
c304a8c06b docs: update meta files for testing facility and tribunal assessment 2026-03-12 04:57:29 +01:00
Hibryda
bbb5f24cf9 test: update tests for production readiness features
Update btmsg-bridge, bttask-bridge, and agent-dispatcher tests for new
APIs (registerAgents, version param, notification mocks).
2026-03-12 04:57:29 +01:00
Hibryda
c193db49a8 feat: integrate all production readiness modules
Register new commands in lib.rs, add command modules, update Cargo deps
(notify-rust, keyring, bundled-full), fix PRAGMA WAL for bundled-full,
add notifications/heartbeats/FTS5 indexing to agent-dispatcher,
update SettingsTab with secrets/plugins/sandbox/updates sections.
2026-03-12 04:57:29 +01:00
Hibryda
3cb65fd5e5 feat: add optimistic locking for bttask and error classification
Version column in tasks table with WHERE id=? AND version=? guard.
Conflict detection in TaskBoardTab. error-classifier.ts: 6 error types
with actionable messages and retry logic. UsageMeter.svelte.
2026-03-12 04:57:29 +01:00
Hibryda
0fe43de357 feat: add keyboard-first UX and rewrite CommandPalette
Alt+1-5 project jump, Ctrl+H/L vi-nav, Ctrl+Shift+1-9 tab switch,
Ctrl+J terminal toggle, Ctrl+Shift+K focus agent. isEditing() guard.
CommandPalette: 18+ commands, 6 categories, fuzzy filter, arrow nav.
2026-03-12 04:57:29 +01:00
Hibryda
5c31668760 feat: add agent health monitoring, audit log, and dead letter queue
heartbeats + dead_letter_queue + audit_log tables in btmsg.db. 15s
heartbeat polling in ProjectBox, stale detection, ProjectHeader heart
indicator. AuditLogTab for Manager. register_agents_from_groups() with
bidirectional contacts and review channel creation.
2026-03-12 04:57:29 +01:00
Hibryda
b2932273ba feat: add plugin system with sandboxed runtime
Plugin discovery from ~/.config/bterminal/plugins/ with plugin.json
manifest. Sandboxed new Function() execution, permission-gated API
(palette, btmsg:read, bttask:read, events). Plugin store + SettingsTab.
2026-03-12 04:57:29 +01:00
Hibryda
5dd7df03cb feat: add OS + in-app notification system
notify-rust for desktop notifications, NotificationCenter.svelte with
bell icon, unread badge, history (max 100), 6 notification types.
Extended notification store with history and type support.
2026-03-12 04:57:29 +01:00
Hibryda
7cb5cddc7c feat: add secrets management via system keyring
SecretsManager using keyring crate (linux-native/libsecret). Store/get/
delete/list with __bterminal_keys__ metadata tracking. SettingsTab
Secrets section. No plaintext fallback.
2026-03-12 04:57:29 +01:00
Hibryda
944b48ff13 feat: add FTS5 full-text search with Spotlight-style overlay
Upgrade rusqlite to bundled-full for FTS5. SearchDb with 3 virtual tables
(messages, tasks, btmsg). SearchOverlay.svelte: Ctrl+Shift+F, 300ms
debounce, grouped results with highlight snippets.
2026-03-12 04:57:29 +01:00
Hibryda
b2c379516c feat: add Landlock sandbox for sidecar process isolation
SandboxConfig with RW/RO paths applied via pre_exec() in sidecar child
process. Requires kernel 6.2+ with graceful fallback. Per-project toggle
in SettingsTab. 9 unit tests.
2026-03-12 04:57:29 +01:00
Hibryda
548478f115 feat: add sidecar crash recovery supervisor with exponential backoff
SidecarSupervisor wraps SidecarManager with auto-restart (1s-30s backoff,
5 retries), SidecarHealth enum, 5min stability window. 17 unit tests.
2026-03-12 04:57:29 +01:00
Hibryda
243faafd9e docs: update meta files for testing facility and tribunal assessment
Update CLAUDE.md with test runner in key paths and build commands.
Update .claude/CLAUDE.md with testing gate rule index entry.
Update TODO.md with tribunal-derived roadmap items.
Update CHANGELOG.md with test runner and testing gate entries.
2026-03-12 04:05:52 +01:00
Hibryda
c5188757ad feat: add unified test runner and testing gate rule
Create v2/scripts/test-all.sh (vitest + cargo + optional E2E via --e2e).
Add npm scripts: test:all, test:all:e2e, test:cargo.
Add .claude/rules/20-testing-gate.md requiring full suite after major changes.
2026-03-12 04:05:52 +01:00
Hibryda
2e29ba5d9a docs: update meta files for E2E test fixes
Update test counts (82 E2E passing), add CHANGELOG entries for
27 fixed failures and AgentPane template fix, update TODO.md.
2026-03-12 03:50:13 +01:00
Hibryda
9ce7c35325 fix(e2e): fix 27 E2E test failures across 3 spec files
Fix stale v2 CSS selectors for v3 UI, WebKit2GTK keyboard/focus
quirks (JS-dispatched KeyboardEvent, programmatic focus check,
backdrop click close), conditional render timing (waitUntil for
project boxes, null handling for burn-rate/cost elements), and
AgentPane missing closing > on data-testid div tag.
2026-03-12 03:50:13 +01:00
Hibryda
e3594074d2 docs: update meta files for E2E testing engine Phase B+ 2026-03-12 03:07:38 +01:00
Hibryda
c43c83fbe6 ci: add E2E test workflow with xvfb and LLM-judged test gating
3 jobs (vitest, cargo, e2e), path-filtered triggers on v2 source changes,
xvfb-run for headless WebKit2GTK, LLM-judged tests gated on
ANTHROPIC_API_KEY secret availability.
2026-03-12 03:07:38 +01:00
Hibryda
5e4357e4ac feat(e2e): add Phase B scenarios with LLM-judged assertions and multi-project tests
Adds 6 new E2E scenarios in phase-b.test.ts covering multi-project grid
rendering, independent tab switching, status bar fleet state, and
LLM-judged agent response quality evaluation via Claude API.
Includes llm-judge.ts helper (raw Anthropic API fetch, haiku-4-5,
structured verdicts with confidence thresholds).
2026-03-12 03:07:38 +01:00
Hibryda
c4c673a4b0 docs: update meta files for E2E testing engine Phase A 2026-03-12 02:52:14 +01:00
Hibryda
c6c38b91c6 feat(e2e): add Phase A scenarios, fixtures, and results store
7 human-authored test scenarios (22 tests) using data-testid
selectors. Test fixture generator for isolated environments.
JSON results store (no native deps). WebDriverIO config updated
with TCP readiness probe and multi-spec support.
2026-03-12 02:52:14 +01:00
Hibryda
2746b34f83 feat(e2e): add data-testid attributes to 7 key Svelte components
Stable test selectors for E2E: agent-pane, data-agent-status,
project-box, data-project-id, status-bar, agent-session,
sidebar-rail, command-palette, terminal-tabs and more.
2026-03-12 02:52:14 +01:00
Hibryda
4097253921 feat(e2e): add test mode infrastructure with BTERMINAL_TEST env isolation
Rust: watcher.rs/fs_watcher.rs skip watchers in test mode,
is_test_mode Tauri command. Frontend: wake-scheduler disable,
App.svelte test mode detection. AppConfig centralization in
bterminal-core (OnceLock pattern for path overrides).
2026-03-12 02:52:14 +01:00
Hibryda
d1a4d9f220 docs: update meta files for reviewer agent role 2026-03-12 00:54:43 +01:00
Hibryda
323bb1b040 feat(reviewer): add Tier 1 reviewer agent role with auto-channel notifications
Reviewer workflow in agent-prompts.ts (8-step process), Rust auto-post
to #review-queue on task->review transition, reviewQueueDepth in
attention scoring (10pts/task cap 50), Tasks tab for reviewer in
ProjectBox with 10s queue polling. 7 vitest + 4 cargo tests.
2026-03-12 00:54:43 +01:00
Hibryda
61f01e22b8 docs: update meta files for auto-wake Manager session 2026-03-12 00:30:41 +01:00
Hibryda
c774f352ee feat(wake): add auto-wake Manager scheduler with 3 selectable strategies
New wake system for Manager agents: persistent (resume prompt), on-demand
(fresh session), smart (threshold-gated). 6 wake signals from tribunal S-3
hybrid. Pure scorer function (24 tests), Svelte 5 rune scheduler store,
SettingsTab UI (strategy button + threshold slider), AgentSession integration.
2026-03-12 00:30:41 +01:00
Hibryda
5576392d4b docs: update meta files for dashboard metrics panel session 2026-03-12 00:15:09 +01:00
Hibryda
6ca3ffdb8d feat(metrics): add Dashboard Metrics Panel with live health and SVG sparkline history
New MetricsPanel.svelte component as ProjectBox tab (PERSISTED-LAZY, all projects).
Live view: fleet aggregates, project health grid, task board summary, attention queue.
History view: 5 switchable SVG sparklines (cost/tokens/turns/tools/duration), stats row,
recent sessions table. 25 tests for pure utility functions.
2026-03-12 00:15:09 +01:00
Hibryda
d9d67b2bc6 docs: update meta files for branded type call-site fixes 2026-03-11 22:56:52 +01:00
Hibryda
0742309595 refactor(adapters): brand btmsg/bttask/groups bridge interfaces with GroupId/AgentId
Apply branded types to all IPC bridge interfaces and function
parameters. Update test mock data with branded constructors.
2026-03-11 22:56:52 +01:00