Commit graph

292 commits

Author SHA1 Message Date
Hibryda
c43c83fbe6 ci: add E2E test workflow with xvfb and LLM-judged test gating
3 jobs (vitest, cargo, e2e), path-filtered triggers on v2 source changes,
xvfb-run for headless WebKit2GTK, LLM-judged tests gated on
ANTHROPIC_API_KEY secret availability.
2026-03-12 03:07:38 +01:00
Hibryda
5e4357e4ac feat(e2e): add Phase B scenarios with LLM-judged assertions and multi-project tests
Adds 6 new E2E scenarios in phase-b.test.ts covering multi-project grid
rendering, independent tab switching, status bar fleet state, and
LLM-judged agent response quality evaluation via Claude API.
Includes llm-judge.ts helper (raw Anthropic API fetch, haiku-4-5,
structured verdicts with confidence thresholds).
2026-03-12 03:07:38 +01:00
Hibryda
c4c673a4b0 docs: update meta files for E2E testing engine Phase A 2026-03-12 02:52:14 +01:00
Hibryda
c6c38b91c6 feat(e2e): add Phase A scenarios, fixtures, and results store
7 human-authored test scenarios (22 tests) using data-testid
selectors. Test fixture generator for isolated environments.
JSON results store (no native deps). WebDriverIO config updated
with TCP readiness probe and multi-spec support.
2026-03-12 02:52:14 +01:00
Hibryda
2746b34f83 feat(e2e): add data-testid attributes to 7 key Svelte components
Stable test selectors for E2E: agent-pane, data-agent-status,
project-box, data-project-id, status-bar, agent-session,
sidebar-rail, command-palette, terminal-tabs and more.
2026-03-12 02:52:14 +01:00
Hibryda
4097253921 feat(e2e): add test mode infrastructure with BTERMINAL_TEST env isolation
Rust: watcher.rs/fs_watcher.rs skip watchers in test mode,
is_test_mode Tauri command. Frontend: wake-scheduler disable,
App.svelte test mode detection. AppConfig centralization in
bterminal-core (OnceLock pattern for path overrides).
2026-03-12 02:52:14 +01:00
Hibryda
d1a4d9f220 docs: update meta files for reviewer agent role 2026-03-12 00:54:43 +01:00
Hibryda
323bb1b040 feat(reviewer): add Tier 1 reviewer agent role with auto-channel notifications
Reviewer workflow in agent-prompts.ts (8-step process), Rust auto-post
to #review-queue on task->review transition, reviewQueueDepth in
attention scoring (10pts/task cap 50), Tasks tab for reviewer in
ProjectBox with 10s queue polling. 7 vitest + 4 cargo tests.
2026-03-12 00:54:43 +01:00
Hibryda
61f01e22b8 docs: update meta files for auto-wake Manager session 2026-03-12 00:30:41 +01:00
Hibryda
c774f352ee feat(wake): add auto-wake Manager scheduler with 3 selectable strategies
New wake system for Manager agents: persistent (resume prompt), on-demand
(fresh session), smart (threshold-gated). 6 wake signals from tribunal S-3
hybrid. Pure scorer function (24 tests), Svelte 5 rune scheduler store,
SettingsTab UI (strategy button + threshold slider), AgentSession integration.
2026-03-12 00:30:41 +01:00
Hibryda
5576392d4b docs: update meta files for dashboard metrics panel session 2026-03-12 00:15:09 +01:00
Hibryda
6ca3ffdb8d feat(metrics): add Dashboard Metrics Panel with live health and SVG sparkline history
New MetricsPanel.svelte component as ProjectBox tab (PERSISTED-LAZY, all projects).
Live view: fleet aggregates, project health grid, task board summary, attention queue.
History view: 5 switchable SVG sparklines (cost/tokens/turns/tools/duration), stats row,
recent sessions table. 25 tests for pure utility functions.
2026-03-12 00:15:09 +01:00
Hibryda
d9d67b2bc6 docs: update meta files for branded type call-site fixes 2026-03-11 22:56:52 +01:00
Hibryda
0742309595 refactor(adapters): brand btmsg/bttask/groups bridge interfaces with GroupId/AgentId
Apply branded types to all IPC bridge interfaces and function
parameters. Update test mock data with branded constructors.
2026-03-11 22:56:52 +01:00
Hibryda
f928abd6ce refactor(types): add GroupId and AgentId branded types to ids.ts
Extend the branded type system with two new domain types for
btmsg/bttask agent and group identifiers. Apply to groups.ts
interfaces including agentToProject() domain crossing cast.
2026-03-11 22:56:52 +01:00
Hibryda
46df7949a7 refactor(components): apply branded types at Svelte component call sites
GroupAgentsPanel, TaskBoardTab, SettingsTab now use GroupId/AgentId
branded constructors at their IPC call sites.
2026-03-11 22:56:52 +01:00
Hibryda
ce389a2a39 docs: update meta files for regression tests and sidecar security session 2026-03-11 22:19:03 +01:00
Hibryda
70ebbff699 security(sidecar): add ANTHROPIC_* to Rust env strip + unit tests
Defense-in-depth: Claude CLI uses credentials file for auth, not
ANTHROPIC_API_KEY from env. OPENAI_* intentionally kept (Codex runner
needs it). 8 unit tests for strip_provider_env_var.
2026-03-11 22:19:03 +01:00
Hibryda
e41d237745 test(btmsg): add regression tests for named column access and camelCase serialization
Covers the CRITICAL status vs system_prompt bug (positional index 7),
JOIN alias disambiguation, serde camelCase serialization, TypeScript
bridge IPC commands, and plantuml hex encoding algorithm.
49 new tests: 8 btmsg.rs + 7 bttask.rs + 8 sidecar + 17 btmsg-bridge.ts + 10 bttask-bridge.ts + 7 plantuml-encode.ts
2026-03-11 22:19:03 +01:00
Hibryda
a12f2bec7b docs: update meta files for dexter_changes bug fix session 2026-03-11 21:54:19 +01:00
Hibryda
8678e3474d fix(components): stopPropagation, PlantUML encoding, Tauri 2.x asset URL
GroupAgentsPanel: added e.stopPropagation() on toggleAgent button.
ArchitectureTab: collapsed rawDeflate no-op into single plantumlEncode().
TestingTab: replaced asset://localhost/ with convertFileSrc().
2026-03-11 21:54:19 +01:00
Hibryda
93c2cdf434 fix(btmsg): convert positional column access to named, fix camelCase mismatch
CRITICAL: get_agents() used SELECT * positional index 7 for status,
but column 7 is system_prompt (column 8 is status). Converted all
query functions in btmsg.rs and bttask.rs to named column access.

Fixed BtmsgAgent/BtmsgMessage TypeScript interfaces to use camelCase
matching Rust serde(rename_all = camelCase). Updated CommsTab consumer.
2026-03-11 21:54:19 +01:00
DexterFromLab
32f6d7eadf docs: update meta files for multi-agent orchestration
- v3-progress.md: full session log for agent orchestration work
- v3-task_plan.md: 7 new decisions (agent rendering, env passthrough,
  re-injection, shared DB, role tabs, PlantUML encoding)
- CLAUDE.md: updated overview, key paths, component list
- .claude/CLAUDE.md: updated workflow, ProjectBox tabs, orchestration docs
2026-03-11 15:25:53 +01:00
DexterFromLab
2ca7756a74 feat(agents): role-specific tabs + bttask Tauri backend
- TaskBoardTab: kanban board (5 columns, CRUD, comments, 5s poll) for Manager
- ArchitectureTab: PlantUML viewer/editor (4 templates, plantuml.com) for Architect
- TestingTab: Selenium screenshots + test file discovery for Tester
- bttask.rs: Rust backend (list, create, update_status, delete, comments)
- bttask-bridge.ts: TypeScript IPC adapter
- ProjectBox: conditional role tabs (isAgent && agentRole), PERSISTED-LAZY
2026-03-11 15:25:41 +01:00
DexterFromLab
0c28f204c7 feat(agents): custom context for Tier 2 + periodic system prompt re-injection
- SettingsTab: Custom Context textarea for Tier 2 project cards
- AgentSession passes systemPrompt for ALL projects (Tier 1 gets full
  generated prompt, Tier 2 gets custom context)
- Periodic re-injection: 1-hour timer checks if agent is idle, then
  auto-sends context refresh prompt with role/tools reminder
- AgentPane: autoPrompt prop consumed when session is done/error,
  resumes session with fresh system prompt
2026-03-11 15:02:28 +01:00
DexterFromLab
14808a97e9 feat(settings): add Tier 1 agent config panel with system prompt editor
- Agent cards in SettingsTab: name, enable/disable, CWD, model, wake interval
- Custom Context textarea for editable system prompt per agent
- Collapsible preview of full generated introductory prompt
- Agent cards styled with mauve left border accent and role badge
- Export AGENT_ROLE_ICONS from groups.ts, add updateAgent() to workspace store
2026-03-11 14:55:38 +01:00
DexterFromLab
a158ed9544 feat(orchestration): multi-agent communication, unified agents, env passthrough
- btmsg: admin role (tier 0), channel messaging (create/list/send/history),
  admin global feed, mark-read conversations
- Rust btmsg module: admin bypass, channels, feed, 8 new Tauri commands
- CommsTab: sidebar chat interface with activity feed, DMs, channels (Ctrl+M)
- Agent unification: Tier 1 agents rendered as ProjectBoxes via agentToProject()
  converter, getAllWorkItems() combines agents + projects in ProjectGrid
- GroupAgentsPanel: click-to-navigate agents to their ProjectBox
- Agent system prompts: generateAgentPrompt() builds comprehensive introductory
  context (role, environment, team, btmsg/bttask docs, workflow instructions)
- AgentSession passes group context to prompt generator via $derived.by()
- BTMSG_AGENT_ID env var passthrough: extra_env field flows through full chain
  (agent-bridge → Rust AgentQueryOptions → NDJSON → sidecar runners → cleanEnv)
- workspace store: updateAgent() for Tier 1 agent config persistence
2026-03-11 14:53:39 +01:00
DexterFromLab
1331d094b3 feat(GroupAgentsPanel): add Tier 1/2 division with project agents
Show Tier 1 (Management: Manager, Architect, Tester) and Tier 2
(Execution: project agents) separated by a divider line. Tier 2
cards show project icon and name, are slightly smaller, no start/stop
button. Header dots show all agents with a separator between tiers.
2026-03-11 14:05:09 +01:00
DexterFromLab
f2dcedc460 feat(orchestration): add bttask CLI + GroupAgentsPanel + btmsg Tauri bridge
Phase 2: bttask CLI (Python, SQLite) — task management with role-based
visibility. Kanban board view. Manager/Architect can create tasks,
Tier 2 agents receive tasks via btmsg only.

Phase 3: GroupAgentConfig in groups.json + Rust backend. GroupAgentsPanel
Svelte component above ProjectGrid with status dots, role icons,
unread badges, start/stop buttons.

Phase 4: btmsg Rust bridge (btmsg.rs) — read/write access to btmsg.db.
6 Tauri commands for agent status, messages, and history.
GroupAgentsPanel polls btmsg.db every 5s for live status updates.
2026-03-11 14:03:11 +01:00
DexterFromLab
485b279659 feat(btmsg): add graph command — visual agent hierarchy with status
Shows tier boxes, communication links, status dots (green/yellow/red),
unread message badges, and model assignments per agent.
2026-03-11 13:54:27 +01:00
DexterFromLab
e1025a0a8a feat(btmsg): add group agent messenger CLI
Python CLI tool for hierarchical multi-agent communication.
SQLite-backed (WAL mode), agent identity via BTMSG_AGENT_ID env var.

Features:
- inbox/read/send/reply — message CRUD with read tracking
- contacts — role-based communication hierarchy enforcement
- history — per-agent conversation view
- status — all agents with tier/role/model/unread counts
- register/allow — agent and contact management
- notify — single-line notification for agent injection
- Short ID prefix matching for convenience

Also: change default Claude model to opus-4-6
2026-03-11 13:51:40 +01:00
DexterFromLab
44610f3177 fix(workspace): docs discovery for doc/ dirs + SSH terminal tab args
- Add doc/ alongside docs/ in markdown file discovery (groups.rs)
- Add SETUP.md to priority root files
- Fix SSH terminal tabs: resolve session args via sshArgsCache derived
  from listSshSessions() instead of passing empty args
- Fix Agent Preview: only mount xterm when tab is active (prevents
  CanvasAddon crash on hidden elements)
- Separate tab type rendering (shell/ssh/agent-preview) with proper guards
2026-03-11 13:04:32 +01:00
Hibryda
dc0ffb6dbf docs: update meta files for branded type call-site fixes 2026-03-11 05:46:22 +01:00
Hibryda
af3cd45324 refactor(components): apply branded types at Svelte component call sites 2026-03-11 05:46:22 +01:00
Hibryda
c3d2e1daee docs: update meta files for SOLID Phase 3 branded types 2026-03-11 05:40:28 +01:00
Hibryda
889adcb004 refactor(agent-dispatcher): brand sessionId at sidecar boundary 2026-03-11 05:40:28 +01:00
Hibryda
a06b9d5053 refactor(utils): apply branded types to session-persistence and auto-anchoring 2026-03-11 05:40:28 +01:00
Hibryda
3f4f2d70af refactor(stores): apply branded types to conflicts and health Map keys 2026-03-11 05:40:28 +01:00
Hibryda
f2a7d385d6 feat(types): introduce SessionId/ProjectId branded types (SOLID Phase 3) 2026-03-11 05:40:28 +01:00
Hibryda
7ba63db101 refactor(agent-dispatcher): remove dead detectWorktreeFromCwd re-export 2026-03-11 05:29:28 +01:00
Hibryda
584a38d096 docs: update meta files for SOLID Phase 2 refactoring 2026-03-11 05:25:32 +01:00
Hibryda
9c94272ca7 refactor(session): split session.rs into 7 sub-modules (SOLID Phase 2) 2026-03-11 05:25:32 +01:00
Hibryda
450756f540 refactor(agent-dispatcher): split into 4 focused modules (SOLID Phase 2) 2026-03-11 05:25:32 +01:00
Hibryda
54b1c60810 docs: update meta files for SOLID Phase 1 refactoring 2026-03-11 05:09:15 +01:00
Hibryda
af369f30d2 test(attention-scorer): add 14 tests for extracted scorer function 2026-03-11 05:09:15 +01:00
Hibryda
4d93b77f6a refactor(frontend): extract attention scorer and shared type guards 2026-03-11 05:09:15 +01:00
Hibryda
30c21256bc refactor(backend): split lib.rs into 11 domain command modules 2026-03-11 05:09:15 +01:00
Hibryda
b1bc5d18a4 docs: update meta files for reconnect loop fix 2026-03-11 04:51:46 +01:00
Hibryda
fc7fe3180e fix(remote): cancel reconnect loop on machine removal 2026-03-11 04:51:46 +01:00
Hibryda
4ac0336e72 chore: broaden .audit gitignore to cover all subdirectories 2026-03-11 04:47:25 +01:00