heartbeats + dead_letter_queue + audit_log tables in btmsg.db. 15s
heartbeat polling in ProjectBox, stale detection, ProjectHeader heart
indicator. AuditLogTab for Manager. register_agents_from_groups() with
bidirectional contacts and review channel creation.
Plugin discovery from ~/.config/bterminal/plugins/ with plugin.json
manifest. Sandboxed new Function() execution, permission-gated API
(palette, btmsg:read, bttask:read, events). Plugin store + SettingsTab.
notify-rust for desktop notifications, NotificationCenter.svelte with
bell icon, unread badge, history (max 100), 6 notification types.
Extended notification store with history and type support.
SandboxConfig with RW/RO paths applied via pre_exec() in sidecar child
process. Requires kernel 6.2+ with graceful fallback. Per-project toggle
in SettingsTab. 9 unit tests.
Update CLAUDE.md with test runner in key paths and build commands.
Update .claude/CLAUDE.md with testing gate rule index entry.
Update TODO.md with tribunal-derived roadmap items.
Update CHANGELOG.md with test runner and testing gate entries.
Create v2/scripts/test-all.sh (vitest + cargo + optional E2E via --e2e).
Add npm scripts: test:all, test:all:e2e, test:cargo.
Add .claude/rules/20-testing-gate.md requiring full suite after major changes.
Adds 6 new E2E scenarios in phase-b.test.ts covering multi-project grid
rendering, independent tab switching, status bar fleet state, and
LLM-judged agent response quality evaluation via Claude API.
Includes llm-judge.ts helper (raw Anthropic API fetch, haiku-4-5,
structured verdicts with confidence thresholds).
7 human-authored test scenarios (22 tests) using data-testid
selectors. Test fixture generator for isolated environments.
JSON results store (no native deps). WebDriverIO config updated
with TCP readiness probe and multi-spec support.
Stable test selectors for E2E: agent-pane, data-agent-status,
project-box, data-project-id, status-bar, agent-session,
sidebar-rail, command-palette, terminal-tabs and more.
Reviewer workflow in agent-prompts.ts (8-step process), Rust auto-post
to #review-queue on task->review transition, reviewQueueDepth in
attention scoring (10pts/task cap 50), Tasks tab for reviewer in
ProjectBox with 10s queue polling. 7 vitest + 4 cargo tests.
New wake system for Manager agents: persistent (resume prompt), on-demand
(fresh session), smart (threshold-gated). 6 wake signals from tribunal S-3
hybrid. Pure scorer function (24 tests), Svelte 5 rune scheduler store,
SettingsTab UI (strategy button + threshold slider), AgentSession integration.
New MetricsPanel.svelte component as ProjectBox tab (PERSISTED-LAZY, all projects).
Live view: fleet aggregates, project health grid, task board summary, attention queue.
History view: 5 switchable SVG sparklines (cost/tokens/turns/tools/duration), stats row,
recent sessions table. 25 tests for pure utility functions.
Extend the branded type system with two new domain types for
btmsg/bttask agent and group identifiers. Apply to groups.ts
interfaces including agentToProject() domain crossing cast.
Defense-in-depth: Claude CLI uses credentials file for auth, not
ANTHROPIC_API_KEY from env. OPENAI_* intentionally kept (Codex runner
needs it). 8 unit tests for strip_provider_env_var.
GroupAgentsPanel: added e.stopPropagation() on toggleAgent button.
ArchitectureTab: collapsed rawDeflate no-op into single plantumlEncode().
TestingTab: replaced asset://localhost/ with convertFileSrc().
CRITICAL: get_agents() used SELECT * positional index 7 for status,
but column 7 is system_prompt (column 8 is status). Converted all
query functions in btmsg.rs and bttask.rs to named column access.
Fixed BtmsgAgent/BtmsgMessage TypeScript interfaces to use camelCase
matching Rust serde(rename_all = camelCase). Updated CommsTab consumer.
- Agent cards in SettingsTab: name, enable/disable, CWD, model, wake interval
- Custom Context textarea for editable system prompt per agent
- Collapsible preview of full generated introductory prompt
- Agent cards styled with mauve left border accent and role badge
- Export AGENT_ROLE_ICONS from groups.ts, add updateAgent() to workspace store
Show Tier 1 (Management: Manager, Architect, Tester) and Tier 2
(Execution: project agents) separated by a divider line. Tier 2
cards show project icon and name, are slightly smaller, no start/stop
button. Header dots show all agents with a separator between tiers.
Python CLI tool for hierarchical multi-agent communication.
SQLite-backed (WAL mode), agent identity via BTMSG_AGENT_ID env var.
Features:
- inbox/read/send/reply — message CRUD with read tracking
- contacts — role-based communication hierarchy enforcement
- history — per-agent conversation view
- status — all agents with tier/role/model/unread counts
- register/allow — agent and contact management
- notify — single-line notification for agent injection
- Short ID prefix matching for convenience
Also: change default Claude model to opus-4-6
- Add doc/ alongside docs/ in markdown file discovery (groups.rs)
- Add SETUP.md to priority root files
- Fix SSH terminal tabs: resolve session args via sshArgsCache derived
from listSshSessions() instead of passing empty args
- Fix Agent Preview: only mount xterm when tab is active (prevents
CanvasAddon crash on hidden elements)
- Separate tab type rendering (shell/ssh/agent-preview) with proper guards