Each plugin now runs in a dedicated Web Worker with permission-gated
API proxied via postMessage. Eliminates prototype walking and
arguments.callee.constructor escape vectors inherent to same-realm
new Function() sandbox.
Update test counts (516 vitest + 159 cargo), add new entries for all 5
tribunal priorities, mark certificate pinning done, add SPKI persistence
and seen_messages pruning as new TODOs.
v2 architecture doc superseded by architecture.md, sidecar.md,
orchestration.md, and production.md. Updated cross-references in
README.md, phases.md, and .claude/CLAUDE.md.
README.md: from 42-line index to rich documentation hub with project
overview, reading order, and key directory listing.
v3-findings.md: from 63 lines to comprehensive research findings covering
adversarial review details, provider coupling analysis, codebase reuse,
session anchor design, multi-agent design rationale, theme evolution,
and performance measurements.
New documentation covering end-to-end system architecture, multi-provider
sidecar lifecycle, btmsg/bttask multi-agent orchestration, and production
hardening features (supervisor, sandbox, search, plugins, secrets, audit).
Fix NoneType bug in generate_report (counterArgument can be None).
Add consult to install.sh alongside ctx for symlink-based installation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Emit 'input' events so agents show received prompts in their console
- Execute detected shell commands (btmsg, bttask, etc.) from LLM output
- Feed command results back to aider for iterative autonomous work
- Detect commands in code blocks, bare btmsg/bttask lines, and $ prefixes
- More robust THINKING/ANSWER marker detection (multiple unicode variants)
- Adapter handles new 'input' and 'tool_result' event types
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aider runner now buffers entire turn output and parses it into thinking,
text, shell command, and cost blocks. Adapter updated for new event types.
Fixes console UI showing individual chevrons per output line.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rust groups.rs ProjectConfig was missing provider, model, and other optional
fields — serde silently dropped them on save, causing all projects to fall
back to claude/Opus on reload. Added all missing fields to both ProjectConfig
and GroupAgentConfig structs.
Rewrote aider-runner from one-shot --message mode to interactive stdin/stdout:
- Persistent aider process with multi-turn conversation support
- Pre-fetches btmsg inbox and bttask board before sending prompt to LLM
- Autonomous agent override prompt so LLM acts instead of asking for files
- Line-buffered output (no token-by-token fragments)
- Thinking block classification for DeepSeek R1
- Graceful /exit shutdown
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
scrollIntoView() in AgentPane was scrolling all ancestor containers
including ProjectGrid (overflow-x: auto), causing the entire project
grid to jump horizontally every time any agent produced output.
Replaced with direct scrollTop/scrollTo manipulation that only affects
the intended scroll container. Also removed scroll-snap-type which
caused additional snap recalculation on layout changes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Set BTMSG_AGENT_ID for all projects (not just Tier 1) so Tier 2
agents can use btmsg/bttask CLI tools
- Add btmsg/bttask documentation to Tier 2 system prompt with
workflow instructions (inbox, tasks, status updates)
- Unify wake/start prompts to always reference btmsg inbox
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add btmsg inbox polling (10s) to AgentSession so agents wake when
they receive messages from other agents (not just admin DMs)
- Remove automatic setActiveProject on agent activation to prevent
focus stealing from the user
- Use untrack() in ProjectGrid scroll effect so agent re-renders
don't trigger unwanted scrollIntoView
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add aider-runner.ts sidecar that spawns aider CLI in non-interactive mode
- Add Aider provider metadata with OpenRouter model presets
- Add aider-messages.ts adapter for Aider event format
- Refactor SidecarManager from single-process to per-provider process management
with lazy startup on first query and session→provider routing
- Add openrouter_api_key to secrets system (keyring storage)
- Inject OPENROUTER_API_KEY from secrets into Aider agent environment
- Register Aider in provider registry, build pipeline, and resource bundle
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add provider and model fields to both GroupAgentConfig and ProjectConfig
- Wire model override through AgentSession → AgentPane → queryAgent → sidecar
- Add model preset dropdown per provider (Opus/Sonnet/Haiku, GPT-5.4/o3, etc.)
with custom model ID input at the bottom
- Add provider dropdown to Tier 1 agents (was Tier 2 only)
- Add "Apply & Restart" button on both tiers to restart agent with new settings
- Changing provider auto-resets model selection
- Admin bypasses stale heartbeat check in btmsg so DMs always deliver
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Play button in GroupAgentsPanel now starts agent session via emitAgentStart
- Stop button now stops running agent session via emitAgentStop
- Sending a DM to a stopped agent auto-wakes it (sets active + emitAgentStart)
- Fix autoPrompt in AgentPane to work for fresh sessions (not just done/error)
- Fix btmsg: admin (tier 0) bypasses stale heartbeat check so messages deliver
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename Cargo package from bterminal to agent-orchestrator so WM_CLASS
matches desktop entry and taskbar groups correctly
- Update lib name (agent_orchestrator_lib) and telemetry service name
- Add Pandora's Box splash screen with progress steps during startup
- Prevent white window flash with inline CSS and Tauri backgroundColor
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace old BTerminal V1 README with comprehensive Agent Orchestrator
documentation covering multi-agent orchestration, production hardening,
testing infrastructure, and architecture overview. Update screenshot
to show current V3 UI with Messages panel and agent workspace.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rebrand all user-visible BTerminal references to Agent Orchestrator
(window title, product name, identifier, status bar, updater URL,
context registration, CLAUDE.md branch reference).
Fix critical btmsg/bttask crash: pragma_update uses execute() internally
but PRAGMA busy_timeout returns a result row, causing "Execute returned
results" error that silently broke all CommsTab message loading.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Increase global mocha timeout from 60s to 180s in wdio.conf.js to accommodate longer-running LLM judge tests that evaluate agent responses and code generation. Add explicit per-test overrides for Phase B scenarios B4 and B5 to ensure adequate time for agent startup, execution, and LLM verification.
- wdio.conf.js: global timeout 60_000 → 180_000ms
- phase-b.test.ts: explicit 180_000ms timeout for B4 and B5 scenarios
Migrates legacy rule numbering (18, 20) to standardized sequence (53, 54) and adds new 18-preexisting-issues.md for handling pre-existing issues during development. This consolidates duplicate rule coverage across the old and new numbering schemes.
Files changed:
- Removed: 18-relative-units.md (moved to 53-relative-units.md)
- Removed: 20-testing-gate.md (moved to 54-testing-gate.md)
- Added: 18-preexisting-issues.md (new)
- Added: 53-relative-units.md (renamed from 18)
- Added: 54-testing-gate.md (renamed from 20)
New docs/e2e-testing.md covering all 3 pillars: test fixtures
(isolated temp environments), test mode (BTERMINAL_TEST=1), and
LLM judge (dual-mode CLI/API). Includes spec phases, CI integration,
WebKit2GTK pitfalls, and troubleshooting guide.
Refactor llm-judge.ts from raw API-only to dual-mode: CLI first
(spawns claude with --output-format text, unsets CLAUDECODE), API
fallback. Backend selectable via LLM_JUDGE_BACKEND env var.
Fix pre-existing race condition in config.rs tests where parallel
test execution caused env var mutations to interfere. Added static
Mutex to serialize env-mutating tests.
tokio::spawn() panics during Tauri setup in WebDriver E2E mode because
the Tokio runtime is not directly accessible. Switch to
tauri::async_runtime::spawn() which uses Tauri's managed runtime.
Fix .gitignore 'plugins/' rule that was accidentally ignoring source
files in v2/src/lib/plugins/. Narrow to /plugins/ and /v2/plugins/
(runtime plugin directories only). Track plugin-host.ts (was written
but never committed) and add comprehensive test suite covering all 13
shadowed globals, this-binding, permission gating, API freeze, and
lifecycle management.
Add periodic PRAGMA wal_checkpoint(TRUNCATE) every 5 minutes for both
sessions.db and btmsg.db to prevent unbounded WAL growth under sustained
multi-agent load. Improve Landlock fallback log message with kernel
version requirement. Add WAL checkpoint tests.
Add optional --tls-cert and --tls-key CLI args. When provided, the relay
wraps TCP streams with native-tls before WebSocket upgrade. Refactored
to generic accept_ws_with_auth<S> and run_ws_session<S> to avoid code
duplication between plain and TLS paths. Client side already supports
wss:// URLs via connect_async with native-tls feature.
Add multi-agent delegation documentation to Manager system prompt so
Claude knows it can spawn child agents via the Agent tool. Also inject
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 env var for Manager agents.
Version column in tasks table with WHERE id=? AND version=? guard.
Conflict detection in TaskBoardTab. error-classifier.ts: 6 error types
with actionable messages and retry logic. UsageMeter.svelte.
heartbeats + dead_letter_queue + audit_log tables in btmsg.db. 15s
heartbeat polling in ProjectBox, stale detection, ProjectHeader heart
indicator. AuditLogTab for Manager. register_agents_from_groups() with
bidirectional contacts and review channel creation.
Plugin discovery from ~/.config/bterminal/plugins/ with plugin.json
manifest. Sandboxed new Function() execution, permission-gated API
(palette, btmsg:read, bttask:read, events). Plugin store + SettingsTab.
notify-rust for desktop notifications, NotificationCenter.svelte with
bell icon, unread badge, history (max 100), 6 notification types.
Extended notification store with history and type support.