Hibryda de8dd04f4b docs: add architecture, sidecar, orchestration, and production guides

New documentation covering end-to-end system architecture, multi-provider
sidecar lifecycle, btmsg/bttask multi-agent orchestration, and production
hardening features (supervisor, sandbox, search, plugins, secrets, audit).

2026-03-14 02:33:59 +01:00

17 KiB

Raw Blame History

Multi-Agent Orchestration

Agent Orchestrator supports running multiple AI agents that communicate with each other, coordinate work through a shared task board, and are managed by a hierarchy of specialized roles. This document covers the inter-agent messaging system (btmsg), the task board (bttask), agent roles and system prompts, and the auto-wake scheduler.

Agent Roles (Tier 1 and Tier 2)

Agents are organized into two tiers:

Tier 1 — Management Agents

Defined in groups.json under a group's agents[] array. Each management agent gets a full ProjectBox in the UI (converted via agentToProject() in the workspace store). They have role-specific capabilities, tabs, and system prompts.

Role	Tabs	btmsg Permissions	bttask Permissions	Purpose
Manager	Model, Tasks	Full (send, receive, create channels)	Full CRUD	Coordinates work, creates/assigns tasks, delegates to subagents
Architect	Model, Architecture	Send, receive	Read-only + comments	Designs solutions, creates PlantUML diagrams, reviews architecture
Tester	Model, Selenium, Tests	Send, receive	Read-only + comments	Runs tests, monitors screenshots, discovers test files
Reviewer	Model, Tasks	Send, receive	Read + status + comments	Reviews code, manages review queue, approves/rejects tasks

Tier 2 — Project Agents

Regular ProjectConfig entries in groups.json. Each project gets its own Claude session with optional custom context via project.systemPrompt. They have standard tabs (Model, Docs, Context, Files, SSH, Memory) but no role-specific tabs.

System Prompt Generation

Tier 1 agents receive auto-generated system prompts built by generateAgentPrompt() in utils/agent-prompts.ts. The prompt has 7 sections:

Identity — Role name, project context, team membership
Environment — Working directory, available tools, shell info
Team — List of other agents in the group with their roles
btmsg documentation — CLI usage, channel commands, message format
bttask documentation — CLI usage, task lifecycle, role-specific permissions
Custom context — Optional project.systemPrompt (Tier 2) or role-specific instructions
Workflow — Role-specific workflow guidelines (e.g., Manager delegates, Reviewer checks review queue)

Tier 2 agents receive only the custom context section (if project.systemPrompt is set), injected as the system_prompt field in AgentQueryOptions.

BTMSG_AGENT_ID

Tier 1 agents receive the BTMSG_AGENT_ID environment variable, injected via extra_env in AgentQueryOptions. This flows through 5 layers: TypeScript → Rust AgentQueryOptions → NDJSON → JS runner → SDK env. The CLI tools (btmsg, bttask) read this variable to identify which agent is sending messages or creating tasks.

Periodic Re-injection

LLM context degrades over long sessions as important instructions scroll out of the context window. To counter this, AgentSession runs a 1-hour timer that re-sends the system prompt when the agent is idle. The mechanism:

AgentSession timer fires after 60 minutes of agent inactivity
Sets autoPrompt flag, which AgentPane reads via onautopromptconsumed callback
AgentPane calls startQuery() with resume=true and the refresh prompt
The agent receives the role/tools reminder as a follow-up message

btmsg — Inter-Agent Messaging

btmsg is a messaging system that lets agents communicate with each other. It consists of a Rust backend (SQLite), a Python CLI tool (for agents to use in their shell), and a Svelte frontend (CommsTab).

Architecture

Agent (via btmsg CLI)
    │
    ├── btmsg send <recipient> "message"     → writes to btmsg.db
    ├── btmsg read                           → reads from btmsg.db
    ├── btmsg channel create #review-queue   → creates channel
    ├── btmsg channel post #review-queue "msg" → posts to channel
    └── btmsg heartbeat                      → updates agent heartbeat
         │
         ▼
btmsg.db (SQLite, WAL mode, ~/.local/share/bterminal/btmsg.db)
    │
    ├── agents table        — registered agents with roles
    ├── messages table      — DMs and channel messages
    ├── channels table      — named channels (#review-queue, #review-log)
    ├── contacts table      — ACL (who can message whom)
    ├── heartbeats table    — agent liveness tracking
    ├── dead_letter_queue   — undeliverable messages
    └── audit_log           — all operations for debugging
         │
         ▼
Rust Backend (btmsg.rs, ~600 lines)
    │
    ├── btmsg_list_messages, btmsg_send_message, ...
    ├── 15+ Tauri commands for full CRUD
    └── Shared database connection (WAL + 5s busy_timeout)
         │
         ▼
Frontend (btmsg-bridge.ts → CommsTab.svelte)
    ├── Activity feed — all messages across all agents
    ├── DM view — direct messages between specific agents
    └── Channel view — channel messages (#review-queue, etc.)

Database Schema

The btmsg database (btmsg.db) stores all messaging data:

Table	Purpose	Key Columns
`agents`	Agent registry	id, name, role, project_id, status, created_at
`messages`	All messages	id, sender_id, recipient_id, channel_id, content, read, created_at
`channels`	Named channels	id, name, created_by, created_at
`contacts`	ACL	agent_id, contact_id (bidirectional)
`heartbeats`	Liveness	agent_id, last_heartbeat, status
`dead_letter_queue`	Failed delivery	message_id, reason, created_at
`audit_log`	All operations	id, event_type, agent_id, details, created_at

CLI Usage (for agents)

Agents use the btmsg Python CLI tool in their shell. The tool reads BTMSG_AGENT_ID to identify the sender:

# Send a direct message
btmsg send architect "Please review the auth module design"

# Read unread messages
btmsg read

# Create a channel
btmsg channel create #architecture-decisions

# Post to a channel
btmsg channel post #review-queue "PR #42 ready for review"

# Send heartbeat (agents do this periodically)
btmsg heartbeat

# List all agents
btmsg agents

Frontend (CommsTab)

The CommsTab component (rendered in ProjectBox for all agents) shows:

Activity Feed — chronological view of all messages across all agents
DMs — direct message threads between agents
Channels — named channel message streams
Polling-based updates (5s interval)

Dead Letter Queue

Messages sent to non-existent or offline agents are moved to the dead letter queue instead of being silently dropped. The Rust backend checks agent status before delivery and queues failures. The Manager agent's health dashboard shows dead letter count.

Audit Logging

Every btmsg operation is logged to the audit_log table with event type, agent ID, and JSON details. Event types include: message_sent, message_read, channel_created, agent_registered, heartbeat, and prompt_injection_detected.

bttask — Task Board

bttask is a kanban-style task board that agents use to coordinate work. It shares the same SQLite database as btmsg (btmsg.db) for deployment simplicity.

Architecture

Agent (via bttask CLI)
    │
    ├── bttask list                    → list all tasks
    ├── bttask create "Fix auth bug"   → create task (Manager only)
    ├── bttask status <id> in_progress → update status
    ├── bttask comment <id> "Done"     → add comment
    └── bttask review-count            → count review queue tasks
         │
         ▼
btmsg.db → tasks table + task_comments table
    │
    ▼
Rust Backend (bttask.rs, ~300 lines)
    │
    ├── 7 Tauri commands: list, create, update_status, delete, add_comment, comments, review_queue_count
    └── Optimistic locking via version column
         │
         ▼
Frontend (bttask-bridge.ts → TaskBoardTab.svelte)
    └── Kanban board: 5 columns, 5s poll, drag-and-drop

Task Lifecycle

┌──────────┐   assign   ┌─────────────┐   complete   ┌──────────┐
│  Backlog  │──────────►│ In Progress  │────────────►│  Review   │
└──────────┘            └─────────────┘              └──────────┘
                                                          │
                                              ┌───────────┼───────────┐
                                              ▼                       ▼
                                         ┌────────┐             ┌──────────┐
                                         │  Done   │             │ Rejected │
                                         └────────┘             └──────────┘

When a task moves to the "Review" column, the system automatically posts a notification to the #review-queue btmsg channel. The ensure_review_channels() function creates #review-queue and #review-log channels idempotently on first use.

Optimistic Locking

To prevent concurrent updates from corrupting task state, bttask uses optimistic locking via a version column:

Client reads task with current version (e.g., version=3)
Client sends update with expected version=3
Server's UPDATE query includes WHERE version = 3
If another client updated first (version=4), the WHERE clause matches 0 rows
Server returns a conflict error, client must re-read and retry

This is critical because multiple agents may try to update the same task simultaneously.

Role-Based Permissions

Role	List	Create	Update Status	Delete	Comments
Manager	Yes	Yes	Yes	Yes	Yes
Reviewer	Yes	No	Yes (review decisions)	No	Yes
Architect	Yes	No	No	No	Yes
Tester	Yes	No	No	No	Yes
Project (Tier 2)	Yes	No	No	No	Yes

Permissions are enforced in the CLI tool based on the agent's role (read from BTMSG_AGENT_ID → agents table lookup).

Review Queue Integration

The Reviewer agent gets special treatment in the attention scoring system:

reviewQueueDepth is an input to attention scoring: 10 points per review task, capped at 50
Priority: between file_conflict (70) and context_high (40)
ProjectBox polls review_queue_count every 10 seconds for reviewer agents
Results feed into setReviewQueueDepth() in the health store

Frontend (TaskBoardTab.svelte)

The kanban board renders 5 columns (Backlog, In Progress, Review, Done, Rejected) with task cards. Features:

5-second polling for updates
Click to expand task details + comments
Manager-only create/delete buttons
Color-coded status badges

Wake Scheduler

The wake scheduler automatically re-activates idle Manager agents when attention-worthy events occur. It runs in wake-scheduler.svelte.ts and supports three user-selectable strategies.

Strategies

Strategy	Behavior	Use Case
Persistent	Sends a resume prompt to the existing session	Long-running managers that should maintain context
On-demand	Starts a fresh session	Managers that work in bursts
Smart	On-demand, but only when wake score exceeds threshold	Avoids waking for minor events

Strategy and threshold are configurable per group agent via GroupAgentConfig.wakeStrategy and GroupAgentConfig.wakeThreshold fields, persisted in groups.json.

Wake Signals

The wake scorer evaluates 6 signals (defined in types/wake.ts, scored by utils/wake-scorer.ts):

Signal	Weight	Trigger
AttentionSpike	1.0	Any project's attention score exceeds threshold
ContextPressureCluster	0.9	Multiple projects have >75% context usage
BurnRateAnomaly	0.8	Cost rate deviates significantly from baseline
TaskQueuePressure	0.7	Task backlog grows beyond threshold
ReviewBacklog	0.6	Review queue has pending items
PeriodicFloor	0.1	Minimum periodic check (floor signal)

The pure scoring function in wake-scorer.ts is tested with 24 unit tests. The types are in types/wake.ts (WakeStrategy, WakeSignal, WakeEvaluation, WakeContext).

Lifecycle

ProjectBox registers manager agents via $effect on mount
Wake scheduler creates per-manager timers
Every 5 seconds, AgentSession polls wake events
If score exceeds threshold (for smart strategy), triggers wake
On group switch, clearWakeScheduler() cancels all timers
In test mode (BTERMINAL_TEST=1), wake scheduler is disabled via disableWakeScheduler()

Health Monitoring & Attention Scoring

The health store (health.svelte.ts) tracks per-project health with a 5-second tick timer. It provides the data that feeds the StatusBar, wake scheduler, and attention queue.

Activity States

State	Meaning	Visual
Inactive	No agent running, no recent activity	Dim dot
Running	Agent actively processing	Green pulse
Idle	Agent finished, waiting for input	Gray dot
Stalled	Agent hasn't produced output for >N minutes	Orange pulse

The stall threshold is configurable per-project via stallThresholdMin in ProjectConfig (default 15 min, range 5-60, step 5).

Attention Scoring

Each project gets an attention score (0-100) based on its current state. The attention queue in the StatusBar shows the top 5 projects sorted by urgency:

Condition	Score	Priority
Stalled agent	100	Highest — agent may be stuck
Error state	90	Agent crashed or API error
Context >90%	80	Context window nearly full
File conflict	70	Two agents wrote same file
Review queue depth	10/task, cap 50	Reviewer has pending reviews
Context >75%	40	Context pressure building

The pure scoring function is in utils/attention-scorer.ts (14 tests). It takes AttentionInput and returns a numeric score.

Burn Rate

Cost tracking uses a 5-minute exponential moving average (EMA) of cost snapshots. The StatusBar displays aggregate $/hr across all running agents.

File Conflict Detection

The conflicts store (conflicts.svelte.ts) detects two types of conflicts:

Agent overlap — Two agents in the same worktree write the same file (tracked via tool_call analysis in the dispatcher)
External writes — A file watched by an agent is modified externally (detected via inotify in fs_watcher.rs, uses 2s timing heuristic AGENT_WRITE_GRACE_MS to distinguish agent writes from external)

Both types show badges in ProjectHeader (orange ⚡ for external, red ⚠ for agent overlap).

Session Anchors

Session anchors preserve important conversation turns through Claude's context compaction process. Without anchors, valuable early context (architecture decisions, debugging breakthroughs) can be lost when the context window fills up.

Anchor Types

Type	Created By	Behavior
Auto	System (on first compaction)	Captures first 3 turns, observation-masked (reasoning preserved, tool outputs compacted)
Pinned	User (pin button in AgentPane)	Marks specific turns as important
Promoted	User (from pinned)	Re-injectable into future sessions via system prompt

Anchor Budget

The budget controls how many tokens are spent on anchor re-injection:

Scale	Token Budget	Use Case
Small	2,000	Quick sessions, minimal context needed
Medium	6,000	Default, covers most scenarios
Large	12,000	Complex debugging sessions
Full	20,000	Maximum context preservation

Configurable per-project via slider in SettingsTab, stored as ProjectConfig.anchorBudgetScale in groups.json.

Re-injection Flow

When a session resumes with promoted anchors:

anchors.svelte.ts loads promoted anchors for the project
anchor-serializer.ts serializes them (turn grouping, observation masking, token estimation)
AgentPane.startQuery() includes serialized anchors in the system_prompt field
The sidecar passes the system prompt to the SDK
Claude receives the anchors as context alongside the new prompt

Storage

Anchors are persisted in the session_anchors table in sessions.db. The ContextTab shows an anchor section with a budget meter (derived from the configured scale) and promote/demote buttons.

17 KiB Raw Blame History

Multi-Agent Orchestration

Agent Roles (Tier 1 and Tier 2)

Tier 1 — Management Agents

Tier 2 — Project Agents

System Prompt Generation

BTMSG_AGENT_ID

Periodic Re-injection

btmsg — Inter-Agent Messaging

Architecture

Database Schema

CLI Usage (for agents)

Frontend (CommsTab)

Dead Letter Queue

Audit Logging

bttask — Task Board

Architecture

Task Lifecycle

Optimistic Locking

Role-Based Permissions

Review Queue Integration

Frontend (TaskBoardTab.svelte)

Wake Scheduler

Strategies

Wake Signals

Lifecycle

Health Monitoring & Attention Scoring

Activity States

Attention Scoring

Burn Rate

File Conflict Detection

Session Anchors

Anchor Types

Anchor Budget

Re-injection Flow

Storage

17 KiB

Raw Blame History