docs: move orchestration and sidecar docs into subdirectories

This commit is contained in:
Hibryda 2026-03-17 04:19:04 +01:00
parent b6c1d4b6af
commit 8641f260f7
2 changed files with 376 additions and 0 deletions

View file

@ -0,0 +1,197 @@
# Multi-Agent Orchestration
Agor supports running multiple AI agents that communicate with each other, coordinate work through a shared task board, and are managed by a hierarchy of specialized roles. This document covers the inter-agent messaging system (btmsg), the task board (bttask), agent roles and system prompts, and the auto-wake scheduler.
---
## Agent Roles (Tier 1 and Tier 2)
### Tier 1 — Management Agents
Defined in `groups.json` under a group's `agents[]` array. Each management agent gets a full ProjectBox in the UI (converted via `agentToProject()`). They have role-specific capabilities, tabs, and system prompts.
| Role | Tabs | btmsg Permissions | bttask Permissions | Purpose |
|------|------|-------------------|-------------------|---------|
| **Manager** | Model, Tasks | Full (send, receive, create channels) | Full CRUD | Coordinates work, creates/assigns tasks |
| **Architect** | Model, Architecture | Send, receive | Read-only + comments | Designs solutions, creates PlantUML diagrams |
| **Tester** | Model, Selenium, Tests | Send, receive | Read-only + comments | Runs tests, monitors screenshots |
| **Reviewer** | Model, Tasks | Send, receive | Read + status + comments | Reviews code, manages review queue |
### Tier 2 — Project Agents
Regular `ProjectConfig` entries in `groups.json`. Each project gets its own Claude session with optional custom context via `project.systemPrompt`. Standard tabs (Model, Docs, Context, Files, SSH, Memory) but no role-specific tabs.
### System Prompt Generation
Tier 1 agents receive auto-generated system prompts built by `generateAgentPrompt()` in `utils/agent-prompts.ts` with 7 sections: Identity, Environment, Team, btmsg docs, bttask docs, Custom context, Workflow.
Tier 2 agents receive only the custom context section (if `project.systemPrompt` is set).
### BTMSG_AGENT_ID
Tier 1 agents receive the `BTMSG_AGENT_ID` environment variable, injected via `extra_env` in AgentQueryOptions. This flows through 5 layers: TypeScript -> Rust -> NDJSON -> JS runner -> SDK env. The CLI tools (`btmsg`, `bttask`) read this variable to identify which agent is sending messages.
### Periodic Re-injection
AgentSession runs a 1-hour timer that re-sends the system prompt when the agent is idle, countering LLM context degradation over long sessions.
---
## btmsg — Inter-Agent Messaging
btmsg lets agents communicate with each other via a Rust backend (SQLite), a Python CLI tool, and a Svelte frontend (CommsTab).
### Database Schema
The btmsg database (`btmsg.db`, `~/.local/share/agor/btmsg.db`) stores all messaging data:
| Table | Purpose | Key Columns |
|-------|---------|-------------|
| `agents` | Agent registry | id, name, role, project_id, status, created_at |
| `messages` | All messages | id, sender_id, recipient_id, channel_id, content, read, created_at |
| `channels` | Named channels | id, name, created_by, created_at |
| `contacts` | ACL | agent_id, contact_id (bidirectional) |
| `heartbeats` | Liveness | agent_id, last_heartbeat, status |
| `dead_letter_queue` | Failed delivery | message_id, reason, created_at |
| `audit_log` | All operations | id, event_type, agent_id, details, created_at |
### CLI Usage (for agents)
```bash
btmsg send architect "Please review the auth module design"
btmsg read
btmsg channel create #architecture-decisions
btmsg channel post #review-queue "PR #42 ready for review"
btmsg heartbeat
btmsg agents
```
### Dead Letter Queue
Messages sent to non-existent or offline agents are moved to the dead letter queue instead of being silently dropped.
---
## bttask — Task Board
bttask is a kanban-style task board sharing the same SQLite database as btmsg (`btmsg.db`).
### Task Lifecycle
```
Backlog -> In Progress -> Review -> Done / Rejected
```
When a task moves to "Review", the system auto-posts to the `#review-queue` btmsg channel.
### Optimistic Locking
To prevent concurrent updates from corrupting task state, bttask uses a `version` column:
1. Client reads task with current version (e.g., version=3)
2. Client sends update with expected version=3
3. Server's UPDATE includes `WHERE version = 3`
4. If another client updated first (version=4), WHERE matches 0 rows -> conflict error
### Role-Based Permissions
| Role | List | Create | Update Status | Delete | Comments |
|------|------|--------|---------------|--------|----------|
| Manager | Yes | Yes | Yes | Yes | Yes |
| Reviewer | Yes | No | Yes (review decisions) | No | Yes |
| Architect | Yes | No | No | No | Yes |
| Tester | Yes | No | No | No | Yes |
| Project (Tier 2) | Yes | No | No | No | Yes |
### Review Queue Integration
The Reviewer agent gets special treatment in attention scoring: `reviewQueueDepth` adds 10 points per review task (capped at 50). ProjectBox polls `review_queue_count` every 10 seconds for reviewer agents.
---
## Wake Scheduler
The wake scheduler automatically re-activates idle Manager agents when attention-worthy events occur (`wake-scheduler.svelte.ts`).
### Strategies
| Strategy | Behavior | Use Case |
|----------|----------|----------|
| **Persistent** | Resume prompt to existing session | Long-running managers |
| **On-demand** | Fresh session | Burst-work managers |
| **Smart** | On-demand when score exceeds threshold | Avoids waking for minor events |
### Wake Signals
| Signal | Weight | Trigger |
|--------|--------|---------|
| AttentionSpike | 1.0 | Project attention score exceeds threshold |
| ContextPressureCluster | 0.9 | Multiple projects >75% context usage |
| BurnRateAnomaly | 0.8 | Cost rate deviates from baseline |
| TaskQueuePressure | 0.7 | Task backlog grows beyond threshold |
| ReviewBacklog | 0.6 | Review queue has pending items |
| PeriodicFloor | 0.1 | Minimum periodic check |
Pure scoring function in `wake-scorer.ts` (24 tests). Types in `types/wake.ts`.
---
## Health Monitoring & Attention Scoring
The health store (`health.svelte.ts`) tracks per-project health with a 5-second tick timer.
### Activity States
| State | Meaning | Visual |
|-------|---------|--------|
| Inactive | No agent running | Dim dot |
| Running | Agent actively processing | Green pulse |
| Idle | Agent finished, waiting for input | Gray dot |
| Stalled | No output for >N minutes | Orange pulse |
Stall threshold is configurable per-project (default 15 min, range 5-60, step 5).
### Attention Scoring
| Condition | Score |
|-----------|-------|
| Stalled agent | 100 |
| Error state | 90 |
| Context >90% | 80 |
| File conflict | 70 |
| Review queue depth | 10/task, cap 50 |
| Context >75% | 40 |
Pure scoring function in `utils/attention-scorer.ts` (14 tests).
### File Conflict Detection
Two types detected by `conflicts.svelte.ts`:
1. **Agent overlap** — Two agents in the same worktree write the same file
2. **External writes** — File modified externally (detected via inotify, 2s timing heuristic)
---
## Session Anchors
Session anchors preserve important conversation turns through Claude's context compaction process.
### Anchor Types
| Type | Created By | Behavior |
|------|-----------|----------|
| **Auto** | System (first compaction) | Captures first 3 turns, observation-masked |
| **Pinned** | User (pin button) | Marks specific turns as important |
| **Promoted** | User (from pinned) | Re-injectable via system prompt |
### Anchor Budget
| Scale | Token Budget | Use Case |
|-------|-------------|----------|
| Small | 2,000 | Quick sessions |
| Medium | 6,000 | Default |
| Large | 12,000 | Complex debugging |
| Full | 20,000 | Maximum preservation |
Re-injection flow: `anchors.svelte.ts` -> `anchor-serializer.ts` -> `AgentPane.startQuery()` -> `system_prompt` -> sidecar -> SDK.

View file

@ -0,0 +1,179 @@
# Sidecar Architecture
The sidecar is the bridge between agor's Rust backend and AI provider APIs. Because the Claude Agent SDK, OpenAI Codex SDK, and Ollama API are JavaScript/TypeScript libraries, they cannot run inside Rust or WebKit2GTK's webview. Instead, the Rust backend spawns child processes (sidecars) that handle AI interactions and communicate back via stdio NDJSON.
---
## Overview
```
Rust Backend (SidecarManager)
|
+-- Spawns child process (Deno preferred, Node.js fallback)
+-- Writes QueryMessage to stdin (NDJSON)
+-- Reads response lines from stdout (NDJSON)
+-- Emits Tauri events for each message
+-- Manages lifecycle (start, stop, crash recovery)
|
v
Sidecar Process (one of):
+-- claude-runner.mjs -> @anthropic-ai/claude-agent-sdk
+-- codex-runner.mjs -> @openai/codex-sdk
+-- ollama-runner.mjs -> native fetch to localhost:11434
```
---
## Provider Runners
Each provider has its own runner file in `sidecar/`, compiled to a standalone ESM bundle in `sidecar/dist/` by esbuild. The runners are self-contained — all dependencies (including SDKs) are bundled into the `.mjs` file.
### Claude Runner (`claude-runner.ts` -> `claude-runner.mjs`)
The primary runner. Uses `@anthropic-ai/claude-agent-sdk` query() function.
**Startup sequence:**
1. Reads NDJSON messages from stdin in a loop
2. On `query` message: resolves Claude CLI path via `findClaudeCli()`
3. Calls SDK `query()` with options: prompt, cwd, permissionMode, model, settingSources, systemPrompt, additionalDirectories, worktreeName, pathToClaudeCodeExecutable
4. Streams SDK messages as NDJSON to stdout
5. On `stop` message: calls AbortController.abort()
**Claude CLI detection (`findClaudeCli()`):**
Checks paths in order: `~/.local/bin/claude` -> `~/.claude/local/claude` -> `/usr/local/bin/claude` -> `/usr/bin/claude` -> `which claude`. If none found, emits `agent_error` immediately.
**Session resume:** Passes `resume: sessionId` to the SDK.
**Multi-account support:** When `claudeConfigDir` is provided, it is set as `CLAUDE_CONFIG_DIR` in the SDK's env option.
**Worktree isolation:** When `worktreeName` is provided, it is passed as `extraArgs: { worktree: name }` to the SDK, which translates to `--worktree <name>` on the CLI.
### Codex Runner (`codex-runner.ts` -> `codex-runner.mjs`)
Uses `@openai/codex-sdk` via dynamic import (graceful failure if not installed).
**Key differences from Claude:**
- Authentication via `CODEX_API_KEY` environment variable
- Sandbox mode mapping: `bypassPermissions` -> `full-auto`, `default` -> `suggest`
- Session resume via thread ID
- No profile/skill support
- ThreadEvent format parsed by `codex-messages.ts`
### Ollama Runner (`ollama-runner.ts` -> `ollama-runner.mjs`)
Direct HTTP to Ollama's REST API — zero external dependencies.
**Key differences:**
- No SDK — uses native `fetch()` to `http://localhost:11434/api/chat`
- Health check on startup (`GET /api/tags`)
- Supports Qwen3's `<think>` tags for reasoning display
- Configurable: host, model, num_ctx, temperature
- Cost is always $0 (local inference)
- No subagent support, no profiles, no skills
---
## Communication Protocol
### Messages from Rust to Sidecar (stdin)
```typescript
// Query -- start a new agent session
{
"type": "query",
"session_id": "uuid",
"prompt": "Fix the bug in auth.ts",
"cwd": "/home/user/project",
"provider": "claude",
"model": "claude-sonnet-4-6",
"permission_mode": "bypassPermissions",
"resume_session_id": "previous-uuid", // optional
"system_prompt": "You are an architect...", // optional
"claude_config_dir": "~/.config/switcher-claude/work/", // optional
"setting_sources": ["user", "project"], // optional
"additional_directories": ["/shared/lib"], // optional
"worktree_name": "session-123", // optional
"provider_config": { ... }, // provider-specific blob
"extra_env": { "BTMSG_AGENT_ID": "manager-1" } // optional
}
// Stop -- abort a running session
{ "type": "stop", "session_id": "uuid" }
```
### Messages from Sidecar to Rust (stdout)
The sidecar writes one JSON object per line (NDJSON). Claude messages follow the same format as the Claude CLI's `--output-format stream-json`.
---
## Environment Variable Stripping
When agor is launched from within a Claude Code terminal session, parent `CLAUDE*` environment variables must not leak to the sidecar. The solution is **dual-layer stripping**:
1. **Rust layer (primary):** `SidecarManager` calls `env_clear()` on the child process command, then explicitly sets only needed variables.
2. **JavaScript layer (defense-in-depth):** Each runner strips provider-specific variables (Claude: `CLAUDE*` except `CLAUDE_CODE_EXPERIMENTAL_*`; Codex: `CODEX*`; Ollama: `OLLAMA*` except `OLLAMA_HOST`).
The `extra_env` field in AgentQueryOptions allows injecting specific variables (like `BTMSG_AGENT_ID`) after stripping.
---
## Sidecar Lifecycle
### Startup
SidecarManager is initialized during Tauri app setup. No sidecar processes spawn until the first agent query.
### Runtime Resolution
`resolve_sidecar_for_provider(provider)` finds the appropriate runner:
1. Looks for `{provider}-runner.mjs` in sidecar dist directory
2. Checks for Deno first, then Node.js
3. Returns `SidecarCommand` struct with runtime binary and script path
Deno preferred (~50ms cold-start vs ~150ms for Node.js).
### Crash Recovery (SidecarSupervisor)
See [production/hardening.md](../production/hardening.md) for details. Exponential backoff (1s-30s cap), max 5 restart attempts, `SidecarHealth` enum.
### Shutdown
On app exit, SidecarManager sends stop messages to all active sessions and kills remaining child processes. `Drop` implementation ensures cleanup even on panic.
---
## Build Pipeline
```bash
npm run build:sidecar
# Internally runs esbuild 3 times:
# sidecar/claude-runner.ts -> sidecar/dist/claude-runner.mjs
# sidecar/codex-runner.ts -> sidecar/dist/codex-runner.mjs
# sidecar/ollama-runner.ts -> sidecar/dist/ollama-runner.mjs
```
Each bundle is standalone ESM with all dependencies included. Built `.mjs` files are included as Tauri resources in `tauri.conf.json`.
---
## Message Adapter Layer
On the frontend, raw sidecar messages pass through a provider-specific adapter before reaching the agent store:
```
Sidecar stdout -> Rust SidecarManager -> Tauri event
-> agent-dispatcher.ts
-> message-adapters.ts (registry)
-> claude-messages.ts / codex-messages.ts / ollama-messages.ts
-> AgentMessage[] (common type)
-> agents.svelte.ts store
```
The `AgentMessage` type is provider-agnostic. The adapter layer is the only code that understands provider-specific formats.
### Test Coverage
- `claude-messages.test.ts` — 25 tests
- `codex-messages.test.ts` — 19 tests
- `ollama-messages.test.ts` — 11 tests