From 8641f260f7dfd57f40efedb8ce37b4de7b2ae9f6 Mon Sep 17 00:00:00 2001 From: Hibryda Date: Tue, 17 Mar 2026 04:19:04 +0100 Subject: [PATCH] docs: move orchestration and sidecar docs into subdirectories --- docs/agents/orchestration.md | 197 +++++++++++++++++++++++++++++++++++ docs/sidecar/architecture.md | 179 +++++++++++++++++++++++++++++++ 2 files changed, 376 insertions(+) create mode 100644 docs/agents/orchestration.md create mode 100644 docs/sidecar/architecture.md diff --git a/docs/agents/orchestration.md b/docs/agents/orchestration.md new file mode 100644 index 0000000..9e9f356 --- /dev/null +++ b/docs/agents/orchestration.md @@ -0,0 +1,197 @@ +# Multi-Agent Orchestration + +Agor supports running multiple AI agents that communicate with each other, coordinate work through a shared task board, and are managed by a hierarchy of specialized roles. This document covers the inter-agent messaging system (btmsg), the task board (bttask), agent roles and system prompts, and the auto-wake scheduler. + +--- + +## Agent Roles (Tier 1 and Tier 2) + +### Tier 1 — Management Agents + +Defined in `groups.json` under a group's `agents[]` array. Each management agent gets a full ProjectBox in the UI (converted via `agentToProject()`). They have role-specific capabilities, tabs, and system prompts. + +| Role | Tabs | btmsg Permissions | bttask Permissions | Purpose | +|------|------|-------------------|-------------------|---------| +| **Manager** | Model, Tasks | Full (send, receive, create channels) | Full CRUD | Coordinates work, creates/assigns tasks | +| **Architect** | Model, Architecture | Send, receive | Read-only + comments | Designs solutions, creates PlantUML diagrams | +| **Tester** | Model, Selenium, Tests | Send, receive | Read-only + comments | Runs tests, monitors screenshots | +| **Reviewer** | Model, Tasks | Send, receive | Read + status + comments | Reviews code, manages review queue | + +### Tier 2 — Project Agents + +Regular `ProjectConfig` entries in `groups.json`. Each project gets its own Claude session with optional custom context via `project.systemPrompt`. Standard tabs (Model, Docs, Context, Files, SSH, Memory) but no role-specific tabs. + +### System Prompt Generation + +Tier 1 agents receive auto-generated system prompts built by `generateAgentPrompt()` in `utils/agent-prompts.ts` with 7 sections: Identity, Environment, Team, btmsg docs, bttask docs, Custom context, Workflow. + +Tier 2 agents receive only the custom context section (if `project.systemPrompt` is set). + +### BTMSG_AGENT_ID + +Tier 1 agents receive the `BTMSG_AGENT_ID` environment variable, injected via `extra_env` in AgentQueryOptions. This flows through 5 layers: TypeScript -> Rust -> NDJSON -> JS runner -> SDK env. The CLI tools (`btmsg`, `bttask`) read this variable to identify which agent is sending messages. + +### Periodic Re-injection + +AgentSession runs a 1-hour timer that re-sends the system prompt when the agent is idle, countering LLM context degradation over long sessions. + +--- + +## btmsg — Inter-Agent Messaging + +btmsg lets agents communicate with each other via a Rust backend (SQLite), a Python CLI tool, and a Svelte frontend (CommsTab). + +### Database Schema + +The btmsg database (`btmsg.db`, `~/.local/share/agor/btmsg.db`) stores all messaging data: + +| Table | Purpose | Key Columns | +|-------|---------|-------------| +| `agents` | Agent registry | id, name, role, project_id, status, created_at | +| `messages` | All messages | id, sender_id, recipient_id, channel_id, content, read, created_at | +| `channels` | Named channels | id, name, created_by, created_at | +| `contacts` | ACL | agent_id, contact_id (bidirectional) | +| `heartbeats` | Liveness | agent_id, last_heartbeat, status | +| `dead_letter_queue` | Failed delivery | message_id, reason, created_at | +| `audit_log` | All operations | id, event_type, agent_id, details, created_at | + +### CLI Usage (for agents) + +```bash +btmsg send architect "Please review the auth module design" +btmsg read +btmsg channel create #architecture-decisions +btmsg channel post #review-queue "PR #42 ready for review" +btmsg heartbeat +btmsg agents +``` + +### Dead Letter Queue + +Messages sent to non-existent or offline agents are moved to the dead letter queue instead of being silently dropped. + +--- + +## bttask — Task Board + +bttask is a kanban-style task board sharing the same SQLite database as btmsg (`btmsg.db`). + +### Task Lifecycle + +``` +Backlog -> In Progress -> Review -> Done / Rejected +``` + +When a task moves to "Review", the system auto-posts to the `#review-queue` btmsg channel. + +### Optimistic Locking + +To prevent concurrent updates from corrupting task state, bttask uses a `version` column: + +1. Client reads task with current version (e.g., version=3) +2. Client sends update with expected version=3 +3. Server's UPDATE includes `WHERE version = 3` +4. If another client updated first (version=4), WHERE matches 0 rows -> conflict error + +### Role-Based Permissions + +| Role | List | Create | Update Status | Delete | Comments | +|------|------|--------|---------------|--------|----------| +| Manager | Yes | Yes | Yes | Yes | Yes | +| Reviewer | Yes | No | Yes (review decisions) | No | Yes | +| Architect | Yes | No | No | No | Yes | +| Tester | Yes | No | No | No | Yes | +| Project (Tier 2) | Yes | No | No | No | Yes | + +### Review Queue Integration + +The Reviewer agent gets special treatment in attention scoring: `reviewQueueDepth` adds 10 points per review task (capped at 50). ProjectBox polls `review_queue_count` every 10 seconds for reviewer agents. + +--- + +## Wake Scheduler + +The wake scheduler automatically re-activates idle Manager agents when attention-worthy events occur (`wake-scheduler.svelte.ts`). + +### Strategies + +| Strategy | Behavior | Use Case | +|----------|----------|----------| +| **Persistent** | Resume prompt to existing session | Long-running managers | +| **On-demand** | Fresh session | Burst-work managers | +| **Smart** | On-demand when score exceeds threshold | Avoids waking for minor events | + +### Wake Signals + +| Signal | Weight | Trigger | +|--------|--------|---------| +| AttentionSpike | 1.0 | Project attention score exceeds threshold | +| ContextPressureCluster | 0.9 | Multiple projects >75% context usage | +| BurnRateAnomaly | 0.8 | Cost rate deviates from baseline | +| TaskQueuePressure | 0.7 | Task backlog grows beyond threshold | +| ReviewBacklog | 0.6 | Review queue has pending items | +| PeriodicFloor | 0.1 | Minimum periodic check | + +Pure scoring function in `wake-scorer.ts` (24 tests). Types in `types/wake.ts`. + +--- + +## Health Monitoring & Attention Scoring + +The health store (`health.svelte.ts`) tracks per-project health with a 5-second tick timer. + +### Activity States + +| State | Meaning | Visual | +|-------|---------|--------| +| Inactive | No agent running | Dim dot | +| Running | Agent actively processing | Green pulse | +| Idle | Agent finished, waiting for input | Gray dot | +| Stalled | No output for >N minutes | Orange pulse | + +Stall threshold is configurable per-project (default 15 min, range 5-60, step 5). + +### Attention Scoring + +| Condition | Score | +|-----------|-------| +| Stalled agent | 100 | +| Error state | 90 | +| Context >90% | 80 | +| File conflict | 70 | +| Review queue depth | 10/task, cap 50 | +| Context >75% | 40 | + +Pure scoring function in `utils/attention-scorer.ts` (14 tests). + +### File Conflict Detection + +Two types detected by `conflicts.svelte.ts`: + +1. **Agent overlap** — Two agents in the same worktree write the same file +2. **External writes** — File modified externally (detected via inotify, 2s timing heuristic) + +--- + +## Session Anchors + +Session anchors preserve important conversation turns through Claude's context compaction process. + +### Anchor Types + +| Type | Created By | Behavior | +|------|-----------|----------| +| **Auto** | System (first compaction) | Captures first 3 turns, observation-masked | +| **Pinned** | User (pin button) | Marks specific turns as important | +| **Promoted** | User (from pinned) | Re-injectable via system prompt | + +### Anchor Budget + +| Scale | Token Budget | Use Case | +|-------|-------------|----------| +| Small | 2,000 | Quick sessions | +| Medium | 6,000 | Default | +| Large | 12,000 | Complex debugging | +| Full | 20,000 | Maximum preservation | + +Re-injection flow: `anchors.svelte.ts` -> `anchor-serializer.ts` -> `AgentPane.startQuery()` -> `system_prompt` -> sidecar -> SDK. diff --git a/docs/sidecar/architecture.md b/docs/sidecar/architecture.md new file mode 100644 index 0000000..5ecac86 --- /dev/null +++ b/docs/sidecar/architecture.md @@ -0,0 +1,179 @@ +# Sidecar Architecture + +The sidecar is the bridge between agor's Rust backend and AI provider APIs. Because the Claude Agent SDK, OpenAI Codex SDK, and Ollama API are JavaScript/TypeScript libraries, they cannot run inside Rust or WebKit2GTK's webview. Instead, the Rust backend spawns child processes (sidecars) that handle AI interactions and communicate back via stdio NDJSON. + +--- + +## Overview + +``` +Rust Backend (SidecarManager) + | + +-- Spawns child process (Deno preferred, Node.js fallback) + +-- Writes QueryMessage to stdin (NDJSON) + +-- Reads response lines from stdout (NDJSON) + +-- Emits Tauri events for each message + +-- Manages lifecycle (start, stop, crash recovery) + | + v +Sidecar Process (one of): + +-- claude-runner.mjs -> @anthropic-ai/claude-agent-sdk + +-- codex-runner.mjs -> @openai/codex-sdk + +-- ollama-runner.mjs -> native fetch to localhost:11434 +``` + +--- + +## Provider Runners + +Each provider has its own runner file in `sidecar/`, compiled to a standalone ESM bundle in `sidecar/dist/` by esbuild. The runners are self-contained — all dependencies (including SDKs) are bundled into the `.mjs` file. + +### Claude Runner (`claude-runner.ts` -> `claude-runner.mjs`) + +The primary runner. Uses `@anthropic-ai/claude-agent-sdk` query() function. + +**Startup sequence:** +1. Reads NDJSON messages from stdin in a loop +2. On `query` message: resolves Claude CLI path via `findClaudeCli()` +3. Calls SDK `query()` with options: prompt, cwd, permissionMode, model, settingSources, systemPrompt, additionalDirectories, worktreeName, pathToClaudeCodeExecutable +4. Streams SDK messages as NDJSON to stdout +5. On `stop` message: calls AbortController.abort() + +**Claude CLI detection (`findClaudeCli()`):** +Checks paths in order: `~/.local/bin/claude` -> `~/.claude/local/claude` -> `/usr/local/bin/claude` -> `/usr/bin/claude` -> `which claude`. If none found, emits `agent_error` immediately. + +**Session resume:** Passes `resume: sessionId` to the SDK. + +**Multi-account support:** When `claudeConfigDir` is provided, it is set as `CLAUDE_CONFIG_DIR` in the SDK's env option. + +**Worktree isolation:** When `worktreeName` is provided, it is passed as `extraArgs: { worktree: name }` to the SDK, which translates to `--worktree ` on the CLI. + +### Codex Runner (`codex-runner.ts` -> `codex-runner.mjs`) + +Uses `@openai/codex-sdk` via dynamic import (graceful failure if not installed). + +**Key differences from Claude:** +- Authentication via `CODEX_API_KEY` environment variable +- Sandbox mode mapping: `bypassPermissions` -> `full-auto`, `default` -> `suggest` +- Session resume via thread ID +- No profile/skill support +- ThreadEvent format parsed by `codex-messages.ts` + +### Ollama Runner (`ollama-runner.ts` -> `ollama-runner.mjs`) + +Direct HTTP to Ollama's REST API — zero external dependencies. + +**Key differences:** +- No SDK — uses native `fetch()` to `http://localhost:11434/api/chat` +- Health check on startup (`GET /api/tags`) +- Supports Qwen3's `` tags for reasoning display +- Configurable: host, model, num_ctx, temperature +- Cost is always $0 (local inference) +- No subagent support, no profiles, no skills + +--- + +## Communication Protocol + +### Messages from Rust to Sidecar (stdin) + +```typescript +// Query -- start a new agent session +{ + "type": "query", + "session_id": "uuid", + "prompt": "Fix the bug in auth.ts", + "cwd": "/home/user/project", + "provider": "claude", + "model": "claude-sonnet-4-6", + "permission_mode": "bypassPermissions", + "resume_session_id": "previous-uuid", // optional + "system_prompt": "You are an architect...", // optional + "claude_config_dir": "~/.config/switcher-claude/work/", // optional + "setting_sources": ["user", "project"], // optional + "additional_directories": ["/shared/lib"], // optional + "worktree_name": "session-123", // optional + "provider_config": { ... }, // provider-specific blob + "extra_env": { "BTMSG_AGENT_ID": "manager-1" } // optional +} + +// Stop -- abort a running session +{ "type": "stop", "session_id": "uuid" } +``` + +### Messages from Sidecar to Rust (stdout) + +The sidecar writes one JSON object per line (NDJSON). Claude messages follow the same format as the Claude CLI's `--output-format stream-json`. + +--- + +## Environment Variable Stripping + +When agor is launched from within a Claude Code terminal session, parent `CLAUDE*` environment variables must not leak to the sidecar. The solution is **dual-layer stripping**: + +1. **Rust layer (primary):** `SidecarManager` calls `env_clear()` on the child process command, then explicitly sets only needed variables. +2. **JavaScript layer (defense-in-depth):** Each runner strips provider-specific variables (Claude: `CLAUDE*` except `CLAUDE_CODE_EXPERIMENTAL_*`; Codex: `CODEX*`; Ollama: `OLLAMA*` except `OLLAMA_HOST`). + +The `extra_env` field in AgentQueryOptions allows injecting specific variables (like `BTMSG_AGENT_ID`) after stripping. + +--- + +## Sidecar Lifecycle + +### Startup + +SidecarManager is initialized during Tauri app setup. No sidecar processes spawn until the first agent query. + +### Runtime Resolution + +`resolve_sidecar_for_provider(provider)` finds the appropriate runner: +1. Looks for `{provider}-runner.mjs` in sidecar dist directory +2. Checks for Deno first, then Node.js +3. Returns `SidecarCommand` struct with runtime binary and script path + +Deno preferred (~50ms cold-start vs ~150ms for Node.js). + +### Crash Recovery (SidecarSupervisor) + +See [production/hardening.md](../production/hardening.md) for details. Exponential backoff (1s-30s cap), max 5 restart attempts, `SidecarHealth` enum. + +### Shutdown + +On app exit, SidecarManager sends stop messages to all active sessions and kills remaining child processes. `Drop` implementation ensures cleanup even on panic. + +--- + +## Build Pipeline + +```bash +npm run build:sidecar +# Internally runs esbuild 3 times: +# sidecar/claude-runner.ts -> sidecar/dist/claude-runner.mjs +# sidecar/codex-runner.ts -> sidecar/dist/codex-runner.mjs +# sidecar/ollama-runner.ts -> sidecar/dist/ollama-runner.mjs +``` + +Each bundle is standalone ESM with all dependencies included. Built `.mjs` files are included as Tauri resources in `tauri.conf.json`. + +--- + +## Message Adapter Layer + +On the frontend, raw sidecar messages pass through a provider-specific adapter before reaching the agent store: + +``` +Sidecar stdout -> Rust SidecarManager -> Tauri event + -> agent-dispatcher.ts + -> message-adapters.ts (registry) + -> claude-messages.ts / codex-messages.ts / ollama-messages.ts + -> AgentMessage[] (common type) + -> agents.svelte.ts store +``` + +The `AgentMessage` type is provider-agnostic. The adapter layer is the only code that understands provider-specific formats. + +### Test Coverage + +- `claude-messages.test.ts` — 25 tests +- `codex-messages.test.ts` — 19 tests +- `ollama-messages.test.ts` — 11 tests