docs: add 11 new documentation files across all categories

New reference docs: - agents/ref-btmsg.md: inter-agent messaging schema and CLI - agents/ref-bttask.md: kanban task board operations - providers/ref-providers.md: Claude/Codex/Ollama/Aider comparison - config/ref-settings.md: (already committed) New guides: - contributing/dual-repo-workflow.md: community vs commercial repos - plugins/guide-developing.md: Web Worker sandbox API and publishing New pro docs: - pro/features/knowledge-base.md: persistent memory + symbol graph - pro/features/git-integration.md: context injection + branch policy - pro/marketplace/README.md: 13 plugins catalog Split files: - architecture/data-model.md: from architecture.md (schemas, layout) - production/hardening.md: from production.md (supervisor, sandbox, WAL) - production/features.md: from production.md (FTS5, plugins, secrets, audit)
2026-03-17 04:18:05 +01:00 · 2026-03-17 04:18:05 +01:00 · b6c1d4b6af
commit b6c1d4b6af
parent 8251321dac
11 changed files with 2198 additions and 0 deletions
--- a/docs/pro/features/knowledge-base.md
+++ b/docs/pro/features/knowledge-base.md
@ -0,0 +1,247 @@
+# Knowledge Base
+
+> This documentation covers Pro edition features available in the agents-orchestrator/agents-orchestrator private repository.
+
+The Knowledge Base provides two complementary systems: Persistent Agent Memory (structured knowledge fragments with search and TTL) and the Codebase Symbol Graph (regex-based symbol extraction for code navigation context).
+
+---
+
+## Persistent Agent Memory
+
+### Purpose
+
+Persistent Agent Memory stores knowledge fragments that agents produce during sessions and makes them available in future sessions. Unlike session anchors (community feature, per-session), memory fragments persist across sessions and are searchable via FTS5.
+
+### Memory Fragments
+
+A memory fragment is a discrete piece of knowledge with metadata:
+
+```typescript
+interface MemoryFragment {
+  id: number;
+  projectId: string;
+  content: string;          // The knowledge itself (plain text or markdown)
+  source: string;           // Where it came from (session ID, file path, user)
+  trustTier: TrustTier;     // agent | human | auto
+  tags: string[];           // Categorization tags
+  createdAt: string;        // ISO 8601
+  updatedAt: string;        // ISO 8601
+  expiresAt: string | null; // ISO 8601, null = never expires
+  accessCount: number;      // Times retrieved for injection
+}
+
+type TrustTier = 'agent' | 'human' | 'auto';
+```
+
+### Trust Tiers
+
+| Tier | Source | Injection Priority | Editable |
+|------|--------|-------------------|----------|
+| `human` | Created or approved by user | Highest | Yes |
+| `agent` | Extracted by agent during session | Medium | Yes |
+| `auto` | Auto-extracted from patterns | Lowest | Yes |
+
+When injecting memories into agent prompts, higher-trust memories are prioritized. Within the same tier, more recently accessed memories rank higher.
+
+### TTL (Time-To-Live)
+
+Memories can have an optional expiration date. Expired memories are excluded from search results and injection. A background cleanup runs on plugin init, deleting memories expired more than 30 days ago.
+
+Default TTL by trust tier:
+
+| Tier | Default TTL |
+|------|-------------|
+| `human` | None (permanent) |
+| `agent` | 90 days |
+| `auto` | 30 days |
+
+Users can override TTL on any individual memory.
+
+### Auto-Extraction
+
+When an agent session completes, the dispatcher can trigger auto-extraction. The extractor scans the session transcript for:
+
+- Explicit knowledge statements ("I learned that...", "Note:", "Important:")
+- Error resolutions (error message followed by successful fix)
+- Configuration discoveries (env vars, file paths, API endpoints)
+
+Extracted fragments are created with `trustTier: 'auto'` and default TTL. The user can promote them to `agent` or `human` tier.
+
+### Memory Injection
+
+Before an agent session starts, the top-K most relevant memories are retrieved and formatted into a `## Project Knowledge` section in the system prompt. Relevance is determined by FTS5 rank score against the project context (project name, CWD, recent file paths).
+
+Default K = 5. Configurable per project via `pro_memory_set_config`.
+
+### SQLite Schema
+
+In `agor_pro.db`:
+
+```sql
+CREATE TABLE pro_memories (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    project_id TEXT NOT NULL,
+    content TEXT NOT NULL,
+    source TEXT NOT NULL,
+    trust_tier TEXT NOT NULL DEFAULT 'auto',
+    tags TEXT NOT NULL DEFAULT '[]',       -- JSON array
+    created_at TEXT NOT NULL DEFAULT (datetime('now')),
+    updated_at TEXT NOT NULL DEFAULT (datetime('now')),
+    expires_at TEXT,
+    access_count INTEGER NOT NULL DEFAULT 0
+);
+
+CREATE VIRTUAL TABLE pro_memories_fts USING fts5(
+    content,
+    tags,
+    content=pro_memories,
+    content_rowid=id
+);
+
+CREATE INDEX idx_pro_memories_project ON pro_memories(project_id);
+CREATE INDEX idx_pro_memories_expires ON pro_memories(expires_at);
+```
+
+### Commands
+
+#### pro_memory_create
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `content` | `String` | Yes | Memory content |
+| `source` | `String` | Yes | Origin (session ID, "user", file path) |
+| `trustTier` | `String` | No | agent, human, or auto (default: auto) |
+| `tags` | `Vec<String>` | No | Categorization tags |
+| `ttlDays` | `u32` | No | Days until expiration (null = tier default) |
+
+#### pro_memory_search
+
+FTS5 search across memory fragments for a project.
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `query` | `String` | Yes | FTS5 search query |
+| `limit` | `u32` | No | Max results (default: 10) |
+
+#### pro_memory_get, pro_memory_update, pro_memory_delete
+
+Standard CRUD by memory ID.
+
+#### pro_memory_list
+
+List memories for a project with optional filters.
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `trustTier` | `String` | No | Filter by tier |
+| `tag` | `String` | No | Filter by tag |
+| `limit` | `u32` | No | Max results (default: 50) |
+
+#### pro_memory_inject
+
+Returns formatted memory text for system prompt injection.
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `contextHints` | `Vec<String>` | No | Additional FTS5 terms for relevance |
+| `topK` | `u32` | No | Number of memories to include (default: 5) |
+
+#### pro_memory_set_config
+
+Sets memory injection configuration for a project.
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `topK` | `u32` | No | Default injection count |
+| `autoExtract` | `bool` | No | Enable auto-extraction on session end |
+
+---
+
+## Codebase Symbol Graph
+
+### Purpose
+
+The Symbol Graph provides structural code awareness by scanning source files with regex patterns to extract function/method/class/struct definitions and build a lightweight call graph. This gives agents contextual knowledge about code structure without requiring a full language server.
+
+### Scanning
+
+The scanner processes files matching configurable glob patterns. Default extensions:
+
+| Language | Extensions | Patterns Extracted |
+|----------|------------|--------------------|
+| TypeScript | `.ts`, `.tsx` | functions, classes, interfaces, type aliases, exports |
+| Rust | `.rs` | functions, structs, enums, traits, impls |
+| Python | `.py` | functions, classes, decorators |
+
+Scanning is triggered manually or on project open. Results are stored in `agor_pro.db`. A full re-scan replaces all symbols for the project.
+
+### Symbol Types
+
+```typescript
+interface CodeSymbol {
+  id: number;
+  projectId: string;
+  filePath: string;         // Relative to project root
+  name: string;             // Symbol name
+  kind: SymbolKind;         // function | class | struct | enum | trait | interface | type
+  line: number;             // Line number (1-based)
+  signature: string;        // Full signature line
+  parentName: string | null; // Enclosing class/struct/impl
+}
+
+type SymbolKind = 'function' | 'class' | 'struct' | 'enum' | 'trait' | 'interface' | 'type';
+```
+
+### Commands
+
+#### pro_symbols_scan
+
+Triggers a full scan of the project's source files.
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `rootPath` | `String` | Yes | Absolute path to project root |
+| `extensions` | `Vec<String>` | No | File extensions to scan (default: ts,rs,py) |
+
+#### pro_symbols_search
+
+Search symbols by name (prefix match).
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `query` | `String` | Yes | Symbol name prefix |
+| `kind` | `String` | No | Filter by symbol kind |
+| `limit` | `u32` | No | Max results (default: 20) |
+
+#### pro_symbols_find_callers
+
+Searches for references to a symbol name across the project's scanned files using text matching.
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `symbolName` | `String` | Yes | Symbol to find references to |
+
+Returns file paths and line numbers where the symbol name appears (excluding its definition).
+
+#### pro_symbols_file
+
+Returns all symbols in a specific file.
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `projectId` | `String` | Yes | Project identifier |
+| `filePath` | `String` | Yes | Relative file path |
+
+### Limitations
+
+- Regex-based extraction is approximate. It does not parse ASTs and may miss symbols in unusual formatting or produce false positives in comments/strings.
+- `find_callers` uses text matching, not semantic analysis. It will find string matches in comments and string literals.
+- Large codebases (10,000+ files) may take several seconds to scan. Scanning runs on a background thread and emits a completion event.