docs: add 11 new documentation files across all categories

New reference docs:
- agents/ref-btmsg.md: inter-agent messaging schema and CLI
- agents/ref-bttask.md: kanban task board operations
- providers/ref-providers.md: Claude/Codex/Ollama/Aider comparison
- config/ref-settings.md: (already committed)

New guides:
- contributing/dual-repo-workflow.md: community vs commercial repos
- plugins/guide-developing.md: Web Worker sandbox API and publishing

New pro docs:
- pro/features/knowledge-base.md: persistent memory + symbol graph
- pro/features/git-integration.md: context injection + branch policy
- pro/marketplace/README.md: 13 plugins catalog

Split files:
- architecture/data-model.md: from architecture.md (schemas, layout)
- production/hardening.md: from production.md (supervisor, sandbox, WAL)
- production/features.md: from production.md (FTS5, plugins, secrets, audit)
This commit is contained in:
Hibryda 2026-03-17 04:18:05 +01:00
parent 8251321dac
commit b6c1d4b6af
11 changed files with 2198 additions and 0 deletions

View file

@ -0,0 +1,247 @@
# Knowledge Base
> This documentation covers Pro edition features available in the agents-orchestrator/agents-orchestrator private repository.
The Knowledge Base provides two complementary systems: Persistent Agent Memory (structured knowledge fragments with search and TTL) and the Codebase Symbol Graph (regex-based symbol extraction for code navigation context).
---
## Persistent Agent Memory
### Purpose
Persistent Agent Memory stores knowledge fragments that agents produce during sessions and makes them available in future sessions. Unlike session anchors (community feature, per-session), memory fragments persist across sessions and are searchable via FTS5.
### Memory Fragments
A memory fragment is a discrete piece of knowledge with metadata:
```typescript
interface MemoryFragment {
id: number;
projectId: string;
content: string; // The knowledge itself (plain text or markdown)
source: string; // Where it came from (session ID, file path, user)
trustTier: TrustTier; // agent | human | auto
tags: string[]; // Categorization tags
createdAt: string; // ISO 8601
updatedAt: string; // ISO 8601
expiresAt: string | null; // ISO 8601, null = never expires
accessCount: number; // Times retrieved for injection
}
type TrustTier = 'agent' | 'human' | 'auto';
```
### Trust Tiers
| Tier | Source | Injection Priority | Editable |
|------|--------|-------------------|----------|
| `human` | Created or approved by user | Highest | Yes |
| `agent` | Extracted by agent during session | Medium | Yes |
| `auto` | Auto-extracted from patterns | Lowest | Yes |
When injecting memories into agent prompts, higher-trust memories are prioritized. Within the same tier, more recently accessed memories rank higher.
### TTL (Time-To-Live)
Memories can have an optional expiration date. Expired memories are excluded from search results and injection. A background cleanup runs on plugin init, deleting memories expired more than 30 days ago.
Default TTL by trust tier:
| Tier | Default TTL |
|------|-------------|
| `human` | None (permanent) |
| `agent` | 90 days |
| `auto` | 30 days |
Users can override TTL on any individual memory.
### Auto-Extraction
When an agent session completes, the dispatcher can trigger auto-extraction. The extractor scans the session transcript for:
- Explicit knowledge statements ("I learned that...", "Note:", "Important:")
- Error resolutions (error message followed by successful fix)
- Configuration discoveries (env vars, file paths, API endpoints)
Extracted fragments are created with `trustTier: 'auto'` and default TTL. The user can promote them to `agent` or `human` tier.
### Memory Injection
Before an agent session starts, the top-K most relevant memories are retrieved and formatted into a `## Project Knowledge` section in the system prompt. Relevance is determined by FTS5 rank score against the project context (project name, CWD, recent file paths).
Default K = 5. Configurable per project via `pro_memory_set_config`.
### SQLite Schema
In `agor_pro.db`:
```sql
CREATE TABLE pro_memories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
project_id TEXT NOT NULL,
content TEXT NOT NULL,
source TEXT NOT NULL,
trust_tier TEXT NOT NULL DEFAULT 'auto',
tags TEXT NOT NULL DEFAULT '[]', -- JSON array
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now')),
expires_at TEXT,
access_count INTEGER NOT NULL DEFAULT 0
);
CREATE VIRTUAL TABLE pro_memories_fts USING fts5(
content,
tags,
content=pro_memories,
content_rowid=id
);
CREATE INDEX idx_pro_memories_project ON pro_memories(project_id);
CREATE INDEX idx_pro_memories_expires ON pro_memories(expires_at);
```
### Commands
#### pro_memory_create
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `content` | `String` | Yes | Memory content |
| `source` | `String` | Yes | Origin (session ID, "user", file path) |
| `trustTier` | `String` | No | agent, human, or auto (default: auto) |
| `tags` | `Vec<String>` | No | Categorization tags |
| `ttlDays` | `u32` | No | Days until expiration (null = tier default) |
#### pro_memory_search
FTS5 search across memory fragments for a project.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `query` | `String` | Yes | FTS5 search query |
| `limit` | `u32` | No | Max results (default: 10) |
#### pro_memory_get, pro_memory_update, pro_memory_delete
Standard CRUD by memory ID.
#### pro_memory_list
List memories for a project with optional filters.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `trustTier` | `String` | No | Filter by tier |
| `tag` | `String` | No | Filter by tag |
| `limit` | `u32` | No | Max results (default: 50) |
#### pro_memory_inject
Returns formatted memory text for system prompt injection.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `contextHints` | `Vec<String>` | No | Additional FTS5 terms for relevance |
| `topK` | `u32` | No | Number of memories to include (default: 5) |
#### pro_memory_set_config
Sets memory injection configuration for a project.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `topK` | `u32` | No | Default injection count |
| `autoExtract` | `bool` | No | Enable auto-extraction on session end |
---
## Codebase Symbol Graph
### Purpose
The Symbol Graph provides structural code awareness by scanning source files with regex patterns to extract function/method/class/struct definitions and build a lightweight call graph. This gives agents contextual knowledge about code structure without requiring a full language server.
### Scanning
The scanner processes files matching configurable glob patterns. Default extensions:
| Language | Extensions | Patterns Extracted |
|----------|------------|--------------------|
| TypeScript | `.ts`, `.tsx` | functions, classes, interfaces, type aliases, exports |
| Rust | `.rs` | functions, structs, enums, traits, impls |
| Python | `.py` | functions, classes, decorators |
Scanning is triggered manually or on project open. Results are stored in `agor_pro.db`. A full re-scan replaces all symbols for the project.
### Symbol Types
```typescript
interface CodeSymbol {
id: number;
projectId: string;
filePath: string; // Relative to project root
name: string; // Symbol name
kind: SymbolKind; // function | class | struct | enum | trait | interface | type
line: number; // Line number (1-based)
signature: string; // Full signature line
parentName: string | null; // Enclosing class/struct/impl
}
type SymbolKind = 'function' | 'class' | 'struct' | 'enum' | 'trait' | 'interface' | 'type';
```
### Commands
#### pro_symbols_scan
Triggers a full scan of the project's source files.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `rootPath` | `String` | Yes | Absolute path to project root |
| `extensions` | `Vec<String>` | No | File extensions to scan (default: ts,rs,py) |
#### pro_symbols_search
Search symbols by name (prefix match).
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `query` | `String` | Yes | Symbol name prefix |
| `kind` | `String` | No | Filter by symbol kind |
| `limit` | `u32` | No | Max results (default: 20) |
#### pro_symbols_find_callers
Searches for references to a symbol name across the project's scanned files using text matching.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `symbolName` | `String` | Yes | Symbol to find references to |
Returns file paths and line numbers where the symbol name appears (excluding its definition).
#### pro_symbols_file
Returns all symbols in a specific file.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `filePath` | `String` | Yes | Relative file path |
### Limitations
- Regex-based extraction is approximate. It does not parse ASTs and may miss symbols in unusual formatting or produce false positives in comments/strings.
- `find_callers` uses text matching, not semantic analysis. It will find string matches in comments and string literals.
- Large codebases (10,000+ files) may take several seconds to scan. Scanning runs on a background thread and emits a completion event.