docs: add 11 new documentation files across all categories
New reference docs: - agents/ref-btmsg.md: inter-agent messaging schema and CLI - agents/ref-bttask.md: kanban task board operations - providers/ref-providers.md: Claude/Codex/Ollama/Aider comparison - config/ref-settings.md: (already committed) New guides: - contributing/dual-repo-workflow.md: community vs commercial repos - plugins/guide-developing.md: Web Worker sandbox API and publishing New pro docs: - pro/features/knowledge-base.md: persistent memory + symbol graph - pro/features/git-integration.md: context injection + branch policy - pro/marketplace/README.md: 13 plugins catalog Split files: - architecture/data-model.md: from architecture.md (schemas, layout) - production/hardening.md: from production.md (supervisor, sandbox, WAL) - production/features.md: from production.md (FTS5, plugins, secrets, audit)
This commit is contained in:
parent
8251321dac
commit
b6c1d4b6af
11 changed files with 2198 additions and 0 deletions
247
docs/pro/features/knowledge-base.md
Normal file
247
docs/pro/features/knowledge-base.md
Normal file
|
|
@ -0,0 +1,247 @@
|
|||
# Knowledge Base
|
||||
|
||||
> This documentation covers Pro edition features available in the agents-orchestrator/agents-orchestrator private repository.
|
||||
|
||||
The Knowledge Base provides two complementary systems: Persistent Agent Memory (structured knowledge fragments with search and TTL) and the Codebase Symbol Graph (regex-based symbol extraction for code navigation context).
|
||||
|
||||
---
|
||||
|
||||
## Persistent Agent Memory
|
||||
|
||||
### Purpose
|
||||
|
||||
Persistent Agent Memory stores knowledge fragments that agents produce during sessions and makes them available in future sessions. Unlike session anchors (community feature, per-session), memory fragments persist across sessions and are searchable via FTS5.
|
||||
|
||||
### Memory Fragments
|
||||
|
||||
A memory fragment is a discrete piece of knowledge with metadata:
|
||||
|
||||
```typescript
|
||||
interface MemoryFragment {
|
||||
id: number;
|
||||
projectId: string;
|
||||
content: string; // The knowledge itself (plain text or markdown)
|
||||
source: string; // Where it came from (session ID, file path, user)
|
||||
trustTier: TrustTier; // agent | human | auto
|
||||
tags: string[]; // Categorization tags
|
||||
createdAt: string; // ISO 8601
|
||||
updatedAt: string; // ISO 8601
|
||||
expiresAt: string | null; // ISO 8601, null = never expires
|
||||
accessCount: number; // Times retrieved for injection
|
||||
}
|
||||
|
||||
type TrustTier = 'agent' | 'human' | 'auto';
|
||||
```
|
||||
|
||||
### Trust Tiers
|
||||
|
||||
| Tier | Source | Injection Priority | Editable |
|
||||
|------|--------|-------------------|----------|
|
||||
| `human` | Created or approved by user | Highest | Yes |
|
||||
| `agent` | Extracted by agent during session | Medium | Yes |
|
||||
| `auto` | Auto-extracted from patterns | Lowest | Yes |
|
||||
|
||||
When injecting memories into agent prompts, higher-trust memories are prioritized. Within the same tier, more recently accessed memories rank higher.
|
||||
|
||||
### TTL (Time-To-Live)
|
||||
|
||||
Memories can have an optional expiration date. Expired memories are excluded from search results and injection. A background cleanup runs on plugin init, deleting memories expired more than 30 days ago.
|
||||
|
||||
Default TTL by trust tier:
|
||||
|
||||
| Tier | Default TTL |
|
||||
|------|-------------|
|
||||
| `human` | None (permanent) |
|
||||
| `agent` | 90 days |
|
||||
| `auto` | 30 days |
|
||||
|
||||
Users can override TTL on any individual memory.
|
||||
|
||||
### Auto-Extraction
|
||||
|
||||
When an agent session completes, the dispatcher can trigger auto-extraction. The extractor scans the session transcript for:
|
||||
|
||||
- Explicit knowledge statements ("I learned that...", "Note:", "Important:")
|
||||
- Error resolutions (error message followed by successful fix)
|
||||
- Configuration discoveries (env vars, file paths, API endpoints)
|
||||
|
||||
Extracted fragments are created with `trustTier: 'auto'` and default TTL. The user can promote them to `agent` or `human` tier.
|
||||
|
||||
### Memory Injection
|
||||
|
||||
Before an agent session starts, the top-K most relevant memories are retrieved and formatted into a `## Project Knowledge` section in the system prompt. Relevance is determined by FTS5 rank score against the project context (project name, CWD, recent file paths).
|
||||
|
||||
Default K = 5. Configurable per project via `pro_memory_set_config`.
|
||||
|
||||
### SQLite Schema
|
||||
|
||||
In `agor_pro.db`:
|
||||
|
||||
```sql
|
||||
CREATE TABLE pro_memories (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
project_id TEXT NOT NULL,
|
||||
content TEXT NOT NULL,
|
||||
source TEXT NOT NULL,
|
||||
trust_tier TEXT NOT NULL DEFAULT 'auto',
|
||||
tags TEXT NOT NULL DEFAULT '[]', -- JSON array
|
||||
created_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||
updated_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||
expires_at TEXT,
|
||||
access_count INTEGER NOT NULL DEFAULT 0
|
||||
);
|
||||
|
||||
CREATE VIRTUAL TABLE pro_memories_fts USING fts5(
|
||||
content,
|
||||
tags,
|
||||
content=pro_memories,
|
||||
content_rowid=id
|
||||
);
|
||||
|
||||
CREATE INDEX idx_pro_memories_project ON pro_memories(project_id);
|
||||
CREATE INDEX idx_pro_memories_expires ON pro_memories(expires_at);
|
||||
```
|
||||
|
||||
### Commands
|
||||
|
||||
#### pro_memory_create
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `content` | `String` | Yes | Memory content |
|
||||
| `source` | `String` | Yes | Origin (session ID, "user", file path) |
|
||||
| `trustTier` | `String` | No | agent, human, or auto (default: auto) |
|
||||
| `tags` | `Vec<String>` | No | Categorization tags |
|
||||
| `ttlDays` | `u32` | No | Days until expiration (null = tier default) |
|
||||
|
||||
#### pro_memory_search
|
||||
|
||||
FTS5 search across memory fragments for a project.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `query` | `String` | Yes | FTS5 search query |
|
||||
| `limit` | `u32` | No | Max results (default: 10) |
|
||||
|
||||
#### pro_memory_get, pro_memory_update, pro_memory_delete
|
||||
|
||||
Standard CRUD by memory ID.
|
||||
|
||||
#### pro_memory_list
|
||||
|
||||
List memories for a project with optional filters.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `trustTier` | `String` | No | Filter by tier |
|
||||
| `tag` | `String` | No | Filter by tag |
|
||||
| `limit` | `u32` | No | Max results (default: 50) |
|
||||
|
||||
#### pro_memory_inject
|
||||
|
||||
Returns formatted memory text for system prompt injection.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `contextHints` | `Vec<String>` | No | Additional FTS5 terms for relevance |
|
||||
| `topK` | `u32` | No | Number of memories to include (default: 5) |
|
||||
|
||||
#### pro_memory_set_config
|
||||
|
||||
Sets memory injection configuration for a project.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `topK` | `u32` | No | Default injection count |
|
||||
| `autoExtract` | `bool` | No | Enable auto-extraction on session end |
|
||||
|
||||
---
|
||||
|
||||
## Codebase Symbol Graph
|
||||
|
||||
### Purpose
|
||||
|
||||
The Symbol Graph provides structural code awareness by scanning source files with regex patterns to extract function/method/class/struct definitions and build a lightweight call graph. This gives agents contextual knowledge about code structure without requiring a full language server.
|
||||
|
||||
### Scanning
|
||||
|
||||
The scanner processes files matching configurable glob patterns. Default extensions:
|
||||
|
||||
| Language | Extensions | Patterns Extracted |
|
||||
|----------|------------|--------------------|
|
||||
| TypeScript | `.ts`, `.tsx` | functions, classes, interfaces, type aliases, exports |
|
||||
| Rust | `.rs` | functions, structs, enums, traits, impls |
|
||||
| Python | `.py` | functions, classes, decorators |
|
||||
|
||||
Scanning is triggered manually or on project open. Results are stored in `agor_pro.db`. A full re-scan replaces all symbols for the project.
|
||||
|
||||
### Symbol Types
|
||||
|
||||
```typescript
|
||||
interface CodeSymbol {
|
||||
id: number;
|
||||
projectId: string;
|
||||
filePath: string; // Relative to project root
|
||||
name: string; // Symbol name
|
||||
kind: SymbolKind; // function | class | struct | enum | trait | interface | type
|
||||
line: number; // Line number (1-based)
|
||||
signature: string; // Full signature line
|
||||
parentName: string | null; // Enclosing class/struct/impl
|
||||
}
|
||||
|
||||
type SymbolKind = 'function' | 'class' | 'struct' | 'enum' | 'trait' | 'interface' | 'type';
|
||||
```
|
||||
|
||||
### Commands
|
||||
|
||||
#### pro_symbols_scan
|
||||
|
||||
Triggers a full scan of the project's source files.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `rootPath` | `String` | Yes | Absolute path to project root |
|
||||
| `extensions` | `Vec<String>` | No | File extensions to scan (default: ts,rs,py) |
|
||||
|
||||
#### pro_symbols_search
|
||||
|
||||
Search symbols by name (prefix match).
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `query` | `String` | Yes | Symbol name prefix |
|
||||
| `kind` | `String` | No | Filter by symbol kind |
|
||||
| `limit` | `u32` | No | Max results (default: 20) |
|
||||
|
||||
#### pro_symbols_find_callers
|
||||
|
||||
Searches for references to a symbol name across the project's scanned files using text matching.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `symbolName` | `String` | Yes | Symbol to find references to |
|
||||
|
||||
Returns file paths and line numbers where the symbol name appears (excluding its definition).
|
||||
|
||||
#### pro_symbols_file
|
||||
|
||||
Returns all symbols in a specific file.
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `projectId` | `String` | Yes | Project identifier |
|
||||
| `filePath` | `String` | Yes | Relative file path |
|
||||
|
||||
### Limitations
|
||||
|
||||
- Regex-based extraction is approximate. It does not parse ASTs and may miss symbols in unusual formatting or produce false positives in comments/strings.
|
||||
- `find_callers` uses text matching, not semantic analysis. It will find string matches in comments and string literals.
|
||||
- Large codebases (10,000+ files) may take several seconds to scan. Scanning runs on a background thread and emits a completion event.
|
||||
Loading…
Add table
Add a link
Reference in a new issue