docs: add 11 new documentation files across all categories

New reference docs:
- agents/ref-btmsg.md: inter-agent messaging schema and CLI
- agents/ref-bttask.md: kanban task board operations
- providers/ref-providers.md: Claude/Codex/Ollama/Aider comparison
- config/ref-settings.md: (already committed)

New guides:
- contributing/dual-repo-workflow.md: community vs commercial repos
- plugins/guide-developing.md: Web Worker sandbox API and publishing

New pro docs:
- pro/features/knowledge-base.md: persistent memory + symbol graph
- pro/features/git-integration.md: context injection + branch policy
- pro/marketplace/README.md: 13 plugins catalog

Split files:
- architecture/data-model.md: from architecture.md (schemas, layout)
- production/hardening.md: from production.md (supervisor, sandbox, WAL)
- production/features.md: from production.md (FTS5, plugins, secrets, audit)
This commit is contained in:
Hibryda 2026-03-17 04:18:05 +01:00
parent 8251321dac
commit b6c1d4b6af
11 changed files with 2198 additions and 0 deletions

View file

@ -0,0 +1,196 @@
# Git Integration
> This documentation covers Pro edition features available in the agents-orchestrator/agents-orchestrator private repository.
Git Integration provides two features: Git Context Injection (branch, commit, and diff information formatted for agent prompts) and Branch Policy (session-level protection for sensitive branches).
---
## Git Context Injection
### Purpose
Git Context Injection gathers repository state and formats it as markdown for inclusion in agent system prompts. This gives agents awareness of the current branch, recent commits, and modified files without requiring them to run git commands.
### Context Gathering
The system collects three categories of information by invoking the `git` CLI:
#### Branch Information
- Current branch name (`git rev-parse --abbrev-ref HEAD`)
- Tracking branch and ahead/behind counts (`git rev-list --left-right --count`)
- Last commit on branch (hash, author, date, subject)
#### Recent Commits
- Last N commits on the current branch (default: 10)
- Each commit includes: short hash, author, relative date, subject line
- Collected via `git log --oneline --format`
#### Modified Files
- Staged files (`git diff --cached --name-status`)
- Unstaged modifications (`git diff --name-status`)
- Untracked files (`git ls-files --others --exclude-standard`)
### Formatted Output
The collected information is formatted as a markdown section:
```markdown
## Git Context
**Branch:** feature/new-dashboard (ahead 3, behind 0 of origin/main)
### Recent Commits (last 10)
- `a1b2c3d` (2 hours ago) fix: resolve null check in analytics
- `e4f5g6h` (5 hours ago) feat: add daily cost chart
- ...
### Working Tree
**Staged:**
- M src/lib/components/Analytics.svelte
- A src/lib/stores/analytics.svelte.ts
**Modified:**
- M src/lib/adapters/pro-bridge.ts
**Untracked:**
- tests/analytics.test.ts
```
### Commands
#### pro_git_context
Gathers all git context for a project directory and returns formatted markdown.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `cwd` | `String` | Yes | Absolute path to the git repository |
| `commitCount` | `u32` | No | Number of recent commits to include (default: 10) |
**Response:**
```typescript
interface GitContext {
markdown: string; // Formatted markdown section
branch: string; // Current branch name
isDirty: boolean; // Has uncommitted changes
aheadBehind: [number, number]; // [ahead, behind]
}
```
#### pro_git_inject
Convenience command that gathers git context and prepends it to a given system prompt.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `cwd` | `String` | Yes | Repository path |
| `systemPrompt` | `String` | Yes | Existing system prompt to augment |
| `commitCount` | `u32` | No | Number of recent commits (default: 10) |
Returns the combined prompt string.
#### pro_git_branch_info
Returns structured branch information without formatting.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `cwd` | `String` | Yes | Repository path |
**Response:**
```typescript
interface BranchInfo {
name: string;
trackingBranch: string | null;
ahead: number;
behind: number;
lastCommitHash: string;
lastCommitSubject: string;
lastCommitDate: string;
}
```
### Implementation Notes
- All git commands are executed via `std::process::Command` (not libgit2). This avoids a heavy native dependency and matches the git CLI behavior users expect.
- Commands run with a 5-second timeout. If git is not installed or the directory is not a repository, commands return structured errors.
- Output encoding is handled as UTF-8 with lossy conversion for non-UTF-8 paths.
---
## Branch Policy
### Purpose
Branch Policy prevents agents from making commits or modifications on protected branches. This is a session-level safeguard -- the policy is checked when an agent session starts and when git operations are detected in tool calls.
### Protection Rules
Protected branch patterns are configurable per project. The defaults are:
| Pattern | Matches |
|---------|---------|
| `main` | Exact match |
| `master` | Exact match |
| `release/*` | Any branch starting with `release/` |
When an agent session starts on a protected branch, the system emits a warning notification. It does not block the session, because agents may need to read code on these branches. However, the branch name is included in the agent's system prompt with a clear instruction not to commit.
### Commands
#### pro_branch_check
Checks whether the current branch is protected.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `cwd` | `String` | Yes | Repository path |
| `projectId` | `String` | Yes | Project identifier (for project-specific policies) |
**Response:**
```typescript
interface BranchCheckResult {
branch: string;
isProtected: boolean;
matchedPattern: string | null; // Which pattern matched
}
```
#### pro_branch_policy_set
Sets custom protected branch patterns for a project. Replaces any existing patterns.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `patterns` | `Vec<String>` | Yes | Branch patterns (exact names or glob with `*`) |
#### pro_branch_policy_get
Returns the current branch policy for a project. Returns default patterns if none are set.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
#### pro_branch_policy_delete
Removes custom branch policy for a project, reverting to defaults.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
### Integration
Branch policy is checked at two points:
1. **Session start:** `AgentSession.startQuery()` calls `pro_branch_check`. If protected, a warning toast is shown and the branch protection instruction is appended to the system prompt.
2. **System prompt injection:** The formatted git context (from `pro_git_inject`) includes a `PROTECTED BRANCH` warning banner when applicable.

View file

@ -0,0 +1,247 @@
# Knowledge Base
> This documentation covers Pro edition features available in the agents-orchestrator/agents-orchestrator private repository.
The Knowledge Base provides two complementary systems: Persistent Agent Memory (structured knowledge fragments with search and TTL) and the Codebase Symbol Graph (regex-based symbol extraction for code navigation context).
---
## Persistent Agent Memory
### Purpose
Persistent Agent Memory stores knowledge fragments that agents produce during sessions and makes them available in future sessions. Unlike session anchors (community feature, per-session), memory fragments persist across sessions and are searchable via FTS5.
### Memory Fragments
A memory fragment is a discrete piece of knowledge with metadata:
```typescript
interface MemoryFragment {
id: number;
projectId: string;
content: string; // The knowledge itself (plain text or markdown)
source: string; // Where it came from (session ID, file path, user)
trustTier: TrustTier; // agent | human | auto
tags: string[]; // Categorization tags
createdAt: string; // ISO 8601
updatedAt: string; // ISO 8601
expiresAt: string | null; // ISO 8601, null = never expires
accessCount: number; // Times retrieved for injection
}
type TrustTier = 'agent' | 'human' | 'auto';
```
### Trust Tiers
| Tier | Source | Injection Priority | Editable |
|------|--------|-------------------|----------|
| `human` | Created or approved by user | Highest | Yes |
| `agent` | Extracted by agent during session | Medium | Yes |
| `auto` | Auto-extracted from patterns | Lowest | Yes |
When injecting memories into agent prompts, higher-trust memories are prioritized. Within the same tier, more recently accessed memories rank higher.
### TTL (Time-To-Live)
Memories can have an optional expiration date. Expired memories are excluded from search results and injection. A background cleanup runs on plugin init, deleting memories expired more than 30 days ago.
Default TTL by trust tier:
| Tier | Default TTL |
|------|-------------|
| `human` | None (permanent) |
| `agent` | 90 days |
| `auto` | 30 days |
Users can override TTL on any individual memory.
### Auto-Extraction
When an agent session completes, the dispatcher can trigger auto-extraction. The extractor scans the session transcript for:
- Explicit knowledge statements ("I learned that...", "Note:", "Important:")
- Error resolutions (error message followed by successful fix)
- Configuration discoveries (env vars, file paths, API endpoints)
Extracted fragments are created with `trustTier: 'auto'` and default TTL. The user can promote them to `agent` or `human` tier.
### Memory Injection
Before an agent session starts, the top-K most relevant memories are retrieved and formatted into a `## Project Knowledge` section in the system prompt. Relevance is determined by FTS5 rank score against the project context (project name, CWD, recent file paths).
Default K = 5. Configurable per project via `pro_memory_set_config`.
### SQLite Schema
In `agor_pro.db`:
```sql
CREATE TABLE pro_memories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
project_id TEXT NOT NULL,
content TEXT NOT NULL,
source TEXT NOT NULL,
trust_tier TEXT NOT NULL DEFAULT 'auto',
tags TEXT NOT NULL DEFAULT '[]', -- JSON array
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now')),
expires_at TEXT,
access_count INTEGER NOT NULL DEFAULT 0
);
CREATE VIRTUAL TABLE pro_memories_fts USING fts5(
content,
tags,
content=pro_memories,
content_rowid=id
);
CREATE INDEX idx_pro_memories_project ON pro_memories(project_id);
CREATE INDEX idx_pro_memories_expires ON pro_memories(expires_at);
```
### Commands
#### pro_memory_create
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `content` | `String` | Yes | Memory content |
| `source` | `String` | Yes | Origin (session ID, "user", file path) |
| `trustTier` | `String` | No | agent, human, or auto (default: auto) |
| `tags` | `Vec<String>` | No | Categorization tags |
| `ttlDays` | `u32` | No | Days until expiration (null = tier default) |
#### pro_memory_search
FTS5 search across memory fragments for a project.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `query` | `String` | Yes | FTS5 search query |
| `limit` | `u32` | No | Max results (default: 10) |
#### pro_memory_get, pro_memory_update, pro_memory_delete
Standard CRUD by memory ID.
#### pro_memory_list
List memories for a project with optional filters.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `trustTier` | `String` | No | Filter by tier |
| `tag` | `String` | No | Filter by tag |
| `limit` | `u32` | No | Max results (default: 50) |
#### pro_memory_inject
Returns formatted memory text for system prompt injection.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `contextHints` | `Vec<String>` | No | Additional FTS5 terms for relevance |
| `topK` | `u32` | No | Number of memories to include (default: 5) |
#### pro_memory_set_config
Sets memory injection configuration for a project.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `topK` | `u32` | No | Default injection count |
| `autoExtract` | `bool` | No | Enable auto-extraction on session end |
---
## Codebase Symbol Graph
### Purpose
The Symbol Graph provides structural code awareness by scanning source files with regex patterns to extract function/method/class/struct definitions and build a lightweight call graph. This gives agents contextual knowledge about code structure without requiring a full language server.
### Scanning
The scanner processes files matching configurable glob patterns. Default extensions:
| Language | Extensions | Patterns Extracted |
|----------|------------|--------------------|
| TypeScript | `.ts`, `.tsx` | functions, classes, interfaces, type aliases, exports |
| Rust | `.rs` | functions, structs, enums, traits, impls |
| Python | `.py` | functions, classes, decorators |
Scanning is triggered manually or on project open. Results are stored in `agor_pro.db`. A full re-scan replaces all symbols for the project.
### Symbol Types
```typescript
interface CodeSymbol {
id: number;
projectId: string;
filePath: string; // Relative to project root
name: string; // Symbol name
kind: SymbolKind; // function | class | struct | enum | trait | interface | type
line: number; // Line number (1-based)
signature: string; // Full signature line
parentName: string | null; // Enclosing class/struct/impl
}
type SymbolKind = 'function' | 'class' | 'struct' | 'enum' | 'trait' | 'interface' | 'type';
```
### Commands
#### pro_symbols_scan
Triggers a full scan of the project's source files.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `rootPath` | `String` | Yes | Absolute path to project root |
| `extensions` | `Vec<String>` | No | File extensions to scan (default: ts,rs,py) |
#### pro_symbols_search
Search symbols by name (prefix match).
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `query` | `String` | Yes | Symbol name prefix |
| `kind` | `String` | No | Filter by symbol kind |
| `limit` | `u32` | No | Max results (default: 20) |
#### pro_symbols_find_callers
Searches for references to a symbol name across the project's scanned files using text matching.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `symbolName` | `String` | Yes | Symbol to find references to |
Returns file paths and line numbers where the symbol name appears (excluding its definition).
#### pro_symbols_file
Returns all symbols in a specific file.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `projectId` | `String` | Yes | Project identifier |
| `filePath` | `String` | Yes | Relative file path |
### Limitations
- Regex-based extraction is approximate. It does not parse ASTs and may miss symbols in unusual formatting or produce false positives in comments/strings.
- `find_callers` uses text matching, not semantic analysis. It will find string matches in comments and string literals.
- Large codebases (10,000+ files) may take several seconds to scan. Scanning runs on a background thread and emits a completion event.