docs: add v3.0 release notes and update meta files for hardening sprint
- docs/v3-release-notes.md: comprehensive v3.0 release notes covering Mission Control, multi-agent orchestration, production readiness, multi-machine early access, test coverage, and known limitations - docs/v3-progress.md: hardening sprint session entry - CHANGELOG.md: security entries (TLS, WAL, plugin sandbox, Landlock) and bug fixes (subagent delegation, gitignore) - TODO.md: hardening complete, remaining items moved to v3.1 - CLAUDE.md: updated test counts (444 vitest + 111 cargo)
This commit is contained in:
parent
5e949696d5
commit
58054e56fc
6 changed files with 158 additions and 6 deletions
11
TODO.md
11
TODO.md
|
|
@ -2,13 +2,16 @@
|
|||
|
||||
## Active
|
||||
|
||||
### v3 Remaining
|
||||
- [ ] **Multi-machine real-world testing** -- Test bterminal-relay with 2 machines. Mark as experimental until docker-compose integration tests pass.
|
||||
- [ ] **Multi-machine TLS/certificate pinning** -- TLS support for bterminal-relay + certificate pinning in RemoteManager.
|
||||
- [ ] **Agent Teams real-world testing** -- Env var whitelist fix done. 3 test sessions ran ($1.10, $0.69, $1.70) but model didn't spawn subagents — needs complex multi-part prompts to trigger delegation. Test with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1.
|
||||
### v3.1 Remaining
|
||||
- [ ] **Multi-machine real-world testing** -- TLS added to relay. Needs real 2-machine test. Multi-machine UI not surfaced in v3, code exists in bridges/stores only.
|
||||
- [ ] **Certificate pinning** -- TLS encryption done (v3.0). Pin cert hash in RemoteManager for v3.1.
|
||||
- [ ] **Agent Teams real-world testing** -- Subagent delegation prompt fix done + env var injection. Needs real multi-agent session to verify Manager spawns child agents.
|
||||
- [ ] **Plugin sandbox migration** -- `new Function()` has inherent escape vectors (prototype walking, arguments.callee.constructor). Consider Web Worker isolation for v3.2.
|
||||
- [ ] **Soak test** -- Run 4-hour soak with 6+ agents across 3+ projects. Monitor memory, SQLite WAL size, xterm.js instances.
|
||||
|
||||
## Completed
|
||||
|
||||
- [x] **v3 Hardening Sprint** -- Fixed subagent delegation (prompt + env var), added TLS to relay, WAL checkpoint (5min), Landlock logging, plugin sandbox tests (35), gitignore fix. 444 vitest + 111 cargo. | Done: 2026-03-12
|
||||
- [x] **v3 Production Readiness — ALL tribunal items** -- Implemented all 13 features from tribunal assessment: sidecar supervisor, notifications, secrets, keyboard UX, agent health, search, plugins, sandbox, error classifier, audit log, team agent orchestration, optimistic locking, usage meter. 409 vitest + 109 cargo. | Done: 2026-03-12
|
||||
- [x] **Unified test runner + testing gate rule** -- Created v2/scripts/test-all.sh (vitest + cargo + optional E2E), added npm scripts (test:all, test:all:e2e, test:cargo), added .claude/rules/20-testing-gate.md requiring full suite after major changes. | Done: 2026-03-12
|
||||
- [x] **E2E testing — Phase B+ & test fixes** -- Phase B: LLM judge (llm-judge.ts, claude-haiku-4-5), 6 multi-project scenarios, CI workflow (3 jobs). Test fixes: 27 failures across 3 spec files. 388 vitest + 68 cargo + 82 E2E (0 fail, 4 skip). | Done: 2026-03-12
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue