feat(e2e): add Phase B scenarios with LLM-judged assertions and multi-project tests

Adds 6 new E2E scenarios in phase-b.test.ts covering multi-project grid
rendering, independent tab switching, status bar fleet state, and
LLM-judged agent response quality evaluation via Claude API.
Includes llm-judge.ts helper (raw Anthropic API fetch, haiku-4-5,
structured verdicts with confidence thresholds).
This commit is contained in:
Hibryda 2026-03-12 03:07:38 +01:00
parent c4c673a4b0
commit 5e4357e4ac
3 changed files with 469 additions and 0 deletions

View file

@ -28,6 +28,7 @@ export const config = {
specs: [
resolve(__dirname, 'specs/bterminal.test.ts'),
resolve(__dirname, 'specs/agent-scenarios.test.ts'),
resolve(__dirname, 'specs/phase-b.test.ts'),
],
// ── Capabilities ──