agents-orchestrator/agent-orchestrator

Author	SHA1	Message	Date
Hibryda	f88d10888b	feat: refactor LLM judge to dual-mode CLI/API and fix config test race Refactor llm-judge.ts from raw API-only to dual-mode: CLI first (spawns claude with --output-format text, unsets CLAUDECODE), API fallback. Backend selectable via LLM_JUDGE_BACKEND env var. Fix pre-existing race condition in config.rs tests where parallel test execution caused env var mutations to interfere. Added static Mutex to serialize env-mutating tests.	2026-03-12 11:10:50 +01:00
Hibryda	9a90c2499a	test: add Phase C E2E tests and fix pre-existing test failures - Add phase-c.test.ts: 27 new E2E tests across 11 scenarios covering hardening sprint features (command palette, search overlay, notification center, keyboard navigation, settings panel, project health, metrics tab, context tab, files tab, LLM-judged settings/status bar) - Fix 3 pre-existing failures in bterminal.test.ts: update stale CSS selectors (.group-name → .cmd-label, .palette-item.active → .selected) - Register phase-c.test.ts in wdio.conf.js specs array - Update test counts: 444 vitest + 151 cargo + 109 E2E = 704 total	2026-03-12 11:10:50 +01:00
Hibryda	1f293083b2	fix(e2e): fix 27 E2E test failures across 3 spec files Fix stale v2 CSS selectors for v3 UI, WebKit2GTK keyboard/focus quirks (JS-dispatched KeyboardEvent, programmatic focus check, backdrop click close), conditional render timing (waitUntil for project boxes, null handling for burn-rate/cost elements), and AgentPane missing closing > on data-testid div tag.	2026-03-12 11:10:50 +01:00
Hibryda	90c997d3e9	feat(e2e): add Phase B scenarios with LLM-judged assertions and multi-project tests Adds 6 new E2E scenarios in phase-b.test.ts covering multi-project grid rendering, independent tab switching, status bar fleet state, and LLM-judged agent response quality evaluation via Claude API. Includes llm-judge.ts helper (raw Anthropic API fetch, haiku-4-5, structured verdicts with confidence thresholds).	2026-03-12 11:10:50 +01:00
Hibryda	22fe723816	docs: update meta files for E2E testing engine Phase A	2026-03-12 11:10:50 +01:00
Hibryda	8bc8a1a33d	feat(e2e): add Phase A scenarios, fixtures, and results store 7 human-authored test scenarios (22 tests) using data-testid selectors. Test fixture generator for isolated environments. JSON results store (no native deps). WebDriverIO config updated with TCP readiness probe and multi-spec support.	2026-03-12 11:10:50 +01:00
Hibryda	b9b5ef9cb3	fix(e2e): scope terminal tab selectors to .tab-bar for reliable matching	2026-03-08 22:42:48 +01:00
Hibryda	2eb323fba8	test(e2e): expand coverage from 25 to 48 tests across 8 describe blocks	2026-03-08 22:27:51 +01:00
Hibryda	4c02b87e33	chore: remove old individual E2E spec files Consolidated into single bterminal.test.ts (Tauri single-session requirement).	2026-03-08 21:58:28 +01:00
Hibryda	d12cbffda7	fix(e2e): consolidate specs into single file and fix WebDriver click issues Tauri creates one app session per spec file; multiple files caused invalid session id on subsequent specs. WebDriver clicks on Svelte 5 components inside scrollable panels dont trigger onclick handlers via WebKit2GTK/tauri-driver - use browser.execute() JS clicks. Also removed tauri-plugin-log (redundant with telemetry::init()).	2026-03-08 21:58:23 +01:00
Hibryda	bfbdb2cc18	fix(e2e): resolve wdio v9 BiDi + tauri-driver compatibility issues	2026-03-08 21:32:16 +01:00
Hibryda	3c3a8ab54e	test(e2e): scaffold WebdriverIO + tauri-driver E2E testing infrastructure	2026-03-08 21:13:38 +01:00
Hibryda	020dc20d4f	test(v2): add integration tests for layout, agent-bridge, and dispatcher Add 59 new vitest tests: layout.test.ts (30), agent-bridge.test.ts (11), agent-dispatcher.test.ts (18). Fix unused import in sdk-messages.test.ts. Add WebDriver E2E scaffold README. Total: 104 vitest + 29 cargo tests.	2026-03-06 15:42:34 +01:00

13 commits