agent-orchestrator/tests/e2e/README.md
Hibryda e76bc341f2 refactor(e2e): extract infrastructure into tests/e2e/infra/ module
- Move fixtures.ts, llm-judge.ts, results-db.ts to tests/e2e/infra/
- Deduplicate wdio.conf.js: use createTestFixture() instead of inline copy
- Replace __dirname paths with projectRoot-anchored paths
- Create test-mode-constants.ts (typed env var names, flag registry)
- Create scripts/preflight-check.sh (validates tauri-driver, display, Claude CLI)
- Create scripts/check-test-flags.sh (CI lint for AGOR_TEST flag drift)
- Rewrite tests/e2e/README.md with full documentation
- Update spec imports for moved infra files
2026-03-18 03:06:57 +01:00

3.2 KiB

E2E Testing Module

Browser automation tests for Agent Orchestrator using WebDriverIO + tauri-driver.

Quick Start

# Preflight check (validates dependencies)
./scripts/preflight-check.sh

# Build debug binary + run E2E
npm run test:all:e2e

# Run E2E only (skip build)
SKIP_BUILD=1 npm run test:e2e

# Headless (CI)
xvfb-run --auto-servernum npm run test:e2e

System Dependencies

Tool Required Install
tauri-driver Yes cargo install tauri-driver
Debug binary Yes cargo tauri build --debug --no-bundle
X11/Wayland Yes (Linux) Use xvfb-run in CI
Claude CLI Optional LLM-judged tests skip if absent
ANTHROPIC_API_KEY Optional Alternative to Claude CLI for LLM judge

Directory Structure

tests/e2e/
├── wdio.conf.js          # WebDriverIO config + tauri-driver lifecycle
├── tsconfig.json          # TypeScript config for specs
├── README.md              # This file
├── infra/                 # Test infrastructure (not specs)
│   ├── fixtures.ts        # Test fixture generator (isolated temp dirs)
│   ├── llm-judge.ts       # LLM-based assertion engine (Claude CLI / API)
│   ├── results-db.ts      # JSON test results store
│   └── test-mode-constants.ts  # Typed env var names for test mode
└── specs/                 # Test specifications
    ├── agor.test.ts       # Smoke + UI tests (50+ tests)
    ├── agent-scenarios.test.ts  # Phase A: agent interaction (22 tests)
    ├── phase-b.test.ts    # Phase B: multi-project + LLM judge
    └── phase-c.test.ts    # Phase C: hardening features (11 scenarios)

Test Mode Environment Variables

Variable Purpose Read By
AGOR_TEST=1 Enable test isolation config.rs, misc.rs, lib.rs, watcher.rs, fs_watcher.rs, telemetry.rs, App.svelte
AGOR_TEST_DATA_DIR Override data dir config.rs
AGOR_TEST_CONFIG_DIR Override config dir config.rs

Effects when AGOR_TEST=1:

  • File watchers disabled (watcher.rs, fs_watcher.rs)
  • OTLP telemetry export disabled (telemetry.rs)
  • CLI tool installation skipped (lib.rs)
  • Wake scheduler disabled (App.svelte)
  • Test env vars forwarded to sidecar processes (lib.rs)

Test Phases

Phase File Tests Type
Smoke agor.test.ts 50+ Deterministic (CSS/DOM assertions)
A agent-scenarios.test.ts 22 Deterministic (data-testid selectors)
B phase-b.test.ts 6+ LLM-judged (multi-project, agent quality)
C phase-c.test.ts 11 scenarios Mixed (deterministic + LLM-judged)

Adding a New Spec

  1. Create tests/e2e/specs/my-feature.test.ts
  2. Import from @wdio/globals for browser and expect
  3. Use data-testid selectors (preferred) or CSS classes
  4. Add to wdio.conf.js specs array
  5. For LLM assertions: import { assertWithJudge } from '../infra/llm-judge'
  6. Run ./scripts/check-test-flags.sh if you added new AGOR_TEST references

CI Workflow

See .github/workflows/e2e.yml — 3 jobs:

  1. unit-tests: vitest frontend
  2. cargo-tests: Rust backend
  3. e2e-tests: WebDriverIO (xvfb-run, Phase A+B+C, LLM tests gated on secret)