agent-orchestrator/tests/e2e/README.md

# E2E Testing Module

Browser automation tests for Agent Orchestrator using WebDriverIO + tauri-driver.

## Quick Start

```bash
# Preflight check (validates dependencies)
./scripts/preflight-check.sh

# Build debug binary + run E2E
npm run test:all:e2e

# Run E2E only (skip build)
SKIP_BUILD=1 npm run test:e2e

# Headless (CI)
xvfb-run --auto-servernum npm run test:e2e
```

## System Dependencies

| Tool | Required | Install |
|------|----------|---------|
| tauri-driver | Yes | `cargo install tauri-driver` |
| Debug binary | Yes | `cargo tauri build --debug --no-bundle` |
| X11/Wayland | Yes (Linux) | Use `xvfb-run` in CI |
| Claude CLI | Optional | LLM-judged tests skip if absent |
| ANTHROPIC_API_KEY | Optional | Alternative to Claude CLI for LLM judge |

## Directory Structure

```
tests/e2e/
├── wdio.conf.js          # WebDriverIO config + tauri-driver lifecycle
├── tsconfig.json          # TypeScript config for specs
├── README.md              # This file
├── infra/                 # Test infrastructure (not specs)
│   ├── fixtures.ts        # Test fixture generator (isolated temp dirs)
│   ├── llm-judge.ts       # LLM-based assertion engine (Claude CLI / API)
│   ├── results-db.ts      # JSON test results store
│   └── test-mode-constants.ts  # Typed env var names for test mode
└── specs/                 # Test specifications
    ├── agor.test.ts       # Smoke + UI tests (50+ tests)
    ├── agent-scenarios.test.ts  # Phase A: agent interaction (22 tests)
    ├── phase-b.test.ts    # Phase B: multi-project + LLM judge
    └── phase-c.test.ts    # Phase C: hardening features (11 scenarios)
```

## Test Mode Environment Variables

| Variable | Purpose | Read By |
|----------|---------|---------|
| `AGOR_TEST=1` | Enable test isolation | config.rs, misc.rs, lib.rs, watcher.rs, fs_watcher.rs, telemetry.rs, App.svelte |
| `AGOR_TEST_DATA_DIR` | Override data dir | config.rs |
| `AGOR_TEST_CONFIG_DIR` | Override config dir | config.rs |

**Effects when AGOR_TEST=1:**
- File watchers disabled (watcher.rs, fs_watcher.rs)
- OTLP telemetry export disabled (telemetry.rs)
- CLI tool installation skipped (lib.rs)
- Wake scheduler disabled (App.svelte)
- Test env vars forwarded to sidecar processes (lib.rs)

## Test Phases

| Phase | File | Tests | Type |
|-------|------|-------|------|
| Smoke | agor.test.ts | 50+ | Deterministic (CSS/DOM assertions) |
| A | agent-scenarios.test.ts | 22 | Deterministic (data-testid selectors) |
| B | phase-b.test.ts | 6+ | LLM-judged (multi-project, agent quality) |
| C | phase-c.test.ts | 11 scenarios | Mixed (deterministic + LLM-judged) |

## Adding a New Spec

1. Create `tests/e2e/specs/my-feature.test.ts`
2. Import from `@wdio/globals` for `browser` and `expect`
3. Use `data-testid` selectors (preferred) or CSS classes
4. Add to `wdio.conf.js` specs array
5. For LLM assertions: `import { assertWithJudge } from '../infra/llm-judge'`
6. Run `./scripts/check-test-flags.sh` if you added new AGOR_TEST references

## CI Workflow

See `.github/workflows/e2e.yml` — 3 jobs:
1. **unit-tests**: vitest frontend
2. **cargo-tests**: Rust backend
3. **e2e-tests**: WebDriverIO (xvfb-run, Phase A+B+C, LLM tests gated on secret)