- Move fixtures.ts, llm-judge.ts, results-db.ts to tests/e2e/infra/ - Deduplicate wdio.conf.js: use createTestFixture() instead of inline copy - Replace __dirname paths with projectRoot-anchored paths - Create test-mode-constants.ts (typed env var names, flag registry) - Create scripts/preflight-check.sh (validates tauri-driver, display, Claude CLI) - Create scripts/check-test-flags.sh (CI lint for AGOR_TEST flag drift) - Rewrite tests/e2e/README.md with full documentation - Update spec imports for moved infra files
88 lines
3.2 KiB
Markdown
88 lines
3.2 KiB
Markdown
# E2E Testing Module
|
|
|
|
Browser automation tests for Agent Orchestrator using WebDriverIO + tauri-driver.
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Preflight check (validates dependencies)
|
|
./scripts/preflight-check.sh
|
|
|
|
# Build debug binary + run E2E
|
|
npm run test:all:e2e
|
|
|
|
# Run E2E only (skip build)
|
|
SKIP_BUILD=1 npm run test:e2e
|
|
|
|
# Headless (CI)
|
|
xvfb-run --auto-servernum npm run test:e2e
|
|
```
|
|
|
|
## System Dependencies
|
|
|
|
| Tool | Required | Install |
|
|
|------|----------|---------|
|
|
| tauri-driver | Yes | `cargo install tauri-driver` |
|
|
| Debug binary | Yes | `cargo tauri build --debug --no-bundle` |
|
|
| X11/Wayland | Yes (Linux) | Use `xvfb-run` in CI |
|
|
| Claude CLI | Optional | LLM-judged tests skip if absent |
|
|
| ANTHROPIC_API_KEY | Optional | Alternative to Claude CLI for LLM judge |
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
tests/e2e/
|
|
├── wdio.conf.js # WebDriverIO config + tauri-driver lifecycle
|
|
├── tsconfig.json # TypeScript config for specs
|
|
├── README.md # This file
|
|
├── infra/ # Test infrastructure (not specs)
|
|
│ ├── fixtures.ts # Test fixture generator (isolated temp dirs)
|
|
│ ├── llm-judge.ts # LLM-based assertion engine (Claude CLI / API)
|
|
│ ├── results-db.ts # JSON test results store
|
|
│ └── test-mode-constants.ts # Typed env var names for test mode
|
|
└── specs/ # Test specifications
|
|
├── agor.test.ts # Smoke + UI tests (50+ tests)
|
|
├── agent-scenarios.test.ts # Phase A: agent interaction (22 tests)
|
|
├── phase-b.test.ts # Phase B: multi-project + LLM judge
|
|
└── phase-c.test.ts # Phase C: hardening features (11 scenarios)
|
|
```
|
|
|
|
## Test Mode Environment Variables
|
|
|
|
| Variable | Purpose | Read By |
|
|
|----------|---------|---------|
|
|
| `AGOR_TEST=1` | Enable test isolation | config.rs, misc.rs, lib.rs, watcher.rs, fs_watcher.rs, telemetry.rs, App.svelte |
|
|
| `AGOR_TEST_DATA_DIR` | Override data dir | config.rs |
|
|
| `AGOR_TEST_CONFIG_DIR` | Override config dir | config.rs |
|
|
|
|
**Effects when AGOR_TEST=1:**
|
|
- File watchers disabled (watcher.rs, fs_watcher.rs)
|
|
- OTLP telemetry export disabled (telemetry.rs)
|
|
- CLI tool installation skipped (lib.rs)
|
|
- Wake scheduler disabled (App.svelte)
|
|
- Test env vars forwarded to sidecar processes (lib.rs)
|
|
|
|
## Test Phases
|
|
|
|
| Phase | File | Tests | Type |
|
|
|-------|------|-------|------|
|
|
| Smoke | agor.test.ts | 50+ | Deterministic (CSS/DOM assertions) |
|
|
| A | agent-scenarios.test.ts | 22 | Deterministic (data-testid selectors) |
|
|
| B | phase-b.test.ts | 6+ | LLM-judged (multi-project, agent quality) |
|
|
| C | phase-c.test.ts | 11 scenarios | Mixed (deterministic + LLM-judged) |
|
|
|
|
## Adding a New Spec
|
|
|
|
1. Create `tests/e2e/specs/my-feature.test.ts`
|
|
2. Import from `@wdio/globals` for `browser` and `expect`
|
|
3. Use `data-testid` selectors (preferred) or CSS classes
|
|
4. Add to `wdio.conf.js` specs array
|
|
5. For LLM assertions: `import { assertWithJudge } from '../infra/llm-judge'`
|
|
6. Run `./scripts/check-test-flags.sh` if you added new AGOR_TEST references
|
|
|
|
## CI Workflow
|
|
|
|
See `.github/workflows/e2e.yml` — 3 jobs:
|
|
1. **unit-tests**: vitest frontend
|
|
2. **cargo-tests**: Rust backend
|
|
3. **e2e-tests**: WebDriverIO (xvfb-run, Phase A+B+C, LLM tests gated on secret)
|