BPMN-style live-execution smoke runner for aim2be (fresh-tree fork patterns from ~/code/vioxen/qa-rig per rule 53)
  • Python 97.7%
  • JavaScript 1.2%
  • HTML 1.1%
Find a file
hibryda 4e84e5f46e
Some checks failed
qa-rig CI / pytest (push) Successful in 8s
qa-rig CI / ruff (push) Failing after 3s
qa-rig CI / live-k3d smoke (push) Has been skipped
feat(qa-rig): L0 T0 #6 PR-OPAQUE-6 — Centrifugo opaque-ticket E2E flows (happy path + single-use rejection) (#5)
Two new SCAFFOLD flows under qa-rig discipline: 21-centrifugo-opaque-ticket-happy-path (login → mint → connect → ticket-consumed) + 22-centrifugo-opaque-ticket-single-use-rejection (with Prometheus delta assertion). BPMN sidecars + drift-detector test. 138 vitest cases. Reused adult-1 mock-data persona per rule 55. R-cycle: R1 BLOCKED → R2 CONDITIONAL (4 findings) → R3 CONDITIONAL (3 findings) → R4 CONDITIONAL_APPROVE quorum (1 minor, reviewer-deferred to carve-out PR). All carve-out atomicity requirements tracked via TOML NOTE + TODO followup.
2026-05-25 02:55:37 +02:00
.forgejo/workflows feat(M2/tier-2): pytest suite + Forgejo Actions CI workflow 2026-05-14 00:47:30 +02:00
deploy/systemd chore(deploy): source LLM keys from shared ~/.config/claude/llm-keys.env 2026-05-14 01:40:05 +02:00
flows fix(qa-rig): apply PR #5 R2 reviewer findings (4 surgical fixes) 2026-05-25 02:38:41 +02:00
harvested feat(M4/tier-4): harvest + infer (Claude Agent SDK opt-in) 2026-05-14 00:58:56 +02:00
runs feat(M1): TIER-1 BPMN QA rig — Python smoke runner against aim2be 2026-05-14 00:36:16 +02:00
src/im2be_qa_rig fix(qa-rig): apply PR #5 R3 reviewer findings (2 MINOR + 1 INFO) 2026-05-25 02:48:31 +02:00
tests fix(qa-rig): apply PR #5 R3 reviewer findings (2 MINOR + 1 INFO) 2026-05-25 02:48:31 +02:00
web feat(M3/tier-3): BPMN 2.0 graph export + bpmn-js viewer 2026-05-14 00:52:06 +02:00
.gitignore feat(M5/tier-5): scheduler + sentinels + systemd unit + README sweep 2026-05-14 01:02:50 +02:00
CLAUDE.md feat(M1): TIER-1 BPMN QA rig — Python smoke runner against aim2be 2026-05-14 00:36:16 +02:00
pyproject.toml feat(qa-rig): add OpenTelemetry SDK + OTLP gRPC bootstrap (opt-in) 2026-05-23 21:52:40 +02:00
README.md feat(qa-rig): L0 T0 #6 PR-OPAQUE-6 — Centrifugo opaque-ticket E2E flows (happy path + single-use rejection) 2026-05-25 02:09:15 +02:00
run-set.example.toml feat(parallel): TIER 6 parallel-flow dispatcher — S7 spike drop-in 2026-05-16 00:05:30 +02:00
schedule.toml.example feat(M5/tier-5): scheduler + sentinels + systemd unit + README sweep 2026-05-14 01:02:50 +02:00
uv.lock feat(qa-rig): add OpenTelemetry SDK + OTLP gRPC bootstrap (opt-in) 2026-05-23 21:52:40 +02:00

im2be-qa-rig

BPMN-style live-execution smoke runner for the aim2be platform. Driven from im2be-mono; outputs land in runs/<run-id>/ per invocation.

Forked patterns from ~/code/vioxen/qa-rig per rule 53 — fresh tree, no git-clone-rebrand. We borrow the naming, the TOML schema, and the assertion language; every source file in this repo was authored fresh.

Status

  • M1 — TIER 1, shipped 2026-05-13. Python smoke runner. One flow (flows/01-adult-1-home-family-diary.toml) drives the PWA at localhost:9620 with adult-1's mock-data storage state injected, walks through home/family/diary, captures per-step screenshots + console messages, exits 0 if all assertions pass.
  • M2 — TIER 2, shipped 2026-05-13. pytest suite (62 cases across spec / storage-state / harvest / infer / bpmn / scheduler) + Forgejo Actions workflow (.forgejo/workflows/qa-rig-ci.yml) with lint + test + gated live-k3d smoke jobs on the aim2be-rework runner.
  • M3 — TIER 3, shipped 2026-05-13. im2be-qa-rig graph emits BPMN 2.0 XML for any FlowSpec; optional run-report overlay colours tasks pass/fail via bioc namespace. Vite + bpmn-js viewer under web/ on port 9710.
  • M4 — TIER 4, shipped 2026-05-13. im2be-qa-rig harvest <sub> reads <meta-repo>/code-intelligence/<sub>/ and emits a REST/Kafka/WebSocket inventory. im2be-qa-rig infer <sub> drafts a FlowSpec TOML — either via the Claude Agent SDK ([triage] extra) or the deterministic stub mode.
  • M5 — TIER 5, shipped 2026-05-13. im2be-qa-rig schedule runs a config-driven scheduler with two job kinds: flow (subprocess invoking the smoke runner) and sentinel (HTTP probe with status-range assertions). Heartbeat at runs/.heartbeat.json after every cycle. Systemd-user unit at deploy/systemd/im2be-qa-rig.service.

See ~/code/vioxen/qa-rig for the upstream M1-M26 reference (read-only — DO NOT modify).

Path conventions (mock-data)

FlowSpec TOML files reference the meta-repo's mock-data/ tree via paths like ../../mock-data/pwa/playwright-storage/adult-1.json. These are meta-repo-relative: this rig is always consumed as a submodule of im2be-mono, so <meta-repo>/im2be-qa-rig/flows/<x>.toml resolves ../../mock-data/... to <meta-repo>/mock-data/.... That layout is documented in <meta-repo>/CLAUDE.md and <meta-repo>/.claude/rules/55-mock-data-discipline.md.

A standalone clone of im2be-qa-rig (without the meta-repo around it) will NOT find mock-data/ and the flows will fail at the storage-state load step. This is intentional — qa-rig has no source of truth for mock data outside the meta-repo's deterministic-persona set. If you need to run the rig fully standalone, mount the meta-repo's mock-data/ at ../mock-data relative to this directory (e.g. via symlink) and patch the flows; do NOT copy mock-data into this repo (it has its own commit history under the meta-repo's mock-data/ source).

Prerequisites

  • The aim2be Stage A.3 demo environment up (scripts/stage-a-demo-up.sh from the meta-repo). The first flow's [setup] invokes this automatically; use --skip-setup if you already have the environment running.
  • Python ≥ 3.11.
  • uv ≥ 0.9 (for installation).

Installation

From inside this repo:

uv sync
uv run playwright install chromium   # one-time Playwright browser download

Running the smoke flow

# From the im2be-mono meta-repo root:
cd im2be-qa-rig

# Run the first flow (assumes Stage A.3 demo is up via scripts/stage-a-demo-up.sh)
uv run im2be-qa-rig --spec flows/01-adult-1-home-family-diary.toml --skip-setup

# Or let the runner bring up the demo itself (~1 min):
uv run im2be-qa-rig --spec flows/01-adult-1-home-family-diary.toml

# Visible browser for local debugging:
uv run im2be-qa-rig --spec flows/01-adult-1-home-family-diary.toml --skip-setup --headed

A successful run produces runs/<run-id>/report.json + runs/<run-id>/screenshots/NN-step.png and exits 0. A failed step fails fast (subsequent steps are not run) and the runner exits 1.

report.json schema

Field Type Description
spec_path string Absolute path to the FlowSpec TOML file.
run_id string 12-char hex run identifier (directory name under runs/).
started_at ISO 8601 string UTC timestamp when the browser flow began.
finished_at ISO 8601 string UTC timestamp when the browser flow completed.
base_url string | null Effective base URL (from FlowSpec or --base-url override).
passed boolean true iff all steps ran and all assertions passed.
trace_id string 32-char lowercase hex OpenTelemetry trace ID for the qa_rig.flow span. All-zeros ("00000000000000000000000000000000") when QA_RIG_OTEL_ENABLED is unset (local dev without a collector). Use this to correlate evidence in the runs/ directory with traces in Grafana/Tempo.
steps[] array Per-step evidence (see below).
steps[].name string Step name from the FlowSpec.
steps[].action string Step action (e.g. navigate, expect_text).
steps[].passed boolean true iff the step assertion succeeded.
steps[].duration_ms integer Wall-clock step duration in milliseconds.
steps[].error string | null Error message when passed=false, otherwise null.
steps[].screenshot_path string | null Relative path to the step screenshot under runs/<run-id>/screenshots/.
steps[].console_slice array Browser console messages emitted during this step.

FlowSpec schema (TIER 1)

[meta]
name = "human-readable name"
type = "browser"
base_url = "http://localhost:9620"

[setup]                                       # optional
shell = "./scripts/stage-a-demo-up.sh --skip-yarn"

[storage_state]                                # optional
path = "../../mock-data/pwa/playwright-storage/adult-1.json"
inject_auth_storage = true                    # default — adds the Zustand auth-storage key

[[steps]]
name = "step-name"
action = "navigate | fill | click | wait_for | expect_text | expect_url | screenshot | expect_no_errors"
# ... action-specific fields

See src/im2be_qa_rig/types.py for the canonical action list + per-action fields.

Active flows

Per-flow status: EXECUTABLE flows run end-to-end on the local k3d cluster today; SCAFFOLD flows lock the BPMN + step anchors + acceptance criteria for a later carve-out that wires the real assertions (rule 57 BPMN-first; http/grpc/metric step kinds land in a follow-up tier). Each scaffold carries the assertion contract in # TODO (real-execution): comments.

File Status Coverage
flows/01-adult-1-home-family-diary.toml EXECUTABLE M1 smoke — adult-1 home → family → diary.
flows/02-… through flows/20-… SCAFFOLD L-1 happy-path scaffolds (OAuth, tasks, subscriptions, push).
flows/21-centrifugo-opaque-ticket-happy-path.toml SCAFFOLD L0 T0 #6 PR-OPAQUE-6 — login → mint → connect → validate happy path against realtime-service /centrifugo/connect.
flows/22-centrifugo-opaque-ticket-single-use-rejection.toml SCAFFOLD L0 T0 #6 PR-OPAQUE-6 — single-use semantic; second connect with same ticket → 401 + Disconnect{4401}; realtime_centrifugo_connect_total{outcome="invalid"} +1.
flows/30-… through flows/33-… SCAFFOLD L-1 failure-mode flows (SPIRE down, Centrifugo mid-session, Kafka lag, identity 503).

The committed .bpmn sidecars (M3 emitter output) are regenerated by im2be-qa-rig graph --spec flows/<name>.toml --output flows/<name>.bpmn and verified against the TOML source-of-truth in tests/test_opaque_ticket_flows.py (rule 57.1 — BPMN-first).

Rules

  • Live execution only. No mocking at the QA-rig layer. The PWA's own MSW handlers (Phase 3) cover the external SaaS surface; everything else hits real cluster services via kubectl port-forward. (Rule 57)
  • Allow-list of targets. base_url for production is explicitly forbidden in code. (Rule 57)
  • No flake tolerance. Intermittent failures are bugs against the application or the FlowSpec, never the rig.
  • Reproducible. Same mock dataset + same flow + same commits = same evidence. The runner pins Playwright's chromium revision via lockfile.

Layout

im2be-qa-rig/
├── pyproject.toml
├── src/im2be_qa_rig/
│   ├── __init__.py
│   ├── cli.py               # `im2be-qa-rig` entry point
│   ├── browser.py           # Playwright sync-API flow runner
│   ├── spec.py              # TOML FlowSpec parser + validator
│   └── types.py             # StepResult / FlowResult dataclasses + VALID_ACTIONS
├── flows/
│   └── 01-adult-1-home-family-diary.toml
├── runs/                     # per-run evidence (gitignored)
└── docs/                     # TIER-1+ design notes

Subcommands

Command Tier Purpose
im2be-qa-rig run --spec <toml> M1 Execute a FlowSpec (Playwright). Default if no subcommand.
im2be-qa-rig graph --spec <toml> [--run <report.json>] [--output <path>] M3 Emit BPMN 2.0 XML; optional pass/fail colour overlay from a run report.
im2be-qa-rig harvest <sub> M4 Read <meta-repo>/code-intelligence/<sub>/ and emit harvested/<sub>/inventory.json.
im2be-qa-rig infer <sub> [--stub] [--out <toml>] M4 Draft a FlowSpec TOML from a harvested inventory. --stub skips the Claude Agent SDK call.
im2be-qa-rig schedule --config schedule.toml [--max-cycles N] M5 Run the scheduler loop. Writes runs/.heartbeat.json after every cycle.

See also

  • ~/code/vioxen/qa-rig/README.md — full M1-M26 reference (read-only).
  • <im2be-mono>/docs/local-deploy/04-stage-a3-demo-runbook.md — how to bring up the aim2be demo this rig drives.
  • <im2be-mono>/.claude/rules/57-bpmn-qa-rig.md — the discipline this rig enforces.
  • <im2be-mono>/.claude/rules/53-borrowed-reference-repos.md — the no-touch-upstream rule.
  • web/README.md — bpmn-js viewer dev + build instructions.
  • deploy/systemd/im2be-qa-rig.service — systemd-user unit for the laptop scheduler deployment.