feat(rmp): full status-page reportability for distributed-quorum reviews #9
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feat/rmp-distributed-leg-observability"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Brings the distributed production mode to full status-page reportability parity with the in-process quorum. Operating distributed previously showed only the MAIN active-job state — the progress-line frontier, ETA, per-leg Quorum column, and token/cost ribbon were all blind.
Gaps closed (
src/rmp/worker-bridge.ts+src/engine-telemetry.ts)distributing/arbitrating) isn't in the servingPHASE_ORDER, so the ledger couldn't position it (no frontier) andPHASE_ORDER.includes→ null ETA. The bridge sink now folds module phase keys onto the serving taxonomy (distributing/arbitrating→sdk-streaming,parsingpasses). Arbitration stays surfaced in the Quorum column (PHASE_ORDER has no slot for a distinct arbitrating phase — same as in-process).makePerLegSinkderives per-leg state from the leg-tagged stream and writes the SAMEwriteQuorumLegState/writeQuorumArbiterStatehash the in-process quorum uses. Explicitthink/streamhandling matches the in-processSessionSupervisor.distributed-quorum.tsemits a leg-done marker on each leg's terminal capture.observeSdkMessageForTelemetrytap never sees them. NewrecordRmpTokenUsage/recordRmpRateLimitfold cross-process usage (from the parsed{t:"telemetry"}events) into the same ribbon counters. No double-count: the arbiter doesn't tap the ribbon directly and no in-process reviewer runs in distributed mode.SessionState/ArbiterState(guarded arbiter parse) → no writer cast. Closes both prior review findings (thinkfall-through + the unchecked casts).Tests
+5 (phase-mapping ⊂ PHASE_ORDER, ribbon-feed delta, the two new engine-telemetry recorders); full suite 2148; typecheck clean. This PR is itself reviewed via the live distributed path.
Show previous round
hib-pr-reviewer review — PR #9 (hib/hib-pr-reviewer)
Round 1 — head
18ffb5e73b4f, basemain, triggeropenedTL;DR: CONDITIONAL_APPROVE — verified and kept unique-to-B minor (think→streaming divergence from LivenessWatchdog) and unique-to-A info (string-typed callbacks with as-casts); no blocking issues.
Summary
Both findings are unique-to-one-reviewer but verified against the actual source. A (info, kept):
worker-bridge.tslines 148/150 confirmed —onLegStateandonArbiterStatecallbacks are typed asstring;worker.tslines 3125/3136 confirmed — both cast withas SessionState/as ArbiterState. No runtime bug today but zero compile-time protection against future state-string divergence. B (minor, kept):worker-bridge.tslines 177–184 confirmed — the bareelsebranch catcheskind:"think"events and forcesstate = "streaming"/tool = null, diverging fromLivenessWatchdogatliveness-watchdog.tsline 185 which explicitly documentskind:"think" → progress only, no state change. If a leg emits athinkevent while intool-executing(possible with interleaved assistant text), the status-page Quorum column flashesstreamingand drops the tool name mid-execution. Two verified, independent findings kept; no findings dropped.Blast Radius
Changes are behind the
RMP_DISTRIBUTEDflag, limited toworker-bridge.ts(newmakePerLegSinkwrapper + extendedDistributedReviewDriverCtx) and a single call site inworker.ts. The distributed quorum path is additive; the in-process quorum and default flag-off path are entirely unaffected.BLAST_SCORE: 3/10
Risk Indicators
writeQuorumLegState,writeQuorumArbiterStateCI status (head
18ffb5e73b4f)Overall: ✓ success
3 checks: 3 pending
Findings (2)
[MINOR]
kind: "think"activity falls through tostreamingbranch, diverging from LivenessWatchdog semanticssrc/rmp/worker-bridge.ts:181
In
makePerLegSinkthe inner conditional forevent.t === "activity"is:LivenessWatchdogatliveness-watchdog.tsline 185 explicitly documents// kind:"think" → progress only, no state change, and itsobserveInnerswitch leavesthis.stateandthis.activeTooluntouched forthink. The currentelsebranch silently maps athinkevent tostreamingand clearstool. If a leg emitsthinkwhile intool-executing(e.g. interleaved assistant-text before thetool_resultarrives), the status-page Quorum column flashesstreamingand loses the tool name mid-execution.Fix: replace the bare
elsewith an explicitelse if (event.kind === "stream")and add athinkno-op comment matching the watchdog:[INFO]
onLegState/onArbiterStatetyped asstring— uncheckedascasts at call sitesrc/rmp/worker-bridge.ts:148
The two optional callbacks in
DistributedReviewDriverCtxare typed(leg, state: string, ...) => void(lines 148–150). At the call site inworker.ts(lines 3125 and 3136) the strings are re-widened withas SessionState/as ArbiterState, which TypeScript treats as a no-op assertion — it won't catch a future state not in those unions.All current values produced by the state machine (
"streaming","tool-executing","delivered","died") are validSessionStatemembers, and the arbiter detail originates from typedArbiterStatevalues, so there is no runtime bug today. But theascasts provide zero compile-time safety.If the
worker-bridge.ts↔worker.tsdecoupling is intentional, a narrow string-literal union("streaming" | "tool-executing" | "delivered" | "died")on the callback would give TypeScript something to check without importing fromsession-supervisor.ts. If decoupling is not a concern, importing and usingSessionState/ArbiterStatedirectly would eliminate the casts entirely.Verdict
CONDITIONAL_APPROVE
hib-pr-reviewer • round 1 • 2 findings (1m/1i) • 2026-06-03T08:28:23.028Z → 2026-06-03T08:35:04.288Z • posted-as: pr-reviewer-bot • model: auto
Operating the distributed production mode showed only the MAIN active-job state; the progress-line frontier, ETA, per-leg Quorum column, and token/cost ribbon were all blind. Brings distributed parity with the in-process quorum. - Progress line + ETA: the distributed module's phase vocabulary (distributing/arbitrating) isn't in the serving PHASE_ORDER, so the ledger couldn't position it (no frontier) and PHASE_ORDER.includes → null ETA. The bridge sink folds module phase keys onto the serving taxonomy (distributing/arbitrating → sdk-streaming, parsing passes) so both work as the single/in-process path. Arbitration stays in the Quorum column. - Token/cost ribbon: the legs (+ arbiter) stream in separate containers, so the engine's in-process observeSdkMessageForTelemetry tap never sees them. New recordRmpTokenUsage / recordRmpRateLimit (twins of the SDK-message recorders, fed from the already-parsed {t:"telemetry"} RMP events) fold cross-process usage into the same ribbon counters — no double-count (the arbiter doesn't tap the ribbon directly; no in-process reviewer runs in distributed mode). - Per-leg column: explicit think/stream handling (matches the in-process SessionSupervisor); callbacks typed SessionState/ArbiterState (guarded arbiter parse) so the writer needs no cast — closes the two PR-9 review findings. +5 tests; full suite 2148; typecheck clean.feat(rmp): distributed quorum drives the per-leg status-page Quorum columnto feat(rmp): full status-page reportability for distributed-quorum reviewsShow previous round
hib-pr-reviewer review — PR #9 (hib/hib-pr-reviewer)
Round 2 — head
b166ebfd7148, basemain, triggersynchronizeTL;DR: CONDITIONAL_APPROVE — kept 1 agreed finding (JSDoc orphan), verified and kept 2 unique findings (missing awaiting-rate-limit-reset transition, now-injection bypass); no blocking issues.
Summary
Arbitration — Round 2
Agreed finding (both A and B): The JSDoc block at lines 181–188 is displaced before
ARBITER_STATESandDEFAULT_LEG_STATE, leavingmakePerLegSinkat line 203 with no attached doc. Verified by reading lines 181–207 directly: the multi-line JSDoc opens at 181, a separate single-line JSDoc at 189–190 precedesARBITER_STATES, andmakePerLegSinkat 203 follows a blank line with no JSDoc. Kept as minor.Unique to A (line 216/229–244): The leg-state machine handles only
activityandroundevents; atelemetryevent withrateLimit.throttled:truefor a known leg bumps the ribbon counter but never transitionslegState[leg].statetoawaiting-rate-limit-reset. Verified by reading lines 229–244 (no telemetry branch present) and cross-checkingLivenessWatchdoglines 154–159 (which does apply the transition) andSessionStateat session-supervisor.ts lines 47–53 (which includesawaiting-rate-limit-reset). The gap is real and produces a display inconsistency under throttle. Kept as minor.Unique to B (line 222):
recordRmpRateLimitis called withnew Date().toISOString()directly whileDistributedReviewDriverCtxcarries anow?: () => stringinjection hook. Verified by reading lines 178 (ctx.now) and 222 (hardcodednew Date()).makePerLegSinkreceives nonowparameter, so the rate-limit timestamp stored inlastRateLimitAtis always wall-clock in tests. Low practical impact (no test currently asserts this timestamp), but it breaks the existing injection pattern established elsewhere in the same file. Kept as info.Result: 1 agreed finding kept, 2 unique findings verified and kept. Total: 3 findings (2 minor, 1 info).
Blast Radius
The diff touches
worker-bridge.ts(the coordination layer for the distributed-quorum production path),engine-telemetry.ts(shared ribbon counters read by the status page), anddistributed-quorum.ts(the leg orchestrator). The distributed path is flag-gated (RMP_DISTRIBUTED=1) so blast radius is bounded to that mode; the in-process quorum is unchanged. Changes export new functions fromengine-telemetry.tsthat any module could call.BLAST_SCORE: 5/10
Risk Indicators
recordRmpRateLimit,recordRmpTokenUsage,runDistributedReviewViaDriverCI status (head
b166ebfd7148)Overall: ✓ success
3 checks: 3 pending
Related PRs
Findings (3)
[MINOR] JSDoc for
makePerLegSinkis orphaned — attaches toARBITER_STATESinstead of the functionsrc/rmp/worker-bridge.ts:181
Lines 181–188 contain a
/** … */block describingmakePerLegSink, but two declarations are interleaved before the function:ARBITER_STATES(line 191, with its own JSDoc at 189–190) andDEFAULT_LEG_STATE(line 198). TypeScript tooling and TypeDoc attach a doc-comment to the immediately following declaration, so the block at 181–188 is orphaned onARBITER_STATESwhilemakePerLegSinkat line 203 has no attached documentation (it is preceded only by a blank line). Editor hover-docs and generated API docs for the function are lost.Fix: move the JSDoc block to immediately above
function makePerLegSink(line 203), afterDEFAULT_LEG_STATE. The comment at line 189 already correctly documentsARBITER_STATESand can stay.[MINOR]
makePerLegSinkleg-state machine has no telemetry branch — a throttled distributed leg never transitions toawaiting-rate-limit-resetsrc/rmp/worker-bridge.ts:229
The state machine at lines 229–244 handles
event.t === "activity"(→tool-executingorstreaming) andevent.t === "round"(→delivered/died), but has notelemetrycase. When a distributed leg emits{t:"telemetry", rateLimit:{throttled:true}, leg:"a"}, the ribbon counter is correctly bumped at line 222, butlegState[leg].stateremains whatever it was (typically"streaming"), so the Quorum column for that leg never showsawaiting-rate-limit-reset.The in-process path via
LivenessWatchdog(liveness-watchdog.ts lines 154–159) does make this transition, andSessionState(session-supervisor.ts lines 47–53) explicitly includes"awaiting-rate-limit-reset". During a throttled distributed review the two paths diverge visually.Suggested fix — add a telemetry branch inside the
leg === "a" || leg === "b"block:This mirrors the LivenessWatchdog's telemetry→rate-limit handling and closes the display inconsistency.
[INFO]
recordRmpRateLimitcall usesnew Date()directly, bypassing thenowinjection hook onDistributedReviewDriverCtxsrc/rmp/worker-bridge.ts:222
Line 222:
recordRmpRateLimit(event.rateLimit, new Date().toISOString()).DistributedReviewDriverCtxcarriesnow?: () => stringfor test-clock injection (line 178), used byrunDistributedQuorumandrunReviewViaProducer, butmakePerLegSinkreceives nonowparameter so the rate-limit timestamp stored inlastRateLimitAtis always wall-clock — even in tests.Practical impact is low today (no test currently asserts the rate-limit timestamp from this path), but it is an inconsistency with the established injection pattern in the same file.
Consider threading
nowthroughmakePerLegSink:And pass
ctx.now ?? (() => new Date().toISOString())at the call site inrunDistributedReviewViaDriver.Verdict
CONDITIONAL_APPROVE
hib-pr-reviewer • round 2 • 3 findings (2m/1i) • 2026-06-03T08:48:06.212Z → 2026-06-03T08:53:51.957Z • posted-as: pr-reviewer-bot • model: auto
- Reorder so the makePerLegSink JSDoc attaches to the function, not the ARBITER_STATES/DEFAULT_LEG_STATE consts (orphaned by the prior edit). - Add the missing throttle branch to the per-leg state machine: a throttled {t:"telemetry"} (rate_limit_event, non-allowed) → awaiting-rate-limit-reset, mirroring the in-process SessionSupervisor; the next activity recovers it. Without it a rate-limited distributed leg silently showed "streaming". - Thread an injectable `now` into makePerLegSink for the ribbon's rate-limit timestamp instead of calling new Date() directly. +1 test (throttle → awaiting-rate-limit-reset → recover); full suite 2149.Show previous round
hib-pr-reviewer review — PR #9 (hib/hib-pr-reviewer)
Round 3 — head
7bc1fee08c02, basemain, triggersynchronizeTL;DR: CONDITIONAL_APPROVE — kept 1 agreed minor finding (missing
nowMsinrecordRmpTokenUsagecall) and 1 verified unique-to-B info finding (missing rolling-window prune test); no blocking issues.Summary
Arbitration — Round 3
Agreed finding (A ∩ B): Both reviewers independently flag
recordRmpTokenUsageatsrc/rmp/worker-bridge.ts:220for omittingnowMs, leaving the rolling-window ring pinned to real wall-clock time even though the injectednowhook is in scope one line later. Verified: lines 220-224 show no second argument; line 212 declaresnow: () => string; line 225 already forwardsnow()torecordRmpRateLimit. Finding confirmed — kept at minor severity.Unique-to-B finding (info):
tests/engine-telemetry.test.ts:63-80covers cumulative accumulation forrecordRmpTokenUsagebut not the rolling-window prune path. Verified: the existing prune test at line 141 is exclusive torecordSdkResultUsage; no analogous test exists forrecordRmpTokenUsagedespite both functions pushing into the same ring and callingpruneRecent. The gap is real — kept at info severity.Dropped: Nothing. Both unique-to-B findings survived verification.
Kept: 2 findings (1 agreed minor, 1 unique-to-B info). Verdict: CONDITIONAL_APPROVE.
Blast Radius
The diff spans 8 files across the engine telemetry module, the distributed-quorum RMP bridge, the worker entrypoint, docs, and tests. The two new exported functions (
recordRmpTokenUsage,recordRmpRateLimit) are consumed byworker-bridge.tsand extend shared mutable counters used by the status-page ribbon. The blast radius is moderate — touching exported telemetry surface shared by all job paths — but the changes are purely additive and the in-process quorum path is unaffected.BLAST_SCORE: 5/10
Risk Indicators
recordRmpTokenUsage,recordRmpRateLimit,makePerLegSinkCI status (head
7bc1fee08c02)Overall: ✓ success
3 checks: 3 pending
Related PRs
Findings (2)
[MINOR]
recordRmpTokenUsagecall omitsnowMs— rolling-window timestamp bypasses the injected clocksrc/rmp/worker-bridge.ts:220
The round-2 fix correctly threaded the injected
nowclock intorecordRmpRateLimitat line 225, but the adjacentrecordRmpTokenUsagecall at lines 220-224 passes nonowMs, so the rolling-window ring always falls back to the realDate.now()regardless of whatnowinjects:DistributedReviewDriverCtx.nowis typed() => string(ISO format). The fix converts it to milliseconds at the call site:Without this, the rolling-window stat used by the status-page ribbon's recent-window display cannot be controlled in tests. Any integration test driving the cross-process path that tries to assert window-pruning behaviour at a deterministic timestamp will find entries pruned immediately (the ring entry uses wall clock while
engineTelemetrySnapshot(nowMs)uses the injectable — the window check becomes inconsistent). No production-correctness impact: in productionnow()returns real time so the result is identical toDate.now().[INFO] No rolling-window prune test for
recordRmpTokenUsage—recentUsageboundary unverified for the new functiontests/engine-telemetry.test.ts:63
Lines 63-80 correctly test cumulative accumulation and edge cases for
recordRmpTokenUsage, but the rolling-window prune behaviour (entries older thanTELEMETRY_WINDOW_MSdropped fromrecentUsage) is only exercised forrecordSdkResultUsage(see the symmetric prune test at line 141). Both functions push into the samerecentUsagering and callpruneRecent, so a boundary test should be symmetric. Suggested addition:This is low-urgency (the
pruneRecentlogic itself is exercised by the existingrecordSdkResultUsagetest), but the gap means a future regression in howrecordRmpTokenUsagecallspruneRecentwould go undetected. Should be addressed in a follow-on commit.Verdict
CONDITIONAL_APPROVE
hib-pr-reviewer • round 3 • 2 findings (1m/1i) • 2026-06-03T08:57:49.759Z → 2026-06-03T09:03:07.631Z • posted-as: pr-reviewer-bot • model: auto
Show previous round
hib-pr-reviewer review — PR #9 (hib/hib-pr-reviewer)
Round 4 — head
aa88ac1fb9e5, basemain, triggersynchronizeTL;DR: NEEDS_WORK — unique-to-B double-count finding verified across four files; arbiter runs in-process and both ribbon paths fire concurrently on divergence.
Summary
Arbitration — Round 4
Agreed findings: none (Reviewer A found 0, Reviewer B found 1).
Unique-to-B finding — verified. Reviewer B claims the in-process arbiter double-counts into the ribbon: once via
observeSdkMessageForTelemetryinsidestreamSdkQueryInner, and again when its adapted{t:"telemetry"}events flow throughmakePerLegSink.observewith no arbiter exclusion guard.Verification steps and results:
Arbiter is in-process.
worker.ts:3067passesarbiter: new Arbiter({ sdkSemaphore, memoraMcpUrl })— a live in-engine instance — intorunDistributedReviewViaDriver. The two reviewer legs are cross-process; the arbiter is not.Arbiter's SDK loop calls
observeSdkMessageForTelemetry.arbiter.ts:335callsstreamSdkQuery→streamSdkQueryInner. Atreviewer.ts:1327, every message passing throughstreamSdkQueryInneris fed toobserveSdkMessageForTelemetry. When the arbiter receives a{type:"result"}carrier, this callsrecordSdkResultUsage→ ribbon +1.Same messages re-enter the ribbon via
makePerLegSink.distributed-quorum.ts:207doesfor (const event of arbiterAdapter.adapt(msg)) emit(tagLeg(event, "arbiter")). The adapted{t:"telemetry", inputTokens:…, outputTokens:…, costUsd:…}event, taggedleg:"arbiter", is pushed into the event stream consumed bymakePerLegSink.observe. Atworker-bridge.ts:219, the check isif (event.t === "telemetry")— noeventLeg(event) !== "arbiter"guard — sorecordRmpTokenUsageis called → ribbon +2. Rate-limit events follow the same dual path viarecordRmpRateLimit.JSDoc claim falsified.
engine-telemetry.ts:166-167states "No double-count: the arbiter does not tap the ribbon directly" — this is true for the two out-of-process leg reviewers but false for the in-process arbiter when legs diverge and arbitration runs.Finding kept as major. Proposed fix (add
&& eventLeg(event) !== "arbiter"atworker-bridge.ts:219) is correct and also requires updating the docstring atengine-telemetry.ts:166-167.Blast Radius
The double-count affects the engine-wide token/cost ribbon (
engine-telemetry.tsmodule-level singletons) and is visible on the status page for all distributed-quorum reviews where legs diverge. The fix is a one-line guard inworker-bridge.tsplus a docstring update; no protocol or data-model changes required.BLAST_SCORE: 5/10
Risk Indicators
recordRmpTokenUsage,recordRmpRateLimit,recordSdkResultUsage,observeSdkMessageForTelemetry,makePerLegSinkCI status (head
aa88ac1fb9e5)Overall: ✓ success
3 checks: 3 pending
Findings (1)
[MAJOR] Arbiter telemetry double-counted in ribbon:
makePerLegSinkfolds arbiter-tagged{t:"telemetry"}events but the in-process arbiter also taps the ribbon viaobserveSdkMessageForTelemetrysrc/rmp/worker-bridge.ts:219
Confirmed double-count path (triggered when legs diverge and the in-process arbiter runs):
worker.ts:3067constructsnew Arbiter(...)— the arbiter runs in the engine process, not in a separate container.arbiter.ts:335callsstreamSdkQuery→streamSdkQueryInner. Atreviewer.ts:1327,observeSdkMessageForTelemetry(message)is called for every SDK message. When the arbiter's{type:"result"}carrier arrives, this callsrecordSdkResultUsage. Ribbon +1.onMessagecallback supplied toarbiter.runatdistributed-quorum.ts:206-207doesfor (const event of arbiterAdapter.adapt(msg)) emit(tagLeg(event, "arbiter")). The adapted{t:"telemetry", inputTokens, outputTokens, costUsd}event flows intomakePerLegSink.observe.worker-bridge.ts:219, the guard isif (event.t === "telemetry")with no arbiter exclusion.recordRmpTokenUsageis called unconditionally. Ribbon +2. The same dual path applies to rate-limit events viarecordRmpRateLimit.The JSDoc at
engine-telemetry.ts:166-167— "No double-count: the arbiter does not tap the ribbon directly" — is false when legs diverge; it is only true for the two out-of-process leg reviewers.Fix. Guard line 219 with the already-imported
eventLeg:Also update
engine-telemetry.ts:166-167to read: "No double-count: arbiter telemetry is excluded fromrecordRmpTokenUsage(guarded byeventLeg !== 'arbiter'); the in-process arbiter accumulates exclusively viaobserveSdkMessageForTelemetry."This applies equally to the
recordRmpRateLimitcall at line 231 — the rate-limit guard at line 249 (event.t === "telemetry" && event.rateLimit?.throttled) also lacks an arbiter exclusion and should be updated in tandem.Verdict
NEEDS_WORK
hib-pr-reviewer • round 4 • 1 finding (1M) • 2026-06-03T09:08:31.510Z → 2026-06-03T09:19:07.139Z • posted-as: pr-reviewer-bot • model: auto
Round-4 review caught a real MAJOR bug. makePerLegSink folded EVERY {t:"telemetry"} event (legs a/b AND the arbiter) into the engine ribbon. But the arbiter runs in-engine, and its SDK loop (arbiter.ts → streamSdkQuery → streamSdkQueryInner, reviewer.ts:1327) ALREADY taps the ribbon via observeSdkMessageForTelemetry. So on divergence (arbiter runs) its tokens were counted twice. Fix: fold telemetry into the ribbon ONLY for the out-of-process legs (a/b), whose tokens reach the engine solely via the RMP event stream. The arbiter's leg-tagged telemetry is left to its in-engine direct tap. The legs run in separate containers (their own ribbon is irrelevant), so no leg is double-tapped. +1 regression test (arbiter usage counted exactly once); full suite 2151.Show previous round
hib-pr-reviewer review — PR #9 (hib/hib-pr-reviewer)
Round 5 — head
a6ab5a073b93, basemain, triggersynchronizeTL;DR: CONDITIONAL_APPROVE — kept 2 agreed-or-verified minor findings and 1 agreed info; no blocking issues.
Summary
Arbitration round 5 — reconciliation
All three unique findings verified against HEAD. Kept all three.
Finding-by-finding
Agreed (A+B): CHANGELOG.md typo — line 10
Confirmed at
/var/lib/pr-reviewer/repos/hib/hib-pr-reviewer/1538/CHANGELOG.md:10: the bullet reads`{t:"telemetry}`(closing"absent before}). Both reviewers grounded on the same file and line. Kept as info.Unique to A: double
now()inmakePerLegSinkRead
src/rmp/worker-bridge.tslines 228 and 237.now()is called at line 228 (const nowMs = Date.parse(now())) and again at line 237 (recordRmpRateLimit(event.rateLimit, now())). Both are inside the sameif (event.t === "telemetry")block withrecordRmpTokenUsagebetween them. In production the two ISO strings can diverge by the time spent inrecordRmpTokenUsage. Finding confirmed; kept as minor.Unique to B: integration test missing ribbon counter assertion
Read
tests/rmp-worker-bridge-distributed.test.tslines 243–259. The test correctly emits a{t:"telemetry", rateLimit:{throttled:true}}event and asserts leg-state transitions (awaiting-rate-limit-reset→streaming), but there is noengineTelemetrySnapshot().rateLimitEventsbefore/after assertion. The wiring from the cross-process throttle event through torecordRmpRateLimit→ ribbon counter is unverified at integration level. Finding confirmed; kept as minor.Result: 2 minor + 1 info kept; no findings dropped.
Blast Radius
The diff spans 8 files across engine telemetry, the distributed worker bridge, distributed quorum logic, docs, and tests. Core changes are additive (new exported functions in engine-telemetry.ts, new sink logic in worker-bridge.ts); the arbiter double-count guard and the rolling-window/rate-limit paths are the only non-trivial mutation surfaces. Impact is bounded to the RMP_DISTRIBUTED flag-gated path — the in-process quorum and single-reviewer paths are untouched.
BLAST_SCORE: 4/10
Risk Indicators
recordRmpTokenUsage,recordRmpRateLimit,recordRateLimitEvent,recordSdkResultUsage,makePerLegSinkCI status (head
a6ab5a073b93)Overall: ✓ success
3 checks: 3 pending
Related PRs
Findings (3)
[MINOR]
now()called twice per telemetry event — rolling-window timestamp and rate-limit ISO string can skewsrc/rmp/worker-bridge.ts:228
Inside the
event.t === "telemetry"branch ofmakePerLegSink,now()is invoked at line 228 (const nowMs = Date.parse(now())) and again at line 237 (recordRmpRateLimit(event.rateLimit, now())). In production the two calls can return timestamps separated by the time spent executingrecordRmpTokenUsage, meaning the rolling-window bucket andlastRateLimitAtare stamped at different instants for the same event.Capture once:
This also avoids a second
new Date().toISOString()allocation and makes both timestamps provably consistent.[MINOR] Throttle integration test verifies leg-state transitions but not ribbon rate-limit counter
tests/rmp-worker-bridge-distributed.test.ts:258
The test at line 243 exercises the throttled-telemetry path (leg-a emits
{t:"telemetry", rateLimit:{throttled:true, reason:"rejected"}}), confirmsawaiting-rate-limit-resetinaStates, and confirms recovery — but never asserts thatengineTelemetrySnapshot().rateLimitEventsincremented. This is the only integration-level coverage of therecordRmpRateLimitwiring inmakePerLegSinkline 237; unit tests inengine-telemetry.test.tscover the function itself but not the cross-process dispatch path.Add before/after snapshot assertions:
[INFO] Typo — missing closing
"in`{t:"telemetry}`CHANGELOG.md:10
Line 10 (Per-leg A/B/arbiter Quorum column bullet) contains the malformed literal
`{t:"telemetry}`— the closing"aftertelemetryis absent. No runtime impact; only the shipped changelog is affected. Should read`{t:"telemetry"}`.Verdict
CONDITIONAL_APPROVE
hib-pr-reviewer • round 5 • 3 findings (2m/1i) • 2026-06-03T09:24:17.376Z → 2026-06-03T09:28:22.272Z • posted-as: pr-reviewer-bot • model: auto
- makePerLegSink reads now() once per telemetry event and reuses the ISO for both the rolling-window ms timestamp and the rate-limit time (was calling now() twice, which could read two slightly different instants). - Extend the throttle test to assert the ribbon's rate-limit counter + lastRateLimitStatus, not just the per-leg state transition. - Fix CHANGELOG typo (missing closing quote in `{t:"telemetry"}`). full suite 2151.hib-pr-reviewer review — PR #9 (hib/hib-pr-reviewer)
Round 6 — head
f65090534b46, basemain, triggersynchronizeTL;DR: APPROVE — all 11 prior-round findings resolved; no new actionable issues.
Summary
Round 6 review — all prior findings resolved
Every open finding from rounds 1–5 has been addressed in this synchronise push. Verified against HEAD:
kind:"think"falls through to wrong branchthink || stream → streamingonLegState/onArbiterStatetyped asstringSessionState/ArbiterState(lines 173-175)ARBITER_STATESmakePerLegSink(208)awaiting-rate-limit-resettelemetry && rateLimit.throttledbranchnew Date()bypass of injected clockconst iso = now()at line 228;Date.parse(iso)at 235recordRmpTokenUsageomitsnowMsDate.parse(iso)passed at line 235leg === "a" || leg === "b"(line 219)now()called twice per event (clock skew)const iso = now()reused for both consumers (line 228)rateLimitEventsdelta +lastRateLimitStatus{t:"telemetry}missing closing"{t:"telemetry"}No new actionable issues found. The implementation is coherent end-to-end: clock injection threads from
DistributedReviewDriverCtx.nowthroughmakePerLegSink(line 301 fallback to wall-clock for production), single clock read per telemetry event, arbiter correctly excluded from the ribbon fold, andmakeWatchdogSinkphase mapping viatoServingPhasecorrectly bridges the distributed vocabulary toPHASE_ORDER.Blast Radius
Changes touch the RMP worker-bridge (cross-process telemetry sink + phase mapping), the engine-telemetry module (two new exported functions), distributed-quorum (leg-done signals), and the worker entrypoint (Quorum-column wiring for the distributed path). All changes are additive; the
RMP_DISTRIBUTED=0fast path is provably unaffected (the new sink and callbacks are only wired inside the distributed branch). Exported surface ofengine-telemetry.tsgrows by two functions used only fromworker-bridge.ts.BLAST_SCORE: 5/10
Risk Indicators
CI status (head
f65090534b46)Overall: ✓ success
3 checks: 3 pending
Related PRs
Findings
No new findings this round.
Claude reviewed and found nothing actionable.
Verdict
APPROVE
hib-pr-reviewer • round 6 • 0 findings • 2026-06-03T09:30:47.168Z → 2026-06-03T09:33:24.249Z • posted-as: pr-reviewer-bot • model: auto