feat(realtime): propagate trace context into the Centrifugo publish (GC-23 relay-rooted trace) #7
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "feat/relay-trace-propagation"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Header
realtime-service • Stage B slice-1 • GC-23 relay trace-propagation
TL;DR
NEEDS_REVIEW — the Kafka→Centrifugo relay now runs its per-message span as the ACTIVE context and injects the W3C
traceparentinto the/api/publishcall → ONEtrace_idacross relay-consume → Centrifugo-publish (GC-23 local/relay-rooted form). Pairs with flux #48 (Centrifugo OTLP export fix).Summary
The relay span was created but not made active, so the publish didn't run under it + carried no trace context → Centrifugo's spans (once it exports) would be a separate trace. Now:
context.with(trace.setSpan(active, span), …)wraps message handling;propagation.inject(active, headers, setter)addstraceparentto the publish request. The Node OTel SDK'sprovider.register()already installs the W3C propagator (no new prod dep). Producer-origin (outboxtraceparent-carry for the business-event root) is out of scope — Wave PLATFORM.Findings (changes)
src/relay/notification-relay.tssrc/relay/centrifugo-publisher.tspropagation.inject→traceparenton/api/publish(no-op without an active span)src/otel.tsregister()installs the global W3C propagatortests/relay.*Verdict
NEEDS_REVIEW.
tsc --noEmitclean; 156/156 vitest. Live GC-23 verify (one trace_id relay→Centrifugo in Tempo) is the parent's deploy-time check, after flux #48 lands (Centrifugo OTLP export).Footer
realtime-service • PR R0 • GC-23 relay trace-propagation • 2026-06-01
Make the Kafka→Centrifugo relay's per-message span the ACTIVE context for the whole message handler, and inject its W3C `traceparent` (+ `tracestate`) onto the `/api/publish` HTTP request — so the relay-consume span and the Centrifugo-side publish share ONE trace_id (GC-23 local form, relay-rooted). Local form / relay-rooted: the relay's own span roots the trace; the publish carries that trace outward to Centrifugo (now OTel-enabled). The producer→relay origin (incoming Kafka outbox headers) is a separate Wave-PLATFORM concern and is deliberately NOT extracted here. Changes: - notification-relay.ts: wrap the message-handling body in `context.with(trace.setSpan(context.active(), span), …)` so the publish + metrics + logging run under the relay span. Span lifecycle preserved: startSpan unchanged, single `span.end()` in the outer finally, all skip/throw/outcome semantics intact. - centrifugo-publisher.ts: `propagation.inject(context.active(), headers, …)` before fetch, with a plain `{ set(c,k,v){c[k]=v} }` TextMapSetter over the existing lowercase headers object. content-type/authorization preserved; injectable fetchFn, AbortController timeout, and all error semantics unchanged. No-op when no active span / no global propagator. - otel.ts: no code change — documented that NodeTracerProvider.register() already sets the global W3C TraceContext + Baggage CompositePropagator (verified against @opentelemetry/sdk-trace-base _buildPropagatorFromEnv). No new dependency required. Tests (vitest): 153 → 156. - publisher: injects a W3C traceparent carrying the active span's trace_id; no traceparent / no crash when no propagator is registered. - relay: handleNotificationMessage runs the publish within the relay span context (publisher receives a valid, non-zero-trace_id traceparent). Verification: - npx tsc --noEmit → clean - npx vitest run → 156 passed (11 files)hib-pr-reviewer review — PR #7 (affinity-intelligence-rework/im2be-realtime-service)
Round 1 — head
1c58bc49fc15, basemain, triggeropenedTL;DR: CONDITIONAL_APPROVE — kept 1 info finding (unique-to-B, verified at line 195); span_id wildcard weakens W3C propagation fidelity assertion but is non-blocking.
Summary
Arbitration — PR #7 GC-23 relay-rooted trace propagation
Memora context: No prior run history for this PR or submodule — first arbitration.
Reconciliation: Reviewer A filed 0 formal findings (noted a devDependencies prose concern but did not file it). Reviewer B filed 1 finding (info severity) about a span_id wildcard in the W3C traceparent assertion.
Verification: Read
tests/relay.centrifugo-publisher.test.tslines 155–201. Confirmed:SPAN_ID = "b7ad6b7169203331"is declared, injected into the synthetic span context at line 179, but the assertion at line 195 uses[0-9a-f]{16}rather than the pinned${SPAN_ID}. The finding is grounded and kept.devDependencies prose concern (mentioned in both summaries but filed by neither reviewer as a formal finding): not elevated per rule 6 / arbiter-is-reconciler-not-fresh-reviewer constraint.
Result: kept=1 (unique-to-B, verified), dropped=0. No blocking issues.
Blast Radius
The diff spans 3 source modules (
otel.ts,centrifugo-publisher.ts,notification-relay.ts) and 2 test files across the relay+publish pipeline. The changed surface is the core Kafka→Centrifugo relay path: an exported handler (handleNotificationMessage) and an exported publisher function (publishToUserChannel) that together constitute the service's primary message-delivery contract. Blast radius is moderate — it's an internal service with no public API, but the relay path is the hot path for every notification.BLAST_SCORE: 5/10
Risk Indicators
publishToUserChannel,handleNotificationMessage,propagation.inject,initTelemetryCI status (head
1c58bc49fc15)No CI checks reported for this commit.
Findings (1)
[INFO] Traceparent regex wildcards the
span_idsegment; propagation fidelity assertion is incompletetests/relay.centrifugo-publisher.test.ts:194
The test sets a synthetic span context with a known
spanId:but the assertion at line 195 uses a wildcard for that position:
W3CTraceContextPropagator.injectis spec-required to encode exactly thespanIdfrom the active span context. With[0-9a-f]{16}the assertion passes even if propagation encoded a different span_id, silently missing any regression in context-propagation fidelity. Tighten to:This is the only test that exercises the exact synthetic context-propagation path with a known span_id.
Verdict
CONDITIONAL_APPROVE
hib-pr-reviewer • round 1 • 1 finding (1i) • 2026-06-01T12:51:10.626Z → 2026-06-01T12:52:20.088Z • posted-as: pr-reviewer-bot • model: auto