feat(realtime): propagate trace context into the Centrifugo publish (GC-23 relay-rooted trace) #7

Merged
hibryda merged 1 commit from feat/relay-trace-propagation into main 2026-06-01 14:53:08 +02:00
Owner

Header

realtime-service • Stage B slice-1 • GC-23 relay trace-propagation

TL;DR

NEEDS_REVIEW — the Kafka→Centrifugo relay now runs its per-message span as the ACTIVE context and injects the W3C traceparent into the /api/publish call → ONE trace_id across relay-consume → Centrifugo-publish (GC-23 local/relay-rooted form). Pairs with flux #48 (Centrifugo OTLP export fix).

Summary

The relay span was created but not made active, so the publish didn't run under it + carried no trace context → Centrifugo's spans (once it exports) would be a separate trace. Now: context.with(trace.setSpan(active, span), …) wraps message handling; propagation.inject(active, headers, setter) adds traceparent to the publish request. The Node OTel SDK's provider.register() already installs the W3C propagator (no new prod dep). Producer-origin (outbox traceparent-carry for the business-event root) is out of scope — Wave PLATFORM.

Findings (changes)

File Change
src/relay/notification-relay.ts wrap handling in the relay span's active context (lifecycle/semantics preserved)
src/relay/centrifugo-publisher.ts propagation.injecttraceparent on /api/publish (no-op without an active span)
src/otel.ts comment: SDK register() installs the global W3C propagator
tests/relay.* +3 tests (publisher injects traceparent / no-op without propagator; relay publishes within span context)

Verdict

NEEDS_REVIEW. tsc --noEmit clean; 156/156 vitest. Live GC-23 verify (one trace_id relay→Centrifugo in Tempo) is the parent's deploy-time check, after flux #48 lands (Centrifugo OTLP export).

realtime-service • PR R0 • GC-23 relay trace-propagation • 2026-06-01

## Header realtime-service • Stage B slice-1 • GC-23 relay trace-propagation ## TL;DR **NEEDS_REVIEW** — the Kafka→Centrifugo relay now runs its per-message span as the ACTIVE context and injects the W3C `traceparent` into the `/api/publish` call → ONE `trace_id` across relay-consume → Centrifugo-publish (GC-23 local/relay-rooted form). Pairs with flux #48 (Centrifugo OTLP export fix). ## Summary The relay span was created but not made active, so the publish didn't run under it + carried no trace context → Centrifugo's spans (once it exports) would be a separate trace. Now: `context.with(trace.setSpan(active, span), …)` wraps message handling; `propagation.inject(active, headers, setter)` adds `traceparent` to the publish request. The Node OTel SDK's `provider.register()` already installs the W3C propagator (no new prod dep). Producer-origin (outbox `traceparent`-carry for the business-event root) is out of scope — Wave PLATFORM. ## Findings (changes) | File | Change | |---|---| | `src/relay/notification-relay.ts` | wrap handling in the relay span's active context (lifecycle/semantics preserved) | | `src/relay/centrifugo-publisher.ts` | `propagation.inject` → `traceparent` on `/api/publish` (no-op without an active span) | | `src/otel.ts` | comment: SDK `register()` installs the global W3C propagator | | `tests/relay.*` | +3 tests (publisher injects traceparent / no-op without propagator; relay publishes within span context) | ## Verdict **NEEDS_REVIEW.** `tsc --noEmit` clean; **156/156** vitest. Live GC-23 verify (one trace_id relay→Centrifugo in Tempo) is the parent's deploy-time check, after flux #48 lands (Centrifugo OTLP export). ## Footer realtime-service • PR R0 • GC-23 relay trace-propagation • 2026-06-01
Make the Kafka→Centrifugo relay's per-message span the ACTIVE context for the
whole message handler, and inject its W3C `traceparent` (+ `tracestate`) onto
the `/api/publish` HTTP request — so the relay-consume span and the
Centrifugo-side publish share ONE trace_id (GC-23 local form, relay-rooted).

Local form / relay-rooted: the relay's own span roots the trace; the publish
carries that trace outward to Centrifugo (now OTel-enabled). The producer→relay
origin (incoming Kafka outbox headers) is a separate Wave-PLATFORM concern and
is deliberately NOT extracted here.

Changes:
- notification-relay.ts: wrap the message-handling body in
  `context.with(trace.setSpan(context.active(), span), …)` so the publish +
  metrics + logging run under the relay span. Span lifecycle preserved:
  startSpan unchanged, single `span.end()` in the outer finally, all
  skip/throw/outcome semantics intact.
- centrifugo-publisher.ts: `propagation.inject(context.active(), headers, …)`
  before fetch, with a plain `{ set(c,k,v){c[k]=v} }` TextMapSetter over the
  existing lowercase headers object. content-type/authorization preserved;
  injectable fetchFn, AbortController timeout, and all error semantics
  unchanged. No-op when no active span / no global propagator.
- otel.ts: no code change — documented that NodeTracerProvider.register()
  already sets the global W3C TraceContext + Baggage CompositePropagator
  (verified against @opentelemetry/sdk-trace-base _buildPropagatorFromEnv).
  No new dependency required.

Tests (vitest): 153 → 156.
- publisher: injects a W3C traceparent carrying the active span's trace_id;
  no traceparent / no crash when no propagator is registered.
- relay: handleNotificationMessage runs the publish within the relay span
  context (publisher receives a valid, non-zero-trace_id traceparent).

Verification:
- npx tsc --noEmit → clean
- npx vitest run → 156 passed (11 files)

hib-pr-reviewer review — PR #7 (affinity-intelligence-rework/im2be-realtime-service)

Round 1 — head 1c58bc49fc15, base main, trigger opened

TL;DR: CONDITIONAL_APPROVE — kept 1 info finding (unique-to-B, verified at line 195); span_id wildcard weakens W3C propagation fidelity assertion but is non-blocking.

Summary

Arbitration — PR #7 GC-23 relay-rooted trace propagation

Memora context: No prior run history for this PR or submodule — first arbitration.

Reconciliation: Reviewer A filed 0 formal findings (noted a devDependencies prose concern but did not file it). Reviewer B filed 1 finding (info severity) about a span_id wildcard in the W3C traceparent assertion.

Verification: Read tests/relay.centrifugo-publisher.test.ts lines 155–201. Confirmed: SPAN_ID = "b7ad6b7169203331" is declared, injected into the synthetic span context at line 179, but the assertion at line 195 uses [0-9a-f]{16} rather than the pinned ${SPAN_ID}. The finding is grounded and kept.

devDependencies prose concern (mentioned in both summaries but filed by neither reviewer as a formal finding): not elevated per rule 6 / arbiter-is-reconciler-not-fresh-reviewer constraint.

Result: kept=1 (unique-to-B, verified), dropped=0. No blocking issues.

Blast Radius

The diff spans 3 source modules (otel.ts, centrifugo-publisher.ts, notification-relay.ts) and 2 test files across the relay+publish pipeline. The changed surface is the core Kafka→Centrifugo relay path: an exported handler (handleNotificationMessage) and an exported publisher function (publishToUserChannel) that together constitute the service's primary message-delivery contract. Blast radius is moderate — it's an internal service with no public API, but the relay path is the hot path for every notification.

BLAST_SCORE: 5/10

Risk Indicators

Indicator Value
Sensitive functions publishToUserChannel, handleNotificationMessage, propagation.inject, initTelemetry
Migration touched
Test delta +126 / -2 lines in test files
Dependency changes

CI status (head 1c58bc49fc15)

No CI checks reported for this commit.

Findings (1)

[INFO] Traceparent regex wildcards the span_id segment; propagation fidelity assertion is incomplete

tests/relay.centrifugo-publisher.test.ts:194

The test sets a synthetic span context with a known spanId:

const SPAN_ID = "b7ad6b7169203331";
// …
trace.setSpanContext(context.active(), { traceId: TRACE_ID, spanId: SPAN_ID,  })

but the assertion at line 195 uses a wildcard for that position:

expect(init.headers["traceparent"]).toMatch(
  new RegExp(`^00-${TRACE_ID}-[0-9a-f]{16}-0[01]$`),  // ← should be ${SPAN_ID}
);

W3CTraceContextPropagator.inject is spec-required to encode exactly the spanId from the active span context. With [0-9a-f]{16} the assertion passes even if propagation encoded a different span_id, silently missing any regression in context-propagation fidelity. Tighten to:

new RegExp(`^00-${TRACE_ID}-${SPAN_ID}-0[01]$`)

This is the only test that exercises the exact synthetic context-propagation path with a known span_id.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 1 • 1 finding (1i) • 2026-06-01T12:51:10.626Z → 2026-06-01T12:52:20.088Z • posted-as: pr-reviewer-bot • model: auto

<!-- hib-pr-reviewer round:1 --> ## hib-pr-reviewer review — PR #7 (affinity-intelligence-rework/im2be-realtime-service) **Round 1** — head `1c58bc49fc15`, base `main`, trigger `opened` **TL;DR:** CONDITIONAL_APPROVE — kept 1 info finding (unique-to-B, verified at line 195); span_id wildcard weakens W3C propagation fidelity assertion but is non-blocking. ### Summary ## Arbitration — PR #7 GC-23 relay-rooted trace propagation **Memora context**: No prior run history for this PR or submodule — first arbitration. **Reconciliation**: Reviewer A filed 0 formal findings (noted a devDependencies prose concern but did not file it). Reviewer B filed 1 finding (info severity) about a span_id wildcard in the W3C traceparent assertion. **Verification**: Read `tests/relay.centrifugo-publisher.test.ts` lines 155–201. Confirmed: `SPAN_ID = "b7ad6b7169203331"` is declared, injected into the synthetic span context at line 179, but the assertion at line 195 uses `[0-9a-f]{16}` rather than the pinned `${SPAN_ID}`. The finding is grounded and kept. **devDependencies prose concern** (mentioned in both summaries but filed by neither reviewer as a formal finding): not elevated per rule 6 / arbiter-is-reconciler-not-fresh-reviewer constraint. **Result**: kept=1 (unique-to-B, verified), dropped=0. No blocking issues. ### Blast Radius The diff spans 3 source modules (`otel.ts`, `centrifugo-publisher.ts`, `notification-relay.ts`) and 2 test files across the relay+publish pipeline. The changed surface is the core Kafka→Centrifugo relay path: an exported handler (`handleNotificationMessage`) and an exported publisher function (`publishToUserChannel`) that together constitute the service's primary message-delivery contract. Blast radius is moderate — it's an internal service with no public API, but the relay path is the hot path for every notification. **BLAST_SCORE: 5/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `publishToUserChannel`, `handleNotificationMessage`, `propagation.inject`, `initTelemetry` | | Migration touched | — | | Test delta | +126 / -2 lines in test files | | Dependency changes | — | ### CI status (head `1c58bc49fc15`) _No CI checks reported for this commit._ ### Findings (1) #### **[INFO]** Traceparent regex wildcards the `span_id` segment; propagation fidelity assertion is incomplete _tests/relay.centrifugo-publisher.test.ts:194_ The test sets a synthetic span context with a known `spanId`: ```ts const SPAN_ID = "b7ad6b7169203331"; // … trace.setSpanContext(context.active(), { traceId: TRACE_ID, spanId: SPAN_ID, … }) ``` but the assertion at line 195 uses a wildcard for that position: ```ts expect(init.headers["traceparent"]).toMatch( new RegExp(`^00-${TRACE_ID}-[0-9a-f]{16}-0[01]$`), // ← should be ${SPAN_ID} ); ``` `W3CTraceContextPropagator.inject` is spec-required to encode exactly the `spanId` from the active span context. With `[0-9a-f]{16}` the assertion passes even if propagation encoded a different span_id, silently missing any regression in context-propagation fidelity. Tighten to: ```ts new RegExp(`^00-${TRACE_ID}-${SPAN_ID}-0[01]$`) ``` This is the only test that exercises the exact synthetic context-propagation path with a known span_id. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 1 • 1 finding (1i) • 2026-06-01T12:51:10.626Z → 2026-06-01T12:52:20.088Z • posted-as: pr-reviewer-bot • model: auto</sub>
hibryda deleted branch feat/relay-trace-propagation 2026-06-01 14:53:08 +02:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
affinity-intelligence-rework/im2be-realtime-service!7
No description provided.