fix(outbox): stop outboxOpenTelemetryFallback hijacking the real OTEL SDK (#343) #22
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix/outbox-otel-fallback-no-hijack-343"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
#343 — stop
outboxOpenTelemetryFallbackhijacking the real OTEL SDKCritical pre-existing defect surfaced while verifying #342 (Micrometer→OTLP bridge) live: the 7 outbox services were exporting no telemetry at all — neither metrics nor traces — while
error-event-sink(no outbox dependency) exported normally.Root cause
OutboxAutoConfiguration.outboxOpenTelemetryFallback()returnedOpenTelemetry.noop()gated on@ConditionalOnMissingBeanalone. The class is@AutoConfigureAfterDataSource/Hibernate/Kafka but not the OpenTelemetry Spring Boot starter'sOpenTelemetryAutoConfiguration. Auto-config ordering let the fallback win the race → it registered the no-op SDK → the starter's realOpenTelemetrybean (itself@ConditionalOnMissingBean) backed off → the whole app ran on a no-op OpenTelemetry. Consequences:MicrometerBridgeAutoConfiguration(@ConditionalOnBean(OpenTelemetry.class)) didn't bind → every Micrometer meter (im2be_outbox_*,kafka_consumer_*,jvm_*) fell to the non-exportingSimpleMeterRegistry.Evidence: user-service condition report —
MicrometerBridgeAutoConfiguration Did not match: ... did not find any beans of type io.opentelemetry.api.OpenTelemetry. At the collector:8889, onlyservice_name=error-event-sinkproduced telemetry.Fix
Add
@ConditionalOnMissingClass("io.opentelemetry.instrumentation.spring.autoconfigure.OpenTelemetryAutoConfiguration")to the fallback bean. Now the decision is order-independent: when the starter is on the classpath (all 8 services) the fallback never registers, so the real SDK always wins.@ConditionalOnMissingBeanis retained for the bare-consumer case (app supplies its ownOpenTelemetry). The fallback's original purpose — a bare consumer without the starter still boots on no-op — is preserved.Tests (
OutboxOpenTelemetryFallbackTest)fallbackProvidesNoopWhenNoOpenTelemetryBeanAndStarterAbsent— bare-consumer path still boots on no-op.fallbackYieldsToAUserProvidedOpenTelemetryBean—@ConditionalOnMissingBeanbehaviour preserved.fallbackIsGuardedByConditionalOnMissingClassOfOtelStarter— pins the guard. The OTEL starter is intentionally not a dependency of this library (onlyopentelemetry-apiis), so the classpath-condition'sfalsestate can't be exercised in-module; that path is covered by the live post-deploy verification.mvn -pl outbox-publisher -am install -DskipITs→ 106/106 unit, 0 warnings.Post-merge (operator-gated)
Republish
outbox-publisherSNAPSHOT → rebuild the 7 outbox service images--no-cache→ re-import → rollout-restart → confirm traces+metrics withservice_namein {user,social-login,family,diary,calendar,admin}-service appear at:8889(not just the sink). Unblocks #342's live close-out + PR-OTEL-9 alerts having data. Memora #3651.Show previous round
hib-pr-reviewer review — PR #22 (affinity-intelligence-rework/im2be-platform-libs)
Round 1 — head
ff2fd2f99fed, basemain, triggeropenedTL;DR: CONDITIONAL_APPROVE — kept 1 agreed finding (A+B on @ConditionalOnMissingClass gap) and 1 verified unique-to-B finding (weak .contains() assertion); no blocking issues.
Summary
Arbitration — PR #22
fix(outbox): stop outboxOpenTelemetryFallback hijacking the real OTEL SDKMemora: No prior run history found; first arbitration for this PR.
Reconciliation:
Finding 1 (A+B agreed → KEPT): Both reviewers independently flag
@ConditionalOnMissingClassatOutboxAutoConfiguration.java:270–271as creating aNoSuchBeanDefinitionExceptionstartup failure when the OTel starter class is on the classpath but its autoconfiguration is excluded (e.g., viaspring.autoconfigure.exclude=...). Read confirmed the annotation is at lines 270–271, and that bothoutboxPublisher(line 362) andoutboxPollerWorker(line 391) injectOpenTelemetryas a non-optional required parameter — noObjectProviderwrapper. Before this PR,@ConditionalOnMissingBeanwould have correctly filled the gap; the new guard regresses that scenario.Finding 2 (Unique to B → VERIFIED + KEPT): Read confirmed line 90–92 of the test uses
.contains(...)not.containsExactly(...). The assertion passes as long as the target class name appears somewhere in theString[]annotation value; extra entries added in a future refactor would silently evade the regression guard. Verification confirms B's claim is accurate.Memora: Run 1 summary persisted (memory id 467).
Blast Radius
The diff touches one auto-configuration class and one new test class within a single
outbox-publishermodule. The@ConditionalOnMissingClasschange affects every service that carries the outbox dependency alongside the OTel Spring Boot starter — a real but bounded surface. No public API or protocol contracts are modified.BLAST_SCORE: 3/10
CI status (head
ff2fd2f99fed)Overall: ✓ success
2 checks: 2 pending
Findings (2)
[MINOR]
@ConditionalOnMissingClassintroduces aNoSuchBeanDefinitionExceptionwhen the OTel starter is on the classpath but its autoconfiguration is excludedoutbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:270
The new guard is a classpath check, not a bean-presence check. When
opentelemetry-spring-boot-starteris a transitive dependency but the application explicitly excludes its autoconfiguration (e.g.,spring.autoconfigure.exclude=io.opentelemetry.instrumentation.spring.autoconfigure.OpenTelemetryAutoConfiguration): (1) the class IS on the classpath →@ConditionalOnMissingClassevaluates tofalse→ the fallback does not register; (2) the starter'sOpenTelemetryAutoConfigurationis excluded → no realOpenTelemetrybean is registered either; (3) bothoutboxPublisher(line 362) andoutboxPollerWorker(line 391) takefinal OpenTelemetry openTelemetryas a required (non-ObjectProvider) parameter → context startup fails withNoSuchBeanDefinitionException. Before this PR,@ConditionalOnMissingBeanalone correctly covered this scenario by registering the noop fallback.Preferred fix: Replace the per-method
@ConditionalOnMissingClassguard with a class-level@AutoConfigureAfter(name = "io.opentelemetry.instrumentation.spring.autoconfigure.OpenTelemetryAutoConfiguration")and keep only@ConditionalOnMissingBeanon the method. When the starter is active, ordering guarantees the real bean exists before the fallback is evaluated; when the starter is absent or excluded,@ConditionalOnMissingBeanregisters the noop as before — covering all three deployment paths without the gap.If the classpath guard is retained: Document the known limitation in the Javadoc and inject
OpenTelemetryviaObjectProvider<OpenTelemetry>in the affected beans, or add a dedicated@ConditionalOnMissingBeanfallback that applies even when the class is present but excluded.[MINOR] Annotation-pin assertion uses
.contains()instead of.containsExactly(), allowing silent annotation bloat to go undetectedoutbox-publisher/src/test/java/com/aim2be/platform/outbox/OutboxOpenTelemetryFallbackTest.java:90
At line 90–92,
assertThat(guard.value()).contains("io.opentelemetry...")only verifies that the target class name is somewhere in theString[]value of@ConditionalOnMissingClass. If a future refactor adds a second class name to the annotation (e.g., to guard an additional starter version), the test still passes while the annotation contract has silently drifted and the extra guard may suppress the fallback in unintended scenarios.Replace with:
This pins both the content and the count of guarded class names, which is the point of a regression-guard test.
Verdict
CONDITIONAL_APPROVE
hib-pr-reviewer • round 1 • 2 findings (2m) • 2026-05-31T21:44:58.159Z → 2026-05-31T21:46:16.452Z • posted-as: pr-reviewer-bot • model: auto
R1 CONDITIONAL_APPROVE findings (kept=2, both MINOR, no blockers): (1) MINOR OutboxAutoConfiguration.java — the @ConditionalOnMissingClass guard has a gap: when opentelemetry-spring-boot-starter is on the classpath but its OpenTelemetryAutoConfiguration is excluded (spring.autoconfigure.exclude), the class-present check skips the fallback AND no real OpenTelemetry bean exists → outboxPublisher / outboxPollerWorker (required, non-ObjectProvider OpenTelemetry params) fail with NoSuchBeanDefinitionException. The classpath guard mishandles the present-but-excluded path. (2) MINOR OutboxOpenTelemetryFallbackTest.java:90 — the annotation-pin used .contains() not .containsExactly(), so a stray added guard name would pass undetected. Fix (adopted the reviewer's preferred solution — strictly better): - Replaced the per-method @ConditionalOnMissingClass with a class-level @AutoConfigureAfter(name="...OpenTelemetryAutoConfiguration") (by name — the starter is not a dependency of this lib) and kept only @ConditionalOnMissingBean on the bean. Ordering guarantees the real SDK bean is registered before the fallback's @ConditionalOnMissingBean is evaluated → real bean wins when present; @ConditionalOnMissingBean registers the no-op when the starter is absent OR excluded. Covers all three paths (present / absent / present-but-excluded) without a startup failure. - Test: the guard now asserts the class-level @AutoConfigureAfter.name() containsExactly the OTEL starter FQN (pins content + count). Verification: - mvn -pl outbox-publisher -am install -DskipITs → 106/106 unit, 0 warnings - mvn -pl outbox-publisher clean compile test-compile → 0 [WARNING]hib-pr-reviewer review — PR #22 (affinity-intelligence-rework/im2be-platform-libs)
Round 2 — head
e7bd62ffdd88, basemain, triggersynchronizeTL;DR: NO_NEW_FINDINGS — No new findings this round.
Summary
[quorum-converged] A=0 = B=0. Both R1 findings are addressed cleanly:
R1
@ConditionalOnMissingClasshazard — the code does not use@ConditionalOnMissingClass. The chosen fix (@AutoConfigureAfter(name = "…OpenTelemetryAutoConfiguration")) correctly covers all three deployment paths (absent / present / present-but-excluded) without aNoSuchBeanDefinitionException. The Javadoc at lines 270–276 explicitly documents why the class-only guard would be wrong, satisfying the R1 concern.R1
.contains()→.containsExactly()—OutboxOpenTelemetryFallbackTest.java:100now uses.containsExactly(...). The inline comment at lines 92–93 explicitly cites#22 R1, confirming the intent.The
@AutoConfigureAfter(value = {...}, name = "...")split is idiomatic Spring Boot autoconfiguration (both attributes are honoured during ordering resolution). TheisSameAs(OpenTelemetry.noop())reference-equality check is valid because the OTel API returns the same singleton instance fromnoop(). TheApplicationContextRunnersetup provides all required beans (OutboxRecordRepository,DataSource,KafkaProperties) andoutboxMetricsBinderconsumesMeterRegistryviaObjectProvider, so absence of a registry does not fail startup. No new actionable issues found.CI status (head
e7bd62ffdd88)Overall: ⏳ pending
2 checks: 2 pending
Findings
No new findings this round.
Quorum converged on empty findings (A + B both returned 0).
Verdict
NO_NEW_FINDINGS
hib-pr-reviewer • round 2 • 0 findings • 2026-05-31T21:49:30.161Z → 2026-05-31T21:51:21.847Z • posted-as: pr-reviewer-bot • model: auto • [bookkeeping fallback]