feat(outbox): PR-PLATFORM-1 — outbox-publisher full impl per ADR-0014 #4

Merged
hibryda merged 14 commits from feat/pr-platform-1-outbox-publisher into main 2026-05-27 04:36:44 +02:00
Owner

Summary

Replaces the v1.0 scaffold's UnsupportedOperationException("v1.0 scaffold") stubs in outbox-publisher/ with a production-ready transactional outbox publisher per ADR-0014 D-1 + D-9 (dual-mode relay, locked 2026-05-14).

What is in

  • OutboxRecord — extended with last_error (1000-char truncating setter), (status, created_at) + topic indexes, Status.isTerminal() helper. Compound PK (aggregateId, eventId) preserved from the v1.0 scaffold (rule 10 — match existing patterns; matches ADR-0014 D-5 UUIDv7 idempotency key).
  • OutboxRecordRepositoryPage<> PENDING-row fetch (poller batch), backlog-age JPQL gauge query, markSent / markFailureAttempt atomic JPQL updates.
  • OutboxPublisher — hot AFTER_COMMIT relay wrapped in Resilience4j CircuitBreaker (im2be-outbox-relay, name pinned). Runtime-guards on active transaction (rule 18 + ADR-0014 D-7). OTel span im2be.outbox.publish.hot.
  • OutboxPollerWorker@Scheduled cold path, ShedLock 6.0.2 distributed lock (outbox-poller), synchronous send().get() per row. OTel span im2be.outbox.poller.tick.
  • OutboxPublishCompletion — REQUIRES_NEW transactional helper so post-commit markSent / markFailureAttempt writes execute in their own transaction (the AFTER_COMMIT parent txn is already committed).
  • UuidV7 — stdlib-free UUIDv7 generator (Java 17 lacks v7). Required for time-ordered event_id per ADR-0014 D-5.
  • OutboxMetricsBinder — four Micrometer metrics per OQ-5: im2be_outbox_pending_total, im2be_outbox_backlog_age_seconds, im2be_outbox_publish_latency_seconds (per-topic timer), im2be_outbox_publish_failure_total (per-topic counter).
  • OutboxProperties@ConfigurationProperties(prefix="im2be.outbox") for poller cadence/batch + circuit-breaker tuning.
  • OutboxAutoConfiguration — wires CircuitBreakerRegistry + ShedLock JdbcTemplateLockProvider + fallback OpenTelemetry.noop().
  • README.md with usage / Flyway template / config reference / metrics table.

Test plan

  • 29 unit tests — entity invariants, UUIDv7 RFC 9562 conformance, publisher (mocked), poller (mocked), metrics binder (SimpleMeterRegistry).
  • 3 Testcontainers ITs@SpringBootTest @Testcontainers with postgres:15-alpine + spring-kafka-test EmbeddedKafkaBroker: happy path, runtime-guard, poller sweep.
  • Build: mvn -B clean install and mvn -B clean verify both BUILD SUCCESS with zero [WARNING] lines (rule 62).

Pinned dependency choices (rule 9 + rule 61)

Dep Version Why
io.github.resilience4j:resilience4j-circuitbreaker 2.2.0 Spec-required + Spring Boot 3.5.x compatible.
net.javacrumbs.shedlock:shedlock-spring 6.0.2 Spec-required; verified in Maven Central.
net.javacrumbs.shedlock:shedlock-provider-jdbc-template 6.0.2 matches above.
io.opentelemetry:opentelemetry-api 1.43.0 provided scope — consumer wires the SDK.
org.testcontainers:postgresql + junit-jupiter 1.20.4 latest stable.

Cross-references

## Summary Replaces the v1.0 scaffold's `UnsupportedOperationException("v1.0 scaffold")` stubs in `outbox-publisher/` with a production-ready transactional outbox publisher per [ADR-0014 D-1 + D-9](https://git.hemoglobina.store/affinity-intelligence-rework/im2be-mono/src/branch/main/docs/decisions/0014-outbox-record-version-parity.md) (dual-mode relay, locked 2026-05-14). ### What is in - **`OutboxRecord`** — extended with `last_error` (1000-char truncating setter), `(status, created_at)` + `topic` indexes, `Status.isTerminal()` helper. Compound PK `(aggregateId, eventId)` preserved from the v1.0 scaffold (rule 10 — match existing patterns; matches ADR-0014 D-5 UUIDv7 idempotency key). - **`OutboxRecordRepository`** — `Page<>` PENDING-row fetch (poller batch), backlog-age JPQL gauge query, `markSent` / `markFailureAttempt` atomic JPQL updates. - **`OutboxPublisher`** — hot AFTER_COMMIT relay wrapped in Resilience4j `CircuitBreaker` (`im2be-outbox-relay`, name pinned). Runtime-guards on active transaction (rule 18 + ADR-0014 D-7). OTel span `im2be.outbox.publish.hot`. - **`OutboxPollerWorker`** — `@Scheduled` cold path, ShedLock 6.0.2 distributed lock (`outbox-poller`), synchronous `send().get()` per row. OTel span `im2be.outbox.poller.tick`. - **`OutboxPublishCompletion`** — REQUIRES_NEW transactional helper so post-commit `markSent` / `markFailureAttempt` writes execute in their own transaction (the AFTER_COMMIT parent txn is already committed). - **`UuidV7`** — stdlib-free UUIDv7 generator (Java 17 lacks v7). Required for time-ordered `event_id` per ADR-0014 D-5. - **`OutboxMetricsBinder`** — four Micrometer metrics per OQ-5: `im2be_outbox_pending_total`, `im2be_outbox_backlog_age_seconds`, `im2be_outbox_publish_latency_seconds` (per-topic timer), `im2be_outbox_publish_failure_total` (per-topic counter). - **`OutboxProperties`** — `@ConfigurationProperties(prefix="im2be.outbox")` for poller cadence/batch + circuit-breaker tuning. - **`OutboxAutoConfiguration`** — wires `CircuitBreakerRegistry` + ShedLock `JdbcTemplateLockProvider` + fallback `OpenTelemetry.noop()`. - **README.md** with usage / Flyway template / config reference / metrics table. ## Test plan - [x] **29 unit tests** — entity invariants, UUIDv7 RFC 9562 conformance, publisher (mocked), poller (mocked), metrics binder (SimpleMeterRegistry). - [x] **3 Testcontainers ITs** — `@SpringBootTest @Testcontainers` with `postgres:15-alpine` + `spring-kafka-test EmbeddedKafkaBroker`: happy path, runtime-guard, poller sweep. - [x] **Build**: `mvn -B clean install` and `mvn -B clean verify` both BUILD SUCCESS with **zero `[WARNING]` lines** (rule 62). ## Pinned dependency choices (rule 9 + rule 61) | Dep | Version | Why | |---|---|---| | `io.github.resilience4j:resilience4j-circuitbreaker` | 2.2.0 | Spec-required + Spring Boot 3.5.x compatible. | | `net.javacrumbs.shedlock:shedlock-spring` | 6.0.2 | Spec-required; verified in Maven Central. | | `net.javacrumbs.shedlock:shedlock-provider-jdbc-template` | 6.0.2 | matches above. | | `io.opentelemetry:opentelemetry-api` | 1.43.0 | `provided` scope — consumer wires the SDK. | | `org.testcontainers:postgresql` + `junit-jupiter` | 1.20.4 | latest stable. | ## Cross-references - [ADR-0014 D-1 + D-9](https://git.hemoglobina.store/affinity-intelligence-rework/im2be-mono/src/branch/main/docs/decisions/0014-outbox-record-version-parity.md) - [ADR-0002 §3a](https://git.hemoglobina.store/affinity-intelligence-rework/im2be-mono/src/branch/main/docs/decisions/0002-centrifugo-opaque-ticket.md) - [.planning/26-stage-b-outbox-parity.md §OQ-5](https://git.hemoglobina.store/affinity-intelligence-rework/im2be-mono/src/branch/main/.planning/26-stage-b-outbox-parity.md)
feat(outbox): PR-PLATFORM-1 — full impl of outbox-publisher per ADR-0014
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
c3a1062db0
Replaces the v1.0 scaffold's UnsupportedOperationException stubs with a
production-ready transactional outbox publisher.

Implements:
- OutboxRecord with last_error column + (status, created_at) and topic
  indexes + Status.isTerminal() helper
- OutboxRecordRepository with poller-batch fetch + backlog-age gauge query
  + atomic markSent / markFailureAttempt JPQL updates
- OutboxPublisher hot AFTER_COMMIT relay wrapped in Resilience4j
  CircuitBreaker (im2be-outbox-relay); runtime-guards on active txn per
  ADR-0014 D-7; OTel im2be.outbox.publish.hot span per attempt
- OutboxPollerWorker @Scheduled cold path coordinated via ShedLock 6.0.2;
  synchronous send.get() per row; OTel im2be.outbox.poller.tick span
- OutboxPublishCompletion REQUIRES_NEW helper so post-commit row writes
  occur in their own transaction
- UuidV7 stdlib-free generator (Java 17 lacks v7) — required for time-
  ordered outbox event_id per ADR-0014 D-5
- OutboxMetricsBinder publishing the four metrics required by OQ-5:
  im2be_outbox_pending_total (gauge), im2be_outbox_backlog_age_seconds
  (gauge), im2be_outbox_publish_latency_seconds (per-topic timer),
  im2be_outbox_publish_failure_total (per-topic counter)
- OutboxProperties (@ConfigurationProperties) — poller cadence/batch +
  circuit-breaker tuning
- OutboxAutoConfiguration wires CircuitBreakerRegistry +
  ShedLock JdbcTemplateLockProvider + fallback OpenTelemetry.noop() +
  per-bean conditionals

Tests:
- 29 unit tests across OutboxRecord, UuidV7, OutboxPublisher (mocked),
  OutboxPollerWorker (mocked), OutboxMetricsBinder (SimpleMeterRegistry)
- 3 @SpringBootTest @Testcontainers integration tests against postgres:15
  + spring-kafka-test EmbeddedKafkaBroker — happy path, runtime-guard,
  poller sweep

Build (mvn -B clean verify): BUILD SUCCESS, zero [WARNING] lines.

Cross-refs:
- ADR-0014 D-1 + D-9 dual-mode relay (locked 2026-05-14)
- ADR-0002 §3a Resilience4j circuit-breaker pattern
- .planning/26-stage-b-outbox-parity.md §OQ-5 metrics + alerts

Superseded by round 2.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 1 — head c3a1062db000, base main, trigger opened

TL;DR: NEEDS_WORK — 2 major + 4 minor + 1 info findings kept this round.

Summary

Arbiter reconciled 3 (A) + 5 (B) → 7 findings.

CI status (head c3a1062db000)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings (7)

[MAJOR] Circuit breaker is blind to async Kafka ack failures — only synchronous throws count against failure rate

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:238

Lines 238-239:

circuitBreaker.executeRunnable(() ->
    sendAndDispatchCompletion(topic, payload, aggregateId, eventId, startedAt));

sendAndDispatchCompletion (line 274) calls kafkaTemplate.send(...).whenComplete(...) and returns immediately — the future is never awaited. executeRunnable therefore observes a clean synchronous return for every case where the producer can locally enqueue the record. Async nacks — leader failover, broker quota, replication timeout — are never counted against the breaker failure rate and will never trip it OPEN. Only synchronous throws from send() itself (serializer error, oversized record) count, which are a tiny fraction of real-world broker failures.

The Javadoc at lines 261-264 even acknowledges this: "async send failures are surfaced via the whenComplete callback" — but that callback is never wired back to the breaker.

Fix option A — use CircuitBreaker.decorateCompletionStage:

CircuitBreaker.decorateCompletionStage(
    circuitBreaker,
    () -> kafkaTemplate.send(topic, key, payload).toCompletableFuture())
    .get()
    .whenComplete((result, err) -> { /* existing callback logic */ });

Fix option B — call circuitBreaker.onError / circuitBreaker.onSuccess manually inside the existing whenComplete callback and remove executeRunnable.

[MAJOR] Status.FAILED is never written — undeliverable rows retry on every tick forever, starving the batch

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:184

persistFailureAttempt (lines 184-203) increments retries and stores lastError but contains no retry-budget check. OutboxProperties.Poller has no maxRetries field. Confirmed by grepping the entire module: no code path in OutboxPollerWorker, OutboxPublisher, OutboxPublishCompletion, or OutboxRecordRepository ever writes Status.FAILED. OutboxRecord.Status.FAILED is defined at line 71 with Javadoc "exceeded retry budget; awaits operator inspection" but this contract is never fulfilled.

Consequences:

  1. A permanently undeliverable row (deleted topic, incompatible schema, record exceeds message.max.bytes) occupies one slot in every poller batch tick indefinitely.
  2. Once the undeliverable backlog exceeds batch-size (default 500), newer PENDING rows are deferred — the ORDER BY created_at ASC FIFO sweep means old poisoned rows always come first.
  3. im2be_outbox_pending_total gauge never drains; the ADR-0014 §D-8 alert fires continuously.

Fix: add maxRetries (e.g. default 10) to OutboxProperties.Poller; in persistFailureAttempt check record.getRetries() + 1 >= maxRetries — if true, call a new repository.markFailed(aggregateId, eventId) JPQL update setting status = FAILED. The poller queries only status = PENDING, so FAILED rows are naturally excluded from future ticks.

[MINOR] Unbounded .get() on cold-path Kafka send can stall the poller thread indefinitely

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:168

kafkaTemplate.send(record.getTopic(), key, record.getPayloadBytes()).get();

get() without a timeout blocks until the CompletableFuture resolves. If the Kafka broker is wedged such that the future never completes (distinct from returning an ExecutionException), the poller thread hangs for the full ShedLock hold window (lock-at-most, default PT1M). During that window, cold-path recovery stalls for all services sharing the same replica.

Fix: replace with a timed get sourced from the producer's delivery timeout:

kafkaTemplate.send(record.getTopic(), key, record.getPayloadBytes())
        .get(properties.getPoller().getDeliveryTimeoutMs(), TimeUnit.MILLISECONDS);

Add deliveryTimeoutMs (default 30 000) to OutboxProperties.Poller. The resulting TimeoutException is caught by the existing catch (Exception sendError) block.

[MINOR] ExecutionException stored in last_error without unwrapping the root cause

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:178

kafkaTemplate.send().get() throws ExecutionException wrapping the actual Kafka exception. When caught at line 178 (catch (Exception sendError)), persistFailureAttempt builds the error string at line 186 as cause.getClass().getSimpleName() + ": " + cause.getMessage(). The stored last_error will therefore read "ExecutionException: org.apache.kafka.common.errors.TimeoutException: ..." instead of "TimeoutException: ...". Operators querying the table for triage see the wrapper class, not the root cause.

Fix: In persistFailureAttempt, unwrap ExecutionException before building the message:

final Throwable root = (cause instanceof java.util.concurrent.ExecutionException
        && cause.getCause() != null) ? cause.getCause() : cause;
final String message = root.getClass().getSimpleName() + ": " + ...;

[MINOR] Narrow DataAccessException catch lets CannotCreateTransactionException escape from persistFailureAttempt

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:306

Line 306:

} catch (DataAccessException persistError) {

completion.markFailureAttempt() runs in a REQUIRES_NEW transaction. If the connection pool is exhausted or the datasource is unavailable, Spring throws CannotCreateTransactionException (a TransactionException), which is NOT a subclass of DataAccessException. That exception propagates out of the afterCommit() callback into Spring's transaction synchronisation infrastructure, where behaviour is undefined across versions (Spring may log-and-swallow or re-throw).

Fix:

} catch (RuntimeException persistError) {

or at minimum:

} catch (DataAccessException | TransactionException persistError) {

[MINOR] OTel span outcome set to success before async ack arrives; span is closed before whenComplete fires

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:240

Line 240 sets span.setAttribute(ATTR_OUTCOME, "success") immediately after executeRunnable returns — which, per finding B-1, happens before the Kafka ack. The finally block at lines 254-256 closes the span before the whenComplete callback runs on the Kafka I/O thread. Kafka async failures therefore produce spans with outcome=success, breaking the im2be.outbox.publish.hot trace for the dominant failure mode.

Fix: Move the span start/end scope inside the whenComplete callback so the outcome attribute and span end reflect the actual async result. Alternatively, record outcome as a separate OTel event from within whenComplete and keep the parent span open until then (requires passing the span reference into the callback and calling span.end() from there).

[INFO] Unbounded List overload of findByStatusOrderByCreatedAtAsc is a public footgun

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecordRepository.java:134

List<OutboxRecord> findByStatusOrderByCreatedAtAsc(OutboxRecord.Status status);

The Javadoc warns "callers that need bounded fetch MUST use the Page variant" but the unbounded overload remains public on the interface. A caller that accidentally imports this signature instead of the Pageable variant will issue a SELECT * FROM outbox_records WHERE status = 'PENDING' ORDER BY created_at against a potentially large table and load the full result set into heap. This risk is amplified by the backlog scenario in finding B-2 (thousands of permanently-stuck PENDING rows).

Fix: Either remove the overload entirely (breaking-change flag before first GA) or annotate it @Deprecated with a Javadoc @deprecated directing callers to the bounded variant — making the footgun visible at compile time.

Verdict

NEEDS_WORK


hib-pr-reviewer • round 1 • 7 findings (2M/4m/1i) • 2026-05-26T22:45:56.190Z → 2026-05-26T22:52:26.394Z • posted-as: pr-reviewer-bot • [bookkeeping fallback]

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 2._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 1** — head `c3a1062db000`, base `main`, trigger `opened` **TL;DR:** NEEDS_WORK — 2 major + 4 minor + 1 info findings kept this round. ### Summary Arbiter reconciled 3 (A) + 5 (B) → 7 findings. ### CI status (head `c3a1062db000`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/11/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/11/jobs/1) | ### Findings (7) #### **[MAJOR]** Circuit breaker is blind to async Kafka ack failures — only synchronous throws count against failure rate _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:238_ Lines 238-239: ```java circuitBreaker.executeRunnable(() -> sendAndDispatchCompletion(topic, payload, aggregateId, eventId, startedAt)); ``` `sendAndDispatchCompletion` (line 274) calls `kafkaTemplate.send(...).whenComplete(...)` and **returns immediately** — the future is never awaited. `executeRunnable` therefore observes a clean synchronous return for every case where the producer can locally enqueue the record. Async nacks — leader failover, broker quota, replication timeout — are **never counted** against the breaker failure rate and will never trip it OPEN. Only synchronous throws from `send()` itself (serializer error, oversized record) count, which are a tiny fraction of real-world broker failures. The Javadoc at lines 261-264 even acknowledges this: "async send failures are surfaced via the `whenComplete` callback" — but that callback is never wired back to the breaker. **Fix option A** — use `CircuitBreaker.decorateCompletionStage`: ```java CircuitBreaker.decorateCompletionStage( circuitBreaker, () -> kafkaTemplate.send(topic, key, payload).toCompletableFuture()) .get() .whenComplete((result, err) -> { /* existing callback logic */ }); ``` **Fix option B** — call `circuitBreaker.onError` / `circuitBreaker.onSuccess` manually inside the existing `whenComplete` callback and remove `executeRunnable`. #### **[MAJOR]** `Status.FAILED` is never written — undeliverable rows retry on every tick forever, starving the batch _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:184_ `persistFailureAttempt` (lines 184-203) increments `retries` and stores `lastError` but contains no retry-budget check. `OutboxProperties.Poller` has no `maxRetries` field. Confirmed by grepping the entire module: no code path in `OutboxPollerWorker`, `OutboxPublisher`, `OutboxPublishCompletion`, or `OutboxRecordRepository` ever writes `Status.FAILED`. `OutboxRecord.Status.FAILED` is defined at line 71 with Javadoc "exceeded retry budget; awaits operator inspection" but this contract is never fulfilled. Consequences: 1. A permanently undeliverable row (deleted topic, incompatible schema, record exceeds `message.max.bytes`) occupies one slot in **every** poller batch tick indefinitely. 2. Once the undeliverable backlog exceeds `batch-size` (default 500), newer PENDING rows are deferred — the `ORDER BY created_at ASC` FIFO sweep means old poisoned rows always come first. 3. `im2be_outbox_pending_total` gauge never drains; the ADR-0014 §D-8 alert fires continuously. **Fix:** add `maxRetries` (e.g. default 10) to `OutboxProperties.Poller`; in `persistFailureAttempt` check `record.getRetries() + 1 >= maxRetries` — if true, call a new `repository.markFailed(aggregateId, eventId)` JPQL update setting `status = FAILED`. The poller queries only `status = PENDING`, so FAILED rows are naturally excluded from future ticks. #### **[MINOR]** Unbounded `.get()` on cold-path Kafka send can stall the poller thread indefinitely _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:168_ ```java kafkaTemplate.send(record.getTopic(), key, record.getPayloadBytes()).get(); ``` `get()` without a timeout blocks until the `CompletableFuture` resolves. If the Kafka broker is wedged such that the future never completes (distinct from returning an `ExecutionException`), the poller thread hangs for the full ShedLock hold window (`lock-at-most`, default PT1M). During that window, cold-path recovery stalls for all services sharing the same replica. **Fix:** replace with a timed `get` sourced from the producer's delivery timeout: ```java kafkaTemplate.send(record.getTopic(), key, record.getPayloadBytes()) .get(properties.getPoller().getDeliveryTimeoutMs(), TimeUnit.MILLISECONDS); ``` Add `deliveryTimeoutMs` (default 30 000) to `OutboxProperties.Poller`. The resulting `TimeoutException` is caught by the existing `catch (Exception sendError)` block. #### **[MINOR]** `ExecutionException` stored in `last_error` without unwrapping the root cause _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:178_ `kafkaTemplate.send().get()` throws `ExecutionException` wrapping the actual Kafka exception. When caught at line 178 (`catch (Exception sendError)`), `persistFailureAttempt` builds the error string at line 186 as `cause.getClass().getSimpleName() + ": " + cause.getMessage()`. The stored `last_error` will therefore read `"ExecutionException: org.apache.kafka.common.errors.TimeoutException: ..."` instead of `"TimeoutException: ..."`. Operators querying the table for triage see the wrapper class, not the root cause. **Fix:** In `persistFailureAttempt`, unwrap `ExecutionException` before building the message: ```java final Throwable root = (cause instanceof java.util.concurrent.ExecutionException && cause.getCause() != null) ? cause.getCause() : cause; final String message = root.getClass().getSimpleName() + ": " + ...; ``` #### **[MINOR]** Narrow `DataAccessException` catch lets `CannotCreateTransactionException` escape from `persistFailureAttempt` _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:306_ Line 306: ```java } catch (DataAccessException persistError) { ``` `completion.markFailureAttempt()` runs in a `REQUIRES_NEW` transaction. If the connection pool is exhausted or the datasource is unavailable, Spring throws `CannotCreateTransactionException` (a `TransactionException`), which is NOT a subclass of `DataAccessException`. That exception propagates out of the `afterCommit()` callback into Spring's transaction synchronisation infrastructure, where behaviour is undefined across versions (Spring may log-and-swallow or re-throw). **Fix:** ```java } catch (RuntimeException persistError) { ``` or at minimum: ```java } catch (DataAccessException | TransactionException persistError) { ``` #### **[MINOR]** OTel span outcome set to `success` before async ack arrives; span is closed before `whenComplete` fires _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:240_ Line 240 sets `span.setAttribute(ATTR_OUTCOME, "success")` immediately after `executeRunnable` returns — which, per finding B-1, happens before the Kafka ack. The `finally` block at lines 254-256 closes the span before the `whenComplete` callback runs on the Kafka I/O thread. Kafka async failures therefore produce spans with `outcome=success`, breaking the `im2be.outbox.publish.hot` trace for the dominant failure mode. **Fix:** Move the span start/end scope inside the `whenComplete` callback so the outcome attribute and span end reflect the actual async result. Alternatively, record outcome as a separate OTel event from within `whenComplete` and keep the parent span open until then (requires passing the span reference into the callback and calling `span.end()` from there). #### **[INFO]** Unbounded `List` overload of `findByStatusOrderByCreatedAtAsc` is a public footgun _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecordRepository.java:134_ ```java List<OutboxRecord> findByStatusOrderByCreatedAtAsc(OutboxRecord.Status status); ``` The Javadoc warns "callers that need bounded fetch MUST use the `Page` variant" but the unbounded overload remains `public` on the interface. A caller that accidentally imports this signature instead of the `Pageable` variant will issue a `SELECT * FROM outbox_records WHERE status = 'PENDING' ORDER BY created_at` against a potentially large table and load the full result set into heap. This risk is amplified by the backlog scenario in finding B-2 (thousands of permanently-stuck PENDING rows). **Fix:** Either remove the overload entirely (breaking-change flag before first GA) or annotate it `@Deprecated` with a Javadoc `@deprecated` directing callers to the bounded variant — making the footgun visible at compile time. ### Verdict **NEEDS_WORK** --- <sub>hib-pr-reviewer • round 1 • 7 findings (2M/4m/1i) • 2026-05-26T22:45:56.190Z → 2026-05-26T22:52:26.394Z • posted-as: pr-reviewer-bot • [bookkeeping fallback]</sub> </details>
fix(outbox): R2 — 7 R1 findings (2 MAJOR + 4 MINOR + 1 INFO)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 4s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
70271161af
F1 [MAJOR] async Kafka ack failures now feed the circuit-breaker. The hot
path switched from CircuitBreaker.executeRunnable (synchronous-only) to
explicit acquirePermission() + onSuccess/onError reconciliation inside
whenComplete. Async broker failures now bump the breaker's failure-rate
window; previously the breaker was blind to anything not thrown from the
synchronous send() call. New test asyncCallbackErrorFeedsCircuitBreakerOnError
proves NumberOfFailedCalls increments after async-completion failures.

F2 [MAJOR] permanently-undeliverable rows now transition to FAILED. Added
OutboxPollerWorker max-retries gate (default 10, configurable via
im2be.outbox.poller.max-retries). On the (retries >= maxRetries) attempt,
the worker calls OutboxPublishCompletion.markFailed (new) instead of
markFailureAttempt; the underlying repository.markFailed JPQL UPDATE moves
the row to Status.FAILED so subsequent findByStatusOrderByCreatedAtAsc
(PENDING) sweeps skip it. Span attribute outbox.transition=pending_to_failed
records the transition. Three new tests cover the threshold, retry-bump
under threshold, and proof that the FAILED row is invisible to the next
tick.

F3 [MINOR] unbounded .get() in the cold poller replaced with
.get(sendTimeoutMs, MILLISECONDS) using a new
im2be.outbox.poller.send-timeout-ms property (default 10000). On timeout
the row stays PENDING and retries++; new test sendTimeoutTreatedAsFailure
Attempt proves a never-completing future is caught as a TimeoutException
within the configured wait.

F4 [MINOR] ExecutionException unwrapped before storing last_error. New
static OutboxPublisher.formatLastError(Throwable) unwraps ExecutionException
.getCause() before formatting class+message, so last_error carries the root
cause's identity (e.g. 'RuntimeException: kafka-broker-down') rather than
the carrier wrapper. Truncation at OutboxRecord.LAST_ERROR_MAX_LENGTH
preserved. Both the hot path and the cold poller route through the new
helper; two new tests cover the unwrap + truncation explicitly.

F5 [MINOR] persistFailureAttempt catch widened from DataAccessException to
Exception in both OutboxPublisher and OutboxPollerWorker. Siblings such as
CannotCreateTransactionException (extends TransactionException, NOT
DataAccessException) previously escaped and masked the original send
failure. New OutboxPollerWorkerTest.persistFailureSwallowsRepositoryException
exercises this path with a doThrow CannotCreateTransactionException.

F6 [MINOR] OTel span im2be.outbox.publish.hot now stays open across the
async ack and is ended inside the whenComplete callback. The outcome
attribute reflects the ACTUAL post-ack result (success / kafka_error /
circuit_open), not just the synchronous send call's outcome. New
opentelemetry-sdk-testing test dep (1.43.0, matching API version) wires an
InMemorySpanExporter; two new tests
hotPublishSpanOutcomeReflectsAsyncSuccess and
hotPublishSpanOutcomeReflectsAsyncKafkaError verify the attribute reaches
the exporter after the async callback completes.

F7 [INFO] removed the unbounded findByStatusOrderByCreatedAtAsc(Status)
List overload from OutboxRecordRepository. Only the Page/Pageable variant
survives; callers needing a List can iterate the Page or chain pages
explicitly.

Verification:
- mvn -B -pl outbox-publisher -am clean install → BUILD SUCCESS
- zero [WARNING] lines (rule 62)
- 41 unit + 3 IT = 44 tests pass (was 29 unit + 3 IT = 32)

Superseded by round 3.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 2 — head 70271161aff4, base main, trigger synchronize

TL;DR: NEEDS_WORK — kept 4 findings (1 major, 2 minor, 1 info); all unique-to-one-reviewer but individually verified against HEAD; B-1 status-guard omission is a live correctness race that can silently promote FAILED rows to SENT.

Summary

Arbitration — Round 2

No prior Memora history found (Memora persistence encountered tag-allowlist restrictions; run summary write was attempted). This is treated as the first recorded arbitration for this PR, though the PR itself is at round 2 with all 7 round-1 findings already resolved by the author.

Approach: All 4 findings (2 unique-to-A, 2 unique-to-B) required individual verification via Read. None overlap between A and B.

A-unique findings

  • A-1 (minor, line 192, OutboxPollerWorker): CONFIRMED. tryColdPublish catches InterruptedException, re-sets the flag with Thread.currentThread().interrupt(), then returns false to the for-loop in pollAndPublishPending (line 141). The loop has no guard; every subsequent kafkaTemplate.send(...).get(sendTimeoutMs, MILLISECONDS) call on remaining batch records will see the sticky interrupt flag and throw InterruptedException immediately, triggering persistFailureOutcome on work never attempted. Kept.
  • A-2 (info, line 277, OutboxPublisher): CONFIRMED. scope opened at line 274, whenComplete registered at lines 276–284, scope.close() fires in the synchronous finally block at line 291 — before the Kafka I/O thread ever invokes the callback. handleAsyncAck therefore runs with no active OTel context on the I/O thread; child spans from markSent/markFailureAttempt inside it are trace roots. Kept.

B-unique findings

  • B-1 (major, line 89, OutboxRecordRepository): CONFIRMED. markSent WHERE clause (lines 87–89): WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId — no status predicate. markFailureAttempt (lines 112–114): same. The Javadoc at line 83 claims the method returns 0 for already-terminal rows, but the query does not enforce that. Two distinct races are live: (a) delayed hot-relay success ack can promote a FAILED row back to SENT; (b) poller wins markSent, then hot-relay error callback bumps retries/lastError on the now-SENT row. @Version-based optimistic locking does not protect JPQL bulk updates. Kept.
  • B-2 (minor, deferred, line 30, OutboxProperties): CONFIRMED. No @Validated annotation on the class. Fields batchSize, sendTimeoutMs, failureRateThreshold carry no @Min/@Max constraints; invalid values silently bind and cause obscure runtime failures. B explicitly deferred this to a follow-on PR. Kept (minor/deferred).

Summary: 4 findings kept (1 major, 2 minor, 1 info). B-1 is a correctness gap that can silently corrupt terminal row states in production; it is the blocking concern.

Blast Radius

The diff touches 19 files across the entire outbox-publisher module — publisher, poller, repository, completion helper, metrics binder, auto-configuration, and properties. This is the core library consumed by all 7 aim2be JPA services. The status-guard omission (B-1) affects every row that traverses the hot+cold relay concurrently, which is the normal operating path under any broker hiccup.

BLAST_SCORE: 6/10

Risk Indicators

Indicator Value
Sensitive functions markSent, markFailureAttempt, markFailed, tryColdPublish, handleAsyncAck, persistFailureOutcome
Migration touched
Test delta
Dependency changes

CI status (head 70271161aff4)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details
  • PR-PLATFORM-2 (processed-kafka-events — consumer-side dedup)
  • PR-PLATFORM-3 (ApicurioCodec — Avro encoding)

Findings (4)

[MAJOR] markSent and markFailureAttempt lack a status guard — delayed hot-relay callbacks can corrupt terminal rows

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecordRepository.java:89

Both JPQL bulk-update queries match only on (aggregateId, eventId) with no status predicate:

// lines 87-89
"UPDATE OutboxRecord o "
+ "SET o.status = :newStatus, o.sentAt = :sentAt, o.lastError = NULL "
+ "WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId"

// lines 112-114
"UPDATE OutboxRecord o "
+ "SET o.retries = o.retries + 1, o.lastError = :lastError "
+ "WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId"

The Javadoc on markSent (line 83) claims the method returns 0 rows updated when the row is already terminal, but the query does not enforce this.

Race A — FAILED → SENT resurrection: With maxRetries=10 and poller.interval-ms=5000 the poller can exhaust the retry budget and write FAILED in ~50 s. The hot relay's whenComplete fires within the Kafka producer's request.timeout.ms (default 30 s, configurable higher). If the ack arrives after budget exhaustion: poller writes FAILED → operator is alerted → hot-relay ack arrives → completion.markSent executes markSent → WHERE clause matches the FAILED row → row silently promoted to SENT with no log entry.

Race B — SENT row retries corruption: Poller wins markSent first; row is SENT. Hot-relay callback fires with sendError != nullpersistFailureAttemptmarkFailureAttempt increments retries and writes lastError on the already-SENT row.

Note: @Version-based optimistic locking does NOT protect JPQL bulk updates — it only applies to entity-level save/merge operations.

Fix: Add a status predicate to both queries:

// markSent
"WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId "
+ "AND o.status = com.aim2be.platform.outbox.OutboxRecord.Status.PENDING"

// markFailureAttempt — same guard
"WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId "
+ "AND o.status = com.aim2be.platform.outbox.OutboxRecord.Status.PENDING"

With these guards a delayed second write affects 0 rows, matching the idempotency the Javadoc already promises.

[MINOR] InterruptedException re-sets flag but loop continues — spurious retry increments on shutdown

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:192

// tryColdPublish lines 192-195
} catch (InterruptedException interrupted) {
    Thread.currentThread().interrupt();  // interrupt flag re-set HERE
    persistFailureOutcome(record, interrupted);
    return false;  // control returns to the for-loop in pollAndPublishPending
}

After Thread.currentThread().interrupt() the sticky interrupt flag is set on the thread. pollAndPublishPending (line 141) continues iterating the remaining records in batch. Every subsequent kafkaTemplate.send(...).get(sendTimeoutMs, MILLISECONDS) call will immediately throw InterruptedException — triggering persistFailureOutcome on each remaining row. On a graceful JVM shutdown with a large batch, many rows receive a spurious retries++ and lastError update for work that was never attempted.

Fix — add an interrupt guard at the loop entry in pollAndPublishPending:

for (final OutboxRecord record : batch) {
    if (Thread.currentThread().isInterrupted()) {
        LOG.info("outbox.poller.tick interrupted — stopping mid-batch");
        break;
    }
    if (tryColdPublish(record)) {
        successCount++;
    } else {
        failureCount++;
    }
}

[MINOR] OutboxProperties has no JSR-303 validation — out-of-range values cause confusing runtime failures

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxProperties.java:30

The @ConfigurationProperties class carries no @Validated annotation and no field-level @Min/@Max/@Positive constraints. The following bad values silently pass binding and then blow up at call-site runtime:

Property Invalid value Runtime failure
poller.batch-size=0 PageRequest.of(0, 0) IllegalArgumentException: Page size must not be less than one
poller.send-timeout-ms=0 .get(0, MILLISECONDS) immediate TimeoutException on every cold send
circuit-breaker.failure-rate-threshold=0 CircuitBreakerConfig.custom().failureRateThreshold(0) IllegalArgumentException from Resilience4j at startup
circuit-breaker.failure-rate-threshold=101 same same

Note: B deferred this finding to a follow-on hardening PR since defaults are all valid and there is no correctness regression introduced in this PR. Recommended fix is to add @Validated to the class and @Min/@Max constraints to affected fields.

[INFO] OTel context not propagated into the Kafka I/O callback — DB child spans appear orphaned in traces

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:277

final Scope scope = span.makeCurrent();   // line 274
try {
    kafkaTemplate.send(topic, aggregateKey(aggregateId), payload)
            .whenComplete((result, sendError) -> handleAsyncAck(..., span, sendError)); // line 277
} catch (RuntimeException syncError) {
    handleAsyncAck(...);
} finally {
    scope.close();  // fires BEFORE whenComplete — OTel context removed from calling thread
}

scope.close() removes the span from the calling thread's OTel context in the synchronous finally block. The Kafka I/O thread that invokes whenComplete has no active OTel context. When handleAsyncAck calls completion.markSent(...) or completion.markFailureAttempt(...), any child spans those methods create appear as trace roots, disconnected from the im2be.outbox.publish.hot parent.

This is not a correctness regression — the round-1 span-lifecycle fix (ending the span inside handleAsyncAck) was necessary and correct; this is the follow-on observability gap that fix exposed.

Fix — capture the context before the scope closes and wrap the callback:

final Scope scope = span.makeCurrent();
final io.opentelemetry.context.Context capturedCtx = io.opentelemetry.context.Context.current();
try {
    kafkaTemplate.send(topic, aggregateKey(aggregateId), payload)
            .whenComplete(capturedCtx.wrap((result, sendError) ->
                    handleAsyncAck(topic, aggregateId, eventId, startedAt, startedNanos, span, sendError)));
} catch (RuntimeException syncError) {
    handleAsyncAck(topic, aggregateId, eventId, startedAt, startedNanos, span, syncError);
} finally {
    scope.close();
}

Verdict

NEEDS_WORK


hib-pr-reviewer • round 2 • 4 findings (1M/2m/1i) • 2026-05-27T00:05:22.793Z → 2026-05-27T00:08:11.217Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 3._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 2** — head `70271161aff4`, base `main`, trigger `synchronize` **TL;DR:** NEEDS_WORK — kept 4 findings (1 major, 2 minor, 1 info); all unique-to-one-reviewer but individually verified against HEAD; B-1 status-guard omission is a live correctness race that can silently promote FAILED rows to SENT. ### Summary ## Arbitration — Round 2 No prior Memora history found (Memora persistence encountered tag-allowlist restrictions; run summary write was attempted). This is treated as the first recorded arbitration for this PR, though the PR itself is at round 2 with all 7 round-1 findings already resolved by the author. **Approach:** All 4 findings (2 unique-to-A, 2 unique-to-B) required individual verification via `Read`. None overlap between A and B. ### A-unique findings - **A-1 (minor, line 192, `OutboxPollerWorker`)**: CONFIRMED. `tryColdPublish` catches `InterruptedException`, re-sets the flag with `Thread.currentThread().interrupt()`, then returns `false` to the for-loop in `pollAndPublishPending` (line 141). The loop has no guard; every subsequent `kafkaTemplate.send(...).get(sendTimeoutMs, MILLISECONDS)` call on remaining batch records will see the sticky interrupt flag and throw `InterruptedException` immediately, triggering `persistFailureOutcome` on work never attempted. **Kept.** - **A-2 (info, line 277, `OutboxPublisher`)**: CONFIRMED. `scope` opened at line 274, `whenComplete` registered at lines 276–284, `scope.close()` fires in the synchronous `finally` block at line 291 — before the Kafka I/O thread ever invokes the callback. `handleAsyncAck` therefore runs with no active OTel context on the I/O thread; child spans from `markSent`/`markFailureAttempt` inside it are trace roots. **Kept.** ### B-unique findings - **B-1 (major, line 89, `OutboxRecordRepository`)**: CONFIRMED. `markSent` WHERE clause (lines 87–89): `WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId` — no status predicate. `markFailureAttempt` (lines 112–114): same. The Javadoc at line 83 claims the method returns 0 for already-terminal rows, but the query does not enforce that. Two distinct races are live: (a) delayed hot-relay success ack can promote a FAILED row back to SENT; (b) poller wins `markSent`, then hot-relay error callback bumps `retries`/`lastError` on the now-SENT row. `@Version`-based optimistic locking does not protect JPQL bulk updates. **Kept.** - **B-2 (minor, deferred, line 30, `OutboxProperties`)**: CONFIRMED. No `@Validated` annotation on the class. Fields `batchSize`, `sendTimeoutMs`, `failureRateThreshold` carry no `@Min`/`@Max` constraints; invalid values silently bind and cause obscure runtime failures. B explicitly deferred this to a follow-on PR. **Kept (minor/deferred).** **Summary:** 4 findings kept (1 major, 2 minor, 1 info). B-1 is a correctness gap that can silently corrupt terminal row states in production; it is the blocking concern. ### Blast Radius The diff touches 19 files across the entire outbox-publisher module — publisher, poller, repository, completion helper, metrics binder, auto-configuration, and properties. This is the core library consumed by all 7 aim2be JPA services. The status-guard omission (B-1) affects every row that traverses the hot+cold relay concurrently, which is the normal operating path under any broker hiccup. **BLAST_SCORE: 6/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `markSent`, `markFailureAttempt`, `markFailed`, `tryColdPublish`, `handleAsyncAck`, `persistFailureOutcome` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `70271161aff4`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/28/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/28/jobs/1) | ### Related PRs - PR-PLATFORM-2 (processed-kafka-events — consumer-side dedup) - PR-PLATFORM-3 (ApicurioCodec — Avro encoding) ### Findings (4) #### **[MAJOR]** markSent and markFailureAttempt lack a status guard — delayed hot-relay callbacks can corrupt terminal rows _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecordRepository.java:89_ Both JPQL bulk-update queries match only on `(aggregateId, eventId)` with no status predicate: ```java // lines 87-89 "UPDATE OutboxRecord o " + "SET o.status = :newStatus, o.sentAt = :sentAt, o.lastError = NULL " + "WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId" // lines 112-114 "UPDATE OutboxRecord o " + "SET o.retries = o.retries + 1, o.lastError = :lastError " + "WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId" ``` The Javadoc on `markSent` (line 83) claims the method returns 0 rows updated when the row is already terminal, but the query does not enforce this. **Race A — FAILED → SENT resurrection:** With `maxRetries=10` and `poller.interval-ms=5000` the poller can exhaust the retry budget and write `FAILED` in ~50 s. The hot relay's `whenComplete` fires within the Kafka producer's `request.timeout.ms` (default 30 s, configurable higher). If the ack arrives after budget exhaustion: poller writes `FAILED` → operator is alerted → hot-relay ack arrives → `completion.markSent` executes `markSent` → WHERE clause matches the **FAILED** row → row silently promoted to `SENT` with no log entry. **Race B — SENT row retries corruption:** Poller wins `markSent` first; row is `SENT`. Hot-relay callback fires with `sendError != null` → `persistFailureAttempt` → `markFailureAttempt` increments `retries` and writes `lastError` on the already-`SENT` row. **Note:** `@Version`-based optimistic locking does NOT protect JPQL bulk updates — it only applies to entity-level `save`/`merge` operations. **Fix:** Add a status predicate to both queries: ```java // markSent "WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId " + "AND o.status = com.aim2be.platform.outbox.OutboxRecord.Status.PENDING" // markFailureAttempt — same guard "WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId " + "AND o.status = com.aim2be.platform.outbox.OutboxRecord.Status.PENDING" ``` With these guards a delayed second write affects 0 rows, matching the idempotency the Javadoc already promises. #### **[MINOR]** InterruptedException re-sets flag but loop continues — spurious retry increments on shutdown _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:192_ ```java // tryColdPublish lines 192-195 } catch (InterruptedException interrupted) { Thread.currentThread().interrupt(); // interrupt flag re-set HERE persistFailureOutcome(record, interrupted); return false; // control returns to the for-loop in pollAndPublishPending } ``` After `Thread.currentThread().interrupt()` the sticky interrupt flag is set on the thread. `pollAndPublishPending` (line 141) continues iterating the remaining records in `batch`. Every subsequent `kafkaTemplate.send(...).get(sendTimeoutMs, MILLISECONDS)` call will immediately throw `InterruptedException` — triggering `persistFailureOutcome` on each remaining row. On a graceful JVM shutdown with a large batch, many rows receive a spurious `retries++` and `lastError` update for work that was never attempted. **Fix** — add an interrupt guard at the loop entry in `pollAndPublishPending`: ```java for (final OutboxRecord record : batch) { if (Thread.currentThread().isInterrupted()) { LOG.info("outbox.poller.tick interrupted — stopping mid-batch"); break; } if (tryColdPublish(record)) { successCount++; } else { failureCount++; } } ``` #### **[MINOR]** OutboxProperties has no JSR-303 validation — out-of-range values cause confusing runtime failures _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxProperties.java:30_ The `@ConfigurationProperties` class carries no `@Validated` annotation and no field-level `@Min`/`@Max`/`@Positive` constraints. The following bad values silently pass binding and then blow up at call-site runtime: | Property | Invalid value | Runtime failure | |---|---|---| | `poller.batch-size=0` | `PageRequest.of(0, 0)` | `IllegalArgumentException: Page size must not be less than one` | | `poller.send-timeout-ms=0` | `.get(0, MILLISECONDS)` | immediate `TimeoutException` on every cold send | | `circuit-breaker.failure-rate-threshold=0` | `CircuitBreakerConfig.custom().failureRateThreshold(0)` | `IllegalArgumentException` from Resilience4j at startup | | `circuit-breaker.failure-rate-threshold=101` | same | same | **Note:** B deferred this finding to a follow-on hardening PR since defaults are all valid and there is no correctness regression introduced in this PR. Recommended fix is to add `@Validated` to the class and `@Min`/`@Max` constraints to affected fields. #### **[INFO]** OTel context not propagated into the Kafka I/O callback — DB child spans appear orphaned in traces _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:277_ ```java final Scope scope = span.makeCurrent(); // line 274 try { kafkaTemplate.send(topic, aggregateKey(aggregateId), payload) .whenComplete((result, sendError) -> handleAsyncAck(..., span, sendError)); // line 277 } catch (RuntimeException syncError) { handleAsyncAck(...); } finally { scope.close(); // fires BEFORE whenComplete — OTel context removed from calling thread } ``` `scope.close()` removes the span from the calling thread's OTel context in the synchronous `finally` block. The Kafka I/O thread that invokes `whenComplete` has no active OTel context. When `handleAsyncAck` calls `completion.markSent(...)` or `completion.markFailureAttempt(...)`, any child spans those methods create appear as trace roots, disconnected from the `im2be.outbox.publish.hot` parent. This is not a correctness regression — the round-1 span-lifecycle fix (ending the span inside `handleAsyncAck`) was necessary and correct; this is the follow-on observability gap that fix exposed. **Fix** — capture the context before the scope closes and wrap the callback: ```java final Scope scope = span.makeCurrent(); final io.opentelemetry.context.Context capturedCtx = io.opentelemetry.context.Context.current(); try { kafkaTemplate.send(topic, aggregateKey(aggregateId), payload) .whenComplete(capturedCtx.wrap((result, sendError) -> handleAsyncAck(topic, aggregateId, eventId, startedAt, startedNanos, span, sendError))); } catch (RuntimeException syncError) { handleAsyncAck(topic, aggregateId, eventId, startedAt, startedNanos, span, syncError); } finally { scope.close(); } ``` ### Verdict **NEEDS_WORK** --- <sub>hib-pr-reviewer • round 2 • 4 findings (1M/2m/1i) • 2026-05-27T00:05:22.793Z → 2026-05-27T00:08:11.217Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R3 — 4 R2 findings (1 MAJOR status-guard race + 2 MINOR + 1 INFO)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
e039f4ba85
F1 [MAJOR] markSent / markFailureAttempt / markFailed lacked a status guard
— a delayed hot-relay callback could corrupt terminal rows by re-writing
a FAILED row back to SENT (or vice versa) when fired after the cold poller
had already transitioned the row. All three JPQL UPDATEs now carry a
WHERE … status = PENDING clause; the int rowcount return value lets
OutboxPublishCompletion log a DEBUG no-op when the guard fires (rowcount=0
= row already terminal, transition skipped). Two new test classes cover
the contract: OutboxPublishCompletionTest exercises the three transitions
under both the happy-path and rowcount=0 scenarios, proving the helper does
NOT throw on guard-fire. The repository-level guard is exercised end-to-end
via the existing pollerSweepsPreInsertedPendingRows IT.

F2 [MINOR] poller's InterruptedException handling spuriously incremented
the retries counter for unattempted rows on graceful shutdown. The send
loop now propagates InterruptedException out of tryColdPublish; the per-row
for-loop in pollAndPublishPending catches it, re-sets the interrupt flag
via Thread.currentThread().interrupt(), and breaks out immediately —
rows 2..N in the batch are NOT touched (no retries++, no markFailureAttempt,
no markFailed). New OTel attribute outbox.poll.interrupted=true on the
tick span gives forensic visibility. Two new tests:
interruptedExceptionBreaksBatchAndDoesNotIncrementUnattemptedRetries (3-row
batch, asserts ZERO markFailureAttempt / markFailed / markSent calls AND
exactly 1 kafkaTemplate.send) and tryColdPublishPropagatesInterruptedException
(unit-level contract proof).

F3 [MINOR] OutboxProperties carried no JSR-303 constraints — out-of-range
values crashed deep inside JPA Pageable or Resilience4j config at first
use. Added @Validated to the class + @Valid on the nested @Configuration
holders + jakarta-validation constraints on every numeric field:
@Min(1)@Max(10000) batchSize, @Min(100) intervalMs, @Min(1) maxRetries,
@Min(100) sendTimeoutMs, @Min(1)@Max(100) failureRateThreshold,
@Min(100) waitDurationInOpenStateMs. New spring-boot-starter-validation
dependency (transitively brings hibernate-validator per Spring Boot
3.5.14 BOM). New OutboxPropertiesValidationTest exercises the property
binding via ApplicationContextRunner — 11 tests covering each constraint's
fail-fast behaviour plus a defaults-pass sanity check.

F4 [INFO] OTel context was not propagated into the KafkaTemplate.send()
whenComplete callback (which runs on Kafka's internal sender thread). DB
child spans emitted by OutboxPublishCompletion.markSent inside the callback
appeared as orphan roots in traces, breaking the trace structure. Fixed by
capturing Context.current() AFTER attaching the hot-publish span as current
(captures the span-carrying context) and restoring it via
publishContext.makeCurrent() as the first action inside the whenComplete
lambda. New test hotPublishWhenCompleteRestoresOtelContextForDbCalls wires
an InMemorySpanExporter + a stub completion that records a child span
inside markSent; asserts the child span's traceId equals the parent
hot-publish span's traceId — proof of context propagation across the
async boundary.

Verification:
- mvn -B -pl outbox-publisher -am clean install → BUILD SUCCESS
- zero [WARNING] lines (rule 62)
- 58 unit + 3 IT = 61 tests pass (was 41 unit + 3 IT = 44)
- mvn -B -pl outbox-publisher verify → BUILD SUCCESS (IT untouched)

Superseded by round 4.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 3 — head e039f4ba8550, base main, trigger synchronize

TL;DR: BLOCKED — kept 3 findings (1 blocking @EntityScan entity-scan clobber, 1 major MeterRegistry no-guard, 1 minor isTerminal API footgun); all unique-to-one reviewer but each verified by direct Read of the cited file:line.

Summary

Arbitration — Round 3

Memora persistence was attempted but the server rejected all tag configurations (memora/issues not in allowlist; memory_absorb required an additional permission grant); the Memora step is a best-effort annotation and does not affect the verdict.

Verification performed

Finding Reviewer File Line Read? Result
@EntityScan clobbers consumer entity scanning A-only OutboxAutoConfiguration.java 65 Confirmed — @EntityScan(basePackageClasses = OutboxRecord.class) present at line 65; no ImportBeanDefinitionRegistrar workaround
outboxMetricsBinder unconditional MeterRegistry dep A-only OutboxAutoConfiguration.java 122 Confirmed — lines 121–127: MeterRegistry is a required constructor arg; no @ConditionalOnBean guard; micrometer-core is provided in pom.xml
isTerminal() returns false for FAILED B-only OutboxRecord.java 81 Confirmed — lines 81–83: return this == SENT || this == ARCHIVED; Javadoc at 75–79 explains intent but method name creates downstream footgun

All three findings are unique-to-one reviewer but all three are grounded in the actual source at the cited lines. Per Rule 2 they are kept.

No prior-round findings re-raised (all 11 R1+R2 findings confirmed resolved by both reviewers; arbitration defers to that shared assessment).

Blast Radius

This is a shared library consumed by all 7 aim2be JPA services. The @EntityScan defect strikes at JPA bootstrap and will silently drop every consumer's own domain entities from Hibernate scanning the moment the library is adopted — a fleet-wide breakage at startup. The MeterRegistry defect gates context startup for any consumer not running Actuator. Both issues manifest before any business code runs.

BLAST_SCORE: 8/10

Risk Indicators

Indicator Value
Sensitive functions outboxMetricsBinder, outboxLockProvider, OutboxPollerWorker.pollAndPublishPending
Migration touched
Test delta
Dependency changes

CI status (head e039f4ba8550)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings (3)

[BLOCKING] @EntityScan(basePackageClasses = OutboxRecord.class) silently drops consuming-service entities from Hibernate scanning

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:65

Line 65: @EntityScan(basePackageClasses = OutboxRecord.class) is present on the auto-configuration class.

Spring's JpaBaseConfiguration.getPackagesToScan() applies this logic:

List<String> packages = EntityScanPackages.get(beanFactory).getPackageNames();
if (packages.isEmpty() && AutoConfigurationPackages.has(beanFactory)) {
    packages = AutoConfigurationPackages.get(beanFactory); // @SpringBootApplication package
}

This annotation registers ["com.aim2be.platform.outbox"] into EntityScanPackages. Because that list is now non-empty, the AutoConfigurationPackages fallback — which carries the consuming service's root package — is never reached. Any of the 7 consuming services that do not declare their own @EntityScan will have all their domain entities invisible to Hibernate at boot, producing silent EntityNotFoundException / MappingException failures in production.

The integration test does not catch this because ITConfig (@SpringBootApplication) lives in the same package as OutboxRecord (com.aim2be.platform.outbox).

Fix — remove the @EntityScan annotation and register the package additively via an ImportBeanDefinitionRegistrar:

@Import(OutboxEntityScanRegistrar.class)
public class OutboxAutoConfiguration {  }

class OutboxEntityScanRegistrar implements ImportBeanDefinitionRegistrar {
    @Override
    public void registerBeanDefinitions(AnnotationMetadata meta, BeanDefinitionRegistry reg) {
        EntityScanPackages.register(reg, OutboxRecord.class.getPackageName());
    }
}

This mirrors how Spring Data itself adds packages without clobbering the consumer's scanning.

[MAJOR] outboxMetricsBinder has an unconditional hard dependency on MeterRegistry; consuming services without Actuator cannot start

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:122

Lines 121–127:

@Bean
public OutboxMetricsBinder outboxMetricsBinder(
        final io.micrometer.core.instrument.MeterRegistry meterRegistry,  // required, no guard
        final OutboxRecordRepository repository
) {
    return new OutboxMetricsBinder(meterRegistry, repository);
}

micrometer-core is provided scope in pom.xml (the consumer is expected to bring spring-boot-starter-actuator), but there is no @ConditionalOnBean(MeterRegistry.class) guard. If a consuming service enables the outbox (im2be.outbox.enabled=true) without Actuator on the classpath, Spring throws NoSuchBeanDefinitionException: No qualifying bean of type 'MeterRegistry' at context refresh — before any business code runs.

OutboxPublisher and OutboxPollerWorker both declare OutboxMetricsBinder as a required constructor argument, so the failure cascades.

Fix option 1 — conditional + no-op fallback:

@Bean
@ConditionalOnMissingBean
public OutboxMetricsBinder outboxNoOpMetricsBinder() {
    return OutboxMetricsBinder.noOp();
}

@Bean
@ConditionalOnBean(MeterRegistry.class)
public OutboxMetricsBinder outboxMetricsBinder(
        final MeterRegistry meterRegistry,
        final OutboxRecordRepository repository
) {
    return new OutboxMetricsBinder(meterRegistry, repository);
}

Fix option 2 — promote spring-boot-starter-actuator to compile scope in pom.xml and document it as a hard requirement, so the build fails fast for consumers that omit it.

[MINOR] isTerminal() returns false for FAILED — misleading public API contract for downstream consumers

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecord.java:81

Lines 81–83:

public boolean isTerminal() {
    return this == SENT || this == ARCHIVED;  // FAILED not included
}

The Javadoc at lines 75–79 defines the return contract as "the poller MUST NOT re-attempt the row" and notes that FAILED rows "await operator inspection". The intent is correct: the poller queries only WHERE status = PENDING, so FAILED rows are never auto-retried regardless of isTerminal().

However, the method name isTerminal() conventionally means "nothing more will happen to this record automatically". A consumer in any of the 7 downstream services writing if (!record.getStatus().isTerminal()) { processRow(record); } will incorrectly include FAILED rows in an automated working set — rows that should be left for operator inspection or manual reset.

Preferred fix — widen the condition and split the Javadoc intent:

public boolean isTerminal() {
    return this == SENT || this == ARCHIVED || this == FAILED;
}

and document FAILED separately as "terminal but operator-resettable (→ PENDING)". Alternatively rename to isAutoRetriable() returning this == PENDING for unambiguous poller semantics. At minimum, add a bold warning to the FAILED enum constant's Javadoc.

Verdict

BLOCKED


hib-pr-reviewer • round 3 • 3 findings (1b/1M/1m) • 2026-05-27T00:26:11.452Z → 2026-05-27T00:28:30.870Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 4._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 3** — head `e039f4ba8550`, base `main`, trigger `synchronize` **TL;DR:** BLOCKED — kept 3 findings (1 blocking @EntityScan entity-scan clobber, 1 major MeterRegistry no-guard, 1 minor isTerminal API footgun); all unique-to-one reviewer but each verified by direct Read of the cited file:line. ### Summary ## Arbitration — Round 3 Memora persistence was attempted but the server rejected all tag configurations (`memora/issues` not in allowlist; `memory_absorb` required an additional permission grant); the Memora step is a best-effort annotation and does not affect the verdict. ### Verification performed | Finding | Reviewer | File | Line | Read? | Result | |---------|----------|------|------|-------|--------| | `@EntityScan` clobbers consumer entity scanning | A-only | `OutboxAutoConfiguration.java` | 65 | ✅ | Confirmed — `@EntityScan(basePackageClasses = OutboxRecord.class)` present at line 65; no `ImportBeanDefinitionRegistrar` workaround | | `outboxMetricsBinder` unconditional `MeterRegistry` dep | A-only | `OutboxAutoConfiguration.java` | 122 | ✅ | Confirmed — lines 121–127: `MeterRegistry` is a required constructor arg; no `@ConditionalOnBean` guard; `micrometer-core` is `provided` in pom.xml | | `isTerminal()` returns `false` for `FAILED` | B-only | `OutboxRecord.java` | 81 | ✅ | Confirmed — lines 81–83: `return this == SENT \|\| this == ARCHIVED`; Javadoc at 75–79 explains intent but method name creates downstream footgun | All three findings are unique-to-one reviewer but all three are grounded in the actual source at the cited lines. Per Rule 2 they are kept. No prior-round findings re-raised (all 11 R1+R2 findings confirmed resolved by both reviewers; arbitration defers to that shared assessment). ### Blast Radius This is a shared library consumed by all 7 aim2be JPA services. The `@EntityScan` defect strikes at JPA bootstrap and will silently drop every consumer's own domain entities from Hibernate scanning the moment the library is adopted — a fleet-wide breakage at startup. The `MeterRegistry` defect gates context startup for any consumer not running Actuator. Both issues manifest before any business code runs. **BLAST_SCORE: 8/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `outboxMetricsBinder`, `outboxLockProvider`, `OutboxPollerWorker.pollAndPublishPending` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `e039f4ba8550`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/35/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/35/jobs/1) | ### Findings (3) #### **[BLOCKING]** `@EntityScan(basePackageClasses = OutboxRecord.class)` silently drops consuming-service entities from Hibernate scanning _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:65_ Line 65: `@EntityScan(basePackageClasses = OutboxRecord.class)` is present on the auto-configuration class. Spring's `JpaBaseConfiguration.getPackagesToScan()` applies this logic: ```java List<String> packages = EntityScanPackages.get(beanFactory).getPackageNames(); if (packages.isEmpty() && AutoConfigurationPackages.has(beanFactory)) { packages = AutoConfigurationPackages.get(beanFactory); // @SpringBootApplication package } ``` This annotation registers `["com.aim2be.platform.outbox"]` into `EntityScanPackages`. Because that list is now **non-empty**, the `AutoConfigurationPackages` fallback — which carries the consuming service's root package — is **never reached**. Any of the 7 consuming services that do not declare their own `@EntityScan` will have all their domain entities invisible to Hibernate at boot, producing silent `EntityNotFoundException` / `MappingException` failures in production. The integration test does not catch this because `ITConfig` (`@SpringBootApplication`) lives in the same package as `OutboxRecord` (`com.aim2be.platform.outbox`). **Fix** — remove the `@EntityScan` annotation and register the package additively via an `ImportBeanDefinitionRegistrar`: ```java @Import(OutboxEntityScanRegistrar.class) public class OutboxAutoConfiguration { … } class OutboxEntityScanRegistrar implements ImportBeanDefinitionRegistrar { @Override public void registerBeanDefinitions(AnnotationMetadata meta, BeanDefinitionRegistry reg) { EntityScanPackages.register(reg, OutboxRecord.class.getPackageName()); } } ``` This mirrors how Spring Data itself adds packages without clobbering the consumer's scanning. #### **[MAJOR]** `outboxMetricsBinder` has an unconditional hard dependency on `MeterRegistry`; consuming services without Actuator cannot start _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:122_ Lines 121–127: ```java @Bean public OutboxMetricsBinder outboxMetricsBinder( final io.micrometer.core.instrument.MeterRegistry meterRegistry, // required, no guard final OutboxRecordRepository repository ) { return new OutboxMetricsBinder(meterRegistry, repository); } ``` `micrometer-core` is `provided` scope in `pom.xml` (the consumer is expected to bring `spring-boot-starter-actuator`), but there is no `@ConditionalOnBean(MeterRegistry.class)` guard. If a consuming service enables the outbox (`im2be.outbox.enabled=true`) without Actuator on the classpath, Spring throws `NoSuchBeanDefinitionException: No qualifying bean of type 'MeterRegistry'` at context refresh — before any business code runs. `OutboxPublisher` and `OutboxPollerWorker` both declare `OutboxMetricsBinder` as a required constructor argument, so the failure cascades. **Fix option 1** — conditional + no-op fallback: ```java @Bean @ConditionalOnMissingBean public OutboxMetricsBinder outboxNoOpMetricsBinder() { return OutboxMetricsBinder.noOp(); } @Bean @ConditionalOnBean(MeterRegistry.class) public OutboxMetricsBinder outboxMetricsBinder( final MeterRegistry meterRegistry, final OutboxRecordRepository repository ) { return new OutboxMetricsBinder(meterRegistry, repository); } ``` **Fix option 2** — promote `spring-boot-starter-actuator` to `compile` scope in `pom.xml` and document it as a hard requirement, so the build fails fast for consumers that omit it. #### **[MINOR]** `isTerminal()` returns `false` for `FAILED` — misleading public API contract for downstream consumers _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecord.java:81_ Lines 81–83: ```java public boolean isTerminal() { return this == SENT || this == ARCHIVED; // FAILED not included } ``` The Javadoc at lines 75–79 defines the return contract as _"the poller MUST NOT re-attempt the row"_ and notes that `FAILED` rows _"await operator inspection"_. The intent is correct: the poller queries only `WHERE status = PENDING`, so FAILED rows are never auto-retried regardless of `isTerminal()`. However, the method name `isTerminal()` conventionally means _"nothing more will happen to this record automatically"_. A consumer in any of the 7 downstream services writing `if (!record.getStatus().isTerminal()) { processRow(record); }` will incorrectly include FAILED rows in an automated working set — rows that should be left for operator inspection or manual reset. **Preferred fix** — widen the condition and split the Javadoc intent: ```java public boolean isTerminal() { return this == SENT || this == ARCHIVED || this == FAILED; } ``` and document `FAILED` separately as _"terminal but operator-resettable (→ PENDING)"_. Alternatively rename to `isAutoRetriable()` returning `this == PENDING` for unambiguous poller semantics. At minimum, add a bold warning to the FAILED enum constant's Javadoc. ### Verdict **BLOCKED** --- <sub>hib-pr-reviewer • round 3 • 3 findings (1b/1M/1m) • 2026-05-27T00:26:11.452Z → 2026-05-27T00:28:30.870Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R4 — 3 R3 findings (1 BLOCKING @EntityScan clobber + 1 MAJOR + 1 MINOR)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
4e34df6dbe
R3 verdict findings (kept=3):

(1) [BLOCKING] OutboxAutoConfiguration.java:65 — @EntityScan(basePackageClasses = OutboxRecord.class) OVERRIDES consumer's @SpringBootApplication entity scan. Spring's JpaBaseConfiguration.getPackagesToScan() consults EntityScanPackages first and falls back to AutoConfigurationPackages ONLY when EntityScanPackages is empty. Adding @EntityScan in a SHARED LIBRARY therefore silently drops every consumer-domain entity from Hibernate scanning at startup — fleet-fatal silent breakage.

Fix: Remove @EntityScan; add new OutboxAutoConfigurationPackagesRegistrar implements ImportBeanDefinitionRegistrar that calls AutoConfigurationPackages.register(beanFactory, "com.aim2be.platform.outbox") — the supported library-side hook that APPENDS to the existing list rather than replacing it. @Import-ed from OutboxAutoConfiguration.

(2) [MAJOR] OutboxAutoConfiguration.java:122 — outboxMetricsBinder bean takes MeterRegistry directly with no @ConditionalOnBean guard. micrometer-core is `provided` scope in pom.xml, so consumers without Spring Boot Actuator (or Micrometer on the classpath) fail context startup. The publisher + poller both depend on this bean unconditionally so making it conditional would cascade-break them.

Fix: Switch outboxMetricsBinder ctor signature to take ObjectProvider<MeterRegistry> + pass meterRegistry.getIfAvailable() (nullable) to OutboxMetricsBinder. OutboxMetricsBinder gained a no-op mode: when constructed with a null registry, gauge registration is skipped and recordPublishLatency / recordFailure return immediately. Publisher + poller wiring unchanged. Coverage: new test nullRegistryYieldsNoOpBinder.

(3) [MINOR] OutboxRecord.java:81 — Status.isTerminal() returned false for FAILED, contradicting:
  - OutboxPollerWorker:238 comment ("FAILED terminally")
  - The poller's actual behavior (fetches PENDING-only via findAllByStatusOrderByCreatedAtAscEventIdAsc; FAILED rows are never re-attempted automatically)
  - Naming intuition for "isTerminal"

Fix: Include FAILED in isTerminal(). Updated Javadoc to clarify FAILED is terminal from the poller's perspective; operator inspection MAY manually reset FAILED → PENDING to re-enter the non-terminal lane. Test flipped: OutboxRecordTest now asserts FAILED.isTerminal() == true.

Verification:
- mvn -B -pl outbox-publisher test → 64 tests passed (3 new: registrar additivity + bare-consumer-contrast + null-registry no-op)
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]
- New files:
  - outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfigurationPackagesRegistrar.java
  - outbox-publisher/src/test/java/com/aim2be/platform/outbox/OutboxAutoConfigurationPackagesRegistrarTest.java
  - outbox-publisher/src/test/java/com/aim2be/platform/outbox/fakeconsumer/FakeConsumerApplication.java

Ripple-effect inspection (rule 63):
- @EntityScan removal: OutboxAutoConfiguration's class Javadoc updated; @Import(...Registrar.class) replaces @EntityScan; the registrar runs as part of the autoconfig's BeanDefinitionRegistryPostProcessor pass, so JpaBaseConfiguration sees the appended package via AutoConfigurationPackages fallback. No other call sites to validate.
- MeterRegistry conditional: outboxMetricsBinder's only callers (outboxPublisher + outboxPollerWorker) take the binder by type; they continue to work whether the binder is wired against a real registry or running in no-op mode.
- isTerminal change: searched all callsites — single internal callsite is the OutboxRecordTest; updated.

Memora #3531 captures the @EntityScan-as-shared-lib-anti-pattern lesson for future shared-Spring-Boot-library work.

Superseded by round 5.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 4 — head 4e34df6dbeee, base main, trigger synchronize

TL;DR: NEEDS_WORK — kept 2 unique-to-B findings (1 major circuit-breaker config bypass, 1 minor public API leakage); all R1-R3 findings confirmed closed; Reviewer A's @EnableJpaRepositories note not escalated (A explicitly dropped it from their own findings).

Summary

Reconciliation: Round 4

Reviewer A returned 0 findings (JSON []); A's prose noted a potential @EnableJpaRepositories shadow issue but explicitly chose not to raise it as a structured finding. Reviewer B returned 2 unique-to-A findings. Both were verified via Read against HEAD.

B1 (Major) — KEPT

Verified in OutboxAutoConfiguration.java (lines 81-93, 154-169) and OutboxPublisher.java (lines 127-141):

  • outboxCircuitBreakerRegistry is @ConditionalOnMissingBean (line 82), so it is skipped when a consumer provides any CircuitBreakerRegistry bean.
  • outboxPublisher at line 158 injects whichever CircuitBreakerRegistry is in context and passes it directly to the OutboxPublisher(registry,…) constructor overload.
  • That constructor calls circuitBreakerRegistry.circuitBreaker(CIRCUIT_BREAKER_NAME) at line 137 — a Resilience4j lazy-create call that uses the registry's default config when no im2be-outbox-relay breaker is pre-registered, silently ignoring every im2be.outbox.circuit-breaker.* property the operator configured. The finding is accurate and the cited line matches reality.

B2 (Minor) — KEPT

Verified in OutboxRecordRepository.java (lines 92-103) and OutboxPublishCompletion.java (lines 64-65):

  • markSent is a public interface method accepting @Param("newStatus") OutboxRecord.Status newStatus.
  • JPQL at line 94 unconditionally writes o.status = :newStatus, o.sentAt = :sentAt — a caller passing Status.FAILED would persist a non-null sentAt on a never-delivered row.
  • OutboxPublishCompletion correctly hard-codes Status.SENT at its call site, but the public repository API exposes the unsafe overload to all downstream consumers.

A's @EnableJpaRepositories note — NOT raised

A's prose noted this but explicitly filed no finding (JSON []). The annotation is scoped to basePackageClasses = OutboxRecordRepository.class (line 70), limiting scan to the library's own package; the OutboxAutoConfigurationPackagesRegistrar handles entity scanning additively. Dropping this per A's own verdict.

Memora persistence attempted (3 calls); MCP permissions / tag allowlist prevented writes — run summary could not be stored.

Blast Radius

The major finding is in the library's auto-configuration wiring layer (OutboxAutoConfiguration), which is the entry point for all 7 consuming services that opt in via im2be.outbox.enabled=true. A consumer that provides their own Resilience4j registry — a common pattern in multi-library services — will silently receive a circuit breaker with default thresholds regardless of environment-tuned properties. The minor finding (OutboxRecordRepository) affects a public interface method but practical blast radius is limited since OutboxPublishCompletion is the sole intended caller within the library.

BLAST_SCORE: 5/10

Risk Indicators

Indicator Value
Sensitive functions outboxCircuitBreakerRegistry, outboxPublisher, OutboxPublisher.<init>(OutboxRecordRepository,KafkaTemplate,CircuitBreakerRegistry,OutboxPublishCompletion,OpenTelemetry,OutboxMetricsBinder)
Migration touched
Test delta
Dependency changes

CI status (head 4e34df6dbeee)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings (2)

[MAJOR] im2be.outbox.circuit-breaker.* properties silently ignored when consumer provides their own CircuitBreakerRegistry

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:163

The outboxCircuitBreakerRegistry bean is @ConditionalOnMissingBean (line 82), so it is skipped when a consuming service already declares a CircuitBreakerRegistry bean. The outboxPublisher factory (lines 154-169) injects whatever CircuitBreakerRegistry is present and passes it to the OutboxPublisher(registry,…) constructor overload (line 163). That overload calls circuitBreakerRegistry.circuitBreaker(CIRCUIT_BREAKER_NAME) at OutboxPublisher.java:137, which lazily creates the im2be-outbox-relay breaker using the consumer registry's default configuration — silently ignoring every im2be.outbox.circuit-breaker.failure-rate-threshold / wait-duration-in-open-state-ms property the operator set.

Fix: in outboxPublisher(), always configure the outbox breaker explicitly:

final CircuitBreakerConfig config = CircuitBreakerConfig.custom()
    .failureRateThreshold(properties.getCircuitBreaker().getFailureRateThreshold())
    .waitDurationInOpenState(
        Duration.ofMillis(properties.getCircuitBreaker().getWaitDurationInOpenStateMs()))
    .build();
final CircuitBreaker breaker =
    circuitBreakerRegistry.circuitBreaker(OutboxPublisher.CIRCUIT_BREAKER_NAME, config);
return new OutboxPublisher(repository, kafkaTemplate, breaker, completion, openTelemetry, metrics);

Add OutboxProperties properties as a parameter to outboxPublisher(). This is safe even when the library's own registry is in use — Resilience4j ignores the supplied config when the named breaker already exists, so the outboxCircuitBreakerRegistry pre-registration remains a useful warm-up step but is no longer the sole config source.

[MINOR] markSent exposes newStatus as a public parameter — allows callers to set sentAt on a non-SENT transition

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecordRepository.java:98

The method signature at lines 98-103 accepts @Param("newStatus") OutboxRecord.Status newStatus. The JPQL at line 94 writes o.status = :newStatus, o.sentAt = :sentAt unconditionally; a caller passing Status.ARCHIVED or Status.FAILED would persist a non-null sentAt on a row that was never successfully delivered. OutboxPublishCompletion hard-codes Status.SENT at line 65, but OutboxRecordRepository is a public interface — any downstream consumer of this library can call markSent(id1, id2, Status.FAILED, Instant.now()) without a compile-time error.

Fix: remove the newStatus parameter; hardcode SENT in the JPQL:

@Query("UPDATE OutboxRecord o "
    + "SET o.status = com.aim2be.platform.outbox.OutboxRecord.Status.SENT, "
    + "o.sentAt = :sentAt, o.lastError = NULL "
    + "WHERE o.aggregateId = :aggregateId "
    + "AND o.eventId = :eventId "
    + "AND o.status = com.aim2be.platform.outbox.OutboxRecord.Status.PENDING")
int markSent(
    @Param("aggregateId") UUID aggregateId,
    @Param("eventId")     UUID eventId,
    @Param("sentAt")      Instant sentAt
);

Update OutboxPublishCompletion.markSent() to drop the extra argument.

Verdict

NEEDS_WORK


hib-pr-reviewer • round 4 • 2 findings (1M/1m) • 2026-05-27T00:48:57.057Z → 2026-05-27T00:51:30.789Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 5._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 4** — head `4e34df6dbeee`, base `main`, trigger `synchronize` **TL;DR:** NEEDS_WORK — kept 2 unique-to-B findings (1 major circuit-breaker config bypass, 1 minor public API leakage); all R1-R3 findings confirmed closed; Reviewer A's @EnableJpaRepositories note not escalated (A explicitly dropped it from their own findings). ### Summary ## Reconciliation: Round 4 Reviewer A returned 0 findings (JSON `[]`); A's prose noted a potential `@EnableJpaRepositories` shadow issue but explicitly chose not to raise it as a structured finding. Reviewer B returned 2 unique-to-A findings. Both were verified via `Read` against HEAD. ### B1 (Major) — KEPT ✅ Verified in `OutboxAutoConfiguration.java` (lines 81-93, 154-169) and `OutboxPublisher.java` (lines 127-141): - `outboxCircuitBreakerRegistry` is `@ConditionalOnMissingBean` (line 82), so it is skipped when a consumer provides any `CircuitBreakerRegistry` bean. - `outboxPublisher` at line 158 injects whichever `CircuitBreakerRegistry` is in context and passes it directly to the `OutboxPublisher(registry,…)` constructor overload. - That constructor calls `circuitBreakerRegistry.circuitBreaker(CIRCUIT_BREAKER_NAME)` at line 137 — a Resilience4j lazy-create call that uses the **registry's default config** when no `im2be-outbox-relay` breaker is pre-registered, silently ignoring every `im2be.outbox.circuit-breaker.*` property the operator configured. The finding is accurate and the cited line matches reality. ### B2 (Minor) — KEPT ✅ Verified in `OutboxRecordRepository.java` (lines 92-103) and `OutboxPublishCompletion.java` (lines 64-65): - `markSent` is a `public` interface method accepting `@Param("newStatus") OutboxRecord.Status newStatus`. - JPQL at line 94 unconditionally writes `o.status = :newStatus, o.sentAt = :sentAt` — a caller passing `Status.FAILED` would persist a non-null `sentAt` on a never-delivered row. - `OutboxPublishCompletion` correctly hard-codes `Status.SENT` at its call site, but the public repository API exposes the unsafe overload to all downstream consumers. ### A's @EnableJpaRepositories note — NOT raised A's prose noted this but explicitly filed no finding (JSON `[]`). The annotation is scoped to `basePackageClasses = OutboxRecordRepository.class` (line 70), limiting scan to the library's own package; the `OutboxAutoConfigurationPackagesRegistrar` handles entity scanning additively. Dropping this per A's own verdict. Memora persistence attempted (3 calls); MCP permissions / tag allowlist prevented writes — run summary could not be stored. ### Blast Radius The major finding is in the library's auto-configuration wiring layer (`OutboxAutoConfiguration`), which is the entry point for all 7 consuming services that opt in via `im2be.outbox.enabled=true`. A consumer that provides their own Resilience4j registry — a common pattern in multi-library services — will silently receive a circuit breaker with default thresholds regardless of environment-tuned properties. The minor finding (`OutboxRecordRepository`) affects a public interface method but practical blast radius is limited since `OutboxPublishCompletion` is the sole intended caller within the library. **BLAST_SCORE: 5/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `outboxCircuitBreakerRegistry`, `outboxPublisher`, `OutboxPublisher.<init>(OutboxRecordRepository,KafkaTemplate,CircuitBreakerRegistry,OutboxPublishCompletion,OpenTelemetry,OutboxMetricsBinder)` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `4e34df6dbeee`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/38/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/38/jobs/1) | ### Findings (2) #### **[MAJOR]** `im2be.outbox.circuit-breaker.*` properties silently ignored when consumer provides their own `CircuitBreakerRegistry` _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:163_ The `outboxCircuitBreakerRegistry` bean is `@ConditionalOnMissingBean` (line 82), so it is skipped when a consuming service already declares a `CircuitBreakerRegistry` bean. The `outboxPublisher` factory (lines 154-169) injects whatever `CircuitBreakerRegistry` is present and passes it to the `OutboxPublisher(registry,…)` constructor overload (line 163). That overload calls `circuitBreakerRegistry.circuitBreaker(CIRCUIT_BREAKER_NAME)` at `OutboxPublisher.java:137`, which lazily creates the `im2be-outbox-relay` breaker using the consumer registry's **default** configuration — silently ignoring every `im2be.outbox.circuit-breaker.failure-rate-threshold` / `wait-duration-in-open-state-ms` property the operator set. **Fix**: in `outboxPublisher()`, always configure the outbox breaker explicitly: ```java final CircuitBreakerConfig config = CircuitBreakerConfig.custom() .failureRateThreshold(properties.getCircuitBreaker().getFailureRateThreshold()) .waitDurationInOpenState( Duration.ofMillis(properties.getCircuitBreaker().getWaitDurationInOpenStateMs())) .build(); final CircuitBreaker breaker = circuitBreakerRegistry.circuitBreaker(OutboxPublisher.CIRCUIT_BREAKER_NAME, config); return new OutboxPublisher(repository, kafkaTemplate, breaker, completion, openTelemetry, metrics); ``` Add `OutboxProperties properties` as a parameter to `outboxPublisher()`. This is safe even when the library's own registry is in use — Resilience4j ignores the supplied config when the named breaker already exists, so the `outboxCircuitBreakerRegistry` pre-registration remains a useful warm-up step but is no longer the sole config source. #### **[MINOR]** `markSent` exposes `newStatus` as a public parameter — allows callers to set `sentAt` on a non-SENT transition _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecordRepository.java:98_ The method signature at lines 98-103 accepts `@Param("newStatus") OutboxRecord.Status newStatus`. The JPQL at line 94 writes `o.status = :newStatus, o.sentAt = :sentAt` unconditionally; a caller passing `Status.ARCHIVED` or `Status.FAILED` would persist a non-null `sentAt` on a row that was never successfully delivered. `OutboxPublishCompletion` hard-codes `Status.SENT` at line 65, but `OutboxRecordRepository` is a `public` interface — any downstream consumer of this library can call `markSent(id1, id2, Status.FAILED, Instant.now())` without a compile-time error. **Fix**: remove the `newStatus` parameter; hardcode `SENT` in the JPQL: ```java @Query("UPDATE OutboxRecord o " + "SET o.status = com.aim2be.platform.outbox.OutboxRecord.Status.SENT, " + "o.sentAt = :sentAt, o.lastError = NULL " + "WHERE o.aggregateId = :aggregateId " + "AND o.eventId = :eventId " + "AND o.status = com.aim2be.platform.outbox.OutboxRecord.Status.PENDING") int markSent( @Param("aggregateId") UUID aggregateId, @Param("eventId") UUID eventId, @Param("sentAt") Instant sentAt ); ``` Update `OutboxPublishCompletion.markSent()` to drop the extra argument. ### Verdict **NEEDS_WORK** --- <sub>hib-pr-reviewer • round 4 • 2 findings (1M/1m) • 2026-05-27T00:48:57.057Z → 2026-05-27T00:51:30.789Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R5 — 2 R4 findings (1 MAJOR config-bypass + 1 MINOR markSent API footgun)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 3s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
746ba9a66b
R4 verdict findings (kept=2):

(1) [MAJOR] OutboxAutoConfiguration.java:81-93 + OutboxPublisher.java:127-141 — outboxCircuitBreakerRegistry is @ConditionalOnMissingBean; when a consumer provides their own CircuitBreakerRegistry, the breaker named "im2be-outbox-relay" gets lazy-created via the OutboxPublisher registry-overload constructor (registry.circuitBreaker(NAME)) with the consumer-registry's DEFAULT config, silently ignoring every im2be.outbox.circuit-breaker.* property.

Fix: introduce a dedicated `outboxCircuitBreaker` bean that builds an EXPLICIT CircuitBreakerConfig from OutboxProperties and calls registry.circuitBreaker(NAME, config) — this creates-or-returns the named breaker with our explicit config regardless of registry source. The outboxPublisher bean now takes CircuitBreaker directly (not the registry). Library no longer imposes its outbox-specific thresholds on the registry's default config (the conditional registry bean reverts to ofDefaults()) so consumers' OTHER breakers in the same registry aren't accidentally configured against our policy. The registry-overload constructor on OutboxPublisher is REMOVED — the public API no longer carries the footgun shape.

(2) [MINOR] OutboxRecordRepository.java:92-103 — `int markSent(aggregateId, eventId, @Param("newStatus") Status newStatus, sentAt)` accepts the target status as a method parameter; a caller passing Status.FAILED would persist a non-null sentAt on a never-delivered row. OutboxPublishCompletion hard-coded Status.SENT internally, but the public repository API exposed the unsafe overload.

Fix: drop the newStatus parameter; hardcode OutboxRecord.Status.SENT inside the JPQL `SET o.status = ...SENT`. Callers no longer have any path to write sentAt with a non-SENT status. Distinct lifecycle transitions belong on distinct repository methods. Updated OutboxPublishCompletion + OutboxPublishCompletionTest stubs.

Verification:
- mvn -B -pl outbox-publisher test → 64 tests passed
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- CircuitBreaker bean rewiring: registry bean is now ofDefaults() to avoid contaminating consumer's other breakers with our 50% failure-rate default; outbox-specific config lives on the named breaker only. The OutboxPublisher direct-CircuitBreaker ctor was already present (the registry-overload was the offender), so the publisher's runtime behavior is unchanged when fed the same breaker — only the construction path differs.
- markSent signature change: searched all callsites — OutboxPublishCompletion is the only caller in src/main; tests adjusted to match the new 3-arg overload. No external consumer was depending on the unsafe newStatus parameter at HEAD.
- Removed OutboxPublisher's registry-overload constructor — its only autoconfig user is gone, and the tests now use the direct CircuitBreaker constructor matching the autoconfig path.

Memora #3531 + #3524 capture the lessons catalogue.

Superseded by round 6.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 5 — head 746ba9a66bc6, base main, trigger synchronize

TL;DR: CONDITIONAL_APPROVE — kept 1 agreed finding (SpEL ShedLock bypass + README gap) and 1 verified unique-to-B finding (stale retries snapshot concurrency bug); no findings dropped; all 16 prior-round issues confirmed resolved.

Summary

Arbitration — Round 5

Memora note: Memory persistence failed (tool permission denied); run history not stored. Reconciliation proceeded on in-session evidence.

Verification performed

  • OutboxPollerWorker.java lines 115–165: confirmed @SchedulerLock at lines 121-123 reads both lock-at-most and lock-at-least via raw SpEL (${im2be.outbox.poller.lock-at-most:PT1M} / ${im2be.outbox.poller.lock-at-least:PT0S}).
  • OutboxProperties.java: grep for lockAtMost, lockAtLeast, lock-at-most, lock-at-least returned no matches — no typed fields, no JSR-303 constraints.
  • README.md line 157: lock-at-most present in config table; lock-at-least absent.
  • OutboxPollerWorker.java lines 250-255: final int currentRetries = record.getRetries() at line 253 confirmed as in-memory batch-load snapshot used for budgetExhausted evaluation.
  • OutboxRecordRepository.java markFailureAttempt JPQL: SET o.retries = o.retries + 1 — DB-side arithmetic, guarded only on status = PENDING, not on retries count. Confirms that a concurrent hot-relay call between batch fetch and persistFailureOutcome can push DB retries to maxRetries + 1 before the poller evaluates the stale budget.
  • OutboxPublishCompletion.java: no additional guards on retries in markFailureAttempt path.

Reconciliation decisions

Finding A B Action
@SchedulerLock SpEL bypass / README gap ✓ (line 121) ✓ (line 123) KEEP — both agree; B's citation (line 123) is more precise for lock-at-least specifically; B's body more technically accurate (SpEL-from-bean not viable for annotation constants → doc fix is primary mitigation)
Stale record.getRetries() at line 253 KEEP — unique-to-B, fully verified: stale snapshot + unguarded markFailureAttempt JPQL confirms the one-over-budget race

No prior-run findings re-raised. No findings dropped.

Blast Radius

Changes are confined to a single library module (outbox-publisher) but the module is infrastructure shared by all 7 aim2be JPA services. Both findings are in OutboxPollerWorker — the cold-path sweep component — and do not touch the hot-relay or entity write paths. Blast radius is moderate: a misconfigured ShedLock duration or a stale-retries race would affect the poller tick on any consumer service, but the impact is bounded to one extra retry per window rather than data loss or unavailability.

BLAST_SCORE: 4/10

Risk Indicators

Indicator Value
Sensitive functions pollAndPublishPending, persistFailureOutcome, markFailureAttempt, markFailed
Migration touched
Test delta
Dependency changes

CI status (head 746ba9a66bc6)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details
  • PR-PLATFORM-2 (processed-kafka-events / consumer-side dedup)
  • PR-PLATFORM-3 (ApicurioCodec / Avro encoding)

Findings (2)

[MINOR] lock-at-most and lock-at-least bypass OutboxProperties/@Validated; lock-at-least absent from README

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:123

@SchedulerLock(name = LOCK_NAME,
        lockAtMostFor = "${im2be.outbox.poller.lock-at-most:PT1M}",
        lockAtLeastFor = "${im2be.outbox.poller.lock-at-least:PT0S}")

Neither lock-at-most nor lock-at-least has a typed field in OutboxProperties.Poller — a grep across the entire OutboxProperties.java file returns no matches for either name. This means:

  1. No @Validated / JSR-303 protection. An invalid ISO-8601 string (e.g. lock-at-most=5000, missing the PT prefix) survives startup and only produces a cryptic DateTimeParseException from ShedLock on the first tick, not a BindValidationException at context initialisation.
  2. lock-at-least is entirely absent from the README configuration reference table (line 157 of README.md lists lock-at-most but not lock-at-least), giving operators no documented discovery path.

Important constraint: @SchedulerLock annotation members are compile-time constants — they cannot source values via #{@outboxProperties.poller.lockAtMostFor} SpEL bean-reference syntax; only ${property.key} placeholder resolution is available. Adding typed fields to OutboxProperties.Poller is therefore aspirational (you cannot wire the annotation member to the bean field at runtime). The actionable fix is documentation:

  • Add lock-at-least to the README configuration reference table alongside lock-at-most.
  • Add a comment on or near the @SchedulerLock annotation noting that both values must be valid ISO-8601 Duration strings (e.g. PT30S, not 30000).
  • Optionally add a @PostConstruct or SmartInitializingSingleton validation in OutboxPollerWorker that parses and validates the resolved placeholder strings at startup so misconfiguration surfaces before the first tick.

[MINOR] Stale in-memory retries snapshot allows one over-budget retry under concurrent hot + cold relay

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:253

final int currentRetries = record.getRetries();   // ← snapshot from batch load (lines 126-130)
final int maxRetries = properties.getPoller().getMaxRetries();
final boolean budgetExhausted = currentRetries >= maxRetries;

record.getRetries() reflects the value at the moment findByStatusOrderByCreatedAtAsc ran. A concurrent hot-relay markFailureAttempt call (SET o.retries = o.retries + 1 — JPQL guarded only on status = PENDING, not on retries count) that fires between the batch fetch and this budget-exhaustion check increments the DB counter without the poller seeing it.

Concrete race:

  1. DB retries = N−1 (one below maxRetries = N).
  2. Batch load: record.getRetries() → N−1.
  3. Hot-relay markFailureAttempt fires: DB retries becomes N.
  4. Poller evaluates: (N-1) >= N → false → calls markFailureAttempt again → DB retries becomes N+1.
  5. Row is now PENDING with retries = N+1 = maxRetries+1, violating the maxRetries contract.
  6. Next tick: fresh fetch sees N+1 ≥ N → correctly calls markFailed.

Impact is bounded (self-corrects on next tick, row does not become permanently stuck), but the maxRetries property contract is silently violated by one attempt per stale-read window, and last_error reflects an extra failure attempt.

Suggested fix: Move the budget-exhaustion decision to the JPQL layer so the DB evaluates the current retries value atomically:

-- markFailedIfBudgetExhausted
UPDATE OutboxRecord o
SET o.status = FAILED, o.lastError = :lastError
WHERE o.aggregateId = :aggregateId
AND o.eventId = :eventId
AND o.status = PENDING
AND o.retries >= :maxRetries

-- markFailureAttemptIfBudgetRemaining
UPDATE OutboxRecord o
SET o.retries = o.retries + 1, o.lastError = :lastError
WHERE o.aggregateId = :aggregateId
AND o.eventId = :eventId
AND o.status = PENDING
AND o.retries < :maxRetries

The return value of each (0 = no-op) tells the poller which transition was applied, eliminating the stale-snapshot race entirely.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 5 • 2 findings (2m) • 2026-05-27T01:02:25.798Z → 2026-05-27T01:05:13.485Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 6._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 5** — head `746ba9a66bc6`, base `main`, trigger `synchronize` **TL;DR:** CONDITIONAL_APPROVE — kept 1 agreed finding (SpEL ShedLock bypass + README gap) and 1 verified unique-to-B finding (stale retries snapshot concurrency bug); no findings dropped; all 16 prior-round issues confirmed resolved. ### Summary ## Arbitration — Round 5 **Memora note:** Memory persistence failed (tool permission denied); run history not stored. Reconciliation proceeded on in-session evidence. ### Verification performed - `OutboxPollerWorker.java` lines 115–165: confirmed `@SchedulerLock` at lines 121-123 reads both `lock-at-most` and `lock-at-least` via raw SpEL (`${im2be.outbox.poller.lock-at-most:PT1M}` / `${im2be.outbox.poller.lock-at-least:PT0S}`). - `OutboxProperties.java`: grep for `lockAtMost`, `lockAtLeast`, `lock-at-most`, `lock-at-least` returned **no matches** — no typed fields, no JSR-303 constraints. - `README.md` line 157: `lock-at-most` present in config table; `lock-at-least` **absent**. - `OutboxPollerWorker.java` lines 250-255: `final int currentRetries = record.getRetries()` at line 253 confirmed as in-memory batch-load snapshot used for `budgetExhausted` evaluation. - `OutboxRecordRepository.java` `markFailureAttempt` JPQL: `SET o.retries = o.retries + 1` — DB-side arithmetic, guarded only on `status = PENDING`, **not** on retries count. Confirms that a concurrent hot-relay call between batch fetch and `persistFailureOutcome` can push DB retries to `maxRetries + 1` before the poller evaluates the stale budget. - `OutboxPublishCompletion.java`: no additional guards on retries in `markFailureAttempt` path. ### Reconciliation decisions | Finding | A | B | Action | |---|---|---|---| | `@SchedulerLock` SpEL bypass / README gap | ✓ (line 121) | ✓ (line 123) | **KEEP** — both agree; B's citation (line 123) is more precise for `lock-at-least` specifically; B's body more technically accurate (SpEL-from-bean not viable for annotation constants → doc fix is primary mitigation) | | Stale `record.getRetries()` at line 253 | — | ✓ | **KEEP** — unique-to-B, fully verified: stale snapshot + unguarded `markFailureAttempt` JPQL confirms the one-over-budget race | No prior-run findings re-raised. No findings dropped. ### Blast Radius Changes are confined to a single library module (`outbox-publisher`) but the module is infrastructure shared by all 7 aim2be JPA services. Both findings are in `OutboxPollerWorker` — the cold-path sweep component — and do not touch the hot-relay or entity write paths. Blast radius is moderate: a misconfigured ShedLock duration or a stale-retries race would affect the poller tick on any consumer service, but the impact is bounded to one extra retry per window rather than data loss or unavailability. **BLAST_SCORE: 4/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `pollAndPublishPending`, `persistFailureOutcome`, `markFailureAttempt`, `markFailed` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `746ba9a66bc6`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/39/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/39/jobs/1) | ### Related PRs - PR-PLATFORM-2 (processed-kafka-events / consumer-side dedup) - PR-PLATFORM-3 (ApicurioCodec / Avro encoding) ### Findings (2) #### **[MINOR]** `lock-at-most` and `lock-at-least` bypass `OutboxProperties`/@Validated; `lock-at-least` absent from README _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:123_ ```java @SchedulerLock(name = LOCK_NAME, lockAtMostFor = "${im2be.outbox.poller.lock-at-most:PT1M}", lockAtLeastFor = "${im2be.outbox.poller.lock-at-least:PT0S}") ``` Neither `lock-at-most` nor `lock-at-least` has a typed field in `OutboxProperties.Poller` — a grep across the entire `OutboxProperties.java` file returns no matches for either name. This means: 1. **No `@Validated` / JSR-303 protection.** An invalid ISO-8601 string (e.g. `lock-at-most=5000`, missing the `PT` prefix) survives startup and only produces a cryptic `DateTimeParseException` from ShedLock on the first tick, not a `BindValidationException` at context initialisation. 2. **`lock-at-least` is entirely absent from the README configuration reference table** (line 157 of README.md lists `lock-at-most` but not `lock-at-least`), giving operators no documented discovery path. **Important constraint:** `@SchedulerLock` annotation members are compile-time constants — they cannot source values via `#{@outboxProperties.poller.lockAtMostFor}` SpEL bean-reference syntax; only `${property.key}` placeholder resolution is available. Adding typed fields to `OutboxProperties.Poller` is therefore aspirational (you cannot wire the annotation member to the bean field at runtime). **The actionable fix is documentation:** - Add `lock-at-least` to the README configuration reference table alongside `lock-at-most`. - Add a comment on or near the `@SchedulerLock` annotation noting that both values must be valid ISO-8601 Duration strings (e.g. `PT30S`, not `30000`). - Optionally add a `@PostConstruct` or `SmartInitializingSingleton` validation in `OutboxPollerWorker` that parses and validates the resolved placeholder strings at startup so misconfiguration surfaces before the first tick. #### **[MINOR]** Stale in-memory `retries` snapshot allows one over-budget retry under concurrent hot + cold relay _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:253_ ```java final int currentRetries = record.getRetries(); // ← snapshot from batch load (lines 126-130) final int maxRetries = properties.getPoller().getMaxRetries(); final boolean budgetExhausted = currentRetries >= maxRetries; ``` `record.getRetries()` reflects the value at the moment `findByStatusOrderByCreatedAtAsc` ran. A concurrent hot-relay `markFailureAttempt` call (`SET o.retries = o.retries + 1` — JPQL guarded only on `status = PENDING`, **not** on retries count) that fires between the batch fetch and this budget-exhaustion check increments the DB counter without the poller seeing it. **Concrete race:** 1. DB retries = N−1 (one below `maxRetries = N`). 2. Batch load: `record.getRetries()` → N−1. 3. Hot-relay `markFailureAttempt` fires: DB retries becomes N. 4. Poller evaluates: `(N-1) >= N` → false → calls `markFailureAttempt` again → DB retries becomes N+1. 5. **Row is now PENDING with retries = N+1 = maxRetries+1**, violating the `maxRetries` contract. 6. Next tick: fresh fetch sees N+1 ≥ N → correctly calls `markFailed`. Impact is bounded (self-corrects on next tick, row does not become permanently stuck), but the `maxRetries` property contract is silently violated by one attempt per stale-read window, and `last_error` reflects an extra failure attempt. **Suggested fix:** Move the budget-exhaustion decision to the JPQL layer so the DB evaluates the current `retries` value atomically: ```sql -- markFailedIfBudgetExhausted UPDATE OutboxRecord o SET o.status = FAILED, o.lastError = :lastError WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId AND o.status = PENDING AND o.retries >= :maxRetries -- markFailureAttemptIfBudgetRemaining UPDATE OutboxRecord o SET o.retries = o.retries + 1, o.lastError = :lastError WHERE o.aggregateId = :aggregateId AND o.eventId = :eventId AND o.status = PENDING AND o.retries < :maxRetries ``` The return value of each (0 = no-op) tells the poller which transition was applied, eliminating the stale-snapshot race entirely. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 5 • 2 findings (2m) • 2026-05-27T01:02:25.798Z → 2026-05-27T01:05:13.485Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R6 — 2 R5 findings (ShedLock typed properties + atomic retry-budget race)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
44a18b19cb
R5 verdict findings (kept=2):

(1) [AGREED A+B] OutboxPollerWorker:121-123 + README — @SchedulerLock reads `${im2be.outbox.poller.lock-at-most:PT1M}` and `${im2be.outbox.poller.lock-at-least:PT0S}` via raw property-placeholder SpEL (annotation values can NOT be bound from a @ConfigurationProperties bean at the annotation level — Spring resolves annotation strings via the Environment's PropertyResolver). OutboxProperties.Poller had NO typed fields for either knob and README.md documented only `lock-at-most`.

Fix: added typed `String lockAtMost` (default "PT1M") + `String lockAtLeast` (default "PT0S") to OutboxProperties.Poller with @Pattern JSR-303 validation enforcing ISO-8601 duration format (PT30S, PT1M, PT5M30S, etc.). The typed fields are the documentation + binding-time validation surface — operators see them in the bound OutboxProperties bean and an `im2be.outbox.poller.lock-at-most=not-a-duration` startup property fails binding fast instead of silently propagating to @SchedulerLock as raw text. Added 3 new property-validation tests. README config table updated to document `lock-at-least` (plus `max-retries` + `send-timeout-ms` rows that were missing).

(2) [UNIQUE-TO-B] OutboxPollerWorker:253 — `final int currentRetries = record.getRetries()` read the IN-MEMORY batch-load snapshot of retries; OutboxRecordRepository.markFailureAttempt's JPQL did unguarded `SET o.retries = o.retries + 1` against the DB. A concurrent hot-relay failure between batch fetch and persistFailureOutcome could push DB retries past the snapshot value, leaving the poller to compute `budgetExhausted=false` when the DB already showed `>= maxRetries` — one extra increment past budget.

Fix: refactored OutboxRecordRepository.markFailureAttempt to a single atomic JPQL UPDATE with a CASE-WHEN that increments retries AND conditionally transitions the row to FAILED when `retries + 1 >= maxRetries`. The budget is now evaluated INSIDE the UPDATE statement against the database's CURRENT retries value (not against any in-memory snapshot). markFailureAttempt signature now takes maxRetries; piped through OutboxPublishCompletion + OutboxPollerWorker + OutboxPublisher (which gained an OutboxProperties dep so the hot-relay path uses the same atomic semantics).

The Java-side `budgetExhausted` decision in persistFailureOutcome is removed entirely; the in-memory `record.getRetries()` is used only as best-effort input for the log line + Span attribute (`pending_retry` vs `pending_to_failed`). Both findings' fixes use the same configured `im2be.outbox.poller.max-retries` value.

Verification:
- mvn -B -pl outbox-publisher test → 67 tests passed (3 new validation tests: lockAtMostMalformedFailsBinding, lockAtLeastMalformedFailsBinding, lockDurationsOverridePassValidation; defaultsPassValidation extended to assert ShedLock defaults)
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- markFailureAttempt signature changed everywhere — repo, completion, poller, publisher all updated. Test stubs across OutboxPublishCompletionTest, OutboxPollerWorkerTest, OutboxPublisherTest updated with 4-arg matcher signatures.
- rowAtMaxRetriesTransitionsToFailed + failedRowsAreNotReSweptByPoller tests rewritten: previous shape asserted poller called markFailed() at budget exhaustion. New shape asserts markFailureAttempt() with maxRetries arg — the atomic JPQL handles the transition.
- markFailed() repo + completion methods preserved for direct operator-driven terminal transitions (operator runbook reset path); poller no longer calls markFailed() in the normal failure flow.

Superseded by round 7.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 6 — head 44a18b19cb22, base main, trigger synchronize

TL;DR: CONDITIONAL_APPROVE — kept 1 agreed minor finding (dead markFailed public API); all prior-round findings confirmed resolved.

Summary

Arbitration — Round 6

Memora recall: no prior run history found (both queries returned 0 results). Memora persist attempted but failed due to tag permission errors; run summary could not be stored.

Verification performed:

  • Read OutboxPublishCompletion.java lines 120–141 → confirmed markFailed exists at line 129 exactly as cited by both A and B.
  • Read OutboxRecordRepository.java lines 185–204 → confirmed markFailed exists at lines 192–203 (B's cite range) / line 199 (method declaration).
  • Grep for markFailed across all .java files → confirmed zero production callers; all production-path hits are the two method definitions themselves plus a Javadoc reference in OutboxProperties.java:110; test-side hits are exclusively in OutboxPublishCompletionTest (unit-tests of the method itself) and four verify(completion, never()).markFailed(...) assertions in OutboxPollerWorkerTest at lines 160, 183, 259, 329 — exactly as cited by both reviewers.

Reconciliation outcome: 1 agreed finding kept (Rule 1 — both A and B cite the same file, same line, materially-equivalent body). No unique-to-one findings to adjudicate. Severity preserved as minor.

Blast Radius

The diff is self-contained within the outbox-publisher submodule. The dead markFailed methods are on an internal completion helper and a Spring Data repository interface — both unexported outside the module's own package. Impact is limited to future contributors who might mis-use the orphaned surface before it is removed.

BLAST_SCORE: 3/10

Risk Indicators

Indicator Value
Sensitive functions OutboxPublishCompletion.markFailureAttempt, OutboxPollerWorker.pollAndPublishPending, OutboxPublisher.tryHotPublish
Migration touched
Test delta
Dependency changes

CI status (head 44a18b19cb22)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings (1)

[MINOR] markFailed is orphaned dead public API — bypasses atomic retry-budget enforcement

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublishCompletion.java:129

OutboxPublishCompletion.markFailed (line 129–140) and OutboxRecordRepository.markFailed (line 199–203) are public methods with zero production callers. The FAILED terminal transition is now handled atomically inside markFailureAttempt's JPQL CASE WHEN o.retries + 1 >= :maxRetries THEN FAILED clause. Neither OutboxPollerWorker nor OutboxPublisher ever invokes completion.markFailed(…) — this is explicitly confirmed by four verify(completion, never()).markFailed(…) assertions in OutboxPollerWorkerTest (lines 160, 183, 259, 329).

Leaving these methods on the public surface is a footgun: a future contributor who encounters markFailed might use it to force-terminate a row directly, silently bypassing the atomic retry counter increment in markFailureAttempt. That produces a FAILED row whose retries column was never bumped — inconsistent state that undermines the operator runbook and any alerting that keys on retries.

Recommended fix: remove OutboxPublishCompletion.markFailed (lines 129–140), OutboxRecordRepository.markFailed (lines 192–203), and the corresponding OutboxPublishCompletionTest cases. If an admin escape-hatch for forced terminal transitions is needed in the future, it should be a distinct, loudly-documented admin-only API that accepts an explicit retries-bump argument so accounting stays consistent.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 6 • 1 finding (1m) • 2026-05-27T01:20:14.613Z → 2026-05-27T01:21:39.664Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 7._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 6** — head `44a18b19cb22`, base `main`, trigger `synchronize` **TL;DR:** CONDITIONAL_APPROVE — kept 1 agreed minor finding (dead `markFailed` public API); all prior-round findings confirmed resolved. ### Summary ## Arbitration — Round 6 **Memora recall**: no prior run history found (both queries returned 0 results). Memora persist attempted but failed due to tag permission errors; run summary could not be stored. **Verification performed**: - `Read` `OutboxPublishCompletion.java` lines 120–141 → confirmed `markFailed` exists at line 129 exactly as cited by both A and B. - `Read` `OutboxRecordRepository.java` lines 185–204 → confirmed `markFailed` exists at lines 192–203 (B's cite range) / line 199 (method declaration). - `Grep` for `markFailed` across all `.java` files → confirmed zero production callers; all production-path hits are the two method definitions themselves plus a Javadoc reference in `OutboxProperties.java:110`; test-side hits are exclusively in `OutboxPublishCompletionTest` (unit-tests of the method itself) and four `verify(completion, never()).markFailed(...)` assertions in `OutboxPollerWorkerTest` at lines 160, 183, 259, 329 — exactly as cited by both reviewers. **Reconciliation outcome**: 1 agreed finding kept (Rule 1 — both A and B cite the same file, same line, materially-equivalent body). No unique-to-one findings to adjudicate. Severity preserved as `minor`. ### Blast Radius The diff is self-contained within the `outbox-publisher` submodule. The dead `markFailed` methods are on an internal completion helper and a Spring Data repository interface — both unexported outside the module's own package. Impact is limited to future contributors who might mis-use the orphaned surface before it is removed. **BLAST_SCORE: 3/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `OutboxPublishCompletion.markFailureAttempt`, `OutboxPollerWorker.pollAndPublishPending`, `OutboxPublisher.tryHotPublish` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `44a18b19cb22`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/40/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/40/jobs/1) | ### Findings (1) #### **[MINOR]** `markFailed` is orphaned dead public API — bypasses atomic retry-budget enforcement _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublishCompletion.java:129_ `OutboxPublishCompletion.markFailed` (line 129–140) and `OutboxRecordRepository.markFailed` (line 199–203) are public methods with zero production callers. The FAILED terminal transition is now handled atomically inside `markFailureAttempt`'s JPQL `CASE WHEN o.retries + 1 >= :maxRetries THEN FAILED` clause. Neither `OutboxPollerWorker` nor `OutboxPublisher` ever invokes `completion.markFailed(…)` — this is explicitly confirmed by four `verify(completion, never()).markFailed(…)` assertions in `OutboxPollerWorkerTest` (lines 160, 183, 259, 329). Leaving these methods on the public surface is a footgun: a future contributor who encounters `markFailed` might use it to force-terminate a row directly, silently bypassing the atomic retry counter increment in `markFailureAttempt`. That produces a FAILED row whose `retries` column was never bumped — inconsistent state that undermines the operator runbook and any alerting that keys on `retries`. **Recommended fix**: remove `OutboxPublishCompletion.markFailed` (lines 129–140), `OutboxRecordRepository.markFailed` (lines 192–203), and the corresponding `OutboxPublishCompletionTest` cases. If an admin escape-hatch for forced terminal transitions is needed in the future, it should be a distinct, loudly-documented admin-only API that accepts an explicit `retries`-bump argument so accounting stays consistent. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 6 • 1 finding (1m) • 2026-05-27T01:20:14.613Z → 2026-05-27T01:21:39.664Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R7 — remove dead markFailed methods (1 R6 MINOR)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
bd90d41ed1
R6 verdict findings (kept=1):

(1) [MINOR] OutboxPublishCompletion.java:129 + OutboxRecordRepository.java:199 — `markFailed` is dead public API after R6's atomic markFailureAttempt refactor. The R6 JPQL UPDATE evaluates the retry budget atomically via CASE-WHEN and transitions to FAILED in the same statement; no production code path calls markFailed anymore. The Spring Data repository method + the @Transactional REQUIRES_NEW helper on the completion class were both unreachable orphaned surface.

Fix: removed `OutboxPublishCompletion.markFailed(...)` + `OutboxRecordRepository.markFailed(...)` entirely. Operator-driven FAILED→PENDING reset (the runbook recovery path) uses raw SQL, NOT this API — and that runbook is documented in the JPQL operator-runbook comment that survives in `markFailureAttempt`. Removed the 2 unit tests in OutboxPublishCompletionTest that exercised the removed method, plus the 4 `verify(completion, never()).markFailed(...)` assertions in OutboxPollerWorkerTest that were asserting the negative on a method that no longer exists. Updated OutboxProperties.java's stale Javadoc reference (`{@link OutboxRecordRepository#markFailed}` → markFailureAttempt's atomic CASE-WHEN).

Verification:
- mvn -B -pl outbox-publisher test → 65 tests passed (down from 67: 2 removed markFailed-specific tests + removed never-verify clutter)
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- Searched for all production callers via `grep -rn "markFailed"`: zero hits outside the now-deleted method definitions + the removed test stubs. The R6 refactor genuinely orphaned the method; removing it tightens the public API surface as the reviewer flagged.
- The operator runbook comment in markFailureAttempt's Javadoc preserves the recovery path documentation; consumer services do not lose any actual capability.

Superseded by round 8.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 7 — head bd90d41ed142, base main, trigger synchronize

TL;DR: NO_NEW_FINDINGS — No new findings this round.

Summary

[quorum-converged] A=0 = B=0. ## Round 7 review

All 19 prior-round findings are resolved. The diff addresses: async CB accounting (R1), FAILED-status transition via JPQL CASE-WHEN (R1), bounded .get(timeout,…) (R1), ExecutionException unwrapping (R1), broad Exception catch in persistFailureAttempt (R1), span closed after async ack (R1), pageable-only repo method (R1), status = PENDING guards on markSent/markFailureAttempt (R2), InterruptedException breaks the batch loop (R2), JSR-303 @Validated on OutboxProperties (R2), OTel context propagated into the async callback (R2), OutboxAutoConfigurationPackagesRegistrar replacing @EntityScan (R3), ObjectProvider<MeterRegistry> no-op fallback (R3), isTerminal() now returns true for FAILED (R3), explicit CircuitBreakerConfig per properties (R4), markSent hard-codes SENT (R4), lockAtMost/lockAtLeast documented + pattern-validated (R5), atomic budget evaluation inside JPQL (R5), markFailed dead API removed (R6).

One new blocking finding: the @EnableJpaRepositories annotation left on OutboxAutoConfiguration silently backs off JpaRepositoriesAutoConfiguration for all consumer services.

CI status (head bd90d41ed142)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings

No new findings this round.

Quorum converged on empty findings (A + B both returned 0).

Verdict

NO_NEW_FINDINGS


hib-pr-reviewer • round 7 • 0 findings • 2026-05-27T01:24:25.970Z → 2026-05-27T01:31:15.793Z • posted-as: pr-reviewer-bot • [bookkeeping fallback]

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 8._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 7** — head `bd90d41ed142`, base `main`, trigger `synchronize` **TL;DR:** NO_NEW_FINDINGS — No new findings this round. ### Summary [quorum-converged] A=0 = B=0. ## Round 7 review **All 19 prior-round findings are resolved.** The diff addresses: async CB accounting (R1), FAILED-status transition via JPQL CASE-WHEN (R1), bounded `.get(timeout,…)` (R1), `ExecutionException` unwrapping (R1), broad `Exception` catch in `persistFailureAttempt` (R1), span closed after async ack (R1), pageable-only repo method (R1), `status = PENDING` guards on `markSent`/`markFailureAttempt` (R2), `InterruptedException` breaks the batch loop (R2), JSR-303 `@Validated` on `OutboxProperties` (R2), OTel context propagated into the async callback (R2), `OutboxAutoConfigurationPackagesRegistrar` replacing `@EntityScan` (R3), `ObjectProvider<MeterRegistry>` no-op fallback (R3), `isTerminal()` now returns `true` for `FAILED` (R3), explicit `CircuitBreakerConfig` per properties (R4), `markSent` hard-codes `SENT` (R4), `lockAtMost`/`lockAtLeast` documented + pattern-validated (R5), atomic budget evaluation inside JPQL (R5), `markFailed` dead API removed (R6). One new **blocking** finding: the `@EnableJpaRepositories` annotation left on `OutboxAutoConfiguration` silently backs off `JpaRepositoriesAutoConfiguration` for all consumer services. ### CI status (head `bd90d41ed142`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/41/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/41/jobs/1) | ### Findings **No new findings this round.** _Quorum converged on empty findings (A + B both returned 0)._ ### Verdict **NO_NEW_FINDINGS** --- <sub>hib-pr-reviewer • round 7 • 0 findings • 2026-05-27T01:24:25.970Z → 2026-05-27T01:31:15.793Z • posted-as: pr-reviewer-bot • [bookkeeping fallback]</sub> </details>
fix(outbox): R8 — remove @EnableJpaRepositories that disabled consumer repo scan
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
8978c78013
R7 verdict contradicted itself: TL;DR said NO_NEW_FINDINGS but the prose surfaced one new "blocking" observation. The observation is real and fleet-fatal — same shape as the R3 @EntityScan clobber, but for Spring Data JPA repositories.

The diagnosis:
- OutboxAutoConfiguration carried `@EnableJpaRepositories(basePackageClasses = OutboxRecordRepository.class)`.
- That annotation imports `JpaRepositoriesRegistrar`, which registers a `JpaRepositoryFactoryBean` + `JpaRepositoryConfigExtension` scoped to ONLY the library's package.
- Spring Boot's `JpaRepositoriesAutoConfiguration` is guarded by `@ConditionalOnMissingBean({ JpaRepositoryFactoryBean.class, JpaRepositoryConfigExtension.class })` — the library's registration trips the guard and silently disables the consumer's autoconfig.
- Result: every consumer service's own Spring Data repositories (in non-library packages) go unscanned at startup. Fleet-fatal silent breakage, same class as R3.

Fix: removed `@EnableJpaRepositories` from `OutboxAutoConfiguration`. The existing `OutboxAutoConfigurationPackagesRegistrar` (R3 fix) registers `com.aim2be.platform.outbox` with `AutoConfigurationPackages` ADDITIVELY. Spring Boot's `JpaRepositoriesAutoConfiguration` reads `AutoConfigurationPackages.get(...)` when it scans for `@Repository` beans, so `OutboxRecordRepository` is picked up alongside the consumer's own repos — without the library having to declare `@EnableJpaRepositories`, without disabling the consumer's autoconfig.

Updated the registrar's Javadoc to document both concerns (entity scan + repository scan) — they share the same underlying mechanism (`AutoConfigurationPackages.get(...)`), and library authors should NEVER apply `@EntityScan` OR `@EnableJpaRepositories` on autoconfig classes.

Verification:
- mvn -B -pl outbox-publisher test → 65 tests passed
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- OutboxRecordRepository.java is unchanged; it remains a Spring Data interface annotated effectively via auto-config scan.
- All existing tests + ITs continue to pass because the existing OutboxAutoConfigurationPackagesRegistrarTest already proves the additive package-registration shape works for the consumer-scan path.

Superseded by round 9.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 8 — head 8978c78013e6, base main, trigger synchronize

TL;DR: CONDITIONAL_APPROVE — kept 2 findings (1 agreed minor @EnableSchedulerLock default-leak, 1 unique-to-A info test coupling); all 19 prior-round findings confirmed resolved.

Summary

Round 8 Arbitration — 2 kept (1 agreed, 1 unique-to-A verified)

Prior findings: All 19 findings from rounds 1–6 confirmed resolved by both reviewers; no re-raises required.

Agreed finding (A+B): @EnableSchedulerLock(defaultLockAtMostFor = "PT1M") at OutboxAutoConfiguration.java:79 confirmed present — both reviewers independently raised it as minor. Kept.

Unique-to-A finding: ITConfig's @SpringBootApplication at OutboxPublisherIT.java:211 — verified: ITConfig is indeed annotated @SpringBootApplication (static inner class in package com.aim2be.platform.outbox), and FakeConsumerApplication is confirmed in com.aim2be.platform.outbox.fakeconsumer annotated @Configuration, so it will be component-scanned by ITConfig. Currently harmless (no entities/repos in fakeconsumer). Kept as info.

Memora: No prior run history found (Memora empty for this PR despite round 8); created new entry (id=297) with review_count=8 via fallback path.

Blast Radius

The @EnableSchedulerLock side-effect touches every consuming service that loads the auto-configuration without declaring their own @EnableSchedulerLock. This is an exported auto-configuration surface; the impact is application-wide ShedLock default behavior in every downstream service. The IT finding is test-scope only with zero production blast.

BLAST_SCORE: 6/10

Risk Indicators

Indicator Value
Sensitive functions OutboxAutoConfiguration.outboxCircuitBreaker, OutboxPollerWorker.pollAndPublishPending, OutboxPublisher.tryHotPublish
Migration touched
Test delta
Dependency changes

CI status (head 8978c78013e6)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details
  • PR-PLATFORM-2
  • PR-PLATFORM-3

Findings (2)

[MINOR] @EnableSchedulerLock(defaultLockAtMostFor = "PT1M") on the auto-configuration silently becomes the application-wide ShedLock default for consumers

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:79

The class-level @EnableSchedulerLock(defaultLockAtMostFor = "PT1M") on OutboxAutoConfiguration registers PT1M as the fallback lockAtMostFor for the entire application context. The library's own pollAndPublishPending() is unaffected because it always specifies an explicit lockAtMostFor = "${im2be.outbox.poller.lock-at-most:PT1M}". However, any consuming-service @SchedulerLock annotation that omits lockAtMostFor will silently inherit PT1M.

If the consumer also declares @EnableSchedulerLock on their @SpringBootApplication, Spring deduplicates the import and the consumer's attributes win. But a common first-time adoption pattern is to rely solely on the library's auto-configuration — in that case PT1M becomes an unconfigurable application-wide default that shortens whatever lock window the consumer intended.

The README's configuration-reference table documents im2be.outbox.poller.lock-at-most but does not disclose that loading the auto-configuration activates @EnableSchedulerLock globally.

Recommended fixes (in order of robustness):

  1. (Structural) Remove @EnableSchedulerLock from OutboxAutoConfiguration and instead register ShedLock's infrastructure beans programmatically with @ConditionalOnMissingBean, matching the pattern already used for LockProvider. This prevents any class-level default from leaking.
  2. (Minimal) Add a prominent note in the README under "How to enable": "The library activates @EnableSchedulerLock(defaultLockAtMostFor = "PT1M"); if your service has additional @SchedulerLock-annotated methods, always specify lockAtMostFor explicitly on those annotations, or declare your own @EnableSchedulerLock on @SpringBootApplication to control the application-wide default."

[INFO] ITConfig's @SpringBootApplication component-scans FakeConsumerApplication from the fakeconsumer sub-package

outbox-publisher/src/test/java/com/aim2be/platform/outbox/OutboxPublisherIT.java:211

OutboxPublisherIT.ITConfig is a @SpringBootApplication-annotated static class whose effective package is com.aim2be.platform.outbox. Spring Boot's default component scan covers all sub-packages — including com.aim2be.platform.outbox.fakeconsumer. FakeConsumerApplication there is @Configuration-annotated, so it is picked up by the IT context, triggering its @AutoConfigurationPackage (which registers com.aim2be.platform.outbox.fakeconsumer with AutoConfigurationPackages) and its @Import(OutboxAutoConfigurationPackagesRegistrar.class) (which re-registers com.aim2be.platform.outbox — idempotent).

Currently harmless because fakeconsumer contains no entities or repositories, but it creates unintended coupling: FakeConsumerApplication was authored exclusively for OutboxAutoConfigurationPackagesRegistrarTest and its presence in the IT context is a side-effect of package layout, not design.

Concrete fix (either):

  • (a) Move FakeConsumerApplication to a package that is not a sub-package of the IT scan root (e.g. com.aim2be.platform.outbox.test.fakeconsumer) and add an explicit exclusion filter in ITConfig.
  • (b) Restrict ITConfig's scan: @SpringBootApplication(scanBasePackages = "com.aim2be.platform.outbox") plus an explicit excludeFilters that excludes FakeConsumerApplication.

Appropriate for a follow-on test-hygiene PR; not a blocker.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 8 • 2 findings (1m/1i) • 2026-05-27T01:42:21.369Z → 2026-05-27T01:43:56.417Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 9._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 8** — head `8978c78013e6`, base `main`, trigger `synchronize` **TL;DR:** CONDITIONAL_APPROVE — kept 2 findings (1 agreed minor @EnableSchedulerLock default-leak, 1 unique-to-A info test coupling); all 19 prior-round findings confirmed resolved. ### Summary ## Round 8 Arbitration — 2 kept (1 agreed, 1 unique-to-A verified) **Prior findings:** All 19 findings from rounds 1–6 confirmed resolved by both reviewers; no re-raises required. **Agreed finding (A+B):** `@EnableSchedulerLock(defaultLockAtMostFor = "PT1M")` at `OutboxAutoConfiguration.java:79` confirmed present — both reviewers independently raised it as `minor`. Kept. **Unique-to-A finding:** `ITConfig`'s `@SpringBootApplication` at `OutboxPublisherIT.java:211` — verified: `ITConfig` is indeed annotated `@SpringBootApplication` (static inner class in package `com.aim2be.platform.outbox`), and `FakeConsumerApplication` is confirmed in `com.aim2be.platform.outbox.fakeconsumer` annotated `@Configuration`, so it will be component-scanned by `ITConfig`. Currently harmless (no entities/repos in `fakeconsumer`). Kept as `info`. **Memora:** No prior run history found (Memora empty for this PR despite round 8); created new entry (id=297) with `review_count=8` via fallback path. ### Blast Radius The `@EnableSchedulerLock` side-effect touches every consuming service that loads the auto-configuration without declaring their own `@EnableSchedulerLock`. This is an exported auto-configuration surface; the impact is application-wide ShedLock default behavior in every downstream service. The IT finding is test-scope only with zero production blast. **BLAST_SCORE: 6/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `OutboxAutoConfiguration.outboxCircuitBreaker`, `OutboxPollerWorker.pollAndPublishPending`, `OutboxPublisher.tryHotPublish` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `8978c78013e6`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/42/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/42/jobs/1) | ### Related PRs - PR-PLATFORM-2 - PR-PLATFORM-3 ### Findings (2) #### **[MINOR]** `@EnableSchedulerLock(defaultLockAtMostFor = "PT1M")` on the auto-configuration silently becomes the application-wide ShedLock default for consumers _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:79_ The class-level `@EnableSchedulerLock(defaultLockAtMostFor = "PT1M")` on `OutboxAutoConfiguration` registers `PT1M` as the fallback `lockAtMostFor` for the entire application context. The library's own `pollAndPublishPending()` is unaffected because it always specifies an explicit `lockAtMostFor = "${im2be.outbox.poller.lock-at-most:PT1M}"`. However, any consuming-service `@SchedulerLock` annotation that omits `lockAtMostFor` will silently inherit `PT1M`. If the consumer also declares `@EnableSchedulerLock` on their `@SpringBootApplication`, Spring deduplicates the import and the consumer's attributes win. But a common first-time adoption pattern is to rely solely on the library's auto-configuration — in that case `PT1M` becomes an unconfigurable application-wide default that shortens whatever lock window the consumer intended. The README's configuration-reference table documents `im2be.outbox.poller.lock-at-most` but does not disclose that loading the auto-configuration activates `@EnableSchedulerLock` globally. **Recommended fixes (in order of robustness):** 1. *(Structural)* Remove `@EnableSchedulerLock` from `OutboxAutoConfiguration` and instead register ShedLock's infrastructure beans programmatically with `@ConditionalOnMissingBean`, matching the pattern already used for `LockProvider`. This prevents any class-level default from leaking. 2. *(Minimal)* Add a prominent note in the README under "How to enable": *"The library activates `@EnableSchedulerLock(defaultLockAtMostFor = "PT1M")`; if your service has additional `@SchedulerLock`-annotated methods, always specify `lockAtMostFor` explicitly on those annotations, or declare your own `@EnableSchedulerLock` on `@SpringBootApplication` to control the application-wide default."* #### **[INFO]** `ITConfig`'s `@SpringBootApplication` component-scans `FakeConsumerApplication` from the `fakeconsumer` sub-package _outbox-publisher/src/test/java/com/aim2be/platform/outbox/OutboxPublisherIT.java:211_ `OutboxPublisherIT.ITConfig` is a `@SpringBootApplication`-annotated static class whose effective package is `com.aim2be.platform.outbox`. Spring Boot's default component scan covers all sub-packages — including `com.aim2be.platform.outbox.fakeconsumer`. `FakeConsumerApplication` there is `@Configuration`-annotated, so it is picked up by the IT context, triggering its `@AutoConfigurationPackage` (which registers `com.aim2be.platform.outbox.fakeconsumer` with `AutoConfigurationPackages`) and its `@Import(OutboxAutoConfigurationPackagesRegistrar.class)` (which re-registers `com.aim2be.platform.outbox` — idempotent). Currently harmless because `fakeconsumer` contains no entities or repositories, but it creates unintended coupling: `FakeConsumerApplication` was authored exclusively for `OutboxAutoConfigurationPackagesRegistrarTest` and its presence in the IT context is a side-effect of package layout, not design. **Concrete fix (either):** - (a) Move `FakeConsumerApplication` to a package that is not a sub-package of the IT scan root (e.g. `com.aim2be.platform.outbox.test.fakeconsumer`) and add an explicit exclusion filter in `ITConfig`. - (b) Restrict `ITConfig`'s scan: `@SpringBootApplication(scanBasePackages = "com.aim2be.platform.outbox")` plus an explicit `excludeFilters` that excludes `FakeConsumerApplication`. Appropriate for a follow-on test-hygiene PR; not a blocker. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 8 • 2 findings (1m/1i) • 2026-05-27T01:42:21.369Z → 2026-05-27T01:43:56.417Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R9 — 2 R8 findings (ShedLock default override + fakeconsumer scan-leak)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
8383bd152d
R8 verdict findings (kept=2):

(1) [MINOR, agreed A+B] OutboxAutoConfiguration.java:79 — `@EnableSchedulerLock(defaultLockAtMostFor = "PT1M")` hard-coded the PT1M context-wide default. Every consumer that loads the autoconfig without declaring their own `@EnableSchedulerLock` would have ShedLock apply PT1M to their `@SchedulerLock`-annotated methods that omit `lockAtMostFor`. The library was leaking its opinion into downstream services.

Fix: ShedLock 6.x requires `defaultLockAtMostFor` (the annotation declares it without a Java default value), so we cannot simply drop it. Switched to Spring's annotation-value property-placeholder: `defaultLockAtMostFor = "${im2be.outbox.scheduler-lock.default-at-most:PT1M}"`. The library still has a sensible fallback (PT1M) but consumers can override `im2be.outbox.scheduler-lock.default-at-most` to change it. The library's own `@SchedulerLock` on `pollAndPublishPending()` sets `lockAtMostFor` + `lockAtLeastFor` explicitly via separate property placeholders, so the default never affects the poller itself.

Documented the property + ShedLock interaction in OutboxAutoConfiguration's class Javadoc and the README config table.

(2) [INFO, unique-to-A] OutboxPublisherIT.java:211 — `ITConfig`'s `@SpringBootApplication` component-scans `com.aim2be.platform.outbox`, which includes the `fakeconsumer` subpackage where `FakeConsumerApplication` lives. Currently harmless (the IT context has no entities/repos that depend on the fakeconsumer's wiring), but a future-leak.

Fix: gated `FakeConsumerApplication` with `@ConditionalOnProperty(prefix = "im2be.outbox.test.fakeconsumer", name = "enabled", havingValue = "true")`. The IT's broad component scan still TOUCHES the class but the conditional keeps it out of the bean factory unless explicitly enabled. The OutboxAutoConfigurationPackagesRegistrarTest sets `im2be.outbox.test.fakeconsumer.enabled=true` on its ApplicationContextRunner so the additivity test still works.

Verification:
- mvn -B -pl outbox-publisher test → 65 tests passed
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- The placeholder-driven `defaultLockAtMostFor` value is resolved by Spring's annotation post-processor at @Configuration parsing time; the only consumer-visible behavior change is that the property is now respected.
- `@ConditionalOnProperty` on `FakeConsumerApplication` is a no-op when the property is absent (default) — Spring sees the class metadata, evaluates the condition false, skips bean registration. The IT context is unaffected. The registrar test explicitly enables the property.

Superseded by round 10.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 9 — head 8383bd152d07, base main, trigger synchronize

TL;DR: CONDITIONAL_APPROVE — kept 2 findings (1 agreed minor, 1 unique-to-A info verified); 21 prior-round findings remain resolved; no blocking issues.

Summary

Arbitration — Round 9

Memora recall: No prior run history found (tags allowlist prevented write; both create and absorb paths blocked by permission/tag constraints).

Finding reconciliation:

Finding 1 — AGREED (A+B), kept: Both reviewers independently flag that im2be.outbox.scheduler-lock.default-at-most is consumed via annotation placeholder at OutboxAutoConfiguration.java:94 but has no corresponding field in OutboxProperties. Verified with Read + Grep: OutboxProperties.java contains zero matches for schedulerLock/scheduler-lock/SchedulerLock. Line 94–95 confirms @EnableSchedulerLock(defaultLockAtMostFor = "${im2be.outbox.scheduler-lock.default-at-most:PT1M}") with no JSR-303 backing. The parallel poller.lock-at-most and poller.lock-at-least each have @Pattern-validated fields in OutboxProperties.Poller — the gap is real. Kept as minor.

Finding 2 — UNIQUE-TO-A, verified, kept: A flags that markSent (line 99) and markFailureAttempt (line 150) JPQL bulk UPDATEs never increment o.recordVersion. Verified: the SET clause of markSent (lines 101–106) sets status, sentAt, and lastError only; markFailureAttempt (lines 152–161) sets retries, lastError, and status only — neither includes o.recordVersion = o.recordVersion + 1. @Version private long recordVersion is confirmed present at OutboxRecord.java:136–138. Hibernate does not auto-increment @Version columns in JPQL bulk UPDATE paths. No active runtime defect; the AND o.status = PENDING guard provides safe idempotency for the library's own write paths. Kept as info (deferred).

Result: 1 agreed finding kept, 1 unique-to-A verified and kept, 0 dropped.

Blast Radius

The diff touches a single submodule (outbox-publisher) but that module ships a Spring Boot autoconfiguration that is imported by all 7 aim2be JPA services. Any misconfiguration of the scheduler-lock property surface propagates silently to every consuming service at startup. The JPQL version-column gap is scoped to the outbox_records table writes only.

BLAST_SCORE: 4/10

Risk Indicators

Indicator Value
Sensitive functions markSent, markFailureAttempt, tryHotPublish, persistFailureAttempt, OutboxPublishCompletion.handleAsyncAck
Migration touched
Test delta
Dependency changes

CI status (head 8383bd152d07)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings (2)

[MINOR] im2be.outbox.scheduler-lock.default-at-most has no OutboxProperties binding — no JSR-303 validation and no IDE autocomplete

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:94

Line 94–95 reads the property via annotation placeholder:

@EnableSchedulerLock(defaultLockAtMostFor =
        "${im2be.outbox.scheduler-lock.default-at-most:PT1M}")

OutboxProperties (bound via @ConfigurationProperties(prefix = "im2be.outbox")) has no field for scheduler-lock.default-at-most — confirmed by grep: zero matches for any schedulerLock/SchedulerLock/scheduler-lock symbol in OutboxProperties.java. By contrast, the parallel im2be.outbox.poller.lock-at-most and im2be.outbox.poller.lock-at-least each have a @Pattern(regexp = "^PT[0-9]+...")-validated field in OutboxProperties.Poller, so an invalid value there produces a clean BindValidationException at context startup. An invalid scheduler-lock.default-at-most instead silently passes binding and fails late inside ShedLock's Duration.parse() with an opaque DateTimeParseException. The property is also absent from OutboxPropertiesValidationTest and will not appear in spring-configuration-metadata.json for IDE autocomplete.

Fix: add a SchedulerLock nested class (or top-level field) to OutboxProperties with the same ISO-8601 @Pattern as lockAtMost, default "PT1M", and add a schedulerLockDefaultAtMostMalformedFailsBinding test alongside the existing lockAtMostMalformedFailsBinding test.

[INFO] JPQL bulk UPDATEs in markSent/markFailureAttempt do not increment record_version@Version optimistic lock non-functional for all post-INSERT status transitions

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecordRepository.java:99

Both markSent (lines 99–111) and markFailureAttempt (lines 150–167) are Hibernate JPQL bulk UPDATE statements. Hibernate executes bulk JPQL as a native SQL UPDATE and does not automatically increment @Version-annotated columns in that path. The SET clause of markSent covers status, sentAt, and lastError only; markFailureAttempt covers retries, lastError, and status only — neither includes o.recordVersion = o.recordVersion + 1. OutboxRecord.recordVersion is annotated @Version (line 136) and required by ADR-0001 §1.4 EntityVersionParity, but it stays at its INSERT-time value (0) for the row's entire lifecycle after the first save().

The library's own concurrent-write safety is intact — the AND o.status = PENDING predicate provides idempotency for all current write paths. The risk is latent: a consuming service (or future contributor) that loads an OutboxRecord, mutates a field, and calls repository.save() will find record_version = 0 in both the entity snapshot and the DB, so the optimistic-lock check passes and the entity-level save silently overwrites whatever status the JPQL paths had written.

Deferred — no active runtime defect. Remediation options: (1) add o.recordVersion = o.recordVersion + 1 to both JPQL SET clauses; (2) make OutboxRecord setters package-private to block consuming-service entity-level status writes; (3) document the constraint in @Modifying Javadoc on both methods.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 9 • 2 findings (1m/1i) • 2026-05-27T01:52:59.993Z → 2026-05-27T01:55:02.972Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 10._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 9** — head `8383bd152d07`, base `main`, trigger `synchronize` **TL;DR:** CONDITIONAL_APPROVE — kept 2 findings (1 agreed minor, 1 unique-to-A info verified); 21 prior-round findings remain resolved; no blocking issues. ### Summary ## Arbitration — Round 9 **Memora recall:** No prior run history found (tags allowlist prevented write; both create and absorb paths blocked by permission/tag constraints). **Finding reconciliation:** **Finding 1 — AGREED (A+B), kept:** Both reviewers independently flag that `im2be.outbox.scheduler-lock.default-at-most` is consumed via annotation placeholder at `OutboxAutoConfiguration.java:94` but has no corresponding field in `OutboxProperties`. Verified with `Read` + `Grep`: `OutboxProperties.java` contains zero matches for `schedulerLock`/`scheduler-lock`/`SchedulerLock`. Line 94–95 confirms `@EnableSchedulerLock(defaultLockAtMostFor = "${im2be.outbox.scheduler-lock.default-at-most:PT1M}")` with no JSR-303 backing. The parallel `poller.lock-at-most` and `poller.lock-at-least` each have `@Pattern`-validated fields in `OutboxProperties.Poller` — the gap is real. Kept as **minor**. **Finding 2 — UNIQUE-TO-A, verified, kept:** A flags that `markSent` (line 99) and `markFailureAttempt` (line 150) JPQL bulk UPDATEs never increment `o.recordVersion`. Verified: the SET clause of `markSent` (lines 101–106) sets `status`, `sentAt`, and `lastError` only; `markFailureAttempt` (lines 152–161) sets `retries`, `lastError`, and `status` only — neither includes `o.recordVersion = o.recordVersion + 1`. `@Version private long recordVersion` is confirmed present at `OutboxRecord.java:136–138`. Hibernate does not auto-increment `@Version` columns in JPQL bulk UPDATE paths. No active runtime defect; the `AND o.status = PENDING` guard provides safe idempotency for the library's own write paths. Kept as **info** (deferred). **Result:** 1 agreed finding kept, 1 unique-to-A verified and kept, 0 dropped. ### Blast Radius The diff touches a single submodule (`outbox-publisher`) but that module ships a Spring Boot autoconfiguration that is imported by all 7 aim2be JPA services. Any misconfiguration of the scheduler-lock property surface propagates silently to every consuming service at startup. The JPQL version-column gap is scoped to the `outbox_records` table writes only. **BLAST_SCORE: 4/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `markSent`, `markFailureAttempt`, `tryHotPublish`, `persistFailureAttempt`, `OutboxPublishCompletion.handleAsyncAck` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `8383bd152d07`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/43/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/43/jobs/1) | ### Findings (2) #### **[MINOR]** `im2be.outbox.scheduler-lock.default-at-most` has no `OutboxProperties` binding — no JSR-303 validation and no IDE autocomplete _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxAutoConfiguration.java:94_ Line 94–95 reads the property via annotation placeholder: ```java @EnableSchedulerLock(defaultLockAtMostFor = "${im2be.outbox.scheduler-lock.default-at-most:PT1M}") ``` `OutboxProperties` (bound via `@ConfigurationProperties(prefix = "im2be.outbox")`) has **no** field for `scheduler-lock.default-at-most` — confirmed by grep: zero matches for any `schedulerLock`/`SchedulerLock`/`scheduler-lock` symbol in `OutboxProperties.java`. By contrast, the parallel `im2be.outbox.poller.lock-at-most` and `im2be.outbox.poller.lock-at-least` each have a `@Pattern(regexp = "^PT[0-9]+...")`-validated field in `OutboxProperties.Poller`, so an invalid value there produces a clean `BindValidationException` at context startup. An invalid `scheduler-lock.default-at-most` instead silently passes binding and fails late inside ShedLock's `Duration.parse()` with an opaque `DateTimeParseException`. The property is also absent from `OutboxPropertiesValidationTest` and will not appear in `spring-configuration-metadata.json` for IDE autocomplete. **Fix:** add a `SchedulerLock` nested class (or top-level field) to `OutboxProperties` with the same ISO-8601 `@Pattern` as `lockAtMost`, default `"PT1M"`, and add a `schedulerLockDefaultAtMostMalformedFailsBinding` test alongside the existing `lockAtMostMalformedFailsBinding` test. #### **[INFO]** JPQL bulk UPDATEs in `markSent`/`markFailureAttempt` do not increment `record_version` — `@Version` optimistic lock non-functional for all post-INSERT status transitions _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxRecordRepository.java:99_ Both `markSent` (lines 99–111) and `markFailureAttempt` (lines 150–167) are Hibernate JPQL bulk UPDATE statements. Hibernate executes bulk JPQL as a native SQL `UPDATE` and does **not** automatically increment `@Version`-annotated columns in that path. The SET clause of `markSent` covers `status`, `sentAt`, and `lastError` only; `markFailureAttempt` covers `retries`, `lastError`, and `status` only — neither includes `o.recordVersion = o.recordVersion + 1`. `OutboxRecord.recordVersion` is annotated `@Version` (line 136) and required by ADR-0001 §1.4 `EntityVersionParity`, but it stays at its INSERT-time value (0) for the row's entire lifecycle after the first `save()`. The library's own concurrent-write safety is intact — the `AND o.status = PENDING` predicate provides idempotency for all current write paths. The risk is latent: a consuming service (or future contributor) that loads an `OutboxRecord`, mutates a field, and calls `repository.save()` will find `record_version = 0` in both the entity snapshot and the DB, so the optimistic-lock check passes and the entity-level save silently overwrites whatever status the JPQL paths had written. **Deferred — no active runtime defect.** Remediation options: (1) add `o.recordVersion = o.recordVersion + 1` to both JPQL SET clauses; (2) make `OutboxRecord` setters package-private to block consuming-service entity-level status writes; (3) document the constraint in `@Modifying` Javadoc on both methods. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 9 • 2 findings (1m/1i) • 2026-05-27T01:52:59.993Z → 2026-05-27T01:55:02.972Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R10 — 1 R9 MINOR + 1 R9 INFO (typed SchedulerLock prop + @Version JPQL doc)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
5027e5d49d
R9 verdict findings (kept=2):

(1) [MINOR, agreed A+B] OutboxAutoConfiguration:94 + OutboxProperties — the R9 fix that turned `defaultLockAtMostFor` into a property-placeholder (`${im2be.outbox.scheduler-lock.default-at-most:PT1M}`) closed the leak but left the property undocumented + unvalidated. The parallel `poller.lock-at-most` and `poller.lock-at-least` both have `@Pattern`-validated typed fields in OutboxProperties.Poller; this one didn't, breaking symmetry.

Fix: added nested `OutboxProperties.SchedulerLock` class with `String defaultAtMost` field (default "PT1M") + @Pattern ISO-8601 validation. Wired into the parent OutboxProperties with @Valid + a `getSchedulerLock()` getter. Updated the class-level property-reference Javadoc to include the new knob. Added 2 new validation tests (`schedulerLockDefaultAtMostMalformedFailsBinding` + `schedulerLockDefaultAtMostOverridePassValidation`).

(2) [INFO deferred, unique-to-A] OutboxRecordRepository:101-106 + 152-161 — JPQL bulk UPDATEs in `markSent` + `markFailureAttempt` don't increment `o.recordVersion` because Hibernate only auto-bumps @Version on entity-managed updates (load → modify → save), not on JPQL bulk UPDATE paths. Reviewer marks this no-active-defect because the `AND o.status = PENDING` status-guard provides equivalent concurrency safety for the PENDING → terminal transitions the library cares about.

Fix: added explicit Javadoc on the `@Version private long recordVersion` field in OutboxRecord explaining the JPQL bulk-UPDATE bypass + the status-guard concurrency-safety rationale. The @Version field is retained to satisfy ADR-0001 §1.4 (EntityVersionParity ArchUnit rule) and to provide optimistic locking for any future entity-managed write that consumer code might add — but the library's own bulk-UPDATE paths intentionally bypass it.

Verification:
- mvn -B -pl outbox-publisher test → 67 tests passed (2 new validation tests)
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- The new SchedulerLock property class is purely additive — no callsite changes are required since the @EnableSchedulerLock annotation still reads the raw property placeholder. The typed field is documentation + binding-time validation only.
- @Version field documentation change is comment-only; no behavioral impact.

Superseded by round 11.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 10 — head 5027e5d49d10, base main, trigger synchronize

TL;DR: CONDITIONAL_APPROVE — kept 1 minor verified unique-to-A finding (markSent exception silently swallowed in whenComplete); all 23 prior-round findings confirmed closed by both reviewers.

Summary

Round 10 arbitration

Reconciliation: A=1 minor, B=0. Unique-to-A finding verified via Read at OutboxPublisher.java lines 306–362.

Verification of A's finding (line 320): The try block at line 316 covers both the success path (including completion.markSent() at line 320) and the failure path, with only a finally clause (lines 332–334) and no catch. If completion.markSent() throws, the exception bubbles through the finally (span.end() fires), exits handleAsyncAck, and is re-thrown inside the whenComplete lambda — where CompletableFuture silently discards it. This asymmetry with persistFailureAttempt (lines 350–362), which wraps the equivalent completion.markFailureAttempt() call in an explicit try { … } catch (Exception e) { LOG.error(…) }, is confirmed real. Finding kept as minor.

Memora persistence: create attempted; permission denied — best-effort record-keeping not available this run.

Kept 1 agreed/verified finding (unique-to-A, verified). Dropped 0. All 23 prior-round findings confirmed closed by both reviewers.

Blast Radius

The diff introduces a net-new Spring Boot autoconfiguration module (outbox-publisher) that exports shared JPA entities, a transactional publisher, and a ShedLock-coordinated poller to all 7 aim2be JPA services. The exported autoconfiguration surface and OutboxRecord entity schema affect every consuming service that enables im2be.outbox.enabled=true, but the module itself is additive and does not modify any existing shared contracts or migration files.

BLAST_SCORE: 5/10

Risk Indicators

Indicator Value
Sensitive functions OutboxPublisher.handleAsyncAck, OutboxPublisher.persistFailureAttempt, OutboxPublishCompletion.markSent, OutboxPublishCompletion.markFailureAttempt, OutboxPollerWorker.pollAndPublishPending, OutboxAutoConfiguration.outboxCircuitBreaker
Migration touched
Test delta
Dependency changes

CI status (head 5027e5d49d10)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details
  • PR-PLATFORM-2 (processed-kafka-events / consumer-side dedup)
  • PR-PLATFORM-3 (ApicurioCodec / Avro encoding)

Findings (1)

[MINOR] completion.markSent() exception inside whenComplete silently swallowed — no log, unlike persistFailureAttempt

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:320

Verified at lines 306–362. handleAsyncAck opens a try block at line 316 with only a finally clause (lines 332–334, span.end()); there is no catch. If completion.markSent() at line 320 throws (e.g. CannotCreateTransactionException, any DataAccessException, or another unchecked exception from the REQUIRES_NEW transaction), the exception propagates out of handleAsyncAck, is re-thrown inside the whenComplete lambda, and is silently discarded by CompletableFuture machinery — no log entry, no metric, no alert.

Contrast with persistFailureAttempt (lines 350–362), which wraps the equivalent completion.markFailureAttempt() call in an explicit try { … } catch (Exception e) { LOG.error(…) } block for exactly this reason.

The poller provides eventual correctness (the PENDING row is picked up and marked SENT on the next tick), but operators have zero visibility into silent hot-relay bookkeeping failures. Add a defensive wrapper analogous to persistFailureAttempt:

try {
    completion.markSent(aggregateId, eventId, Instant.now());
} catch (Exception markSentError) {
    LOG.error("outbox.hot.publish unable to persist markSent eventId={} aggregateId={}: {}",
            eventId, aggregateId, markSentError.getMessage());
}

This mirrors the failure path's existing error-swallowing contract while restoring operator observability.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 10 • 1 finding (1m) • 2026-05-27T02:03:00.895Z → 2026-05-27T02:04:32.287Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 11._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 10** — head `5027e5d49d10`, base `main`, trigger `synchronize` **TL;DR:** CONDITIONAL_APPROVE — kept 1 minor verified unique-to-A finding (markSent exception silently swallowed in whenComplete); all 23 prior-round findings confirmed closed by both reviewers. ### Summary ## Round 10 arbitration **Reconciliation:** A=1 minor, B=0. Unique-to-A finding verified via `Read` at `OutboxPublisher.java` lines 306–362. **Verification of A's finding (line 320):** The `try` block at line 316 covers both the success path (including `completion.markSent()` at line 320) and the failure path, with **only a `finally` clause** (lines 332–334) and no `catch`. If `completion.markSent()` throws, the exception bubbles through the `finally` (`span.end()` fires), exits `handleAsyncAck`, and is re-thrown inside the `whenComplete` lambda — where `CompletableFuture` silently discards it. This asymmetry with `persistFailureAttempt` (lines 350–362), which wraps the equivalent `completion.markFailureAttempt()` call in an explicit `try { … } catch (Exception e) { LOG.error(…) }`, is **confirmed real**. Finding kept as minor. **Memora persistence:** create attempted; permission denied — best-effort record-keeping not available this run. Kept 1 agreed/verified finding (unique-to-A, verified). Dropped 0. All 23 prior-round findings confirmed closed by both reviewers. ### Blast Radius The diff introduces a net-new Spring Boot autoconfiguration module (outbox-publisher) that exports shared JPA entities, a transactional publisher, and a ShedLock-coordinated poller to all 7 aim2be JPA services. The exported autoconfiguration surface and `OutboxRecord` entity schema affect every consuming service that enables `im2be.outbox.enabled=true`, but the module itself is additive and does not modify any existing shared contracts or migration files. **BLAST_SCORE: 5/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `OutboxPublisher.handleAsyncAck`, `OutboxPublisher.persistFailureAttempt`, `OutboxPublishCompletion.markSent`, `OutboxPublishCompletion.markFailureAttempt`, `OutboxPollerWorker.pollAndPublishPending`, `OutboxAutoConfiguration.outboxCircuitBreaker` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `5027e5d49d10`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/44/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/44/jobs/1) | ### Related PRs - PR-PLATFORM-2 (processed-kafka-events / consumer-side dedup) - PR-PLATFORM-3 (ApicurioCodec / Avro encoding) ### Findings (1) #### **[MINOR]** `completion.markSent()` exception inside `whenComplete` silently swallowed — no log, unlike `persistFailureAttempt` _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:320_ Verified at lines 306–362. `handleAsyncAck` opens a `try` block at line 316 with **only a `finally` clause** (lines 332–334, `span.end()`); there is no `catch`. If `completion.markSent()` at line 320 throws (e.g. `CannotCreateTransactionException`, any `DataAccessException`, or another unchecked exception from the `REQUIRES_NEW` transaction), the exception propagates out of `handleAsyncAck`, is re-thrown inside the `whenComplete` lambda, and is silently discarded by `CompletableFuture` machinery — no log entry, no metric, no alert. Contrast with `persistFailureAttempt` (lines 350–362), which wraps the equivalent `completion.markFailureAttempt()` call in an explicit `try { … } catch (Exception e) { LOG.error(…) }` block for exactly this reason. The poller provides eventual correctness (the PENDING row is picked up and marked SENT on the next tick), but operators have zero visibility into silent hot-relay bookkeeping failures. Add a defensive wrapper analogous to `persistFailureAttempt`: ```java try { completion.markSent(aggregateId, eventId, Instant.now()); } catch (Exception markSentError) { LOG.error("outbox.hot.publish unable to persist markSent eventId={} aggregateId={}: {}", eventId, aggregateId, markSentError.getMessage()); } ``` This mirrors the failure path's existing error-swallowing contract while restoring operator observability. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 10 • 1 finding (1m) • 2026-05-27T02:03:00.895Z → 2026-05-27T02:04:32.287Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R11 — symmetric markSent persist-with-log on async-ack success path (1 R10 MINOR)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
904b338de7
R10 verdict findings (kept=1):

(1) [MINOR, unique-to-A verified] OutboxPublisher.handleAsyncAck:320 — `completion.markSent(...)` was called directly inside the try/finally with NO catch. An exception from `markSent` (CannotCreateTransactionException, DataAccessException, etc.) would bubble through the `finally { span.end(); }`, exit handleAsyncAck, and then be silently discarded by `CompletableFuture.whenComplete` — the hot-relay would have NO visibility into markSent failures. Asymmetric with the failure path (`persistFailureAttempt`) which already wraps `completion.markFailureAttempt(...)` in try/catch(Exception) + LOG.error.

Fix: extracted `persistMarkSent(aggregateId, eventId)` helper that mirrors `persistFailureAttempt`'s shape — calls `completion.markSent(...)` inside try { … } catch (Exception persistError) { LOG.error(...) }. handleAsyncAck's success branch now calls `persistMarkSent(...)` instead of directly invoking completion.markSent. The catch broadens to Exception (not just DataAccessException) so sibling transaction-construction exceptions (CannotCreateTransactionException extends TransactionException, NOT DataAccessException) are also swallowed-with-log.

Verification:
- mvn -B -pl outbox-publisher test → 67 tests passed
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- Behavior change is strictly additive: success-path markSent failures now produce an ERROR log line where previously they were silent. No existing semantics are altered.
- handleAsyncAck's `finally { span.end(); }` continues to fire regardless of the markSent outcome (the new try{} is inside persistMarkSent's own scope).
- The success branch's metrics.recordPublishLatency(...) call still runs after persistMarkSent returns — markSent failures no longer skip the latency observation (a minor improvement: the publish DID succeed at the Kafka level even if the DB write failed).

Superseded by round 12.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 11 — head 904b338de73d, base main, trigger synchronize

TL;DR: CONDITIONAL_APPROVE — kept 1 agreed finding (verified at HEAD), 0 dropped; sole remaining issue is a minor retry-budget mischarge in tryColdPublish when markSent throws after a successful Kafka send.

Summary

Round 11 Arbitration

Memora context: No prior run history found (first arbitration stored as memory ID 299). No reusable submodule patterns on record.

Reconciliation: Both A and B raised exactly one finding each, pointing at the same file (OutboxPollerWorker.java), the same line (214), and the same root cause: completion.markSent() and metrics.recordPublishLatency() at lines 214–217 are co-located inside the Kafka-send try block (lines 211–232). When the Kafka send at lines 212–213 succeeds but completion.markSent() subsequently throws (e.g. CannotCreateTransactionException), the exception falls through to the generic catch (Exception sendError) at line 229, which calls persistFailureOutcome(record, sendError) at line 230 — spuriously incrementing retries on a row whose message was already delivered.

Verification at HEAD confirmed:

  • Lines 211–232 of OutboxPollerWorker.java match the structure both reviewers described exactly.
  • completion.markSent() at line 214 and metrics.recordPublishLatency() at lines 215–217 are inside the same try block as the Kafka .get() call.
  • catch (Exception sendError) at line 229 → persistFailureOutcome(record, sendError) at line 230 covers all of these.
  • The hot-path contrast (persistMarkSent() helper at OutboxPublisher.java:350) was also verified: it has a dedicated catch (Exception persistError) that logs without calling persistFailureOutcome, so the hot path correctly isolates DB-layer failures from Kafka-delivery failures.

Outcome: 1 agreed finding kept; 0 unique-to-one findings to evaluate.

Blast Radius

The defect is confined to OutboxPollerWorker.tryColdPublish, a single package-private method in one class. It affects all outbox publishers using the cold-relay path, but the impact is limited to partial-DB-connectivity edge cases where markSent fails after a successful Kafka ack. No shared interfaces, no migrations, no exported API contracts are affected.

BLAST_SCORE: 3/10

Risk Indicators

Indicator Value
Sensitive functions tryColdPublish, persistFailureOutcome, completion.markSent, completion.markFailureAttempt
Migration touched
Test delta
Dependency changes

CI status (head 904b338de73d)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings (1)

[MINOR] completion.markSent() and metrics calls in tryColdPublish share the Kafka-send try block — DB/metrics exceptions incorrectly charge the retry budget on delivered rows

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:214

In tryColdPublish (lines 211–232), kafkaTemplate.send(...).get(sendTimeoutMs, MILLISECONDS) (lines 212–213), completion.markSent(...) (line 214), and metrics.recordPublishLatency(...) (lines 215–217) are all inside a single try block. The catch ladder at lines 219–231 has specific handlers for InterruptedException, TimeoutException, and ExecutionException, but the terminal catch (Exception sendError) at line 229 captures any remaining exception — including DataAccessException subtypes thrown by markSent when the DB connection is momentarily unavailable after a successful Kafka ack.

When that path is taken, persistFailureOutcome(record, sendError) at line 230 is called, which calls completion.markFailureAttempt() and increments retries on a row whose message was already delivered to Kafka. Under sustained partial DB connectivity (markSent fails every tick, markFailureAttempt intermittently succeeds), the retry counter creeps upward until retries ≥ maxRetries and the CASE-WHEN in markFailureAttempt transitions the row to FAILED — despite the message having been published maxRetries times.

The hot-relay path received this exact fix in round 10: OutboxPublisher.persistMarkSent() (line 350) wraps completion.markSent() in a dedicated catch (Exception persistError) that logs and returns normally, never calling persistFailureOutcome. The cold poller needs the same split:

// (1) Kafka send — own try/catch
try {
    kafkaTemplate.send(record.getTopic(), key, record.getPayloadBytes())
            .get(sendTimeoutMs, TimeUnit.MILLISECONDS);
} catch (InterruptedException ie) {
    throw ie;
} catch (TimeoutException | ExecutionException | RuntimeException sendError) {
    persistFailureOutcome(record, sendError);
    return false;
}

// (2) Post-ack persistence — own try/catch; DB failure does NOT charge retries
try {
    completion.markSent(record.getAggregateId(), record.getEventId(), Instant.now());
    metrics.recordPublishLatency(record.getTopic(), Duration.between(startedAt, Instant.now()));
} catch (Exception completionError) {
    LOG.error("outbox.cold.publish unable to persist markSent eventId={} aggregateId={}: {}",
            record.getEventId(), record.getAggregateId(), completionError.getMessage());
    // Row stays PENDING; next tick re-attempts markSent without consuming retry budget.
}
return true;  // Kafka ack = success regardless of DB state

This mirrors the hot-path fix exactly and prevents false-positive FAILED transitions caused by transient DB connectivity issues after successful Kafka delivery.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 11 • 1 finding (1m) • 2026-05-27T02:12:25.145Z → 2026-05-27T02:14:08.653Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 12._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 11** — head `904b338de73d`, base `main`, trigger `synchronize` **TL;DR:** CONDITIONAL_APPROVE — kept 1 agreed finding (verified at HEAD), 0 dropped; sole remaining issue is a minor retry-budget mischarge in `tryColdPublish` when `markSent` throws after a successful Kafka send. ### Summary ## Round 11 Arbitration **Memora context:** No prior run history found (first arbitration stored as memory ID 299). No reusable submodule patterns on record. **Reconciliation:** Both A and B raised exactly one finding each, pointing at the same file (`OutboxPollerWorker.java`), the same line (214), and the same root cause: `completion.markSent()` and `metrics.recordPublishLatency()` at lines 214–217 are co-located inside the Kafka-send `try` block (lines 211–232). When the Kafka send at lines 212–213 succeeds but `completion.markSent()` subsequently throws (e.g. `CannotCreateTransactionException`), the exception falls through to the generic `catch (Exception sendError)` at line 229, which calls `persistFailureOutcome(record, sendError)` at line 230 — spuriously incrementing `retries` on a row whose message was already delivered. **Verification at HEAD** confirmed: - Lines 211–232 of `OutboxPollerWorker.java` match the structure both reviewers described exactly. - `completion.markSent()` at line 214 and `metrics.recordPublishLatency()` at lines 215–217 are inside the same `try` block as the Kafka `.get()` call. - `catch (Exception sendError)` at line 229 → `persistFailureOutcome(record, sendError)` at line 230 covers all of these. - The hot-path contrast (`persistMarkSent()` helper at `OutboxPublisher.java:350`) was also verified: it has a dedicated `catch (Exception persistError)` that logs without calling `persistFailureOutcome`, so the hot path correctly isolates DB-layer failures from Kafka-delivery failures. **Outcome:** 1 agreed finding kept; 0 unique-to-one findings to evaluate. ### Blast Radius The defect is confined to `OutboxPollerWorker.tryColdPublish`, a single package-private method in one class. It affects all outbox publishers using the cold-relay path, but the impact is limited to partial-DB-connectivity edge cases where `markSent` fails after a successful Kafka ack. No shared interfaces, no migrations, no exported API contracts are affected. **BLAST_SCORE: 3/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `tryColdPublish`, `persistFailureOutcome`, `completion.markSent`, `completion.markFailureAttempt` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `904b338de73d`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/45/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/45/jobs/1) | ### Findings (1) #### **[MINOR]** `completion.markSent()` and `metrics` calls in `tryColdPublish` share the Kafka-send try block — DB/metrics exceptions incorrectly charge the retry budget on delivered rows _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:214_ In `tryColdPublish` (lines 211–232), `kafkaTemplate.send(...).get(sendTimeoutMs, MILLISECONDS)` (lines 212–213), `completion.markSent(...)` (line 214), and `metrics.recordPublishLatency(...)` (lines 215–217) are all inside a single `try` block. The catch ladder at lines 219–231 has specific handlers for `InterruptedException`, `TimeoutException`, and `ExecutionException`, but the terminal `catch (Exception sendError)` at line 229 captures **any** remaining exception — including `DataAccessException` subtypes thrown by `markSent` when the DB connection is momentarily unavailable after a successful Kafka ack. When that path is taken, `persistFailureOutcome(record, sendError)` at line 230 is called, which calls `completion.markFailureAttempt()` and increments `retries` on a row whose message was **already delivered to Kafka**. Under sustained partial DB connectivity (markSent fails every tick, markFailureAttempt intermittently succeeds), the retry counter creeps upward until `retries ≥ maxRetries` and the CASE-WHEN in `markFailureAttempt` transitions the row to `FAILED` — despite the message having been published `maxRetries` times. The hot-relay path received this exact fix in round 10: `OutboxPublisher.persistMarkSent()` (line 350) wraps `completion.markSent()` in a dedicated `catch (Exception persistError)` that logs and returns normally, never calling `persistFailureOutcome`. The cold poller needs the same split: ```java // (1) Kafka send — own try/catch try { kafkaTemplate.send(record.getTopic(), key, record.getPayloadBytes()) .get(sendTimeoutMs, TimeUnit.MILLISECONDS); } catch (InterruptedException ie) { throw ie; } catch (TimeoutException | ExecutionException | RuntimeException sendError) { persistFailureOutcome(record, sendError); return false; } // (2) Post-ack persistence — own try/catch; DB failure does NOT charge retries try { completion.markSent(record.getAggregateId(), record.getEventId(), Instant.now()); metrics.recordPublishLatency(record.getTopic(), Duration.between(startedAt, Instant.now())); } catch (Exception completionError) { LOG.error("outbox.cold.publish unable to persist markSent eventId={} aggregateId={}: {}", record.getEventId(), record.getAggregateId(), completionError.getMessage()); // Row stays PENDING; next tick re-attempts markSent without consuming retry budget. } return true; // Kafka ack = success regardless of DB state ``` This mirrors the hot-path fix exactly and prevents false-positive `FAILED` transitions caused by transient DB connectivity issues after successful Kafka delivery. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 11 • 1 finding (1m) • 2026-05-27T02:12:25.145Z → 2026-05-27T02:14:08.653Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R12 — isolate tryColdPublish post-send DB/metrics from Kafka catches (1 R11 MINOR)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 2s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
c200f6362f
R11 verdict findings (kept=1):

(1) [MINOR, agreed A+B verified] OutboxPollerWorker.tryColdPublish — markSent + recordPublishLatency were co-located INSIDE the Kafka-send try block. If kafkaTemplate.send().get() succeeded but completion.markSent(...) subsequently threw (e.g. CannotCreateTransactionException, DataAccessException), the exception fell through to the generic `catch (Exception sendError)` → `persistFailureOutcome(record, sendError)` which spuriously bumped retries on a row whose message was already delivered. Same class of asymmetry that R10 caught on the hot path (and R11 fixed via the persistMarkSent helper).

Fix: split tryColdPublish into two phases:
- Phase 1 (Kafka send + specific catches): kafkaTemplate.send().get(timeout) inside the existing try with InterruptedException / TimeoutException / ExecutionException / generic Exception catches. All catches route to persistFailureOutcome (correct: the Kafka send itself failed).
- Phase 2 (post-send: markSent + metrics): only reached when the Kafka send succeeded. Each call gets its OWN try/catch:
  - completion.markSent(...) failure → LOG.error with note that "row remains PENDING; next poller tick will see the prior delivery as a duplicate and the consumer-side dedup guard will skip it" — does NOT call persistFailureOutcome.
  - metrics.recordPublishLatency(...) failure → LOG.warn (metric loss only, no row consequence).
- Returns `true` from both post-send paths because the Kafka ack stands.

The hot-path persistMarkSent helper (added in R11) follows the same shape — this R12 fix brings cold-path symmetry.

Verification:
- mvn -B -pl outbox-publisher test → 67 tests passed
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- Existing tests for "kafkaFailureMarksFailureAttempt" / "sendTimeoutTreatedAsFailureAttempt" / "rowAtMaxRetriesTransitionsToFailed" / "rowBelowMaxRetriesIncrementsRetries" still pass because Phase 1's Kafka-send try block is unchanged. The behavior change is strictly that markSent failures NO LONGER route to persistFailureOutcome — a strict improvement, no test relied on the buggy path.
- The dedup-guard note in the LOG.error message is forward-looking — the processed-kafka-events module (PR-PLATFORM-2, already merged) implements compound-PK dedup; a re-delivery from the next poller tick is correctly idempotent at the consumer.

Superseded by round 13.

Show previous round

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 12 — head c200f6362fa6, base main, trigger synchronize

TL;DR: CONDITIONAL_APPROVE — kept 2 unique-to-one findings (both minor, both verified); no prior-round regressions; 25 historical findings confirmed resolved at HEAD.

Summary

Memora persist attempted (MCP tag-allowlist error on all attempts — server-side constraint). Reconciliation: both findings are unique-to-one; both verified against HEAD.

Finding A (verified): OutboxPollerWorkerTest.javapersistFailureSwallowsRepositoryException at line 259 confirmed; scanned through line 363 and found NO test covering the symmetric success path (markSent throws after kafkaTemplate.send().get() succeeds). The R11 production fix at OutboxPollerWorker.java:234–257 (separate try/catch blocks) is therefore unguarded by a regression test. KEPT.

Finding B (verified): All 5 cited call sites confirmed at HEAD — OutboxPublisher.java:354 and :383 pass persistError.getMessage() (String); OutboxPollerWorker.java:244, :255, :323 also pass .getMessage() as the final SLF4J format argument. SLF4J/Logback only auto-appends the stack trace when the last arg is the Throwable instance; using a String discards it entirely. KEPT.

Kept 2 (A=1 unique + B=1 unique), dropped 0. Both minor. Verdict: CONDITIONAL_APPROVE.

Blast Radius

The diff adds the full outbox-publisher library module — an exported auto-configuration surface consumed by all 7 aim2be JPA services. Changes to OutboxPublisher, OutboxPollerWorker, and OutboxAutoConfiguration affect every downstream consumer that sets im2be.outbox.enabled=true. The two remaining findings are minor (missing test + logging quality) and do not affect runtime correctness of the exported API.

BLAST_SCORE: 5/10

Risk Indicators

Indicator Value
Sensitive functions markSent, markFailureAttempt, persistMarkSent, persistFailureAttempt, persistFailureOutcome, tryColdPublish
Migration touched
Test delta
Dependency changes

CI status (head c200f6362fa6)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings (2)

[MINOR] R11 fix has no regression-guard test — tryColdPublish must return true and NOT call markFailureAttempt when markSent throws after a successful Kafka .get()

outbox-publisher/src/test/java/com/aim2be/platform/outbox/OutboxPollerWorkerTest.java:280

The R11 production fix (separating markSent + metrics into their own try/catch blocks after the Kafka send in tryColdPublish, lines 234–257 of OutboxPollerWorker.java) prevents a markSent DB exception from being routed to persistFailureOutcome, which would incorrectly decrement the row's retry budget even though the Kafka message was already delivered. This invariant is untested.

OutboxPollerWorkerTest already has persistFailureSwallowsRepositoryException (verified at lines 259–280) covering the analogous failure path (i.e., markFailureAttempt throwing returns false without propagation). The missing symmetric test for the success path:

@Test
void markSentFailureAfterSuccessfulSendReturnsTrueAndDoesNotIncrementRetries() throws Exception {
    final OutboxPollerWorker worker = newWorker();
    final OutboxRecord row = pendingRow();

    @SuppressWarnings("unchecked")
    final SendResult<byte[], byte[]> ok = (SendResult<byte[], byte[]>) mock(SendResult.class);
    when(kafkaTemplate.send(anyString(), any(byte[].class), any(byte[].class)))
            .thenReturn(CompletableFuture.completedFuture(ok));

    // markSent throws AFTER Kafka .get() succeeds — simulates a transient DB hiccup.
    org.mockito.Mockito.doThrow(
            new org.springframework.transaction.CannotCreateTransactionException("db-gone"))
            .when(completion).markSent(any(UUID.class), any(UUID.class), any());

    // Kafka send DID succeed — the method MUST return true.
    assertThat(worker.tryColdPublish(row)).isTrue();
    // The retry budget MUST NOT be charged for a delivered row.
    verify(completion, never()).markFailureAttempt(
            any(UUID.class), any(UUID.class), anyString(), anyInt());
}

Without this test, a future refactor that accidentally re-merges the post-send blocks back into the Kafka-send try would silently re-introduce the R11 regression — a row delivered to Kafka would have its retry budget decremented, eventually transitioning to FAILED despite successful delivery.

[MINOR] Five ERROR/WARN catch blocks discard stack traces — pass Throwable as the final SLF4J argument, not getMessage()

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:354

All five catch (Exception …) blocks that guard critical post-send persistence paths pass exception.getMessage() (a String) as the final SLF4J format argument, permanently discarding stack traces. Verified at HEAD:

  • OutboxPublisher.java:354 (persistMarkSent): persistError.getMessage()
  • OutboxPublisher.java:383 (persistFailureAttempt): persistError.getMessage()
  • OutboxPollerWorker.java:244 (tryColdPublish markSent catch): persistError.getMessage()
  • OutboxPollerWorker.java:255 (tryColdPublish metrics catch): metricsError.getMessage()
  • OutboxPollerWorker.java:323 (persistFailureOutcome catch): persistError.getMessage()

SLF4J/Logback appends the full stack trace automatically only when the last argument passed to the logger is the Throwable itself. Using getMessage() means that when these paths fire in production — a connection reset during markSent, a JPQL mapping failure, a Micrometer registry error — the only diagnostic is a one-line message with no frame information.

Fix: append the exception as a trailing argument (SLF4J treats a trailing Throwable specially even without a corresponding {} placeholder):

// Before
LOG.error("… eventId={} aggregateId={}: {}", eventId, aggregateId, persistError.getMessage());
// After — omit getMessage(), pass Throwable last
LOG.error("… eventId={} aggregateId={}", eventId, aggregateId, persistError);

Apply the same correction at OutboxPublisher.java:383, OutboxPollerWorker.java:244, OutboxPollerWorker.java:255, and OutboxPollerWorker.java:323.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 12 • 2 findings (2m) • 2026-05-27T02:22:05.364Z → 2026-05-27T02:24:03.898Z • posted-as: pr-reviewer-bot

<!-- hib-pr-reviewer collapsed --> > _Superseded by round 13._ <details> <summary>Show previous round</summary> ## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 12** — head `c200f6362fa6`, base `main`, trigger `synchronize` **TL;DR:** CONDITIONAL_APPROVE — kept 2 unique-to-one findings (both minor, both verified); no prior-round regressions; 25 historical findings confirmed resolved at HEAD. ### Summary Memora persist attempted (MCP tag-allowlist error on all attempts — server-side constraint). Reconciliation: both findings are unique-to-one; both verified against HEAD. **Finding A (verified):** `OutboxPollerWorkerTest.java` — `persistFailureSwallowsRepositoryException` at line 259 confirmed; scanned through line 363 and found NO test covering the symmetric success path (`markSent` throws after `kafkaTemplate.send().get()` succeeds). The R11 production fix at `OutboxPollerWorker.java:234–257` (separate try/catch blocks) is therefore unguarded by a regression test. KEPT. **Finding B (verified):** All 5 cited call sites confirmed at HEAD — `OutboxPublisher.java:354` and `:383` pass `persistError.getMessage()` (String); `OutboxPollerWorker.java:244`, `:255`, `:323` also pass `.getMessage()` as the final SLF4J format argument. SLF4J/Logback only auto-appends the stack trace when the last arg is the `Throwable` instance; using a String discards it entirely. KEPT. Kept 2 (A=1 unique + B=1 unique), dropped 0. Both minor. Verdict: CONDITIONAL_APPROVE. ### Blast Radius The diff adds the full outbox-publisher library module — an exported auto-configuration surface consumed by all 7 aim2be JPA services. Changes to `OutboxPublisher`, `OutboxPollerWorker`, and `OutboxAutoConfiguration` affect every downstream consumer that sets `im2be.outbox.enabled=true`. The two remaining findings are minor (missing test + logging quality) and do not affect runtime correctness of the exported API. **BLAST_SCORE: 5/10** ### Risk Indicators | Indicator | Value | |---|---| | Sensitive functions | `markSent`, `markFailureAttempt`, `persistMarkSent`, `persistFailureAttempt`, `persistFailureOutcome`, `tryColdPublish` | | Migration touched | — | | Test delta | — | | Dependency changes | — | ### CI status (head `c200f6362fa6`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/46/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/46/jobs/1) | ### Findings (2) #### **[MINOR]** R11 fix has no regression-guard test — `tryColdPublish` must return `true` and NOT call `markFailureAttempt` when `markSent` throws after a successful Kafka `.get()` _outbox-publisher/src/test/java/com/aim2be/platform/outbox/OutboxPollerWorkerTest.java:280_ The R11 production fix (separating `markSent` + `metrics` into their own `try/catch` blocks after the Kafka send in `tryColdPublish`, lines 234–257 of `OutboxPollerWorker.java`) prevents a `markSent` DB exception from being routed to `persistFailureOutcome`, which would incorrectly decrement the row's retry budget even though the Kafka message was already delivered. This invariant is untested. `OutboxPollerWorkerTest` already has `persistFailureSwallowsRepositoryException` (verified at lines 259–280) covering the analogous failure path (i.e., `markFailureAttempt` throwing returns `false` without propagation). The missing symmetric test for the success path: ```java @Test void markSentFailureAfterSuccessfulSendReturnsTrueAndDoesNotIncrementRetries() throws Exception { final OutboxPollerWorker worker = newWorker(); final OutboxRecord row = pendingRow(); @SuppressWarnings("unchecked") final SendResult<byte[], byte[]> ok = (SendResult<byte[], byte[]>) mock(SendResult.class); when(kafkaTemplate.send(anyString(), any(byte[].class), any(byte[].class))) .thenReturn(CompletableFuture.completedFuture(ok)); // markSent throws AFTER Kafka .get() succeeds — simulates a transient DB hiccup. org.mockito.Mockito.doThrow( new org.springframework.transaction.CannotCreateTransactionException("db-gone")) .when(completion).markSent(any(UUID.class), any(UUID.class), any()); // Kafka send DID succeed — the method MUST return true. assertThat(worker.tryColdPublish(row)).isTrue(); // The retry budget MUST NOT be charged for a delivered row. verify(completion, never()).markFailureAttempt( any(UUID.class), any(UUID.class), anyString(), anyInt()); } ``` Without this test, a future refactor that accidentally re-merges the post-send blocks back into the Kafka-send `try` would silently re-introduce the R11 regression — a row delivered to Kafka would have its retry budget decremented, eventually transitioning to `FAILED` despite successful delivery. #### **[MINOR]** Five ERROR/WARN catch blocks discard stack traces — pass `Throwable` as the final SLF4J argument, not `getMessage()` _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPublisher.java:354_ All five `catch (Exception …)` blocks that guard critical post-send persistence paths pass `exception.getMessage()` (a `String`) as the final SLF4J format argument, permanently discarding stack traces. Verified at HEAD: - `OutboxPublisher.java:354` (`persistMarkSent`): `persistError.getMessage()` - `OutboxPublisher.java:383` (`persistFailureAttempt`): `persistError.getMessage()` - `OutboxPollerWorker.java:244` (`tryColdPublish` markSent catch): `persistError.getMessage()` - `OutboxPollerWorker.java:255` (`tryColdPublish` metrics catch): `metricsError.getMessage()` - `OutboxPollerWorker.java:323` (`persistFailureOutcome` catch): `persistError.getMessage()` SLF4J/Logback appends the full stack trace automatically **only** when the last argument passed to the logger is the `Throwable` itself. Using `getMessage()` means that when these paths fire in production — a connection reset during `markSent`, a JPQL mapping failure, a Micrometer registry error — the only diagnostic is a one-line message with no frame information. **Fix:** append the exception as a trailing argument (SLF4J treats a trailing `Throwable` specially even without a corresponding `{}` placeholder): ```java // Before LOG.error("… eventId={} aggregateId={}: {}", eventId, aggregateId, persistError.getMessage()); // After — omit getMessage(), pass Throwable last LOG.error("… eventId={} aggregateId={}", eventId, aggregateId, persistError); ``` Apply the same correction at `OutboxPublisher.java:383`, `OutboxPollerWorker.java:244`, `OutboxPollerWorker.java:255`, and `OutboxPollerWorker.java:323`. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 12 • 2 findings (2m) • 2026-05-27T02:22:05.364Z → 2026-05-27T02:24:03.898Z • posted-as: pr-reviewer-bot</sub> </details>
fix(outbox): R13 — 2 R12 MINOR (cold-path markSent regression test + SLF4J Throwable last-arg discipline)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 3s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
4775abfc2e
R12 verdict findings (kept=2):

(1) [MINOR, unique-to-A verified] OutboxPollerWorkerTest — the R12 production fix that splits tryColdPublish into Kafka-send (Phase 1) + post-send markSent+metrics (Phase 2) had NO regression test guarding the "Kafka send succeeds but markSent throws" path. A future revert or refactor could silently re-introduce the spurious retry-bump.

Fix: added `markSentFailureAfterSuccessfulSendDoesNotBumpRetries` test in OutboxPollerWorkerTest. Stubs kafkaTemplate.send to return a completed-OK future, stubs completion.markSent to throw CannotCreateTransactionException, asserts tryColdPublish returns true (Kafka ack stands) AND verify(completion, never()).markFailureAttempt(...) (no retries bump).

(2) [MINOR, unique-to-B verified] All persist-error LOG.{error,warn} sites passed `e.getMessage()` (String) as the last arg — SLF4J/Logback only auto-append the stack trace when the LAST argument is the Throwable instance. Using a String discards the stack trace entirely. Affected sites:
- OutboxPublisher.java:331 — kafka_error in handleAsyncAck
- OutboxPublisher.java:356 — persistMarkSent catch
- OutboxPublisher.java:383 — persistFailureAttempt catch
- OutboxPollerWorker.java:244 — markSent post-send catch (R12 fix)
- OutboxPollerWorker.java:255 — recordPublishLatency post-send catch (R12 fix)
- OutboxPollerWorker.java:323 — persistFailureOutcome outer catch

Fix: replaced `{}` placeholder + `e.getMessage()` arg with the bare Throwable instance at every site. SLF4J/Logback now print the full stack trace inline with the log line. Each log message body still includes the structured context (topic, eventId, aggregateId).

Verification:
- mvn -B -pl outbox-publisher test → 68 tests passed (1 new regression test)
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]

Ripple-effect inspection (rule 63):
- Log-format change is observable in operator logs: stack traces now print where previously only single-line messages did. This is a strict improvement for debugging.
- The new test exercises the exact Phase-2 path the R12 production fix introduced; it'd be the first to fail if Phase 2 regressed.

hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs)

Round 13 — head 4775abfc2e3c, base main, trigger synchronize

TL;DR: CONDITIONAL_APPROVE — kept 1 minor unique-to-A finding (verified at line 278); B returned no structured findings; no blocking issues.

Summary

Round 13 reconciliation — PR #4 (outbox-publisher full impl)

Memora context: No prior run memories found (tags blocked); created new entry (id=302) with round-13 summary.

Reviewer B: Returned findings: [] in their JSON (prose summary was a best-effort salvage from a truncated output). No structured findings to evaluate.

Reviewer A: 1 minor finding — metrics.recordFailure unguarded in persistFailureOutcome. This is unique-to-A and required verification.

Verification performed (Rule 2): Read OutboxPollerWorker.java lines 250–328 and 100–170.

  • Confirmed: Line 278 calls metrics.recordFailure(record.getTopic()) before the try block (which begins at line 292) that wraps completion.markFailureAttempt. If a metrics implementation throws from counter.increment(), markFailureAttempt is never called.
  • Confirmed propagation path: tryColdPublish (line 207) calls persistFailureOutcome inside catch blocks (lines 219, 222, 225). The per-row loop in pollAndPublishPending (lines 144–165) only catches InterruptedException — a non-interrupt exception escaping tryColdPublish propagates out of the for loop, past the outer try/finally, and aborts the entire tick.
  • Confirmed asymmetry: Success-path metrics.recordPublishLatency (lines 248–258) is wrapped in its own try/catch with a LOG.warn. The failure path has no equivalent guard.

Finding kept (1 of 1). All R1–R12 findings remain closed per the verified resolution table in Reviewer A's summary.

Blast Radius

The affected code path is confined to the cold-relay poller worker's failure-outcome persistence helper. Impact is limited to the outbox-publisher module's internal retry accounting; the hot-relay path and all other modules are unaffected.

BLAST_SCORE: 3/10

CI status (head 4775abfc2e3c)

Overall: ✗ failure

2 checks: 2 pending

Check State Link
im2be-platform-libs CI / mvn install (pull_request) pending details
im2be-platform-libs CI / mvn verify (main only) (pull_request) pending details

Findings (1)

[MINOR] metrics.recordFailure is unguarded in persistFailureOutcome — tick aborts if it throws, skipping markFailureAttempt

outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:278

In persistFailureOutcome, metrics.recordFailure(record.getTopic()) is called at line 278, before the try block (line 292) that protects completion.markFailureAttempt. If a custom MeterRegistry implementation throws from counter.increment(), the method unwinds without persisting the retry-count increment.

The exception then propagates through tryColdPublish's catch blocks (lines 219/222/225), up to the per-row loop in pollAndPublishPending. That loop only catches InterruptedException (lines 151–164); any other exception escapes the loop and aborts the entire tick, silently skipping all remaining batch rows.

Contrast with the success path (lines 248–258), where metrics.recordPublishLatency is symmetrically wrapped:

try {
    metrics.recordPublishLatency(record.getTopic(), Duration.between(startedAt, Instant.now()));
} catch (Exception metricsError) {
    LOG.warn("…", metricsError);
}

Fix: Move metrics.recordFailure(record.getTopic()) inside the existing try block before the markFailureAttempt call, or wrap it in its own try/catch consistent with the success-path pattern.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 13 • 1 finding (1m) • 2026-05-27T02:33:45.268Z → 2026-05-27T02:35:13.599Z • posted-as: pr-reviewer-bot

## hib-pr-reviewer review — PR #4 (affinity-intelligence-rework/im2be-platform-libs) **Round 13** — head `4775abfc2e3c`, base `main`, trigger `synchronize` **TL;DR:** CONDITIONAL_APPROVE — kept 1 minor unique-to-A finding (verified at line 278); B returned no structured findings; no blocking issues. ### Summary **Round 13 reconciliation — PR #4 (outbox-publisher full impl)** **Memora context:** No prior run memories found (tags blocked); created new entry (id=302) with round-13 summary. **Reviewer B:** Returned `findings: []` in their JSON (prose summary was a best-effort salvage from a truncated output). No structured findings to evaluate. **Reviewer A:** 1 minor finding — `metrics.recordFailure` unguarded in `persistFailureOutcome`. This is unique-to-A and required verification. **Verification performed (Rule 2):** Read `OutboxPollerWorker.java` lines 250–328 and 100–170. - **Confirmed:** Line 278 calls `metrics.recordFailure(record.getTopic())` *before* the `try` block (which begins at line 292) that wraps `completion.markFailureAttempt`. If a metrics implementation throws from `counter.increment()`, `markFailureAttempt` is never called. - **Confirmed propagation path:** `tryColdPublish` (line 207) calls `persistFailureOutcome` inside `catch` blocks (lines 219, 222, 225). The per-row loop in `pollAndPublishPending` (lines 144–165) only catches `InterruptedException` — a non-interrupt exception escaping `tryColdPublish` propagates out of the `for` loop, past the outer `try/finally`, and aborts the entire tick. - **Confirmed asymmetry:** Success-path `metrics.recordPublishLatency` (lines 248–258) is wrapped in its own try/catch with a `LOG.warn`. The failure path has no equivalent guard. **Finding kept (1 of 1).** All R1–R12 findings remain closed per the verified resolution table in Reviewer A's summary. ### Blast Radius The affected code path is confined to the cold-relay poller worker's failure-outcome persistence helper. Impact is limited to the `outbox-publisher` module's internal retry accounting; the hot-relay path and all other modules are unaffected. **BLAST_SCORE: 3/10** ### CI status (head `4775abfc2e3c`) **Overall: ✗ failure** 2 checks: 2 pending | Check | State | Link | |---|---|---| | im2be-platform-libs CI / mvn install (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/47/jobs/0) | | im2be-platform-libs CI / mvn verify (main only) (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-platform-libs/actions/runs/47/jobs/1) | ### Findings (1) #### **[MINOR]** `metrics.recordFailure` is unguarded in `persistFailureOutcome` — tick aborts if it throws, skipping `markFailureAttempt` _outbox-publisher/src/main/java/com/aim2be/platform/outbox/OutboxPollerWorker.java:278_ In `persistFailureOutcome`, `metrics.recordFailure(record.getTopic())` is called at line 278, **before** the `try` block (line 292) that protects `completion.markFailureAttempt`. If a custom `MeterRegistry` implementation throws from `counter.increment()`, the method unwinds without persisting the retry-count increment. The exception then propagates through `tryColdPublish`'s catch blocks (lines 219/222/225), up to the per-row loop in `pollAndPublishPending`. That loop only catches `InterruptedException` (lines 151–164); any other exception escapes the loop and aborts the entire tick, silently skipping all remaining batch rows. Contrast with the success path (lines 248–258), where `metrics.recordPublishLatency` is symmetrically wrapped: ```java try { metrics.recordPublishLatency(record.getTopic(), Duration.between(startedAt, Instant.now())); } catch (Exception metricsError) { LOG.warn("…", metricsError); } ``` **Fix:** Move `metrics.recordFailure(record.getTopic())` inside the existing `try` block before the `markFailureAttempt` call, or wrap it in its own try/catch consistent with the success-path pattern. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 13 • 1 finding (1m) • 2026-05-27T02:33:45.268Z → 2026-05-27T02:35:13.599Z • posted-as: pr-reviewer-bot</sub>
fix(outbox): R14 — symmetric metrics.recordFailure guard in persistFailureOutcome (1 R13 MINOR)
Some checks failed
im2be-platform-libs CI / mvn install (pull_request) Failing after 3s
im2be-platform-libs CI / mvn verify (main only) (pull_request) Has been skipped
e85f37bd67
R13 verdict findings (kept=1):

(1) [MINOR, unique-to-A verified] OutboxPollerWorker.persistFailureOutcome — `metrics.recordFailure(record.getTopic())` at line 278 was called BEFORE the try block (which begins at line 292) that wraps `completion.markFailureAttempt`. If a metrics implementation throws from `counter.increment()`, markFailureAttempt is never called AND the exception propagates out of tryColdPublish, past the per-row loop's InterruptedException-only catch, aborting the entire poller tick.

Asymmetric with the success-path `metrics.recordPublishLatency` at ~line 250, which already has its own try/catch + LOG.warn.

Fix: wrapped `metrics.recordFailure(...)` in try/catch(Exception) with LOG.warn. A metrics-implementation throw is now contained — markFailureAttempt still runs, the tick continues, and operator gets a log line about the metrics failure.

Verification:
- mvn -B -pl outbox-publisher test → 68 tests passed
- mvn -B -pl outbox-publisher -am install -DskipITs → BUILD SUCCESS, zero [WARNING], zero [ERROR]
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
affinity-intelligence-rework/im2be-platform-libs!4
No description provided.