- Java 98.1%
- Lua 1.9%
|
|
||
|---|---|---|
| .forgejo/workflows | ||
| apicurio-client | ||
| archunit-rules | ||
| error-event-publisher | ||
| outbox-publisher | ||
| outbox-test-fixtures | ||
| processed-kafka-events | ||
| redis-outbox-backend | ||
| .gitignore | ||
| CLAUDE.md | ||
| CODEOWNERS | ||
| LICENSE | ||
| pom.xml | ||
| README.md | ||
im2be-platform-libs
Stage B Wave PLATFORM v1.0 — shared Spring Boot 3.5.14 / Java 17 libraries for the 8 aim2be Java services.
Adopted as the substrate for the L-1 Full-Spec Library First decision per ADR-0014 D-2 plus CONT-3 Option A. The L-1 alternative (Strangler Fig family-first PoC) was rejected — see ADR-0014 §S-1-R4.
Status
This repo currently holds the v1.0 scaffold — module shells with
class signatures, auto-configuration wiring, and full Javadoc. Method
bodies throw UnsupportedOperationException("v1.0 scaffold — impl lands in PR-PLATFORM-N") where N is the implementing PR. Real logic ships in
follow-up PRs:
| PR | Scope |
|---|---|
| PR-PLATFORM-0 (this PR) | 5-module Maven scaffold, autoconfig stubs, ArchRule placeholder, test fixtures shape, CI |
| PR-PLATFORM-1 | outbox-publisher real impl — JPA entity, AFTER_COMMIT relay, @Scheduled poller, SENT transition semantics |
| PR-PLATFORM-2 | processed-kafka-events real impl — INSERT-OR-NOTHING dedup, week-aligned partitioning helper |
| PR-PLATFORM-3 | apicurio-client real impl — Apicurio primary, Confluent wire-format fallback |
| PR-PLATFORM-4 | outbox-test-fixtures real impl — @AutoConfigureOutbox, TestOutboxRecordCaptor |
| PR-PLATFORM-5 | archunit-rules real impl — EntityVersionParity with self-test (positive + negative fixture) |
| PR-PLATFORM-CI-1 | Forgejo Maven package registry deploy on tag push |
Module map
| Module | Purpose | Key deps |
|---|---|---|
outbox-publisher |
PG-backed transactional outbox. OutboxRecord JPA entity, OutboxPublisher AFTER_COMMIT relay, OutboxPollerWorker @Scheduled SENT transitions (per ADR-0014 D-4). Lua-Redis variant for identity-service lands in PR-PLATFORM-1-FOLLOWUP. |
spring-boot-starter-data-jpa, spring-kafka |
processed-kafka-events |
Consumer-side compound-PK dedup (consumer_scope_id, event_id, week_start) per ADR-0014 D-4. DedupGuard.tryClaim(...) / DedupGuard.tryClaimSameTx(...) are the public dedup API (see "Dedup delivery-guarantee" below). |
spring-boot-starter-data-jpa, spring-kafka |
apicurio-client |
SchemaRegistryClient abstraction with Apicurio primary impl + Confluent wire-format fallback path per ADR-0011 §B. Used by every Avro-emitting service. |
io.apicurio:apicurio-registry-client:2.5.11.Final |
outbox-test-fixtures |
@AutoConfigureOutbox test annotation + TestOutboxRecordCaptor assertion API. Per OQ-6 §1.5. |
spring-boot-starter-test (test scope), JUnit Jupiter |
archunit-rules |
EntityVersionParity.entitiesMustCarryVersion() — every @Entity carries @Version unless package-excluded (**.audit.**, **.replica.**) or class-excluded (@Immutable). Per ADR-0001 §1.4. NoBlanketCatch.noBlanketCatchOutsideBoundary() — no catch (Exception|Throwable|RuntimeException) outside a @RestControllerAdvice/@ControllerAdvice boundary (ADR-0013 §3, L0-T0 #7 OTEL sweep); inspects JavaCodeUnit.getTryCatchBlocks(); boundary matched by FQN string, direct or meta-annotated (no spring-web compile dep), so a composed @GlobalExceptionHandler stereotype still counts as a boundary. RuntimeException is flagged because the typed XxxError domain types are unchecked — catching it erases the signal just as catch (Exception) does. NoBlanketCatchArchTest.evaluate(classes) is the preferred single-scan entry point: it evaluates the rule ONCE and returns a WarnResult(count, report) record carrying both the ratchet count and the warn-mode report — avoiding the double evaluation of calling the two legacy accessors back-to-back. The legacy NoBlanketCatchArchTest.evaluateForWarning(classes) (warn-mode report; services log it, don't fail) and NoBlanketCatchArchTest.violationCount(classes) (stable violation count — ArchUnit FailureReport detail size, not a line count — for a ratchet assertion: pin the W1 baseline, assert violationCount(...) <= it, so existing debt is tolerated while new blanket catches fail) are retained for back-compat. Flip to a hard @ArchTest once typed. |
com.tngtech.archunit:archunit-junit5:1.3.0 |
error-event-publisher |
Durable cross-service error-event Kafka stream per ADR-0015. ErrorEventPublisher.publish(ErrorEvent) is a best-effort facade (never throws — observability must not break the caller); it delegates to ErrorEventOutboxWriter, which writes the raw protobuf observability.v1.ErrorEvent to im2be.error.event via the ADR-0014 outbox in a @Transactional(REQUIRES_NEW) boundary (so the error survives a rolling-back business tx — ADR-0015 D-3). Raw bytes, not Apicurio (D-2: registry-outage resilience); partition key = trace_id. Wired into each service's OtelErrorEvents.record(...) so one catch-site call emits span-event + counter + durable Kafka event. Enable via im2be.error-event.enabled=true (default off). |
com.aim2be:outbox-publisher, com.aim2be:proto-observability:1.0.0 |
Dedup delivery-guarantee — tryClaim vs tryClaimSameTx
DedupGuard exposes two propagation variants of the same INSERT-OR-NOTHING
claim. They share one core and differ only in @Transactional propagation;
that single difference flips the delivery guarantee on the business-failure
path. Pick per consumer:
tryClaim(...) (REQUIRES_NEW) |
tryClaimSameTx(...) (MANDATORY) |
|
|---|---|---|
| Claim transaction | Separate — commits independently of the caller. | The caller's — commits/rolls back atomically with it. |
| Caller business logic then fails | Claim is already committed → event is NOT redelivered → at-most-once-on-failure (the event is effectively dropped; no poison-message redelivery loop). | Claim rolls back WITH the business tx → event is redelivered and re-processed → at-least-once. |
| Needs an active caller tx? | No — opens its own. | Yes — fails fast with IllegalTransactionStateException if none (MANDATORY, not REQUIRED, is deliberate: it asserts the consumer is transactional rather than silently degrading to at-most-once — mirrors the OutboxPublisher.publish(...) fail-fast philosophy, ADR-0014 D-7). |
| Use when | Consumer is idempotent OR loss-tolerant — e.g. push notifications, where re-sending on every retry is worse than occasionally dropping one. | Every event MUST be processed at least once AND the consumer has a DLT + retry-limit. |
WARNING (
tryClaimSameTx): because the claim rolls back with the business transaction, a deterministically-failing ("poison") event is redelivered and re-fails forever unless the consumer caps redelivery. Callers oftryClaimSameTxMUST configure a Dead-Letter Topic plus a bounded retry/back-off (e.g. spring-kafkaDefaultErrorHandlerwith aDeadLetterPublishingRecovererand a boundedBackOff); otherwise the partition stalls on the first un-processable record.tryClaimhas no such requirement precisely because it never redelivers.
Both come in an explicit-scope overload and a no-scope overload (the latter
uses im2be.dedup.default-scope-id); per-pod broadcast consumers
(ADR-0014 §L-12) must use the explicit-scope overload with a pod-suffixed
scope.
Resolving the dedup event_id — KafkaHeaderUtils.resolveEventId(...)
KafkaHeaderUtils.resolveEventId(ConsumerRecord<?,?>) derives the
event_id that feeds the DedupGuard claim. It reads the event_id Kafka
header (decoded UTF-8, trimmed; last value wins) and falls back to the
deterministic topic:partition:offset coordinates when the header is absent,
null, or blank. This is the canonical extraction of the helper that all five
consumer services (user / family / notification / calendar / diary) previously
carried as a private copy in their KafkaConsumerService; consumer adoption
(replacing each private copy with this shared util) is a separate phase — the
services are not changed by the PR that introduces the util.
Outbox relay producer — dedicated raw-bytes KafkaTemplate
The outbox relay (OutboxPublisher AFTER_COMMIT + OutboxPollerWorker) sends
raw byte[] keys and values — protobuf payloads persisted verbatim, no
Apicurio envelope (ADR-0014 D-2). It therefore needs a ByteArraySerializer
producer, which is not what a consuming service's default KafkaTemplate
carries (the 7 services configure String/JSON serializers for their own domain
events).
OutboxAutoConfiguration#buildOutboxKafkaTemplate(...) builds that producer
internally — derived from the service's own spring.kafka.* config (bootstrap
servers, security, SSL bundle) via
KafkaProperties.buildProducerProperties(SslBundles), then forcing five
knobs: key + value ByteArraySerializer, acks=all,
enable.idempotence=true, and max.in.flight.requests.per.connection=5
(idempotence requires acks=all AND max.in.flight <= 5, so pinning all three
keeps a stronger-throughput service config from throwing at first send). The
durability knobs are forced (not
putIfAbsent-ed) because a durable error-event relay must never silently
inherit a weaker acks from a service's general-purpose producer — it owns its
at-least-once guarantee. (No-op for the 7 services, which already set
acks=all + idempotence; load-bearing only against a weaker config.)
Why built inside the bean method, not published as a
@Bean:KafkaAutoConfiguration'skafkaTemplate/kafkaProducerFactoryare@ConditionalOnMissingBean, so contributing a competingKafkaTemplateorProducerFactorybean would suppress the service's default producer and break every non-outbox Kafka send. Keeping the byte[] template private to the relay leaves the service's default producer untouched.
The underlying DefaultKafkaProducerFactory is built once and cached on the
auto-configuration singleton (outboxProducerFactory field), so the
outboxPublisher (AFTER_COMMIT relay) and outboxPollerWorker (@Scheduled
poller) bean methods share one broker connection pool rather than opening
two. Because the factory is not a managed bean it would never receive Spring's
DisposableBean callback, so a @PreDestroy (closeOutboxProducerFactory)
calls factory.destroy() at context shutdown — flushing buffered records and
releasing the sender thread + broker TCP connections (a clean rolling-deploy is
load-bearing for an at-least-once relay).
This closed a #332 silent-failure: the autoconfig previously injected a
KafkaTemplate<byte[], byte[]> bean, but with no byte[] producer present Spring
satisfied it from the service's String/JSON template by generic erasure → every
outbox row failed with SerializationException: Can't convert key of class [B to StringSerializer. OutboxPublisherIT now runs with a String-serializer
service default (mirroring production) + acks=1, proving the relay's own
template is independent of — and strengthens — the service's producer config.
Distributed-trace carry through the outbox (GC-23 (b))
The outbox stores-and-forwards, so without help a business event's trace would
end at the outbox write and the relay would start a fresh trace — breaking
end-to-end correlation (producing service → Kafka → consuming relay). The
outbox-publisher closes that gap with a W3C traceparent carry
(OutboxTraceContext):
- Capture at write.
OutboxPublisher.publish(...)captures the producing span's W3Ctraceparent(+tracestate) at outbox-write time — while the business span is current — and persists them on the row (trace_parent/trace_statecolumns onOutboxRecord). - Re-emit at publish. Both relays add the trace context as Kafka message
headers (
traceparent/tracestate): the AFTER_COMMIT hot relay nests aim2be.outbox.publish.hotspan under the restored business context and emits that span's context; the@Scheduledcold poller (which may run after a restart, with no live context) re-emits the persisted strings verbatim. - Continue at the consumer. A consuming relay that extracts the
traceparentheader (e.g.im2be-realtime-service's notification relay) continues the SAMEtrace_id→ one trace from business event to client.
W3C is used explicitly (W3CTraceContextPropagator), not the service's
configured propagator, so the traceparent header the polyglot consumer expects
is always emitted (ADR-0013). The carry is a no-op when no recording span is
active (OpenTelemetry.noop() / sampled-out): the persisted trace_parent is
NULL, no header is emitted, and downstream roots its own trace — no regression.
Schema migration for consuming services. The new
trace_parent/trace_statecolumns are added to theOutboxRecordentity. Local/k3d services usingspring.jpa.hibernate.ddl-auto=updatepick them up automatically. A service that owns an explicit Flyway/Liquibase migration foroutbox_records(dev/stage/prod) MUST addALTER TABLE outbox_records ADD COLUMN trace_parent VARCHAR(64), ADD COLUMN trace_state VARCHAR(512);(both nullable) before deploying the upgraded library.
Tech stack
- Java 17 (Spring Boot 3.5.x minimum)
- Spring Boot 3.5.14 (per L-21 harmonization target; OSS-supported to
2026-06-30 —
L-21-FOLLOWUPqueued to bump to 4.0.6 within 5 weeks) - Maven 3.9+
- JPA via Hibernate (Spring Boot defaults)
- spring-kafka (Spring Boot defaults)
- Apicurio Registry client 2.5.11.Final
- ArchUnit JUnit5 1.3.0
Dev commands
# Full build + tests (parent aggregator drives all 5 modules)
mvn install
# Per-module test runs
mvn -pl outbox-publisher test
mvn -pl processed-kafka-events test
mvn -pl apicurio-client test
mvn -pl outbox-test-fixtures test
mvn -pl archunit-rules test
# Compile-only quick check
mvn -B compile
# Install to local Maven repo (without running tests — for service-side
# experiments only; never skip tests on a real PR)
mvn -DskipTests=true install
Build MUST exit 0 with zero warnings — rule 62 (clean-compilation).
Distribution
Day-1: library is installed locally via mvn install; consuming services
list com.aim2be:im2be-platform-libs-parent as a dependencyManagement
import and per-module artefacts as plain <dependency> entries.
Tag-push-time: Forgejo's native Maven package registry receives the
deploy (configured in PR-PLATFORM-CI-1 — distributionManagement + a
deploy-plugin invocation in .forgejo/workflows/ci.yml). Versions follow
semver: MAJOR.MINOR.PATCH; breaking changes bump MAJOR and require an
ADR amendment.
Library governance
- Named owner:
@hibryda(per CODEOWNERS). - Named deputy: placeholder — operator will appoint when team grows.
- 48h PR-review SLA; after 48h any senior engineer may merge with 2 peer approvals (branch protection enforces 2-approval minimum).
- Cross-module API changes require an ADR amendment in
im2be-monobefore merge. - Version bumps are operator-driven via the meta-repo submodule-bump pattern (same playbook as PR-OPAQUE-4-FOLLOWUP). No Renovate Bot — library is internal-only.
Cross-references
- ADR-0001 §1.4 —
@Versionparity rule (archunit-rules) - ADR-0002 §3b — Avro ticket lifecycle (apicurio-client consumer)
- ADR-0011 §B — Apicurio + Confluent fallback (apicurio-client)
- ADR-0014 D-2 / D-4 — Outbox + record_version parity (outbox-publisher, processed-kafka-events)
- .planning/26-stage-b-outbox-parity.md OQ-2 / OQ-6 / OQ-7
License
Apache-2.0 (LICENSE file added in PR-PLATFORM-0-FOLLOWUP — same template as identity-service).