Class RedisDurabilityValidator
- All Implemented Interfaces:
org.springframework.beans.factory.InitializingBean
appendfsync everysec a crash loses up
to ~1s of acked writes, and under an eviction policy other than
noeviction an OOM silently drops keys instead of failing the write
closed. This validator asserts the store is configured for durable,
fail-closed writes — at startup (InitializingBean) AND periodically
(Scheduled) so a live CONFIG SET that weakens durability after
boot is detected. The two phases have DIFFERENT enforcement (reviewer R3): the
startup gate HALTS bootstrap on a FAIL_FAST violation; the periodic
check can only ALERT (a @Scheduled method cannot abort a running app —
see periodicCheck()).
Required configuration:
appendonly = yes— AOF persistence on (RDB-only would lose all writes since the last snapshot on crash).appendfsync ∈ {everysec, always}—nomeans the OS decides when to flush (unbounded loss window). Whenrequire-always-fsync=true(audit-grade), ONLYalwayspasses.maxmemory-policy = noeviction— turns an OOM into a fail-closed write error (correct only when the mint fails-closed + alerts, which the enqueue script + this validator together ensure); any eviction policy could silently evict an un-relayed outbox entry.
On violation: RedisOutboxProperties.EnforcementMode.FAIL_FAST at
STARTUP throws (aborting bootstrap before any event is relayed against a
non-durable store); on the PERIODIC path it instead emits an explicit ERROR
alert (the throw cannot halt a live app — see periodicCheck()).
RedisOutboxProperties.EnforcementMode.WARN logs loudly + lets the app
run (dev only). Both modes increment
im2be_outbox_redis_durability_violations_total on every violation.
Doc-grounded note (rule 61). A WAIT-after-EVAL fsync barrier
was considered (design doc §7) but VERIFIED to be a no-op for AOF durability
in a standalone deployment (https://redis.io/docs/latest/commands/wait/
— WAIT is a replication barrier; with zero replicas it returns
immediately and does NOT flush the AOF). The appendfsync policy is the
only real single-node durability lever — hence this validator enforces it
rather than relying on a barrier (see
RedisOutboxProperties.Durability.isFsyncBarrierEnabled()).
-
Constructor Summary
ConstructorsConstructorDescriptionRedisDurabilityValidator(org.springframework.data.redis.core.StringRedisTemplate redis, RedisOutboxProperties.Durability config, RedisOutboxMetrics metrics) -
Method Summary
Modifier and TypeMethodDescriptionvoidStartup gate.voidPeriodic re-check.
-
Constructor Details
-
RedisDurabilityValidator
public RedisDurabilityValidator(org.springframework.data.redis.core.StringRedisTemplate redis, RedisOutboxProperties.Durability config, RedisOutboxMetrics metrics) - Parameters:
redis- Redis template (used forCONFIG GET); nevernullconfig- durability config (enforcement mode, always-fsync flag); nevernullmetrics- metrics binder (durability-violation counter); nevernull
-
-
Method Details
-
afterPropertiesSet
public void afterPropertiesSet()Startup gate. InFAIL_FASTmode a violation throws here, aborting application bootstrap before any event is relayed against a non-durable store.- Specified by:
afterPropertiesSetin interfaceorg.springframework.beans.factory.InitializingBean- Throws:
RedisOutboxDurabilityException- inFAIL_FASTmode when the store is not durably configured
-
periodicCheck
@Scheduled(fixedDelayString="${im2be.outbox.redis.durability.check-interval-ms:60000}") public void periodicCheck()Periodic re-check. Detects a liveCONFIG SETthat weakens durability after boot. Cadence governed byim2be.outbox.redis.durability.check-interval-ms.Does NOT halt a running app (reviewer R3). A
@Scheduledmethod cannot abort the application: Spring's default scheduled-taskErrorHandler(LOG_AND_SUPPRESS_ERROR_HANDLER) logs + swallows any thrown exception and re-schedules the next tick. So inFAIL_FASTmode a post-boot violation is converted here into an explicit, actionable ERROR alert (paired with theim2be_outbox_redis_durability_violations_totalmetric) rather than thrown — thestartup gateremains the hard fail-closed; live regressions are alerted, and an operator restores durability (or restarts to re-trigger the startup gate). The exception is intentionally NOT propagated out of this method.
-