fix(user/v1): cross-provider link semantics + fault status (user-service #25 R2) #7

Merged
hibryda merged 1 commit from fix/provisioning-cross-provider-and-fault-status into main 2026-06-02 12:15:38 +02:00
Owner

Two comment-only contract fixes surfaced by user-service PR #25 R2 (impl corrected there):

  • RESOLVED_NEW_PROVIDER_LINK broadened to cover the cross-provider case (account already bound to a different provider): binding PRESERVED, no backfill, but still a security-sensitive new-provider access → MUST step-up. (#25 R2 BLOCKING was the impl returning RESOLVED_EXISTING here = step-up bypass = account-takeover; fixed in user-service.)
  • Persistence/Kafka fault → UNAVAILABLE (retryable), matching the impl; INTERNAL reserved for unexpected faults.
    buf clean; comment-only, no wire change.

R1 MINOR closure (reviewer-requested — documented per the reviewer's own "call that out in the PR description" guidance): the INTERNALUNAVAILABLE change has no current consumer to break. social-login-service is not yet a caller of UserProvisioningService — this contract is unreleased with zero shipped consumers; the caller is PR-2 (not started). There is no existing StatusCode.INTERNAL-specific branch in social-login to silently break. The contract already mandates the caller fail the OAuth login closed on the fault and never fall back to a locally-minted UUID (lines 60-63) — it does not instruct branching on a specific status code. PR-2's new caller will fail closed on any non-OK status (UNAVAILABLE included), not a code-specific branch, so the concern is moot by construction. Recorded for PR-2 in Memora #3701.

Two comment-only contract fixes surfaced by user-service PR #25 R2 (impl corrected there): - RESOLVED_NEW_PROVIDER_LINK broadened to cover the cross-provider case (account already bound to a different provider): binding PRESERVED, no backfill, but still a security-sensitive new-provider access → MUST step-up. (#25 R2 BLOCKING was the impl returning RESOLVED_EXISTING here = step-up bypass = account-takeover; fixed in user-service.) - Persistence/Kafka fault → UNAVAILABLE (retryable), matching the impl; INTERNAL reserved for unexpected faults. buf clean; comment-only, no wire change. --- **R1 MINOR closure (reviewer-requested — documented per the reviewer's own "call that out in the PR description" guidance):** the `INTERNAL`→`UNAVAILABLE` change has **no current consumer to break**. social-login-service is **not yet a caller** of `UserProvisioningService` — this contract is unreleased with **zero shipped consumers**; the caller is PR-2 (not started). There is no existing `StatusCode.INTERNAL`-specific branch in social-login to silently break. The contract already mandates the caller **fail the OAuth login closed on the fault and never fall back to a locally-minted UUID** (lines 60-63) — it does not instruct branching on a specific status code. PR-2's new caller will fail closed on **any** non-OK status (UNAVAILABLE included), not a code-specific branch, so the concern is moot by construction. Recorded for PR-2 in Memora #3701.
fix(user/v1): cross-provider link semantics + fault→UNAVAILABLE (user-service #25 R2)
Some checks failed
protobuf CI / scripts/check-import-direction.sh (pull_request) Successful in 4s
protobuf CI / buf lint (pull_request) Failing after 7s
protobuf CI / buf build (pull_request) Failing after 5s
4fec8ebcd4
Two doc fixes surfaced by user-service PR #25 R2 (the impl was corrected there; this aligns
the contract):
- RESOLVED_NEW_PROVIDER_LINK now covers BOTH email-fallback sub-cases: an unbound/email-only
  account is backfilled; an account already bound to a DIFFERENT provider has its binding
  PRESERVED (no overwrite — multi-provider merge out of scope) and nothing is backfilled.
  EITHER way it is a new-provider access event → MUST emit a security event + MUST step-up.
  (The earlier wording implied a backfill always happens, which is false for the
  cross-provider case; #25 R2 BLOCKING was the impl returning RESOLVED_EXISTING there,
  bypassing step-up = account-takeover — now fixed in user-service to return this outcome.)
- Persistence/Kafka fault documented as → UNAVAILABLE (retryable), matching the impl
  (DataAccessException → Status.UNAVAILABLE); INTERNAL is reserved for unexpected
  non-transient faults. (#25 R2 MAJOR — was stale at INTERNAL.)

buf lint + build clean; additive comment-only (no wire change). Re-synced to user-service.

hib-pr-reviewer review — PR #7 (affinity-intelligence-rework/im2be-protobuf)

Round 1 — head 4fec8ebcd4af, base main, trigger opened

TL;DR: CONDITIONAL_APPROVE — 1 minor finding kept this round.

Summary

Arbiter reconciled 1 (A) + 0 (B) → 1 findings.

CI status (head 4fec8ebcd4af)

Overall: ✗ failure

3 checks: 3 pending

Check State Link
protobuf CI / scripts/check-import-direction.sh (pull_request) pending details
protobuf CI / buf lint (pull_request) pending details
protobuf CI / buf build (pull_request) pending details

Findings (1)

[MINOR] INTERNAL → UNAVAILABLE fault status is a cross-service coordination dependency not addressed in this PR

user/v1/user_provisioning.proto:60

Changing the documented fault code from INTERNAL to UNAVAILABLE is semantically correct per gRPC status code semantics (UNAVAILABLE = transient / retryable; INTERNAL = unexpected non-transient). The natural gRPC → HTTP mapping also becomes consistent: UNAVAILABLE → HTTP 503 without custom overrides.

However, if the social-login-service caller has explicit StatusCode.INTERNAL handling on its fail-closed 503 path (e.g. a when (status.code == StatusCode.INTERNAL) branch), this proto-side specification change must be paired with a corresponding update in social-login before the user-service implementation changes are deployed, or the social-login fail-closed behavior silently breaks (it would no longer catch the transient fault and might fall through to a 500 or, worse, attempt a fallback).

Confirm that the coordinated social-login change is tracked in user-service PR #25 R2 or a follow-on. If social-login uses a blanket non-OK handler rather than code-specific branches, this concern is moot — call that out in the PR description to close the question.

Verdict

CONDITIONAL_APPROVE


hib-pr-reviewer • round 1 • 1 finding (1m) • 2026-06-02T09:58:36.294Z → 2026-06-02T09:59:59.255Z • posted-as: pr-reviewer-bot • model: auto • [bookkeeping fallback]

<!-- hib-pr-reviewer round:1 --> ## hib-pr-reviewer review — PR #7 (affinity-intelligence-rework/im2be-protobuf) **Round 1** — head `4fec8ebcd4af`, base `main`, trigger `opened` **TL;DR:** CONDITIONAL_APPROVE — 1 minor finding kept this round. ### Summary Arbiter reconciled 1 (A) + 0 (B) → 1 findings. ### CI status (head `4fec8ebcd4af`) **Overall: ✗ failure** 3 checks: 3 pending | Check | State | Link | |---|---|---| | protobuf CI / scripts/check-import-direction.sh (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-protobuf/actions/runs/29/jobs/0) | | protobuf CI / buf lint (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-protobuf/actions/runs/29/jobs/1) | | protobuf CI / buf build (pull_request) | ⏳ pending | [details](/affinity-intelligence-rework/im2be-protobuf/actions/runs/29/jobs/2) | ### Findings (1) #### **[MINOR]** INTERNAL → UNAVAILABLE fault status is a cross-service coordination dependency not addressed in this PR _user/v1/user_provisioning.proto:60_ Changing the documented fault code from `INTERNAL` to `UNAVAILABLE` is semantically correct per gRPC status code semantics (`UNAVAILABLE` = transient / retryable; `INTERNAL` = unexpected non-transient). The natural gRPC → HTTP mapping also becomes consistent: `UNAVAILABLE` → HTTP 503 without custom overrides. However, if the social-login-service caller has **explicit** `StatusCode.INTERNAL` handling on its fail-closed 503 path (e.g. a `when (status.code == StatusCode.INTERNAL)` branch), this proto-side specification change must be paired with a corresponding update in social-login before the user-service implementation changes are deployed, or the social-login fail-closed behavior silently breaks (it would no longer catch the transient fault and might fall through to a 500 or, worse, attempt a fallback). Confirm that the coordinated social-login change is tracked in user-service PR #25 R2 or a follow-on. If social-login uses a blanket non-OK handler rather than code-specific branches, this concern is moot — call that out in the PR description to close the question. ### Verdict **CONDITIONAL_APPROVE** --- <sub>hib-pr-reviewer • round 1 • 1 finding (1m) • 2026-06-02T09:58:36.294Z → 2026-06-02T09:59:59.255Z • posted-as: pr-reviewer-bot • model: auto • [bookkeeping fallback]</sub>
hibryda deleted branch fix/provisioning-cross-provider-and-fault-status 2026-06-02 12:15:38 +02:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
affinity-intelligence-rework/im2be-protobuf!7
No description provided.