03 — Authentication

Proving who a principal is — humans and machines — and managing the tokens that carry that proof. Decision: ADR-0036; builds on ADR-0010. Closes OWASP API2 (Broken Authentication).

1. Human authentication

Method	Use	Notes
OIDC (Auth Code + PKCE)	SaaS + SSO	OAuth 2.1 baseline; pluggable IdP
SAML	enterprise SSO	for IdPs that require it
Password	self-host / fallback	argon2id, breach-check (HIBP k-anonymity), no composition rules theater
MFA	all human logins	WebAuthn/passkeys preferred, TOTP fallback; phishing-resistant
Step-up auth	sensitive ops	re-prompt MFA for key ops, sharing externally, admin/role changes, billing

Passkeys first: WebAuthn is phishing-resistant and removes the password attack surface; passwords remain a fallback, hardened.
Step-up decouples “logged in” from “allowed to do something dangerous” — a stolen session still can’t rotate keys or exfiltrate via mass external share without re-auth.

2. Machine authentication

Service accounts authenticate via a scoped API key (02/ADR-0035) or, preferably, workload OIDC (short-lived JWT/mTLS from the runtime — no static secret, ADR-0030).
mTLS between internal services (zero trust, platform/02).
API keys carry no interactive privileges (no MFA-gated/admin-destructive actions without step-up) — a leaked key has a deliberately small blast radius.

3. Tokens & sessions (OAuth 2.1, RFC 9700)

Control	Policy
PKCE	mandatory for all clients (public and confidential) — kills code-interception
Grants	no implicit, no ROPC (removed in OAuth 2.1)
Redirect URIs	exact string match
Access token TTL	5–15 min for sensitive ops, 30–60 min general (RFC 9700)
Refresh tokens	rotation + reuse detection → on reuse, invalidate the entire token family and force re-auth
Sender-constraining	DPoP (public clients) / mTLS-bound (confidential) for high-assurance — a stolen token without the key is useless
JWT validation	verify `exp`, `nbf`, `aud`, `iss`, signature; reject `alg:none`; rotate signing keys (JWKS)
Web sessions	cookies `HttpOnly` + `Secure` + `SameSite`; idle and absolute timeout; rotate on privilege change

sequenceDiagram
    autonumber
    participant C as Client
    participant GW as Gateway
    participant IdP as OIDC IdP
    C->>GW: start login
    GW->>C: redirect (PKCE code_challenge, exact redirect_uri)
    C->>IdP: authenticate + MFA/passkey
    IdP-->>C: authorization code
    C->>GW: code + code_verifier (PKCE)
    GW->>IdP: exchange (verify challenge)
    IdP-->>GW: id_token + access (short TTL) + refresh
    GW-->>C: session (DPoP-bound, httpOnly cookie)
    Note over GW: tenant_id derived from VERIFIED token → set as request context ([05])

4. Token propagation (authenticate once, carry context)

The gateway authenticates at the edge, derives tenant + principal context from the verified token, and propagates a signed internal auth context over mTLS (ADR-0010); downstream services re-validate context on every authz decision (04) — never re-trusting client input.

5. Threats addressed & residual

Threat	Control	Residual
Credential stuffing / brute force	MFA + lockout + breach-pw check + rate limit (09)	low (human factor)
Token theft / replay	short TTL + rotation + reuse-detect + DPoP	low
Code interception	PKCE + exact redirect	very low
Phishing	passkeys (phishing-resistant) + step-up	low-med
Leaked API key	hashed, scoped, rotatable; no interactive priv (02)	medium → low w/ scanning

Residual reality: the human is the weakest link — passkeys + step-up + anomaly detection (07) shrink but never eliminate phishing/social risk.

References

OAuth 2.1: https://oauth.net/2.1/ · Security BCP (RFC 9700): https://www.rfc-editor.org/rfc/rfc9700 · DPoP (RFC 9449): https://www.rfc-editor.org/rfc/rfc9449
WebAuthn / passkeys: https://www.w3.org/TR/webauthn-2/
OWASP ASVS (authentication): https://owasp.org/www-project-application-security-verification-standard/