03 — Authentication
Proving who a principal is — humans and machines — and managing the tokens that carry that proof. Decision: ADR-0036; builds on ADR-0010. Closes OWASP API2 (Broken Authentication).
1. Human authentication
| Method | Use | Notes |
|---|---|---|
| OIDC (Auth Code + PKCE) | SaaS + SSO | OAuth 2.1 baseline; pluggable IdP |
| SAML | enterprise SSO | for IdPs that require it |
| Password | self-host / fallback | argon2id, breach-check (HIBP k-anonymity), no composition rules theater |
| MFA | all human logins | WebAuthn/passkeys preferred, TOTP fallback; phishing-resistant |
| Step-up auth | sensitive ops | re-prompt MFA for key ops, sharing externally, admin/role changes, billing |
- Passkeys first: WebAuthn is phishing-resistant and removes the password attack surface; passwords remain a fallback, hardened.
- Step-up decouples “logged in” from “allowed to do something dangerous” — a stolen session still can’t rotate keys or exfiltrate via mass external share without re-auth.
2. Machine authentication
- Service accounts authenticate via a scoped API key (02/ADR-0035) or, preferably, workload OIDC (short-lived JWT/mTLS from the runtime — no static secret, ADR-0030).
- mTLS between internal services (zero trust, platform/02).
- API keys carry no interactive privileges (no MFA-gated/admin-destructive actions without step-up) — a leaked key has a deliberately small blast radius.
3. Tokens & sessions (OAuth 2.1, RFC 9700)
| Control | Policy |
|---|---|
| PKCE | mandatory for all clients (public and confidential) — kills code-interception |
| Grants | no implicit, no ROPC (removed in OAuth 2.1) |
| Redirect URIs | exact string match |
| Access token TTL | 5–15 min for sensitive ops, 30–60 min general (RFC 9700) |
| Refresh tokens | rotation + reuse detection → on reuse, invalidate the entire token family and force re-auth |
| Sender-constraining | DPoP (public clients) / mTLS-bound (confidential) for high-assurance — a stolen token without the key is useless |
| JWT validation | verify exp, nbf, aud, iss, signature; reject alg:none; rotate signing keys (JWKS) |
| Web sessions | cookies HttpOnly + Secure + SameSite; idle and absolute timeout; rotate on privilege change |
sequenceDiagram
autonumber
participant C as Client
participant GW as Gateway
participant IdP as OIDC IdP
C->>GW: start login
GW->>C: redirect (PKCE code_challenge, exact redirect_uri)
C->>IdP: authenticate + MFA/passkey
IdP-->>C: authorization code
C->>GW: code + code_verifier (PKCE)
GW->>IdP: exchange (verify challenge)
IdP-->>GW: id_token + access (short TTL) + refresh
GW-->>C: session (DPoP-bound, httpOnly cookie)
Note over GW: tenant_id derived from VERIFIED token → set as request context ([05])
4. Token propagation (authenticate once, carry context)
The gateway authenticates at the edge, derives tenant + principal context from the verified token, and propagates a signed internal auth context over mTLS (ADR-0010); downstream services re-validate context on every authz decision (04) — never re-trusting client input.
5. Threats addressed & residual
| Threat | Control | Residual |
|---|---|---|
| Credential stuffing / brute force | MFA + lockout + breach-pw check + rate limit (09) | low (human factor) |
| Token theft / replay | short TTL + rotation + reuse-detect + DPoP | low |
| Code interception | PKCE + exact redirect | very low |
| Phishing | passkeys (phishing-resistant) + step-up | low-med |
| Leaked API key | hashed, scoped, rotatable; no interactive priv (02) | medium → low w/ scanning |
Residual reality: the human is the weakest link — passkeys + step-up + anomaly detection (07) shrink but never eliminate phishing/social risk.
References
- OAuth 2.1: https://oauth.net/2.1/ · Security BCP (RFC 9700): https://www.rfc-editor.org/rfc/rfc9700 · DPoP (RFC 9449): https://www.rfc-editor.org/rfc/rfc9449
- WebAuthn / passkeys: https://www.w3.org/TR/webauthn-2/
- OWASP ASVS (authentication): https://owasp.org/www-project-application-security-verification-standard/