09 — Secrets Management
Task 4: design secrets management. The rule is absolute: no secret material in Git, ever. Git holds references; secrets live in a KMS/vault and are synced into the cluster. Decision in ADR-0030; ties to ADR-0014 envelope encryption.
1. Two approaches, and the one we choose
| Approach | Tools | Secret in Git? |
|---|---|---|
| Encrypted-in-Git | Sealed Secrets, SOPS | yes, encrypted |
| Reference-based | External Secrets Operator (ESO), Secrets Store CSI | no — only a reference |
Decision: ESO + cloud Secrets Manager / Vault as the production primary.
Git contains an ExternalSecret (a reference); the actual value never enters version
control; ESO syncs it into a Kubernetes Secret on the destination cluster (ArgoCD’s
recommended “populate on the cluster” model). SOPS (age/KMS) covers the few
bootstrap/config secrets that must exist before ESO is up. Sealed Secrets is the
self-host-friendly alternative (no external vault required, ADR-0012).
Why ESO over Sealed Secrets for SaaS: secrets stay in one managed source of truth, rotation is centralized (update in the manager → ESO re-syncs), it works across many clusters, and Git holds nothing sensitive. Sealed Secrets bind keys to one cluster and become a rotation/migration headache at scale.
2. The ESO flow (no static credentials anywhere)
flowchart LR
classDef g fill:#fbcfe8,stroke:#be185d,color:#111827;
classDef k fill:#bbf7d0,stroke:#15803d,color:#111827;
classDef c fill:#c7d2fe,stroke:#3730a3,color:#111827;
classDef e fill:#fde68a,stroke:#b45309,color:#111827;
git[("GitOps repo<br/>ExternalSecret + SecretStore (refs only)")]:::g
eso["External Secrets Operator (in-cluster)"]:::e
wid["Workload Identity (IRSA / GKE WI / Azure WI)"]:::c
vault[("Cloud Secrets Manager / Vault")]:::c
ksec[("Kubernetes Secret (synced)")]:::k
pod["pod mounts secret / env"]:::k
git --> eso
eso -->|"auth via pod identity (no static creds)"| wid --> vault
vault -->|fetch values| eso --> ksec --> pod
eso -. "re-sync on rotation" .-> vault
SecretStoredeclares the provider + auth (via workload identity — the pod assumes a cloud role; no static credentials to read secrets).ExternalSecretdeclares which keys to pull → ESO writes a K8sSecret.- Both manifests are safe in Git (references, not values).
3. Secret categories & how each is handled
| Category | Handling |
|---|---|
| App runtime (DB creds, provider keys, OIDC client secrets) | ESO ← Secrets Manager/Vault |
| Per-tenant data-encryption keys | KMS envelope encryption, app-side (ADR-0014) — not K8s Secrets |
| TLS certs | cert-manager + ACME (auto-issued/renewed), not hand-managed |
| etcd Secret encryption | KMS encryption provider for etcd (defense in depth) |
| ArgoCD repo/cluster creds | bootstrapped by OpenTofu via cloud secret / workload identity (10) |
| Registry / cloud auth (CI) | OIDC, no long-lived keys (07) |
4. Rotation & least privilege
- Rotation happens in the manager; ESO re-syncs on a refresh interval; apps reload (hot-reload or a rolling restart triggered by the Secret’s hash). No Git change, no redeploy of code.
- Least privilege: per-namespace/per-env
SecretStores scope which secrets a workload can read; prod and nonprod use separate stores/KMS keys so a nonprod compromise can’t read prod secrets. - Auditability: the secret manager logs every access; combined with workload identity, every read is attributable.
5. Bootstrap chicken-and-egg
ESO needs to authenticate to the vault before it can sync secrets. Resolved by cloud workload identity provisioned by OpenTofu (10): the ESO pod’s identity is granted read access at cluster-creation time — so the first secret sync needs no pre-seeded credential. ArgoCD’s own bootstrap secret is likewise created by OpenTofu (or SOPS-encrypted in the bootstrap repo).
6. Tradeoffs / Alternatives / Scaling
Tradeoffs. ESO requires running a secret manager (cost) + a controller (overhead), and adds a dependency on the manager’s availability. Worth it: secrets never touch Git, rotation is centralized, and it scales across clusters — the properties Sealed Secrets lacks.
Alternatives considered.
- Sealed Secrets: great for self-host / no-vault setups (chosen as the self-host alternative); cluster-bound keys + poor multi-cluster rotation make it weak for SaaS.
- SOPS-only: good for encrypted-in-Git config and bootstrap; CLI-centric and limited native GitOps support for runtime injection. Used for bootstrap, not the primary.
- Plain K8s Secrets in Git (even base64): base64 is not encryption — forbidden.
- Vault Agent sidecar injection: powerful (dynamic/leased secrets) but heavier; ESO covers our needs; revisit for dynamic DB credentials later.
Scaling concerns.
- Many secrets × envs × clusters → ESO + a single managed source scales linearly;
ClusterSecretStorefor shared,SecretStorefor scoped. - Manager availability on the critical path → ESO caches synced Secrets in etcd, so a transient manager outage doesn’t break running pods (only blocks rotation/new pulls).
- Rotation blast radius → per-env keys + staged rotation.
References
- External Secrets Operator: https://external-secrets.io/
- ArgoCD secret management guidance: https://argo-cd.readthedocs.io/en/stable/operator-manual/secret-management/
- cert-manager: https://cert-manager.io/ · SOPS: https://github.com/getsops/sops · Sealed Secrets: https://github.com/bitnami-labs/sealed-secrets