02 — Kubernetes Namespaces
Task 1: design Kubernetes namespaces. Namespaces are BitVault’s isolation and policy unit for environments and platform concerns — deliberately not for customers (tenant isolation is application-level RLS, ADR-0007).
1. Namespace taxonomy (per cluster)
| Namespace | Holds | Why isolated |
|---|---|---|
platform |
ArgoCD, Argo Rollouts, External Secrets Operator, cert-manager, ingress controller, OTel Collector, Velero, monitoring | privileged controllers; cluster-wide blast radius; strict RBAC |
bitvault |
app workloads: gateway, bitvaultd, workers |
the product; HPA-scaled, stateless |
bitvault-data |
stateful operators + instances: CloudNativePG, Redis, NATS, OpenSearch | different lifecycle, storage, and backup posture; PVCs |
(nonprod only) dev, staging, pr-<n> |
full app (+ shared or per-env data) per environment | environment separation on a shared cluster (03) |
Why split
bitvaultfrombitvault-data: stateless app and stateful data have opposite operational profiles (scale-to-zero vs never-lose-a-byte), different RBAC, different backup/restore (12), and different network exposure. Keeping them apart makes quotas, policies, and DR tractable.
2. Per-namespace controls (every namespace gets these)
| Control | Purpose |
|---|---|
| ResourceQuota | cap CPU/mem/storage/object counts per namespace → no env starves another |
| LimitRange | default + max requests/limits per pod → no unbounded pod |
| NetworkPolicy | default-deny ingress and egress; explicit allows only (§3) |
| PodSecurity admission | enforce the restricted profile (non-root, no privilege escalation, dropped caps) — matches the image design (01) |
| RBAC | least-privilege ServiceAccounts; humans via SSO groups; no cluster-admin in app namespaces |
| Labels | env, team, cost-center for policy selectors + chargeback |
3. Network policy (default-deny)
flowchart TB
classDef ext fill:#fecaca,stroke:#b91c1c,color:#111827;
classDef gw fill:#fde68a,stroke:#b45309,color:#111827;
classDef app fill:#bbf7d0,stroke:#15803d,color:#111827;
classDef data fill:#c7d2fe,stroke:#3730a3,color:#111827;
net(("Internet")):::ext
ing["Ingress controller (platform)"]:::gw
gw["gateway (bitvault)"]:::app
api["bitvaultd / workers (bitvault)"]:::app
pg[("Postgres / Redis / NATS / OpenSearch (bitvault-data)")]:::data
kms["Cloud KMS / Secrets / Object storage"]:::ext
net -->|"443 only"| ing -->|allow| gw -->|allow gRPC| api
api -->|allow| pg
api -->|"egress allow"| kms
net -. "presigned (bypasses cluster)" .-> kms
Rules: only the ingress controller is internet-reachable; gateway→bitvaultd
allowed; app→bitvault-data allowed; controlled egress to KMS/secrets/object
storage; everything else denied. Bulk file bytes never traverse the cluster
(presigned direct-to-storage, storage/ADR-0011).
4. What namespaces are NOT for: tenants
A namespace per customer is impossible at BitVault’s scale (millions of tenants; namespaces cost etcd objects, controllers, and quota churn). Tenant isolation is enforced in the data layer (Postgres RLS + tenant-prefixed object keys, ADR-0007). Namespaces isolate environments and platform concerns, full stop. (A dedicated single-tenant cluster is an enterprise option, not the default.)
5. Platform tenancy via ArgoCD AppProjects
ArgoCD AppProjects (06) constrain which source repos,
destination clusters, and namespaces each Application may touch — so a
misconfigured app can’t deploy into platform or another env. This is the GitOps-side
guardrail complementing namespace RBAC.
6. Tradeoffs / Alternatives / Scaling
Tradeoffs. Namespace-per-environment (nonprod) is cheaper than cluster-per-env but offers soft isolation (shared control plane/nodes); prod gets a dedicated cluster for hard isolation. We accept soft isolation for dev/staging where blast radius is tolerable.
Alternatives considered.
- Cluster-per-environment for all envs: strongest isolation, highest cost/ops. Reserved for prod (and enterprise dedicated). Rejected for dev/staging on cost.
- Namespace-per-tenant: impossible at scale (see §4). Rejected.
- vCluster (virtual clusters) for previews: attractive for strong per-PR isolation on shared nodes; a good future option for previews (03) if namespace-level preview isolation proves insufficient.
Scaling concerns.
- Quota tuning as workloads grow → quotas are values in the GitOps repo, reviewed like code.
- NetworkPolicy sprawl → a small set of reusable policy templates (default-deny + labeled allows), not per-pod bespoke rules.
- etcd object pressure from many preview namespaces → TTL + reap closed PRs (03).
References
- Pod Security Standards: https://kubernetes.io/docs/concepts/security/pod-security-standards/
- NetworkPolicy: https://kubernetes.io/docs/concepts/services-networking/network-policies/
- ArgoCD Projects: https://argo-cd.readthedocs.io/en/stable/user-guide/projects/