Kubernetes
Namespace Layout
Workloads are separated by environment and concern. PodSecurity admission is enforced at the namespace level.
| Namespace | Tenants / Contents | PodSecurity Profile |
|---|---|---|
bitvault-system |
Operators (ArgoCD, external-secrets, cert-manager, KEDA), shared infra controllers | privileged (operators require it); tightly RBAC scoped |
bitvault-prod |
Production app workloads (bitvaultd, bitvault-worker, bitvault-web) | restricted |
bitvault-staging |
Staging app workloads | restricted |
bitvault-dev |
Dev app workloads | restricted |
bitvault-preview-pr-N |
Per-PR ephemeral review workloads | restricted |
Operators that require elevated permissions are isolated in bitvault-system with tight RBAC. Application namespaces enforce the restricted profile with no exceptions.
PodSecurity
All application namespaces enforce the Kubernetes restricted PodSecurity Standard. This is set via namespace labels and backed by the built-in PodSecurity admission controller:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restricted
The restricted profile requires:
- Container must not run as root (
runAsNonRoot: true) - Root filesystem must be read-only (
readOnlyRootFilesystem: true) - No privilege escalation (
allowPrivilegeEscalation: false) - Seccomp profile must be
RuntimeDefaultorLocalhost - All capabilities dropped; only
NET_BIND_SERVICEmay be added (and is not needed — services bind on ports > 1024) hostNetwork,hostPID,hostIPCall forbidden
These requirements are satisfied by the image design rules described in Containerization.
NetworkPolicies
By default, all pod-to-pod traffic within a namespace is allowed in Kubernetes. BitVault overrides this with explicit NetworkPolicy resources:
- A default-deny policy is applied to every application namespace on creation, blocking all ingress and egress.
- Each service has an explicit
NetworkPolicygranting only the ingress/egress paths it legitimately needs (e.g.,bitvaultdmay reach Postgres on port 5432; it may not reach the NATS management port). - Egress to the object storage endpoint (MinIO/S3) is allowed only from
bitvaultdandbitvault-worker; it is denied frombitvault-web. - External egress (to cloud APIs, Vault, external KMS) is allowed only from designated pods via CIDR-scoped egress rules.
:::note mTLS Service-to-service mTLS is enforced starting at P4 (service extraction phase) via a service mesh (e.g., Istio, Linkerd) or per-service cert-manager issued certificates. In P1–P3 (modular monolith phase), all inter-module calls are in-process; there is no lateral network traffic between modules to secure. :::
NetworkPolicies are part of the Helm chart and are created alongside the workloads they govern.
Resource Management
Every container specifies both requests and limits. Unbounded containers are not permitted.
| Workload | Scaling Mechanism | Trigger |
|---|---|---|
bitvaultd (stateless control plane) |
HPA | CPU utilization + custom requests_per_second metric |
bitvault-web (stateless SSR) |
HPA | CPU + memory |
bitvault-worker (async consumers) |
KEDA ScaledObject |
NATS JetStream consumer pending message count |
| PostgreSQL, Redis, NATS | Operator-managed | Operator-specific (replica count, shard count) |
PodDisruptionBudgets are defined for every stateful quorum and every stateless service:
- Stateless services (bitvaultd, bitvault-web):
minAvailable: 1(or 50% for larger deployments) - NATS JetStream cluster:
minAvailable: 2(for a 3-node cluster) - PostgreSQL HA: operator-managed PDB per the chosen operator (e.g., CloudNativePG)
This ensures that cluster node drain operations do not take down more replicas than the quorum can tolerate.
Health Probes
Every component exposes two health endpoints on a dedicated port (default 8081, separate from the serving port):
| Probe | Endpoint | What It Checks |
|---|---|---|
| Liveness | GET /healthz |
The process is alive and not deadlocked. Returns 200 if the Go runtime and basic goroutines are healthy. |
| Readiness | GET /readyz |
The pod is ready to receive traffic. Checks DB connectivity, Redis reachability, and that the service has completed startup (e.g., migration applied, initial cache warm). |
The liveness probe has a generous initialDelaySeconds (60s for bitvaultd which runs migrations on startup). The readiness probe is checked more frequently and removes a pod from the Service endpoints immediately on failure, preventing traffic routing to a degraded instance.
Graceful Shutdown
When Kubernetes sends SIGTERM to a pod (during a rolling update, node drain, or scale-down), the following sequence executes:
- Stop accepting new connections. The HTTP and gRPC servers stop accepting new TCP connections immediately.
- Drain in-flight requests. Active HTTP requests and gRPC streams are allowed to complete up to a configurable drain timeout (default: 30s).
- Finish or checkpoint work. For
bitvault-worker, in-flight NATS message processing is completed; partially processed messages are NACKed for redelivery rather than dropped. - Exit cleanly. The process exits with code 0.
A preStop lifecycle hook introduces a short sleep (5s) before the shutdown sequence begins, giving the Service endpoints controller time to propagate the pod removal and stop routing new requests before the drain starts.
terminationGracePeriodSeconds is tuned per workload:
| Workload | terminationGracePeriodSeconds |
|---|---|
bitvaultd |
60s |
bitvault-web |
30s |
bitvault-worker |
120s (long-running jobs) |