Kubernetes

Namespace Layout

Workloads are separated by environment and concern. PodSecurity admission is enforced at the namespace level.

Namespace Tenants / Contents PodSecurity Profile
bitvault-system Operators (ArgoCD, external-secrets, cert-manager, KEDA), shared infra controllers privileged (operators require it); tightly RBAC scoped
bitvault-prod Production app workloads (bitvaultd, bitvault-worker, bitvault-web) restricted
bitvault-staging Staging app workloads restricted
bitvault-dev Dev app workloads restricted
bitvault-preview-pr-N Per-PR ephemeral review workloads restricted

Operators that require elevated permissions are isolated in bitvault-system with tight RBAC. Application namespaces enforce the restricted profile with no exceptions.

PodSecurity

All application namespaces enforce the Kubernetes restricted PodSecurity Standard. This is set via namespace labels and backed by the built-in PodSecurity admission controller:

pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restricted

The restricted profile requires:

These requirements are satisfied by the image design rules described in Containerization.

NetworkPolicies

By default, all pod-to-pod traffic within a namespace is allowed in Kubernetes. BitVault overrides this with explicit NetworkPolicy resources:

:::note mTLS Service-to-service mTLS is enforced starting at P4 (service extraction phase) via a service mesh (e.g., Istio, Linkerd) or per-service cert-manager issued certificates. In P1–P3 (modular monolith phase), all inter-module calls are in-process; there is no lateral network traffic between modules to secure. :::

NetworkPolicies are part of the Helm chart and are created alongside the workloads they govern.

Resource Management

Every container specifies both requests and limits. Unbounded containers are not permitted.

Workload Scaling Mechanism Trigger
bitvaultd (stateless control plane) HPA CPU utilization + custom requests_per_second metric
bitvault-web (stateless SSR) HPA CPU + memory
bitvault-worker (async consumers) KEDA ScaledObject NATS JetStream consumer pending message count
PostgreSQL, Redis, NATS Operator-managed Operator-specific (replica count, shard count)

PodDisruptionBudgets are defined for every stateful quorum and every stateless service:

This ensures that cluster node drain operations do not take down more replicas than the quorum can tolerate.

Health Probes

Every component exposes two health endpoints on a dedicated port (default 8081, separate from the serving port):

Probe Endpoint What It Checks
Liveness GET /healthz The process is alive and not deadlocked. Returns 200 if the Go runtime and basic goroutines are healthy.
Readiness GET /readyz The pod is ready to receive traffic. Checks DB connectivity, Redis reachability, and that the service has completed startup (e.g., migration applied, initial cache warm).

The liveness probe has a generous initialDelaySeconds (60s for bitvaultd which runs migrations on startup). The readiness probe is checked more frequently and removes a pod from the Service endpoints immediately on failure, preventing traffic routing to a degraded instance.

Graceful Shutdown

When Kubernetes sends SIGTERM to a pod (during a rolling update, node drain, or scale-down), the following sequence executes:

  1. Stop accepting new connections. The HTTP and gRPC servers stop accepting new TCP connections immediately.
  2. Drain in-flight requests. Active HTTP requests and gRPC streams are allowed to complete up to a configurable drain timeout (default: 30s).
  3. Finish or checkpoint work. For bitvault-worker, in-flight NATS message processing is completed; partially processed messages are NACKed for redelivery rather than dropped.
  4. Exit cleanly. The process exits with code 0.

A preStop lifecycle hook introduces a short sleep (5s) before the shutdown sequence begins, giving the Service endpoints controller time to propagate the pod removal and stop routing new requests before the drain starts.

terminationGracePeriodSeconds is tuned per workload:

Workload terminationGracePeriodSeconds
bitvaultd 60s
bitvault-web 30s
bitvault-worker 120s (long-running jobs)