ADR-0007 — Shared-DB multi-tenancy with tenant_id + Row-Level Security
- Status: Accepted
- Date: 2026-06-11 · Revised: 2026-06-12 (Architecture Freeze V1)
- Related: 01 R4, 08 §5, ADR-0004, ADR-0038
V1 Freeze (2026-06-12): Accepted. Blocker-4 follow-up: the unsafe phrase “session GUC” is replaced by a transaction-local
SET LOCALcontext; the connection-pooling interaction is now decided in its own ADR-0038.
Context
BitVault is multi-tenant SaaS and self-hostable. Multi-tenancy is a security
boundary: a cross-tenant data leak is catastrophic and is one forgotten WHERE
tenant_id = ? away if isolation lives only in application code (R4). Options trade
isolation strength against cost/operability: shared schema (cheap, weakest
isolation), schema-per-tenant, database-per-tenant (strongest, costliest).
Decision
- v1: shared database, shared schema. Every owned row carries
tenant_id(NOT NULL, FK totenants). - Isolation is enforced in the database via PostgreSQL Row-Level Security. Each
request sets a tenant context inside its transaction via
SET LOCAL(a transaction-local GUC that is discarded on commit, so a pooled connection carries no context between transactions — see ADR-0038); RLS policies filter every query bycurrent_setting('app.tenant_id'). App-layer scoping remains as defense-in-depth, but the database is the boundary — app bugs cannot leak across tenants, and a query with no context set returns no rows (default-deny). - Object storage uses tenant-prefixed keys (
/{tenant_id}/{hash}) for storage-side isolation, per-tenant lifecycle, and quotas. - Redis/OpenSearch keys/indices are tenant-namespaced.
- Documented escalation path (not built in v1): noisy or enterprise tenants →
schema-per-tenant → database-per-tenant. The
tenant_id-everywhere model makes this a routing/connection change, not a data-model rewrite.
Consequences
Positive
- Strong isolation at the lowest layer (RLS) — the right place for a security boundary (R4); a forgotten app-side filter cannot leak data.
- Cheapest to operate; one DB to back up and migrate; ideal for self-host.
- Smooth, pre-planned path to stronger isolation for tenants that need it.
Negative / costs
- RLS adds query-planning overhead and demands rigorous policy coverage + tests (a missing policy on a new table is a hole — caught by a CI check that every tenant-scoped table has a policy).
- “Noisy neighbor” risk under shared resources → mitigated by per-tenant rate limits/quotas now, and the escalation path later.
- Every connection must set tenant context correctly; this is centralized in
platform/db(transaction-localSET LOCAL, ADR-0038) so it cannot be forgotten per-query, and a CI lint forbids raw pool access that bypasses the wrapper.
Alternatives considered
- App-layer-only tenant scoping (no RLS): rejected — a single missing
WHEREis a breach; unacceptable for a security boundary. - Schema-per-tenant or DB-per-tenant from day one: rejected for v1 — operationally heavy (migrations × N tenants), bad for self-host, premature; kept as the documented escalation.
- Separate tenant context service: unnecessary; tenant context is carried in the auth token and applied at the DB layer.
Verification
Two CI tests (see ADR-0038) guard the I3 invariant
(08):
(1) a forged tenant_id in the application layer → RLS still blocks cross-tenant
reads/writes; (2) a pooled-connection bleed test across a transaction-mode pool →
tenant B never observes tenant A’s rows, and a query without the context wrapper
returns zero rows.