ADR-0007 — Shared-DB multi-tenancy with `tenant_id` + Row-Level Security

Status: Accepted
Date: 2026-06-11 · Revised: 2026-06-12 (Architecture Freeze V1)
Related: 01 R4, 08 §5, ADR-0004, ADR-0038

V1 Freeze (2026-06-12): Accepted. Blocker-4 follow-up: the unsafe phrase “session GUC” is replaced by a transaction-local SET LOCAL context; the connection-pooling interaction is now decided in its own ADR-0038.

Context

BitVault is multi-tenant SaaS and self-hostable. Multi-tenancy is a security boundary: a cross-tenant data leak is catastrophic and is one forgotten WHERE tenant_id = ? away if isolation lives only in application code (R4). Options trade isolation strength against cost/operability: shared schema (cheap, weakest isolation), schema-per-tenant, database-per-tenant (strongest, costliest).

Decision

v1: shared database, shared schema. Every owned row carries tenant_id (NOT NULL, FK to tenants).
Isolation is enforced in the database via PostgreSQL Row-Level Security. Each request sets a tenant context inside its transaction via SET LOCAL (a transaction-local GUC that is discarded on commit, so a pooled connection carries no context between transactions — see ADR-0038); RLS policies filter every query by current_setting('app.tenant_id'). App-layer scoping remains as defense-in-depth, but the database is the boundary — app bugs cannot leak across tenants, and a query with no context set returns no rows (default-deny).
Object storage uses tenant-prefixed keys (/{tenant_id}/{hash}) for storage-side isolation, per-tenant lifecycle, and quotas.
Redis/OpenSearch keys/indices are tenant-namespaced.
Documented escalation path (not built in v1): noisy or enterprise tenants → schema-per-tenant → database-per-tenant. The tenant_id-everywhere model makes this a routing/connection change, not a data-model rewrite.

Consequences

Positive

Strong isolation at the lowest layer (RLS) — the right place for a security boundary (R4); a forgotten app-side filter cannot leak data.
Cheapest to operate; one DB to back up and migrate; ideal for self-host.
Smooth, pre-planned path to stronger isolation for tenants that need it.

Negative / costs

RLS adds query-planning overhead and demands rigorous policy coverage + tests (a missing policy on a new table is a hole — caught by a CI check that every tenant-scoped table has a policy).
“Noisy neighbor” risk under shared resources → mitigated by per-tenant rate limits/quotas now, and the escalation path later.
Every connection must set tenant context correctly; this is centralized in platform/db (transaction-local SET LOCAL, ADR-0038) so it cannot be forgotten per-query, and a CI lint forbids raw pool access that bypasses the wrapper.

Alternatives considered

App-layer-only tenant scoping (no RLS): rejected — a single missing WHERE is a breach; unacceptable for a security boundary.
Schema-per-tenant or DB-per-tenant from day one: rejected for v1 — operationally heavy (migrations × N tenants), bad for self-host, premature; kept as the documented escalation.
Separate tenant context service: unnecessary; tenant context is carried in the auth token and applied at the DB layer.

Verification

Two CI tests (see ADR-0038) guard the I3 invariant (08): (1) a forged tenant_id in the application layer → RLS still blocks cross-tenant reads/writes; (2) a pooled-connection bleed test across a transaction-mode pool → tenant B never observes tenant A’s rows, and a query without the context wrapper returns zero rows.