ADR-0004 — PostgreSQL as the metadata source of truth
Context
A file is two writes — bytes (object store) and metadata (a row). They cannot be written atomically across two systems, so one must be authoritative. Getting this wrong yields orphaned blobs or dangling references (R2) — silent corruption, the worst class of bug for a storage product. We also need transactions, relational integrity (the namespace tree, versions, shares), Row-Level Security for tenant isolation (ADR-0007), and a transactional outbox (ADR-0006).
Decision
PostgreSQL is the single source of truth for all metadata: the namespace tree
(nodes), versions, blobs (the record of bytes, ref-counted), shares,
identity, the change journal, and the transactional outbox.
Object storage holds bytes only; an object with no committed Postgres row does not exist and is reclaimed by GC. The commit protocol (06 §4) writes the version, the blob refcount, and the outbox event in one transaction.
Postgres features we rely on: ACID transactions, foreign keys, RLS (ADR-0007),
jsonb (metadata/outbox payloads), partial indexes (outbox drainer), and
full-text search for v1 (ADR-0009).
Consequences
Positive
- Atomic commit of metadata + event closes the dual-write hole (R2).
- Strong consistency for the namespace (NFR-5); relational integrity for the tree.
- One well-understood, portable, self-hostable datastore for the core.
- RLS gives a database-level tenant boundary (ADR-0007).
Negative / costs
- Postgres becomes the scaling pivot of the control plane; we must plan read replicas and document the connection/IOPS ceiling (NFR-4). Mitigation: the data plane bypasses Postgres entirely (ADR-0011); derived reads go to Redis/OpenSearch.
- Large-scale write throughput eventually needs partitioning/sharding by tenant —
the
tenant_id-everywhere model keeps that a routing change (ADR-0007). - Requires migration discipline (forward-only, versioned).
Alternatives considered
- Object store as source of truth (metadata in object tags/sidecars): rejected — no transactions, no relational queries, weak/again-eventual consistency, painful listing/search.
- A NoSQL document store (e.g. Mongo/Dynamo): rejected — we need multi-row transactions (commit protocol), relational integrity (tree/versions/shares), and RLS; Postgres provides all three and is trivially self-hostable.
- Separate DB per context from day one: rejected for v1 — premature; one Postgres with per-context schemas/ownership suffices until a context is extracted.