05 — Service Boundaries & Data Ownership

Covers task 7. Maps bounded contexts (04) to modules (v1, inside the bitvaultd modular monolith) that are designed to be extracted into services later (09).

Read the column “v1 form” carefully: almost everything starts as an in-process module. The service decomposition is the target, reached by extraction with a forcing function — never big-bang (ADR-0001).

Architecture Freeze V1 (2026-06-12): ownership updated so the change journal is owned and written by File & Metadata in the commit transaction (source of truth); Sync reads it and owns only cursors + conflicts (review §3.3, ADR-0008). Search is Postgres-FTS in V1 (OpenSearch deferred, ADR-0009); the event bus is in-process in V1 (NATS at P3, ADR-0006).


1. The golden rules of ownership

  1. One owner per piece of data. A module/service that owns a table is the only one that writes it. Others read via its API or via events — never by reaching into its tables. This is the rule that makes extraction possible later.
  2. Postgres is the source of truth; derived stores are disposable. Search, notifications, usage counters, and previews are rebuildable from events.
  3. Cross-boundary writes go through APIs (sync) or events (async). No shared mutable tables across owners. No distributed transactions (ADR-0006).
  4. The data plane (bytes) never flows through compute. Presigned URLs only (R5).

2. Services / Modules

Logical service Owns (authoritative data) Internal API (gRPC) Sync deps Async (events) v1 form
API Gateway / BFF nothing (stateless) — (calls others) all modules Module: HTTP server + router in bitvaultd
Identity tenants, users, memberships, roles, api_tokens, sessions Identity.* (authn, authz, token introspection) Postgres emits UserCreated, TenantSuspended Module
File & Metadata nodes, versions, node_metadata, tags, trash, change_journal, outbox Files.* (create/commit/move/list/version/trash) Identity, Storage, Postgres writes change_journal(seq) + emits NodeChanged via outbox — one commit tx Module
Storage blobs, multipart_uploads, provider config Storage.* (presign, head, commit-blob, delete, GC) object store emits BlobCommitted, BlobOrphaned Module + worker (GC/finalizer)
Sync device_cursors, conflict_records Sync.* (register device, pull deltas, push) Files (reads journal), Storage — (reads the journal; no event projection) Module (first extraction candidate)
Sharing shares, share_links, permissions Sharing.* (grant, link, resolve-access) Identity, Files emits ShareCreated Module
Search & Indexing derived FTS index (Postgres-FTS in V1; OpenSearch deferred P3) Search.* (query) Postgres (PG-FTS) consumes node/share events → index Module + worker (early extraction candidate)
Notification & Events subscriptions, webhook_endpoints, notifications, delivery state Notify.* (subscribe, deliver) Redis, SMTP consumes domain events → fan-out Module + worker
Billing & Metering usage_meters, quotas, plans Billing.* (check-quota, record-usage) Postgres consumes usage events Module
Admin & Platform feature_flags, config, audit_log (append-only) Admin.* Postgres consumes all events → audit Module

Workers are the async halves (GC, indexing, notification delivery, preview generation). In v1 they run as goroutine pools inside bitvaultd driven by the in-process event bus / outbox; they are the first things to become standalone deployments because their scaling profile (bursty, CPU-bound, retry-heavy) differs sharply from request-serving.


3. The three planes

A useful lens orthogonal to the service list:

flowchart LR
    subgraph CP["Control plane (strong consistency)"]
        GW[API Gateway]
        ID[Identity]
        FM[File & Metadata]
        SH[Sharing]
        ST[Storage: presign/commit]
    end
    subgraph DP["Data plane (bypasses compute)"]
        OBJ[(Object Store)]
    end
    subgraph AP["Async / derivation plane (eventual)"]
        SE[Search indexer]
        NO[Notifier]
        BI[Meter]
        AU[Audit]
        GC[GC / finalizer]
    end
    JRNL[(change journal<br/>source of truth)]
    SY[Sync: delta serve]
    Client -->|REST| GW
    GW --> ID & FM & SH & ST & SY
    Client -. presigned PUT/GET .-> OBJ
    FM -->|writes journal + outbox in one commit tx| JRNL
    SY -->|reads seq gt cursor| JRNL
    FM -->|outbox| BUS{{in-proc bus / NATS at P3}}
    ST -->|outbox| BUS
    BUS --> SE & NO & BI & AU & GC
    GC --> OBJ

4. Service boundary detail

API Gateway / BFF

Identity & Access

File & Metadata (the spine)

Storage

Sync (first extraction target)

Sharing

Search & Indexing (early extraction target)

Notification & Events

Billing & Metering

Admin & Platform


5. Extraction forcing-functions (when a module becomes a service)

Per ADR-0001, extraction is evidence-driven. A module graduates to a service only when one of these is demonstrated:

Trigger Likely first service
Async workload starves request latency (GC’s CPU, indexing bursts) Storage worker, Search indexer
A component needs an independent scaling profile (many long-lived sync connections) Sync
A component needs a different datastore lifecycle / can be optional Search
Independent deploy cadence / blast-radius isolation needed the noisiest module
A team takes ownership that team’s context

What is deliberately not a trigger: “microservices look good.” The whole point (see 01 §3.1) is that disciplined, justified extraction is the portfolio story.


6. Anti-patterns this boundary design forbids