Phases
Phase Overview
flowchart LR
classDef p0 fill:#e5e7eb,stroke:#6b7280,color:#111827;
classDef p1 fill:#fde68a,stroke:#b45309,color:#111827;
classDef p2 fill:#fed7aa,stroke:#c2410c,color:#111827;
classDef p3 fill:#bfdbfe,stroke:#1d4ed8,color:#111827;
classDef p4 fill:#bbf7d0,stroke:#15803d,color:#111827;
P0["P0 · Walking Skeleton\n1 binary, PG+MinIO\nupload/download + OTel"]:::p0
P1["P1 · Core Product\nnamespace, versions, sharing,\ncommit protocol, RLS, CLI"]:::p1
P2["P2 · Sync\nchange journal, deltas,\nconflict copies, web app"]:::p2
P3["P3 · Async Plane\nNATS+outbox, search,\nnotifications, GC worker"]:::p3
P4["P4 · Extraction & Scale\nsplit workers/sync, Helm full,\nHPA, load+chaos proof"]:::p4
P5["P5 · Breadth\nmore storage adapters,\npreviews, mobile, multi-region*"]:::p4
P0 --> P1 --> P2 --> P3 --> P4 --> P5
P0 — Walking Skeleton
Goal: thinnest end-to-end path: authenticate → upload (presigned) → commit →
download, in bitvaultd, on Postgres + MinIO.
| Aspect | Detail |
|---|---|
| Build | internal/platform/* (config, db, server, OTel from here), Identity (minimal), Storage (MinIO adapter), File & Metadata (commit protocol), Gateway (REST↔gRPC) |
| Proves | Dual-write defense (no orphaned blobs); one upload = one connected trace across ≥ 3 spans |
| Dependency tier | Lite (Postgres + MinIO) |
| Artifact | Upload trace screenshot spanning Gateway → File → Storage; chaos test showing GC reclaims the orphan after a process kill between PUT and commit |
P1 — Core Product
Goal: real file management for a single tenant, multi-user.
| Aspect | Detail |
|---|---|
| Build | Folders, move/copy/rename, versioning, trash/restore, RLS multi-tenancy (ADR-0007), Sharing (internal + public links), RBAC deny-by-default (ADR-0010), Go CLI, Postgres-FTS name and metadata search |
| Proves | Tenant isolation invariant: cross-tenant access test fails at the DB RLS layer; all namespace operations are byte-free (presigned URLs only) |
| Dependency tier | Lite / Standard |
| Artifact | CLI demo screencast; cross-tenant isolation test running in CI |
P2 — Synchronization
Goal: correct multi-device sync.
| Aspect | Detail |
|---|---|
| Build | Change journal (monotonic per-tenant sequence), device cursors, delta pull/push API (ADR-0024), conflict = conflicted copy (ADR-0008 / ADR-0026), three-tree reconciliation (ADR-0022), local SQLite sync DB on device (ADR-0023), file watching (ADR-0025), Next.js web app |
| Proves | Conflict harness: two offline devices edit the same file simultaneously → both versions survive, no silent data loss. This is the most important test in the project. |
| Dependency tier | Standard (Postgres + MinIO + Redis + NATS in-proc bus) |
| Artifact | Conflict-resolution test in CI; web UI walkthrough of the conflict flow |
:::note
P2’s journal is still fed in-process from the File context — NATS JetStream is not
required yet. The event bus interface (internal/platform/bus) is in place from
P0 with an in-proc implementation, so P3 is a config swap, not a rewrite.
:::
P3 — Async Derivation Plane
Goal: introduce real eventing where it earns its place.
| Aspect | Detail |
|---|---|
| Build | NATS JetStream behind the existing bus interface (ADR-0006), transactional outbox drainer, OpenSearch content search (ADR-0009), notifications/webhooks, GC/finalizer worker (ADR-0019), usage metering, audit sink; idempotent consumers + DLQs |
| Proves | At-least-once delivery with idempotency; derived index is rebuildable from source of truth (no derived store blocks the control plane) |
| Dependency tier | Full (+ OpenSearch) |
| Artifact | End-to-end trace spanning REST → gRPC → NATS → indexer; “rebuild search index from the journal” demo |
P4 — Extraction & Scale
Goal: turn modules into services with evidence, and prove horizontal scale.
| Aspect | Detail |
|---|---|
| Build | Extract workers (indexer/notifier/GC) and Sync into their own cmd/* binaries + deployments (each with a forcing-function ADR); Helm full profile; HPA, PDB, NetworkPolicies, mTLS between services; load tests validating NFR-3/4 SLOs; chaos tests for partial failure |
| Proves | Data plane scales independently of control plane; extraction changed deployment, not callers (they already called via gRPC API) |
| Dependency tier | Full + Kubernetes |
| Artifact | Before/after architecture diagram; load test results validating NFR SLOs; chaos test results for partial dependency failure |
:::note Extraction is the story The modular monolith → extracted services migration is a primary deliverable. Each extraction ships with an ADR naming the forcing function and a before/after trace + load test. This is what a principal-level reviewer wants to see. :::
P5 — Breadth
Goal: widen, now that the core is proven.
| Aspect | Detail |
|---|---|
| Build | Additional storage adapters (S3 → R2 → GCS → Azure, each passing the conformance suite — ADR-0005), previews/thumbnails worker, React Native mobile app on the now-stable /v1 API, optional multi-region groundwork |
| Dependency tier | Full |
| Gate | Stable /v1 API freeze (required before mobile client ships) |
Intentional Deferrals
Each row is a forcing function that unlocks the next layer. Building the right side before the left side is over-engineering.
| Until you have… | Don’t build… |
|---|---|
| A working commit protocol (P0) | NATS, OpenSearch, multiple services |
| Correct sync (P2) | Previews, mobile app, extra storage adapters |
| An event backbone + outbox (P3) | Extracted services |
| Traces + load tests (P4) | Multi-region, service mesh, operators-for-everything |
| A stable public API (P4) | The React Native mobile app |
Definition of Done
The project demonstrates principal-level work when all seven criteria pass:
- ✅ Sync conflict harness passes — no silent data loss.
- ✅ Dual-write chaos test passes — no orphaned blobs or dangling references.
- ✅ Cross-tenant isolation test passes at the Postgres RLS layer.
- ✅ One upload renders as one connected distributed trace across ≥ 3 components.
- ✅ Load test shows control-plane horizontal scaling + data-plane independence.
- ✅ At least one module extracted to a service with a documented forcing function and a before/after trace + load test.
- ✅ Every major decision has an ADR with consequences honestly stated (positive and negative).