02 — Client Architecture
Answers (with 05/07): How should BitVault sync work? The detailed architecture of the native sync engine — components, threading model, and data flow around the three-tree planner.
The engine is one reusable Go library embedded in the desktop daemon and the CLI, and exposed to the future mobile app via bindings. One engine, many shells.
1. Component architecture
flowchart TB
classDef in fill:#dbeafe,stroke:#1e40af,color:#111827;
classDef ctl fill:#c7d2fe,stroke:#3730a3,color:#111827;
classDef wk fill:#fed7aa,stroke:#c2410c,color:#111827;
classDef db fill:#bbf7d0,stroke:#15803d,color:#111827;
classDef net fill:#fde68a,stroke:#b45309,color:#111827;
fs[("Local filesystem")]:::in
subgraph OBSERVE["Observation (feeds Local tree)"]
watch["Watcher (inotify/FSEvents/RDCW)"]:::wk
scan["Scanner (authoritative rescan)"]:::wk
hash["Hasher (BLAKE3, CDC)"]:::wk
end
subgraph REMOTE["Remote feed (feeds Remote tree)"]
notify["Notification client (longpoll/stream)"]:::net
pull["Cursor delta puller"]:::net
end
subgraph CORE["Control plane (single-threaded, deterministic)"]
idx["Indexer<br/>(updates Local/Remote trees in DB)"]:::ctl
plan["Planner<br/>3-way merge → ops (pure)"]:::ctl
sched["Scheduler<br/>(queue, deps, priority, retry)"]:::ctl
conf["Conflict resolver"]:::ctl
end
subgraph XFER["Transfer workers"]
up["Uploader (negotiate + presigned PUT + commit)"]:::wk
down["Downloader (fetch chunks + reconstruct + atomic rename)"]:::wk
end
db[("Local SQLite DB<br/>3 trees · queue · cursor · chunk cache")]:::db
api["BitVault Gateway / Sync API"]:::net
store["Object storage (presigned)"]:::net
fs --> watch --> hash
fs --> scan --> hash
hash --> idx
notify --> pull --> idx
idx <--> db
idx --> plan --> sched
plan --> conf --> sched
sched <--> db
sched --> up & down
up --> api
up -. presigned .-> store
down -. presigned .-> store
down --> fs
pull <--> api
notify <--> api
| Component | Role | Thread |
|---|---|---|
| Watcher | OS-native change hints; emits raw events (04) | bg |
| Scanner | authoritative full/subtree rescan (truth) | bg pool |
| Hasher | BLAKE3 + CDC chunking of changed files | bg pool |
| Notification client | longpoll/stream “namespace advanced” (07) | bg |
| Cursor puller | fetch ordered remote delta since cursor | bg |
| Indexer | applies observations into Local/Remote trees in the DB | control |
| Planner | pure (R,L,S) → ops three-way merge (05) |
control |
| Scheduler | durable queue: ordering, deps, priority, retry (10) | control |
| Conflict resolver | classify + materialize conflicted copies (09) | control |
| Uploader / Downloader | chunk transfer + atomic apply (08) | bg pool |
2. Threading model: single-threaded control, parallel I/O
Following Nucleus: all control/state logic runs on one thread (the indexer → planner → scheduler loop), while I/O, hashing, and transfers run on worker pools.
- Why single-threaded control: the planner and tree mutations become deterministic and race-free — no locks on the trees, no interleaving bugs, and the planner can be property-tested by feeding tree triples and asserting the ops (Dropbox’s CanopyCheck). Concurrency bugs are the dominant source of sync data-corruption incidents; we design them out.
- Why parallel I/O: hashing a GB and uploading 100 chunks must not block the control loop. Workers communicate results back to the control thread via channels; the control thread is the sole writer of tree state.
- For tests, worker operations can be serialized to make whole-engine runs reproducible.
3. The control loop (described, not coded)
The control thread runs a simple, re-entrant loop:
- Drain observations — apply pending watcher/scan results into the Local tree and pending cursor deltas into the Remote tree (one DB transaction each).
- Plan — if any tree changed, run the planner over
(Remote, Local, Synced)to produce a set of operations (05). - Enqueue — persist new operations into the durable queue (dedup against existing).
- Schedule — hand ready operations (dependencies satisfied) to transfer workers.
- Commit progress — as operations complete, advance the Synced tree for those nodes in a transaction; surface conflicts/errors.
- Idle until a wake signal (watcher event, notification, retry timer, user action).
Because step 2 recomputes from persisted state, the loop is idempotent: a crash at any point resumes correctly on restart (re-drain, re-plan). This is the operational payoff of “state, not activity” (ADR-0022).
4. Data flow summary
flowchart LR
classDef t fill:#fde68a,stroke:#b45309,color:#111827;
classDef a fill:#c7d2fe,stroke:#3730a3,color:#111827;
obs1["watcher+scan+hash"]:::a --> L["Local tree"]:::t
obs2["cursor delta pull"]:::a --> R["Remote tree"]:::t
L --> PL{{"Planner (pure)"}}:::a
R --> PL
S["Synced tree"]:::t --> PL
PL --> Q["durable op queue"]:::a
Q --> EX["execute: upload / download / rename / delete / conflict"]:::a
EX -->|success| S
5. Tradeoffs / Alternatives / Scaling
Tradeoffs. A single control thread caps control-logic throughput on one core — fine, because the heavy work (hash/transfer) is parallel and the control loop only manipulates metadata. The simplicity/determinism win dwarfs the cost.
Alternatives considered.
- Free-threaded engine (legacy Dropbox): maximal parallelism, but the source of the concurrency bugs that motivated the rewrite. Rejected.
- Event-sourced operation log instead of the three trees: replaying an op log is fragile across crashes/offline/reorder; the three-tree recompute is more robust (ADR-0022).
- Embed sync in the web app: impossible — browsers have no persistent FS watcher; web uses the API directly (README §4).
Scaling concerns.
- Many files (millions): the control loop handles metadata only; cost is in scanning/hashing (mitigated by watcher + fast-path stat, 04) and DB size (03).
- Many small changes (e.g. a build tree): events are coalesced/debounced
(04); the planner batches; ignore-rules drop
churn (e.g.
node_modules,.git). - Many devices per user: each runs an independent engine converging to the server; fan-out is the server’s concern (07).
- Huge files: streamed, chunked, resumable, never buffered whole (08).