ADR-0022 — Three-tree reconciliation (local / remote / synced)

Status: Accepted
Date: 2026-06-11
Related: sync/05 reconciliation, sync/01 prior art, ADR-0008

Context

A sync engine must decide, per file, whether a difference between device and cloud came from the local side, the remote side, or both (a conflict). Doing this with a two-way diff (local vs remote) is ambiguous — “remote added X” is indistinguishable from “local deleted X” — and that ambiguity is the classic cause of sync data loss. Representing sync as a queue of in-flight operations (the legacy Dropbox model) is not robust to crashes, reordering, and offline divergence.

Decision

Model client state as three trees and compute sync as a three-way merge (Dropbox Nucleus model):

Local (L) — last observed on-disk state.
Remote (R) — last known cloud state (via cursor delta, ADR-0024).
Synced (S) — the last state where L and R agreed = the merge base.

A pure planner consumes (R, L, S) and emits operations to converge them; each completed op advances S. Nodes are keyed by stable ID, not path (O(1) renames). The server’s change journal (ADR-0008) provides a total order, so per-file vector clocks are unnecessary (unlike P2P engines such as Syncthing).

Consequences

Positive

The Synced base makes every per-node decision unambiguous (local-only / remote-only / both) → correct conflict detection, no lost-update ambiguity.
“State, not activity” → the planner is idempotent and re-entrant: re-run from any point (crash, offline, reorder) and get the right answer. Offline-for-weeks and offline-for-seconds use the same code path (re-plan).
The planner is a pure function → exhaustively property-testable (CanopyCheck-style).
Central total order avoids vector-clock complexity.

Negative / costs

~3× per-node metadata (three snapshots) vs an op-queue; maintaining S on every completion. Accepted — it is the cost of correctness.
Requires a reliable merge base; if the local DB is lost, S resets to ∅ and the planner runs conservatively (ADR-0023).

Alternatives considered

Two-way diff (no Synced base): ambiguous add-vs-delete → data loss. Rejected.
Operation-log replay: fragile across crash/offline/reorder. Rejected.
Per-file vector clocks / P2P merge (Syncthing): needed without a central authority; redundant here given the journal’s total order. Rejected.

Scaling

Dirty-set planning visits only changed nodes (not the whole tree); subtree ops execute in parallel; folder moves are O(1) via node-ID.