ADR-0027 — Sync safety guards (atomic apply, self-write suppression, bulk-change brake)

Context

Sync failures that lose or corrupt data are catastrophic and irreversible from the user’s view; failures that merely delay or duplicate are recoverable. Several specific hazards recur across all sync products: partially-written files exposed after a crash, an upload feedback loop from the client’s own writes, and mass-deletion propagation (a folder unmounts, ransomware encrypts a tree, or a buggy plan deletes everything — and the engine faithfully replicates the disaster to the cloud and every device).

Decision

Adopt three non-negotiable safety guards:

  1. Atomic local apply: downloads are written to a temp file, fsync‘d, and atomically renamed into place; the DB records “synced” only after the rename (apply-then-record). A partial file is never visible; a crash leaves only a discardable temp.
  2. Self-write suppression: record (path, expected hash) before applying a download so the resulting watcher event is absorbed, not re-uploaded (ADR-0025).
  3. Bulk-change circuit breaker (SafetyHold): if a plan would delete or overwrite more than a threshold (e.g. > N files or > X% of the synced set), pause and require user confirmation (with a preview), after sanity checks (is the volume mounted? does this look like mass encryption?). Backed by server trash + version history (storage/07) so even a confirmed mistake is recoverable within the retention window.

Consequences

Positive

Negative / costs

Alternatives considered

Scaling

Threshold checks are O(plan size) on already-computed plans (cheap); the brake only engages on rare large destructive plans, so steady-state sync is unaffected.