ADR-0024 — Cursor-based delta pull + lightweight push notification
- Status: Accepted
- Date: 2026-06-11 · Revised: 2026-06-12 (Architecture Freeze V1)
- Related: sync/07 protocol, ADR-0008, ADR-0006, ADR-0003
V1 Freeze (2026-06-12): Accepted. Aligned with ADR-0008: the journal’s monotonic
seqis assigned transactionally at commit (gap-free, totally ordered per tenant by construction), so the “ordered journal delta” this protocol pulls is authoritative without depending on event ordering. In V1 the lossy notification is fanned out by the in-process bus; NATS fan-out arrives with ADR-0006’s P3 tier.
Context
Clients must learn about remote changes promptly without polling constantly, and must be
able to resume from any point after arbitrary offline periods. The change source of truth
is the server change journal (ADR-0008), whose per-tenant seq is assigned in the
commit transaction — so it is gap-free and totally ordered, and a seq > cursor pull is
exact. The realtime delivery tier must scale to many devices and tolerate
dropped/duplicated signals without risking correctness.
Decision
Two channels, separating a lossy signal from an authoritative pull (the Dropbox pattern):
- Notification (lossy, low-latency): a “namespace advanced past your cursor” signal over a gRPC server-stream / WebSocket, with a longpoll fallback that blocks until a change plus random jitter (vs thundering herd). Fanned out from the event bus (in-process in V1; NATS at P3, ADR-0006). Carries no content, so loss/dup is harmless.
- Cursor delta pull (authoritative):
GetChanges(cursor)returns the ordered journal delta (node-keyed create/update/move/delete) + a new cursor. Idempotent and exactly resumable; persisted in the local DB (ADR-0023).
Cursors are opaque, monotonic encodings of the journal position. An expired/invalid
cursor returns a reset (409 / FAILED_PRECONDITION) → the client does a full list →
rebuild Remote tree → diff. Uploads use the storage commit protocol with a base
version for optimistic-concurrency conflict detection. Protobuf is the single contract
(ADR-0003); REST mirror for third parties.
Consequences
Positive
- Realtime tier is cheap and failure-tolerant (signal is lossy; pull is the truth) → scales to many devices.
- Resumable, idempotent delta pull → correct across offline gaps and at-least-once delivery.
- Reset path bounds journal-retention requirements; jitter tames reconnect storms.
- Node-keyed deltas mean renames/moves are one entry, not delete+create storms.
Negative / costs
- Two channels to build/operate; a connection/notifier tier for stream fan-out.
- Cursor reset is an expensive full resync for long-offline devices (bounded by retention).
Alternatives considered
- Pure polling: simple/correct but high latency or load. Kept as fallback.
- Server pushes full content: couples correctness to delivery, huge fan-out, breaks offline. Rejected — push only the signal.
- P2P gossip / vector clocks: unnecessary with a central journal (ADR-0022).
Scaling
Stateless notifier tier subscribed to NATS by tenant; longpoll for devices that can’t hold streams; jitter on wake; client-side debounced pulls coalesce notification bursts; journal retention + reset path handle long-offline devices.