06 — Downloads & Reconstruction

Topic: download flows. Downloads are where the chunked/deduped/packed storage model meets the user’s expectation of “give me my file, fast.” The central tension: storage is optimized for dedup (many small chunks, possibly packed), but download wants few, large, sequential reads. This doc resolves it.

1. Resolution pipeline (version → bytes)

sequenceDiagram
    autonumber
    participant C as Client
    participant GW as Gateway
    participant SH as Sharing/Authz
    participant S as Storage Coordinator
    participant DB as Manifest + Chunk/Pack Index
    participant O as Object Store
    C->>GW: GET /v1/files/{id}/content (Range optional)
    GW->>SH: CheckAccess(principal, node, read)
    SH-->>GW: allow
    GW->>S: ResolveDownload(version, range?)
    S->>DB: content_hash → manifest → chunk refs → locations + tiers
    DB-->>S: [{chunk, pack_id|object, offset, len, tier}]
    alt any chunk on cold tier
        S-->>C: 202 Accepted (rehydrating) — notify when ready (§5)
    else all online
        S-->>C: plan = presigned GET(s) + ranges (or single URL)
        C->>O: GET bytes (direct, presigned, Range)
        O-->>C: bytes
        C->>C: BLAKE3-verify each chunk/range ([04])
    end

Authz is resolved before any URL is issued; presigned URLs are scoped + short- TTL (ADR-0011). Bytes flow client ⇄ provider, not through our compute.

2. Three download shapes (chosen by how the file was stored + who’s asking)

File / client	How served	Reads
Small / whole-stored (≤ chunk threshold, 05)	single presigned GET	1
Large, chunked → smart client (CLI/sync/mobile)	client fetches only missing chunks, reconstructs locally per manifest	N (deduped)
Large, chunked → browser / simple	range reads over packs, reassembled (see §4)	N or streamed

The small-file fast path is why we don’t chunk small files (05 §7): the majority of downloads become a single direct GET with zero reconstruction.

3. Reading packed chunks (range reads)

When a chunk lives inside a ~1 GiB pack (02, 11), we don’t download the pack — we issue a presigned GET on the pack object and the client sends a Range: bytes=offset-(offset+len-1) header for exactly that chunk. The Pack Index supplies (pack_id, offset, len).

Coalescing: if several needed chunks are contiguous within the same pack (common, because the packer groups chunks of the same object), they are fetched in one ranged GET spanning them — turning N requests into 1. This is a major download-efficiency lever and a reason packing helps reads, not just storage.
Verification: BLAKE3 verified streaming lets the client verify each chunk’s range independently even within a coalesced read (04).

4. The reconstruction-location decision (browser problem)

Reassembling many chunks is easy for a smart client (it has the bytes locally and wants chunks anyway). A browser downloading a large chunked file is the hard case. Options:

Option	How	Bytes through our compute?	Verdict
Store small files whole	no reconstruction for the common case	no	✅ default; eliminates most of the problem
Service-worker reassembly	browser fetches chunks via presigned URLs, a service worker concatenates into the download stream	no	✅ preferred for large chunked files in modern browsers
Streaming reconstructor	a thin stateless endpoint streams pack ranges → client, concatenated server-side	yes (bounded, streamed)	⚠️ fallback only; rate-limited, CDN-fronted
Pre-materialized whole object	keep a coalesced whole copy for hot/large files	no (extra storage)	⚠️ for frequently browser-downloaded large files

Recommendation / tradeoff: default to store-small-whole + service-worker reassembly; use the streaming reconstructor only as a compatibility fallback (it reintroduces compute egress, so it is bounded and metered). This keeps the dedup storage model without paying reconstruction cost on the common path. (Magic Pocket likewise reconstructs files from blocks; the key is to keep that off the hot, compute-bound path.)

5. Cold-tier reads (rehydration)

A chunk on an archival tier (Glacier/Archive/Coldline) is not immediately readable — recall takes minutes to hours (10).

flowchart TB
    classDef c fill:#fde68a,stroke:#b45309,color:#111827;
    classDef w fill:#fed7aa,stroke:#c2410c,color:#111827;
    req["Download hits a cold chunk"]:::c --> r["Initiate provider restore<br/>(mark chunk rehydrating)"]:::w
    r --> p["Poll / await restore callback"]:::w
    p --> ready["Chunk temporarily on a hot tier"]:::c
    ready --> serve["Issue presigned GET"]:::c

The API returns 202 + a job, notifying the user (event/webhook, 04 contexts) when ready, rather than blocking.
Restore copies are temporary; lifecycle returns them to cold after a window.
Predictive rehydration (warm a version’s chunks when a user opens its folder) is a future optimization, designed-for via the manifest (we know all chunks up front).

6. Caching & CDN (content addressing’s payoff)

Immutable, content-addressed objects are the ideal cache citizens:

Perfect cache key: the hash is the key; no invalidation logic — content never changes under a key, so TTLs can be effectively infinite.
CDN in front of object storage for public links and hot reads, with signed CDN URLs; the CDN caches pack/chunk objects by key.
Edge dedup: because the same chunk underlies many files/versions, a cached hot chunk serves many logical downloads.
Authz vs cache tension: private content must not be cached unauthenticated → presigned/signed-URL TTLs + per-tenant cache keying; public-link content is freely cacheable. The two are kept distinct at the URL/permission layer.

7. Tradeoffs / Alternatives / Scaling

Tradeoffs. The manifest indirection adds one metadata lookup per download (cached, 08). Reconstruction adds client work for large chunked files — bounded by storing small files whole and coalescing pack ranges.

Alternatives.

Always reconstruct server-side: simplest client, but the R5 compute-egress cost at scale — rejected except as fallback.
Never chunk (store every file whole): trivial downloads, but forfeits dedup and delta-sync — rejected; the small-file-whole threshold captures most of the download simplicity without losing dedup on the data that matters (large/edited/shared).
Client-side decryption for E2E: would move keys to clients and break CDN caching of plaintext — out of scope (ADR-0014).

Scaling concerns.

Request amplification: a naive chunked download = N provider GETs. Mitigated by pack-range coalescing (§3), CDN edge caching (§6), and storing small files whole.
Manifest read hotspots for popular files → cache manifests in Redis; manifests are immutable so caching is safe and infinite-TTL.
Cold-recall stampedes (many users open an archived dataset) → coalesce restore requests per chunk (one restore serves all waiters), rate-limit, and prefer tiering policies that keep likely-read data warm (10).
Egress cost is the dominant download cost at scale → CDN offload, in-region serving, and provider selection by egress price (Placement, 09).

References

HTTP Range requests: https://developer.mozilla.org/docs/Web/HTTP/Range_requests
S3 Glacier restore (rehydration latency): https://docs.aws.amazon.com/AmazonS3/latest/userguide/restoring-objects.html
Dropbox block reconstruction: https://dropbox.tech/infrastructure/inside-the-magic-pocket