10 — Tiered Storage & Lifecycle Policies

Topics: tiered storage, lifecycle policies. Storage cost is dominated by keeping rarely-accessed bytes on expensive hot media. Tiering moves data down the cost/latency curve as it cools; lifecycle policies automate the whole journey from upload to expiry.


1. The tiers (mapped to provider classes)

Tier Backing (per provider class) Latency Rel. $/GB For
Hot S3 Standard · GCS Standard · Azure Hot · MinIO SSD ms active files, recent versions
Warm S3 Standard-IA · GCS Nearline · Azure Cool ms (higher $/req) ~0.5× infrequently accessed
Cold S3 Glacier Flexible · GCS Coldline min–hr (restore) ~0.2× old versions, archives
Frozen S3 Glacier Deep Archive · GCS Archive · Azure Archive hrs (restore) ~0.04× compliance retention

Tier availability is a capability (01) — R2 has no archive tier, MinIO tiers via remote ILM. The Lifecycle Engine only proposes transitions to classes the placed provider supports; otherwise it migrates the pack to a provider that has the target tier (09).


2. Lifecycle policies (declarative, per tenant/folder)

A policy is a set of rules evaluated continuously against object/version metadata:

Rule type Example Acts on
Tier transition no access 30d → Warm; 90d → Cold; 365d → Frozen packs/chunks
Promote on access cold read → restore + mark Hot for a window chunks
Version expiry keep-last-N / time-based (07) versions
Trash purge hard-delete trashed nodes after 30d nodes → GC
Incomplete-upload abort abort staging/MPU after 7d staging (05)
Retention floor (WORM) never transition/delete before retain_until overrides all above

Policies are data (owned by Admin/Platform), evaluated by the Lifecycle Engine worker, which emits actions executed by the Migration worker (tiering) and GC (expiry/purge). Where a provider’s native lifecycle rules suffice (single-provider, simple age rules), we delegate to them to avoid moving bytes ourselves; we run our own engine for cross-provider, access-based, or finer-grained policies.


3. Tiering lifecycle (state view)

stateDiagram-v2
    [*] --> Hot: written
    Hot --> Warm: no access > N1 days
    Warm --> Cold: no access > N2 days
    Cold --> Frozen: no access > N3 days (compliance)
    Warm --> Hot: accessed
    Cold --> Rehydrating: read request ([06])
    Frozen --> Rehydrating: read request (hours)
    Rehydrating --> Hot: restored copy (temporary window)
    Hot --> Expired: version/retention policy ([07])
    Warm --> Expired: version/retention policy
    Cold --> Expired: version/retention policy
    Expired --> [*]: unreferenced → GC ([11])
    note right of Frozen
        retain_until (WORM) blocks
        Expired until the floor passes
    end note

4. Pack by temperature (the insight that makes tiering work with packing)

Packing (02, 11) aggregates chunks into ~1 GiB packs. If a pack mixes hot and cold chunks, you cannot tier it efficiently — moving the pack to cold drags hot chunks down (slow reads), keeping it hot wastes money on cold chunks.

Resolution: the packer groups chunks by access temperature and tenant, so a pack is homogeneously hot or cold. As chunks cool, the repacker moves cooled chunks out of hot packs into cold packs (this also reclaims dead space, 11). Tiering then operates cleanly at pack granularity. This co-design of packing + tiering is essential and easy to miss.


5. Cost model & the recall tradeoff

Tiering trades storage cost for retrieval cost + latency:


6. Tradeoffs / Alternatives / Scaling

Tradeoffs. Access-based tiering needs access tracking (last-access timestamps), which is metadata write load; we sample/batch it rather than write on every read. Aggressive tiering cuts storage cost but risks recall cost/latency — tuned per tenant.

Alternatives considered.

Scaling concerns.

References