10 — Tiered Storage & Lifecycle Policies

Topics: tiered storage, lifecycle policies. Storage cost is dominated by keeping rarely-accessed bytes on expensive hot media. Tiering moves data down the cost/latency curve as it cools; lifecycle policies automate the whole journey from upload to expiry.

1. The tiers (mapped to provider classes)

Tier	Backing (per provider class)	Latency	Rel. $/GB	For
Hot	S3 Standard · GCS Standard · Azure Hot · MinIO SSD	ms	1×	active files, recent versions
Warm	S3 Standard-IA · GCS Nearline · Azure Cool	ms (higher $/req)	~0.5×	infrequently accessed
Cold	S3 Glacier Flexible · GCS Coldline	min–hr (restore)	~0.2×	old versions, archives
Frozen	S3 Glacier Deep Archive · GCS Archive · Azure Archive	hrs (restore)	~0.04×	compliance retention

Tier availability is a capability (01) — R2 has no archive tier, MinIO tiers via remote ILM. The Lifecycle Engine only proposes transitions to classes the placed provider supports; otherwise it migrates the pack to a provider that has the target tier (09).

2. Lifecycle policies (declarative, per tenant/folder)

A policy is a set of rules evaluated continuously against object/version metadata:

Rule type	Example	Acts on
Tier transition	no access 30d → Warm; 90d → Cold; 365d → Frozen	packs/chunks
Promote on access	cold read → restore + mark Hot for a window	chunks
Version expiry	keep-last-N / time-based (07)	versions
Trash purge	hard-delete trashed nodes after 30d	nodes → GC
Incomplete-upload abort	abort staging/MPU after 7d	staging (05)
Retention floor (WORM)	never transition/delete before `retain_until`	overrides all above

Policies are data (owned by Admin/Platform), evaluated by the Lifecycle Engine worker, which emits actions executed by the Migration worker (tiering) and GC (expiry/purge). Where a provider’s native lifecycle rules suffice (single-provider, simple age rules), we delegate to them to avoid moving bytes ourselves; we run our own engine for cross-provider, access-based, or finer-grained policies.

3. Tiering lifecycle (state view)

stateDiagram-v2
    [*] --> Hot: written
    Hot --> Warm: no access > N1 days
    Warm --> Cold: no access > N2 days
    Cold --> Frozen: no access > N3 days (compliance)
    Warm --> Hot: accessed
    Cold --> Rehydrating: read request ([06])
    Frozen --> Rehydrating: read request (hours)
    Rehydrating --> Hot: restored copy (temporary window)
    Hot --> Expired: version/retention policy ([07])
    Warm --> Expired: version/retention policy
    Cold --> Expired: version/retention policy
    Expired --> [*]: unreferenced → GC ([11])
    note right of Frozen
        retain_until (WORM) blocks
        Expired until the floor passes
    end note

4. Pack by temperature (the insight that makes tiering work with packing)

Packing (02, 11) aggregates chunks into ~1 GiB packs. If a pack mixes hot and cold chunks, you cannot tier it efficiently — moving the pack to cold drags hot chunks down (slow reads), keeping it hot wastes money on cold chunks.

Resolution: the packer groups chunks by access temperature and tenant, so a pack is homogeneously hot or cold. As chunks cool, the repacker moves cooled chunks out of hot packs into cold packs (this also reclaims dead space, 11). Tiering then operates cleanly at pack granularity. This co-design of packing + tiering is essential and easy to miss.

5. Cost model & the recall tradeoff

Tiering trades storage cost for retrieval cost + latency:

Cold/Frozen are ~5–25× cheaper to store but charge per-GB retrieval and impose minutes-to-hours restore latency (06 §5), plus minimum storage durations (early-deletion fees).
Don’t tier too aggressively: moving data that will be read soon incurs restore fees + latency that dwarf the storage savings. Transitions are driven by observed access patterns (last-access, frequency), not just age, and respect minimum-duration economics.
Minimum object size for tiering: archival classes have per-object overhead/min sizes → another reason to tier packs (~1 GiB), never individual ~1 MiB chunks.

6. Tradeoffs / Alternatives / Scaling

Tradeoffs. Access-based tiering needs access tracking (last-access timestamps), which is metadata write load; we sample/batch it rather than write on every read. Aggressive tiering cuts storage cost but risks recall cost/latency — tuned per tenant.

Alternatives considered.

Provider-native lifecycle only: zero engine to build, but limited to single- provider age-based rules; no cross-provider tiering, no access-based promotion, no per-version policy. Used where sufficient; insufficient as the whole story.
No tiering (everything hot): simplest, but at PB scale the cost is enormous; rejected for any non-trivial deployment.
Per-chunk tiering: too fine (archival min sizes, metadata) → pack-granularity.

Scaling concerns.

Policy evaluation over billions of objects → the Lifecycle Engine scans the (partitioned) index incrementally with watermarks, not a full sweep each run; candidates are surfaced by access-time/age indexes.
Access tracking write load → sampled + coalesced; exact last-access is not required, “cooled enough” is.
Restore stampedes → coalesced restores + keep-likely-read-warm policies (06).
Repack churn from temperature changes → hysteresis (only repack on sustained cooling) to avoid thrashing packs between tiers.

References

S3 storage classes & lifecycle: https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html
S3 Lifecycle transitions & constraints: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-transition-general-considerations.html
GCS storage classes: https://cloud.google.com/storage/docs/storage-classes
MinIO ILM / tiering: https://min.io/docs/minio/linux/administration/object-management/transition-objects-to-remote-tier.html