10 — Tiered Storage & Lifecycle Policies
Topics: tiered storage, lifecycle policies. Storage cost is dominated by keeping rarely-accessed bytes on expensive hot media. Tiering moves data down the cost/latency curve as it cools; lifecycle policies automate the whole journey from upload to expiry.
1. The tiers (mapped to provider classes)
| Tier | Backing (per provider class) | Latency | Rel. $/GB | For |
|---|---|---|---|---|
| Hot | S3 Standard · GCS Standard · Azure Hot · MinIO SSD | ms | 1× | active files, recent versions |
| Warm | S3 Standard-IA · GCS Nearline · Azure Cool | ms (higher $/req) | ~0.5× | infrequently accessed |
| Cold | S3 Glacier Flexible · GCS Coldline | min–hr (restore) | ~0.2× | old versions, archives |
| Frozen | S3 Glacier Deep Archive · GCS Archive · Azure Archive | hrs (restore) | ~0.04× | compliance retention |
Tier availability is a capability (01) — R2 has no archive tier, MinIO tiers via remote ILM. The Lifecycle Engine only proposes transitions to classes the placed provider supports; otherwise it migrates the pack to a provider that has the target tier (09).
2. Lifecycle policies (declarative, per tenant/folder)
A policy is a set of rules evaluated continuously against object/version metadata:
| Rule type | Example | Acts on |
|---|---|---|
| Tier transition | no access 30d → Warm; 90d → Cold; 365d → Frozen | packs/chunks |
| Promote on access | cold read → restore + mark Hot for a window | chunks |
| Version expiry | keep-last-N / time-based (07) | versions |
| Trash purge | hard-delete trashed nodes after 30d | nodes → GC |
| Incomplete-upload abort | abort staging/MPU after 7d | staging (05) |
| Retention floor (WORM) | never transition/delete before retain_until |
overrides all above |
Policies are data (owned by Admin/Platform), evaluated by the Lifecycle Engine worker, which emits actions executed by the Migration worker (tiering) and GC (expiry/purge). Where a provider’s native lifecycle rules suffice (single-provider, simple age rules), we delegate to them to avoid moving bytes ourselves; we run our own engine for cross-provider, access-based, or finer-grained policies.
3. Tiering lifecycle (state view)
stateDiagram-v2
[*] --> Hot: written
Hot --> Warm: no access > N1 days
Warm --> Cold: no access > N2 days
Cold --> Frozen: no access > N3 days (compliance)
Warm --> Hot: accessed
Cold --> Rehydrating: read request ([06])
Frozen --> Rehydrating: read request (hours)
Rehydrating --> Hot: restored copy (temporary window)
Hot --> Expired: version/retention policy ([07])
Warm --> Expired: version/retention policy
Cold --> Expired: version/retention policy
Expired --> [*]: unreferenced → GC ([11])
note right of Frozen
retain_until (WORM) blocks
Expired until the floor passes
end note
4. Pack by temperature (the insight that makes tiering work with packing)
Packing (02, 11) aggregates chunks into ~1 GiB packs. If a pack mixes hot and cold chunks, you cannot tier it efficiently — moving the pack to cold drags hot chunks down (slow reads), keeping it hot wastes money on cold chunks.
Resolution: the packer groups chunks by access temperature and tenant, so a pack is homogeneously hot or cold. As chunks cool, the repacker moves cooled chunks out of hot packs into cold packs (this also reclaims dead space, 11). Tiering then operates cleanly at pack granularity. This co-design of packing + tiering is essential and easy to miss.
5. Cost model & the recall tradeoff
Tiering trades storage cost for retrieval cost + latency:
- Cold/Frozen are ~5–25× cheaper to store but charge per-GB retrieval and impose minutes-to-hours restore latency (06 §5), plus minimum storage durations (early-deletion fees).
- Don’t tier too aggressively: moving data that will be read soon incurs restore fees + latency that dwarf the storage savings. Transitions are driven by observed access patterns (last-access, frequency), not just age, and respect minimum-duration economics.
- Minimum object size for tiering: archival classes have per-object overhead/min sizes → another reason to tier packs (~1 GiB), never individual ~1 MiB chunks.
6. Tradeoffs / Alternatives / Scaling
Tradeoffs. Access-based tiering needs access tracking (last-access timestamps), which is metadata write load; we sample/batch it rather than write on every read. Aggressive tiering cuts storage cost but risks recall cost/latency — tuned per tenant.
Alternatives considered.
- Provider-native lifecycle only: zero engine to build, but limited to single- provider age-based rules; no cross-provider tiering, no access-based promotion, no per-version policy. Used where sufficient; insufficient as the whole story.
- No tiering (everything hot): simplest, but at PB scale the cost is enormous; rejected for any non-trivial deployment.
- Per-chunk tiering: too fine (archival min sizes, metadata) → pack-granularity.
Scaling concerns.
- Policy evaluation over billions of objects → the Lifecycle Engine scans the (partitioned) index incrementally with watermarks, not a full sweep each run; candidates are surfaced by access-time/age indexes.
- Access tracking write load → sampled + coalesced; exact last-access is not required, “cooled enough” is.
- Restore stampedes → coalesced restores + keep-likely-read-warm policies (06).
- Repack churn from temperature changes → hysteresis (only repack on sustained cooling) to avoid thrashing packs between tiers.
References
- S3 storage classes & lifecycle: https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html
- S3 Lifecycle transitions & constraints: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-transition-general-considerations.html
- GCS storage classes: https://cloud.google.com/storage/docs/storage-classes
- MinIO ILM / tiering: https://min.io/docs/minio/linux/administration/object-management/transition-objects-to-remote-tier.html