09 — Storage Federation & Placement

Topic: storage federation. Federation = many providers, regions, buckets, and tiers presented as one logical store, with a Placement Service deciding where each unit of data lives and a location map that always knows where it is. Decision in ADR-0020.

1. What federation buys (and the one thing it requires)

Federation lets one BitVault deployment span MinIO + S3 + R2 + GCS + Azure across regions. It buys:

Data residency / sovereignty — pin a tenant’s bytes to a region/jurisdiction (GDPR).
Cost arbitrage — place cold data where storage is cheapest; serve where egress is cheapest.
Durability — optional cross-provider redundancy (04).
No vendor lock-in — drain a provider entirely (provider exit) because data is portable by content hash.
Latency — place hot data near users.

The one thing it requires: a reliable location map. There is no global provider namespace; the system must always know which provider/bucket/key holds a given chunk/pack. That map is the Chunk/Pack Index (08) — federation is fundamentally a metadata capability, with the abstraction (01) as the execution layer.

2. Placement granularity: per-pack (mostly), per-chunk (when standalone)

Where does location live?

Granularity	Location stored on	Pros	Cons
Per-chunk	every chunk row	maximal flexibility	location columns × 10^9 chunks = index bloat
Per-pack (default)	pack row; chunks inherit	one location row per ~hundreds of chunks	move = move whole pack
Per-tenant/bucket	tenant config	tiny metadata	coarse; can’t optimize hot/cold per object

Decision: location is recorded per-pack for packed chunks (the common case), and per-chunk only for standalone (hot/large/unpacked) chunks. This keeps the location map ~100× smaller than per-chunk-everywhere while retaining object-level flexibility for the data that needs it. Tiering/migration then operate at pack granularity (move a pack = relocate all its chunks with one index update).

3. The Placement Service (policy → decision)

A stateless policy engine consulted when a new pack (or standalone chunk) needs a home, and by the migration worker when rebalancing.

flowchart TB
    classDef in fill:#dbeafe,stroke:#1e40af,color:#111827;
    classDef eng fill:#fde68a,stroke:#b45309,color:#111827;
    classDef out fill:#bbf7d0,stroke:#15803d,color:#111827;
    subgraph INPUTS["Placement inputs"]
      res["Tenant residency / sovereignty"]:::in
      dur["Durability class (1× / cross-provider)"]:::in
      cost["Provider cost: $/GB store + $/GB egress"]:::in
      lat["User region / latency"]:::in
      cap["Capacity & provider health"]:::in
      tier["Target tier (hot/warm/cold)"]:::in
    end
    eng["Placement Service<br/>(policy evaluation, default-deny on residency)"]:::eng
    INPUTS --> eng
    eng --> dec["Decision: {provider, region, bucket, tier, redundancy}"]:::out
    dec --> idx["Recorded in Pack/Chunk Index"]:::out

Residency is a hard constraint (default-deny): a tenant pinned to eu never has bytes placed outside eu, regardless of cost. Compliance trumps optimization.
Cost & latency are soft optimizations within the residency-allowed set.
Placement policy is config/data owned by Admin/Platform; the engine is stateless and cacheable.

4. Read routing & migration (content hash makes both safe)

Read routing: download resolve (06) reads the location from the index → picks the provider adapter → presigns. If cross-provider redundancy exists, route to the cheapest-egress / lowest-latency copy.

Online migration (rebalance, cost-optimize, provider exit, region move):

sequenceDiagram
    autonumber
    participant M as Migration Worker
    participant SRC as Source provider
    participant DST as Dest provider
    participant IX as Pack Index
    M->>SRC: read pack bytes
    M->>DST: write pack (same content)
    M->>DST: Head + BLAKE3 verify == content hash
    M->>IX: CAS location SRC→DST (dual-listed during cutover)
    Note over M,IX: reads now served from DST — SRC kept until grace
    M->>SRC: delete old pack (after grace, GC)

Migration is safe because content is addressed by hash: the destination copy is provably the same bytes (verify by hash), and the index flip is a CAS. Reads can be dual-sourced during cutover (try DST, fall back to SRC) so migration never causes a read miss. Provider exit = migrate every pack off a provider, then remove it.

5. Egress & data-gravity (the cost that dominates at scale)

Never move bulk bytes cross-provider/cross-region on the hot path. Reads are served from a co-located/cheapest copy; maintenance (scrub, pack, migrate) runs co-located with the data (in-region/in-cluster compute) to minimize egress.
Egress asymmetry: ingress is usually free, egress is expensive and varies wildly (R2 notably zero-egress). Placement weights egress price for read-heavy data and storage price for cold data.
CDN offloads repeat egress entirely (06 §6).

6. Tradeoffs / Alternatives / Scaling

Tradeoffs. Federation adds a placement decision and a location map to every write, and migration machinery. The payoff (residency, cost, durability, no lock-in) is exactly the multi-cloud value proposition (G4); a single-provider deployment simply uses a trivial one-option policy and pays none of the complexity.

Alternatives considered.

Static per-tenant provider assignment (no engine): simplest; loses per-object cost/tier optimization and easy rebalancing. Kept as the degenerate policy for small/self-host deployments.
Hash-based deterministic placement (provider = f(hash)): needs no location map but makes residency, tiering, migration, and provider exit nearly impossible (you can’t move data without changing its address). Rejected — the location-map approach is strictly more flexible and migration-safe.
Provider-managed multi-region (e.g. S3 MRAP) only: ties us to one provider’s multi-region story; defeats multi-cloud. Used opportunistically, not as the model.

Scaling concerns.

Location-map size → per-pack granularity (§2) keeps it ~100× smaller than per-chunk.
Migration throughput must be co-located and throttled; at PB scale a provider exit is a long-running, resumable, verify-as-you-go campaign (watermarked like GC).
Placement decisions are cheap/stateless → no scaling concern; policy is cached.
Residency correctness is a compliance risk → enforced as a hard constraint with audit, and verified by a periodic job that asserts no pack violates its tenant’s residency (a test, not a hope).

References

AWS S3 cross-region / Multi-Region Access Points: https://docs.aws.amazon.com/AmazonS3/latest/userguide/MultiRegionAccessPoints.html
Cloudflare R2 zero egress: https://developers.cloudflare.com/r2/
GDPR data residency considerations: https://gdpr.eu/