09 — Storage Federation & Placement

Topic: storage federation. Federation = many providers, regions, buckets, and tiers presented as one logical store, with a Placement Service deciding where each unit of data lives and a location map that always knows where it is. Decision in ADR-0020.


1. What federation buys (and the one thing it requires)

Federation lets one BitVault deployment span MinIO + S3 + R2 + GCS + Azure across regions. It buys:

The one thing it requires: a reliable location map. There is no global provider namespace; the system must always know which provider/bucket/key holds a given chunk/pack. That map is the Chunk/Pack Index (08) — federation is fundamentally a metadata capability, with the abstraction (01) as the execution layer.


2. Placement granularity: per-pack (mostly), per-chunk (when standalone)

Where does location live?

Granularity Location stored on Pros Cons
Per-chunk every chunk row maximal flexibility location columns × 10^9 chunks = index bloat
Per-pack (default) pack row; chunks inherit one location row per ~hundreds of chunks move = move whole pack
Per-tenant/bucket tenant config tiny metadata coarse; can’t optimize hot/cold per object

Decision: location is recorded per-pack for packed chunks (the common case), and per-chunk only for standalone (hot/large/unpacked) chunks. This keeps the location map ~100× smaller than per-chunk-everywhere while retaining object-level flexibility for the data that needs it. Tiering/migration then operate at pack granularity (move a pack = relocate all its chunks with one index update).


3. The Placement Service (policy → decision)

A stateless policy engine consulted when a new pack (or standalone chunk) needs a home, and by the migration worker when rebalancing.

flowchart TB
    classDef in fill:#dbeafe,stroke:#1e40af,color:#111827;
    classDef eng fill:#fde68a,stroke:#b45309,color:#111827;
    classDef out fill:#bbf7d0,stroke:#15803d,color:#111827;
    subgraph INPUTS["Placement inputs"]
      res["Tenant residency / sovereignty"]:::in
      dur["Durability class (1× / cross-provider)"]:::in
      cost["Provider cost: $/GB store + $/GB egress"]:::in
      lat["User region / latency"]:::in
      cap["Capacity & provider health"]:::in
      tier["Target tier (hot/warm/cold)"]:::in
    end
    eng["Placement Service<br/>(policy evaluation, default-deny on residency)"]:::eng
    INPUTS --> eng
    eng --> dec["Decision: {provider, region, bucket, tier, redundancy}"]:::out
    dec --> idx["Recorded in Pack/Chunk Index"]:::out

4. Read routing & migration (content hash makes both safe)

Read routing: download resolve (06) reads the location from the index → picks the provider adapter → presigns. If cross-provider redundancy exists, route to the cheapest-egress / lowest-latency copy.

Online migration (rebalance, cost-optimize, provider exit, region move):

sequenceDiagram
    autonumber
    participant M as Migration Worker
    participant SRC as Source provider
    participant DST as Dest provider
    participant IX as Pack Index
    M->>SRC: read pack bytes
    M->>DST: write pack (same content)
    M->>DST: Head + BLAKE3 verify == content hash
    M->>IX: CAS location SRC→DST (dual-listed during cutover)
    Note over M,IX: reads now served from DST — SRC kept until grace
    M->>SRC: delete old pack (after grace, GC)

Migration is safe because content is addressed by hash: the destination copy is provably the same bytes (verify by hash), and the index flip is a CAS. Reads can be dual-sourced during cutover (try DST, fall back to SRC) so migration never causes a read miss. Provider exit = migrate every pack off a provider, then remove it.


5. Egress & data-gravity (the cost that dominates at scale)


6. Tradeoffs / Alternatives / Scaling

Tradeoffs. Federation adds a placement decision and a location map to every write, and migration machinery. The payoff (residency, cost, durability, no lock-in) is exactly the multi-cloud value proposition (G4); a single-provider deployment simply uses a trivial one-option policy and pays none of the complexity.

Alternatives considered.

Scaling concerns.

References