01 — Flagship Features (Deep Dive)

The six features that turn BitVault into a programmable, verifiable, governable platform. Each: a technical sketch + Why it matters · Complexity · Dependencies · Resume impact. These compound (README §1).

Complexity scale: S (weeks) · M (1–2 mo) · L (quarter) · XL (multi-quarter).


1. WASM Plugin Runtime

Let users (and us) extend BitVault with sandboxed code that runs inside the platform — content processors, custom storage providers, custom auth, event handlers, policy hooks — written in any language, compiled to WebAssembly.

flowchart TB
    classDef h fill:#bbf7d0,stroke:#15803d,color:#111827;
    classDef p fill:#fde68a,stroke:#b45309,color:#111827;
    classDef c fill:#c7d2fe,stroke:#3730a3,color:#111827;
    host["BitVault host (Go)<br/>wazero runtime"]:::h
    subgraph SB["WASM sandbox (per invocation)"]
      plug["plugin module (.wasm)<br/>any language"]:::p
    end
    host -->|"instantiate + fuel/mem/time limits"| plug
    plug -->|"imports (capabilities only)"| cap["host functions:<br/>read_input · write_output ·<br/>scoped_http · scoped_kv · log · emit_event"]:::c
    cap --> host
    host -. "deny by default: no FS, no net, no syscalls" .-> plug
   
Why it matters Turns BitVault from a closed app into a platform; every other programmable feature (Functions, transforms, DLP, custom providers) rides on it. Ecosystem leverage.
Complexity L — runtime embedding is M; the hard parts are the capability/host-function ABI, resource governance, and a clean PDK.
Dependencies Go host (ADR-0001); event system (08) for handler triggers; signing (ADR-0032).
Resume impact Very high. “Designed a capability-based WASM plugin system with sandboxed execution and resource isolation” signals language-runtime + security depth few engineers have.

2. BitVault Functions (event-driven compute)

S3-events-plus-Lambda, but yours: run a WASM function on storage events (“on upload to /invoices, OCR it and extract totals”). The marriage of the plugin runtime (§1) and the event system (08).

sequenceDiagram
    autonumber
    participant FM as File & Metadata
    participant BUS as Event bus (NATS, ADR-0006)
    participant FN as Functions runtime (WASM pool)
    participant ST as Storage
    FM->>BUS: NodeChanged (upload to /invoices)
    BUS->>FN: matching trigger fires
    FN->>FN: warm WASM instance + capability grant
    FN->>ST: read input chunk(s) (scoped)
    FN->>FN: run user function (OCR, extract)
    FN->>FM: write derived metadata / new file (scoped)
    FN->>BUS: emit result event (chainable)
   
Why it matters The automation killer-app and the reason developers stay: extend the product without us shipping every integration. Powers transforms, DLP, custom workflows.
Complexity L–XL — builds on §1; adds the trigger router, warm-pool scheduler, idempotency/retry, and multi-tenant fairness.
Dependencies §1 (runtime), event system (08), storage scoped access (ADR-0011), KEDA.
Resume impact Very high. “Built a multi-tenant, event-driven serverless runtime on WASM with idempotent execution and autoscaling” is a systems headline.

3. Policy-as-code: Cedar + ReBAC

Replace ad-hoc ACLs with authorization as code: a verified policy engine for permissions and a relationship graph for sharing — plus the ability to prove properties about your policies.

flowchart TB
    classDef e fill:#fde68a,stroke:#b45309,color:#111827;
    classDef d fill:#bbf7d0,stroke:#15803d,color:#111827;
    req["request: principal · action · resource · context"]:::e --> cedar["Cedar engine (deny-by-default, formally verified)"]:::e
    rebac["ReBAC graph (Zanzibar/OpenFGA): owner/editor/viewer, group, folder inheritance"]:::e --> cedar
    attrs["attributes: tenant, tags, residency, classification"]:::e --> cedar
    cedar --> dec{"permit / forbid"}:::d
    cedar -.-> sim["policy simulation:<br/>what-if + 'prove no public access to /secret'"]:::d
   
Why it matters Most file products have brittle, ad-hoc permissions. Verifiable policy-as-code + a real sharing graph is enterprise-grade governance and a genuine differentiator.
Complexity L — integrating Cedar is M; the ReBAC graph + consistency (Zanzibar “zookies”) and simulation push it to L.
Dependencies Identity/sharing contexts (04 bounded-contexts), ADR-0007/ADR-0010.
Resume impact Very high. “Authorization via a formally-verified policy engine + a Zanzibar-style ReBAC graph, with policy simulation” is rare, senior security/distributed-systems signal.

4. End-to-End Encrypted Private Vaults

An opt-in, zero-knowledge vault tier where the server stores only ciphertext and cannot read the data — keys live client-side.

flowchart TB
    classDef c fill:#c7d2fe,stroke:#3730a3,color:#111827;
    classDef s fill:#fecaca,stroke:#b91c1c,color:#111827;
    file["file"]:::c --> ck["random per-file content key (CK)"]:::c
    ck --> enc["encrypt chunks client-side"]:::c --> server[("server stores ciphertext only")]:::s
    ck --> wrapU["wrap CK with user's public key"]:::c
    ck --> wrapR["wrap CK with each recipient's public key (sharing)"]:::c
    wrapU & wrapR --> kstore[("server stores wrapped keys (opaque)")]:::s
    recovery["recovery: key escrow / social recovery / passphrase-derived"]:::c
   
Why it matters A credible zero-knowledge tier is a top-tier trust differentiator (Proton/Tresorit/Cryptomator territory) and the marquee security feature.
Complexity XL — applied cryptography, key management, sharing/revocation, recovery UX, and the feature-loss tradeoffs. Easy to get subtly wrong.
Dependencies KMS/envelope model (ADR-0014), storage chunking (storage/02), client SDK.
Resume impact Very high (if done correctly). Applied crypto + key management + secure sharing is a standout — and demonstrates the judgment to scope tradeoffs honestly.

5. S3-Compatible API

Expose BitVault as a drop-in S3 endpoint. Every tool that speaks S3 — aws-cli, boto3, rclone, Terraform, backup tools, data frameworks — works against BitVault unchanged.

flowchart LR
    classDef e fill:#dbeafe,stroke:#1e40af,color:#111827;
    classDef g fill:#fde68a,stroke:#b45309,color:#111827;
    classDef s fill:#bbf7d0,stroke:#15803d,color:#111827;
    tools["aws-cli · boto3 · rclone · Terraform · Spark"]:::e -->|"S3 REST + SigV4"| gw["S3 gateway:<br/>auth (SigV4) · bucket/object map · multipart"]:::g
    gw --> ns["map: bucket→space, key→node path"]:::g
    gw --> st["reuse storage subsystem<br/>(chunks, manifests, presign)"]:::s
    gw -.->|"governed by"| pol["Cedar policy + ReBAC (§3)"]:::g
   
Why it matters Instant ecosystem: thousands of existing tools/integrations work day one. The single highest adoption lever for developers, at modest build cost.
Complexity M — SigV4 + the common object/multipart subset is well-trodden; full S3 fidelity is a long tail (scope the 80%).
Dependencies Storage subsystem (storage/), policy (07), API gateway (ADR-0003).
Resume impact High. “Implemented an S3-compatible API (SigV4, multipart) over a custom storage engine” is concrete, recognizable, and protocol-level.

6. Verifiable Storage (CIDs + Merkle proofs)

Lean into the content-addressed core (storage/02): give every object a verifiable content identifier and let anyone prove a file is intact and unaltered — without trusting BitVault.

flowchart TB
    classDef c fill:#bbf7d0,stroke:#15803d,color:#111827;
    classDef p fill:#fde68a,stroke:#b45309,color:#111827;
    obj["object → CID (BLAKE3 Merkle root)"]:::c --> link["verifiable link: bitvault://<cid>"]:::p
    obj --> proof["Merkle inclusion proof per chunk"]:::p
    proof --> verify["client verifies bytes ↔ CID without trusting server"]:::c
    obj --> receipt["signed storage receipt + timestamp (transparency log)"]:::p
    receipt --> audit["tamper-evident: prove 'this file existed, unchanged, at time T'"]:::c
   
Why it matters “Don’t trust, verify” storage is a genuinely novel angle for a file platform — provable integrity, compliance evidence, and a unique sharing primitive.
Complexity M–L — the Merkle machinery exists (BLAKE3); the work is the proof API, verifiable-link format, and the transparency log.
Dependencies Content addressing + BLAKE3 (storage/02, ADR-0016), integrity (storage/04).
Resume impact Very high. Merkle proofs, verifiable data structures, and transparency logs are deep, distinctive, and rarely seen in app engineers.

How the six compound

That loop — programmable → governable → verifiable → adopted → more programmable — is the moat a plain file-sharing app can’t replicate.