Rate Limiting & Abuse Prevention
Availability and abuse defense. Closes OWASP API4 (Unrestricted Resource Consumption) and API6 (Unrestricted Access to Sensitive Business Flows), and provides the availability half of tenant isolation.
:::tip Quota vs Rate Limit Quotas are storage and capacity limits — bytes used vs bytes allowed, or invocations this billing period vs the plan ceiling. Rate limits are request-frequency limits — requests per second or per minute. Both are enforced independently. Hitting a quota returns a different error than hitting a rate limit. :::
1. Layered Rate Limits
flowchart TB
classDef l fill:#fde68a,stroke:#b45309,color:#111827;
req["request"]:::l --> e["① Edge / CDN / WAF: per-IP, DDoS scrubbing, geo, bot detection"]:::l
e --> t["② Per-tenant: request rate + storage / egress quotas (noisy-neighbor isolation)"]:::l
t --> k["③ Per-user / per-API-key: scoped budgets"]:::l
k --> ep["④ Per-endpoint: auth, search, and expensive operations — stricter limits"]:::l
ep --> ok["allow / 429 Retry-After"]:::l
| Layer | Limits applied | What it stops |
|---|---|---|
| ① Edge / CDN / WAF | Per-IP rate, DDoS scrubbing, geo-block, bot fingerprinting | Volumetric floods, automated scanners |
| ② Per-tenant | Request rate + quotas (storage bytes, egress, function invocations, share count) | Noisy neighbor; tenant-level abuse |
| ③ Per-user / per-key | Scoped request budgets per principal | Single compromised credential or API key exhausting tenant quota |
| ④ Per-endpoint | Login, share-password check, search, transforms = stricter individual limits | Brute force; expensive-operation abuse |
Algorithm: token-bucket / sliding-window counters in Redis, updated atomically via Lua scripts for correctness under concurrency. Distributed across gateway replicas so limits are enforced cluster-wide, not per-instance.
Gateway enforcement: rate-limit middleware runs before any request is routed to backend modules. A request that exceeds its limit never reaches business logic.
2. Quota Enforcement
Per-tenant quotas cover: storage bytes, transfer / egress bytes, request counts per period, function invocations, and share count. Quotas serve two purposes simultaneously:
- Billing — usage counted against the plan ceiling.
- Availability isolation — a single tenant cannot exhaust shared infrastructure resources.
The Billing service checks quota synchronously at upload init (a commit that would exceed quota is refused before issuing a presigned URL). Quota accrual after successful commits is asynchronous.
3. Targeted Abuse Defenses
| Abuse vector | Defense |
|---|---|
| Login brute force / credential stuffing | Exponential backoff + account lockout + CAPTCHA + breach-password check + device / anomaly signals |
| Share-password brute force | Strict per-link rate limit + lockout after N failures |
| ID / user enumeration | Uniform timing responses (no “user not found” vs “wrong password” oracle); unguessable token IDs |
| Mass public-link creation (scripted) | Flow-level limits on share creation; step-up auth required above threshold |
| Bulk download / data exfiltration | Per-session and per-tenant egress rate limits; anomaly detection triggers step-up |
| Signup farms | CAPTCHA + email verification; per-IP signup rate limit |
| Expensive search queries | Per-tenant search request budget; query complexity limit |
| Upload floods | Per-tenant ingest rate limit; staging GC reclaims uncommitted blobs after TTL |
| Function / transform abuse | Dedicated stricter per-tenant budget; CPU/memory caps enforced in the sandbox |
4. The Data-Plane Gap
Presigned direct-to-storage transfers (ADR-0011) bypass gateway rate limiting — the bytes travel directly between the client and object storage. BitVault cannot rate-limit what it does not see. Mitigations:
- Short-TTL, scoped, single-use-ish presigned URLs (small blast radius per URL).
- Object-store-side per-tenant egress limits and bucket policies enforced by the storage provider.
- CDN rate limits where the object store is fronted by a CDN.
- Quota enforcement at issuance time — the gateway refuses to issue a presigned URL if the commit would exceed the tenant’s storage or egress quota.
This is a deliberate, documented tradeoff: reduced in-band control over the data plane in exchange for the scalability of not proxying bytes through compute.
5. DoS Resilience
- Edge / CDN / WAF absorbs volumetric attacks before they reach application servers.
- Services autoscale via HPA and KEDA on queue depth; new replicas serve additional load within minutes.
- Graceful degradation: non-critical load (thumbnail generation, search re-index, transform previews) is shed before core read/write operations are impacted.
- Fail-closed on auth: the system never fails open to “allow” under load. An overloaded auth check fails the request, not grants it.
- PodDisruptionBudgets ensure that autoscaler activity and rolling updates do not simultaneously remove too many replicas.
6. Threats Addressed
| Threat | Control | Residual |
|---|---|---|
| Resource exhaustion (API4) | Layered limits + per-tenant quotas | Low |
| Sensitive-flow abuse (API6) | Flow limits + step-up + anomaly detection | Medium |
| Login brute force / credential stuffing | Backoff + lockout + CAPTCHA + MFA | Low |
| Data-plane abuse via presigned URLs | Short-TTL URLs + object-store-side limits | Medium (known gap — documented tradeoff) |
| Volumetric DoS | CDN / WAF + autoscale + graceful degradation | Low–medium |
| Noisy-neighbor availability impact | Per-tenant quotas + isolation | Low |