ADR-0031 — IaC with OpenTofu; the IaC↔GitOps boundary
- Status: Deferred
- Date: 2026-06-11
- Related: platform/10 infrastructure, ADR-0028
V1 Freeze (2026-06-12): Deferred. V1 runs local/Compose; no provisioned cloud substrate. Re-opens at P4.
Context
Something must provision the cloud substrate (clusters, networks, buckets, KMS, IAM/OIDC) and bootstrap GitOps. Two questions: which IaC tool, and where IaC stops and GitOps begins — managing the same in-cluster object with both is an anti-pattern (two reconcilers fighting).
Decision
- OpenTofu (the MPL-licensed, community-governed fork of Terraform) — keeps the full HCL/module ecosystem while avoiding the BSL licensing risk that matters for an OSS-friendly, self-hostable project. Drop-in for our needs.
- Boundary: OpenTofu provisions the substrate (VPC, K8s cluster + node pools, object buckets with versioning/object-lock, KMS keys, IAM + workload identity + GitHub OIDC, DNS) and bootstraps ArgoCD (install + point at the GitOps root app). ArgoCD owns everything inside the cluster thereafter (ADR-0028). We do not manage in-cluster app resources with OpenTofu.
- Remote state per environment with locking + encryption; no secrets in state
where avoidable (ADR-0030). Modules + thin per-env
stacks. Plan in CI (OIDC read role), apply gated (OIDC apply role) — never blind
applies; scheduled
planfor drift.
Consequences
Positive
- Clean separation: each tool does what it’s best at (substrate vs in-cluster reconciliation); no controller fights.
- License-risk-free; large ecosystem; contributors already know HCL.
- DR is
tofu apply+ ArgoCD re-sync (ADR-0033). - OIDC-gated plan/apply → no long-lived cloud creds in CI (ADR-0032).
Negative / costs
- Two systems + a bootstrap handoff to operate; state management discipline required.
Alternatives considered
- Terraform (BSL): functionally fine, but the license change is a real risk for this project; OpenTofu removes it.
- Pulumi: real languages, but HCL/declarative + ecosystem fit the substrate and lower the contributor bar.
- Crossplane (infra via K8s CRDs through ArgoCD): compelling for unifying the control loop; deferred (higher risk/novelty for v1) — a strong future option to shrink the IaC/GitOps seam.
- Manage K8s resources in OpenTofu: the two-reconcilers anti-pattern. Rejected.
Scaling
State split per env + per layer (network/cluster/data) bounds blast radius; a DR region is just another stack instantiation; pinned module versions tested in nonprod first.