07 — CI/CD & Image Build Pipelines
Tasks 5 & 7: design image build pipelines and CI/CD workflows. GitHub Actions builds, tests, secures, and publishes artifacts — then hands off to GitOps. It does not deploy to clusters. Decision in ADR-0032.
1. The CI/CD boundary
CI’s job ends at “signed image pushed + GitOps PR opened.” Deployment is ArgoCD’s pull-based job (06). This split means GitHub Actions never holds a kubeconfig — the single most important CI security property.
2. Pipelines
flowchart TB
classDef pr fill:#dbeafe,stroke:#1e40af,color:#111827;
classDef mg fill:#fde68a,stroke:#b45309,color:#111827;
classDef rel fill:#bbf7d0,stroke:#15803d,color:#111827;
classDef out fill:#fbcfe8,stroke:#be185d,color:#111827;
subgraph PRP["PR pipeline (fast feedback)"]
l["lint + unit/integration tests (path-filtered)"]:::pr
b["build image (no push / ephemeral)"]:::pr
sc["scan (trivy) + helm lint + OPA policy"]:::pr
pv["spin ephemeral preview (PR generator)"]:::pr
end
subgraph MGP["Merge-to-main pipeline"]
mb["build multi-arch image"]:::mg
sb["SBOM (syft)"]:::mg
mscan["scan (trivy/grype) — gate"]:::mg
sign["cosign sign (keyless OIDC) + SLSA provenance"]:::mg
push["push by digest → registry"]:::mg
bump["open PR → GitOps repo (bump dev digest)"]:::out
end
subgraph RELP["Release pipeline (tag v*)"]
cl["changelog + GitHub Release"]:::rel
prom["promote digest → staging (GitOps PR)"]:::rel
chart["publish Helm chart (OCI) + Compose bundle (self-host)"]:::rel
end
PRP --> MGP --> RELP
- PR pipeline: quick gates + an ephemeral preview env.
id-token: writeis not granted on PRs (signing happens post-merge) — a GitHub security constraint we respect. - Merge pipeline: the supply-chain heart (§3) — build, attest, sign, push, then open a GitOps PR (auto-merge for dev).
- Release pipeline: changelog, GitHub Release, promote the already-built digest to staging, publish self-host artifacts (08).
3. Supply-chain security (built into every image)
flowchart LR
classDef a fill:#fde68a,stroke:#b45309,color:#111827;
classDef o fill:#bbf7d0,stroke:#15803d,color:#111827;
img["built image"]:::a --> sbom["SBOM (syft)"]:::a
img --> scan["vuln scan (trivy) — fail on critical"]:::a
oidc["GHA OIDC token (id-token: write)"]:::a --> fulcio["Fulcio: short-lived cert bound to workflow identity"]:::a
fulcio --> cosign["cosign sign (keyless)"]:::a
cosign --> rekor["Rekor transparency log"]:::o
img --> prov["SLSA provenance (in-toto attestation)"]:::a
sbom & cosign & prov --> reg[("registry: image + attestations")]:::o
reg --> verify["admission: verify signature + provenance before run (cluster)"]:::o
- Keyless signing: GitHub OIDC identity → Fulcio short-lived cert →
cosign sign; the signature is logged in Rekor. No long-lived signing keys to steal. “Prove you are this workflow,” not “prove you have a key.” - SBOM (syft) + vuln scan (trivy/grype, gate on criticals) + SLSA provenance (in-toto attestation of how/where it was built).
- Admission verification: the cluster runs only images with a valid signature + provenance (policy-controller / Kyverno), closing the loop (02).
4. Cloud auth: OIDC, no long-lived secrets
The same OIDC mechanism authenticates GitHub Actions to the cloud (registry push, OpenTofu plan) by assuming a short-lived cloud IAM role — so there are no long-lived cloud keys in GitHub secrets. Combined with §3, the pipeline holds no durable credentials.
5. Monorepo CI ergonomics
- Path filters → only build the services a change touches (a web change doesn’t rebuild Go images).
- Reusable workflows + matrix builds (per service, per arch).
- Caching: Go build/module cache, pnpm store, BuildKit layer cache (registry- backed) → fast incremental builds.
- Hardening: least-privilege
GITHUB_TOKEN, pinned action SHAs, dependency review, secret scanning, required status checks + branch protection.
6. Tradeoffs / Alternatives / Scaling
Tradeoffs. Full supply-chain (SBOM + scan + sign + provenance + verify) adds CI time and complexity; it is non-negotiable for a credible production platform and is mostly cached/parallelized.
Alternatives considered.
- CI deploys directly (push CD): simpler but CI holds cluster creds + no drift control. Rejected for GitOps (ADR-0028).
- Long-lived registry/cloud keys in secrets: the thing OIDC eliminates. Rejected.
- Keyed cosign signing: a key to manage/rotate/leak; keyless (OIDC) preferred.
- Other CI (GitLab/Buildkite): equivalent; GitHub Actions chosen for repo proximity + first-class OIDC (ADR-0032).
Scaling concerns.
- Build throughput → self-hosted/larger runners (incl. arm64 for native multi-arch), aggressive caching, path-filtered matrices.
- Flaky-test drag → quarantine + required-checks discipline.
- Registry/egress cost → layer cache + small images (01).
References
- GitHub OIDC hardening: https://docs.github.com/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect
- Sigstore cosign: https://docs.sigstore.dev/cosign/signing/overview/
- SLSA: https://slsa.dev/ · syft (SBOM): https://github.com/anchore/syft · trivy: https://github.com/aquasecurity/trivy