Deployment & Infra
Secrets at deploy: re-plumb a leaky deployment
Reading about leaked secrets is not the same as closing the holes yourself. Start from a deployment that does everything wrong — secret baked into the image, base64 Secret in git, password in an env var — and re-plumb it end to end until a leaked manifest, a stolen etcd backup, and a crash dump all reveal nothing usable.
Turn the unit’s threat model into a reproducible hardening loop: prove the original leaks, move the secret out of the image, encrypt it at rest and in git, switch injection to files, rotate without a restart, and verify each fix with evidence rather than assertion.
Take a small Kubernetes-deployed service whose secret is baked into the image, committed as a base64 Secret, and injected as an env var — and re-plumb it so the secret enters only at runtime, is encrypted at rest and in git, and rotates without a pod restart. Prove every leak is closed with a concrete command, not a claim.
- A before/after leak table: for each of the three original leaks (image layer, base64 git Secret, env crash dump), the exact command that exposed it before and the same command showing nothing usable after.
- Evidence that the production image contains no secret material in any layer (docker history output) and that etcd now stores the value as ciphertext (raw etcd read).
- The committed git artifact is a Sealed Secret (or ExternalSecret reference) that cannot be decoded without the in-cluster controller, shown by a failed decode attempt.
- A rotation demonstration: new value applied, running pod serves it with no restart, with the measured time-to-effect recorded.
- Add a CI secret-scanning gate (gitleaks/trufflehog) that fails the build if any plaintext secret or base64 Secret is committed, and show it catching a deliberately reintroduced leak.
- Replace the static secret with a Vault dynamic secret (short-lived per-app credential that auto-revokes) and show that a credential captured at time T stops working after its TTL.
- Tighten RBAC so only the workload's ServiceAccount can read the Secret, and prove a different ServiceAccount is denied — closing the misconfigured-RBAC leak path the lesson names.
- Write a one-page on-call runbook: the five places a secret leaks (image, base64 Secret, etcd, env/crash dump, git history), the triage command for each, and the fix-priority ladder from never-bake to dynamic credentials.
This is the loop you will run on every real deployment that touches a credential: prove where it leaks before you trust any fix, move it out of the image, encrypt it at rest and rewrite the old Secrets, make the committed artifact genuine ciphertext, switch injection to files to shrink the crash-dump surface, and rotate in place without a restart. Doing it once on a toy service turns “we use Secrets, so we are fine” into a threat model you can actually defend under audit.