Deployment & Infra DEP · 05 · 10

Infrastructure as Code: build a drift-safe stack

Hands-on project — stand up a locked, versioned remote-state IaC stack, then deliberately induce a concurrency clash and a drift event and recover from each with evidence.

DEP Senior ◷ 240 min

Level

FoundationsJuniorMiddleSenior

Reading about locked state and silent reverts is not the same as living through them. Stand up a small but real IaC stack with proper remote state, then deliberately walk it into the two incidents from the lesson — a concurrent-apply clash and an out-of-band drift — and recover from each the senior way, with evidence at every step.

Goal

Turn the unit’s mental model into muscle memory: configure a versioned, locked remote backend, prove the lock actually blocks a concurrent apply, induce real drift and resolve it deliberately instead of letting apply silently revert it, and keep secrets out of state.

Project

0 of 7

Objective

Build a small Terraform/OpenTofu (or Pulumi) stack with proper remote, versioned, locked state, then deliberately reproduce and recover from a concurrency clash and a drift event — proving each outcome with command output, not assertion.

Requirements

Acceptance criteria

Command output (not prose) showing: a clean apply, a no-op second apply, the 'Error acquiring the state lock' failure under concurrency, and a plan -refresh-only that reports the induced drift.
A short write-up of the drift resolution: which manual change you codified into config, which you let apply revert, and the reasoning for each — explicitly naming the silent-revert risk you avoided.
Evidence the backend is versioned and locked (backend config plus the object-version listing or the lock entry), and that the state file contains no plaintext secret.
A one-paragraph reflection connecting the exercise back to the unit: how the state file acted as both source of truth and hazard in your runs.

Senior stretch

Add a scheduled drift-detection job (CI cron running plan -refresh-only) that opens an alert or PR when reality diverges from the declaration, so drift is reviewed before any apply silently resolves it.
Extract the stack into a reusable module with input variables and instantiate it twice (e.g. staging and prod) from the same source, proving reproducibility across environments.
Wire a CI pipeline with a concurrency group and a -lock-timeout so overlapping runs wait instead of failing, then simulate a crashed run that leaves a stale lock and document the safe force-unlock + re-plan recovery.
Convert one mutable resource to an immutable-replacement pattern (new image / create_before_destroy) and show how it shrinks the drift surface compared with in-place mutation.

Recap

This is the loop you will run on every real IaC stack: put state in a versioned, locked remote backend before anything else, prove the lock by trying to break it, treat drift as a question you answer with plan -refresh-only and a deliberate keep-or-revert decision rather than a blind apply, and keep secrets out of state entirely. Doing it once on a toy stack — including breaking it on purpose and recovering with evidence — makes the production version muscle memory instead of a 2am surprise.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.