Deployment & Infra
Image layers and the build cache: order is everything
A one-line CSS tweak triggers a four-minute CI build. Every push does the same: npm install reruns from scratch, pulling 1,200 packages over the network, even though package.json never changed. Someone wrote COPY . . before RUN npm install, so editing any file invalidates the install layer. The team had been paying ~3 minutes per push for months — a cache that was never hitting because the Dockerfile asked the wrong question of it first.
A layer is a diff, and the cache key is content
A Docker image is not one blob. It is a stack of read-only layers presented through a union filesystem (overlay2 on Linux), so the running container sees one merged tree but the storage is a chain of diffs. Each RUN, COPY, and ADD in your Dockerfile produces one layer — the filesystem changes that step made, nothing more.
The build cache keys each layer on two things: the instruction string itself and the content it depends on. For COPY and ADD, Docker computes a checksum of the files being copied; if a byte changes, the checksum changes, and the cache misses. For RUN, the cache key is the literal command text — Docker does not inspect what the command actually does, which is its own trap (more below). The rule that follows is unforgiving: the moment one layer’s key changes, that layer and every layer after it rebuild. Cache is a prefix match. You keep it only up to the first miss.
That single fact drives every optimisation in this lesson. Your job as the author of a Dockerfile is to arrange instructions so the steps that rarely change sit early (deep in the cache prefix) and the steps that change on every commit — your source code — sit last.
The cardinal rule: order least- to most-changing
Here is the bug from the hook, and the fix.
| Step | Bad order (cache busts every edit) | Good order (install stays cached) |
|---|---|---|
| 1 | COPY . . | COPY package.json package-lock.json ./ |
| 2 | RUN npm ci | RUN npm ci |
| 3 | — | COPY . . |
| Effect of editing one source file | Step 1 checksum changes → install reruns (minutes) | Only step 3 misses → install is a cache hit (seconds) |
In the good order, COPY package.json package-lock.json ./ only changes its checksum when your dependency manifest changes. Edit a component, and steps 1 and 2 stay cached; only the final COPY . . and whatever follows rebuild. This is the single highest-leverage change you can make to a Dockerfile: teams routinely report CI build times dropping by 70% from layer caching alone, because the multi-minute dependency install becomes a sub-second cache hit on the vast majority of commits.
The same principle generalises. Pin your base image and install OS packages early; they change monthly at most. Copy lockfiles and install dependencies next. Copy source and build last. The frequency gradient — rarely-changing at the bottom, every-commit at the top — is the whole game.
Why this works
RUN is cached on the command text, not its result. RUN apt-get update will happily reuse a months-old cached layer because the string never changed — so you install stale package indexes and then apt-get install against them. The fix is to join them in one instruction: RUN apt-get update && apt-get install -y curl. Now they share a cache key and invalidate together. This is also why ARG placement matters: a build arg changes downstream cache keys, so declare it as late as the build allows.
Multi-stage: compile fat, ship slim
Your build toolchain — compilers, dev dependencies, node_modules full of build-time packages, apt caches — has no business in the image you run in production. It bloats the image, widens the attack surface, and slows every pull and deploy. Multi-stage builds solve this: you write several FROM stages in one Dockerfile, do the heavy work in a fat builder stage, then COPY --from=builder only the finished artifacts into a slim runtime stage. Only the last stage becomes the published image; the builder is discarded.
# syntax=docker/dockerfile:1
FROM node:22 AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM gcr.io/distroless/nodejs22-debian12
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["dist/server.js"]The numbers are dramatic. A single-stage Node build commonly lands around 380MB+ (the full node:22 base is ~1.1GB before your app); the multi-stage version on a slim or distroless base drops to roughly 60MB. For a Go binary the contrast is starker — a single-stage build near 180MB collapses to a ~3–12MB image on scratch or distroless/static, because the compiled binary needs no runtime at all. One widely-cited example shrank an 843MB image to 12.1MB, a 98.6% reduction.
Distroless vs alpine: the runtime-base tradeoff
Once you have a slim runtime stage, the base you pick is a real decision. Alpine is tiny because it uses musl libc and BusyBox, and it ships a package manager — convenient when you need to add a tool. The catch: musl differs from glibc subtly enough to cause native-module breakage and the occasional DNS or performance surprise. Distroless ships only your app and its direct runtime dependencies — no shell, no package manager, no apt — which is excellent for attack surface but means you cannot docker exec a shell into it to debug, and you must get every runtime dependency in via COPY.
A compiled service must ship as a small, hardened production image. Pick the final-stage base.
.dockerignore and the secret that never dies
Two failures round out the picture. First, the build context: when you run docker build ., the entire directory is tarred and sent to the daemon. Without a .dockerignore, that includes .git, node_modules, build output, and .env files — slowing the upload and risking that COPY . . rakes secrets and junk into a layer. A .dockerignore listing node_modules, .git, *.log, and .env keeps the context lean and the copy clean.
Second, the trap that bites teams hardest: a secret added in one layer and removed in a later layer still lives in image history. Layers are immutable diffs. If you COPY id_rsa (or echo a token into a file) and then RUN rm id_rsa two lines later, the removal is just a new diff on top — the original file is still recoverable from the earlier layer via docker history or by extracting the image. The delete is theatre. The correct tools are BuildKit secret mounts — RUN --mount=type=secret,id=token ... makes the secret available for that one instruction and writes it to no layer — or, failing that, a multi-stage build where the secret lives only in a discarded builder stage. Never rm a secret and assume it is gone.
You edit one source file and rebuild. npm ci reruns every time. What is the most likely cause?
A token was COPYed into a layer, then deleted with RUN rm in the next instruction. Is it safe?
Order a Node Dockerfile's steps for maximum cache reuse (least- to most-frequently-changing):
- 1 FROM node:22 AS builder — pin the base image (changes rarely)
- 2 RUN apt-get update && apt-get install -y <os-deps> — OS packages (monthly at most)
- 3 COPY package.json package-lock.json ./ — dependency manifest (changes when deps change)
- 4 RUN npm ci — install dependencies (cached until the manifest changes)
- 5 COPY . . then RUN npm run build — source code (changes every commit)
- 01Explain why putting COPY . . before RUN npm ci wrecks build times, and the exact reorder that fixes it.
- 02Why does deleting a secret in a later layer not remove it, and what should you do instead?
An image is a stack of read-only layers over a union filesystem, and each Dockerfile instruction produces one layer keyed on the instruction text plus the content it touches. The cache is a prefix match, so the first instruction whose key changes rebuilds itself and everything below it — which makes instruction order the single biggest lever you have. Put rarely-changing steps first: pin the base, install OS packages, copy the lockfile and install dependencies, and only then copy source and build, so an everyday code edit keeps the expensive install as a cache hit. Use multi-stage builds to compile in a fat builder and COPY only artifacts into a slim distroless or alpine runtime, dropping images from hundreds of MB to tens. Keep .git, node_modules, and .env out of the build context with .dockerignore. And remember that a secret added then rm’d still lives in an earlier layer’s diff — reach for BuildKit secret mounts or a discarded builder stage, because in immutable layers you can never delete it back out.