awesome-everything RU
↑ Back to the climb

Caching

ETags: make revalidation survive the load balancer

Crux Hands-on project: add ETag revalidation to a multi-replica API, reproduce the per-node 304 failure behind a load balancer, fix it with a content-derived tag, and prove the bandwidth win with before/after numbers.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 210 min

Reading about the per-node ETag bug is not the same as watching it ruin your egress graph. Stand up a small multi-replica API, add ETags the naive way, reproduce the all-200s regression behind a load balancer, then fix it with a content-derived tag and prove the bandwidth win with real numbers.

Goal

Turn the unit’s mental model into a reproducible loop: implement conditional requests correctly, demonstrate the failure mode that hides in single-server dev, fix it at the root, and verify the 304 rate and bytes-saved with before/after measurements.

Project
0 of 7
Objective

Build a small JSON API that serves a rarely-changing resource, add ETag-based conditional requests, reproduce the per-node 304 failure across multiple replicas behind a load balancer, fix it with a content-derived validator, and prove the bandwidth saving with before/after numbers.

Requirements
Acceptance criteria
  • A before/after table: 304 rate, total bytes transferred over N polls, and bytes-per-poll for unchanged content — measured under the same load against the multi-replica setup, not single-server.
  • A captured wire log (curl -v or DevTools) showing the same content yielding a consistent ETag across replicas after the fix, where it differed before.
  • Conditional handling is correct: 304 carries no body, a real content change produces a 200 with a new ETag, and the client picks up the new version on the next poll.
  • A short write-up naming why the original tag was node-local, why the content-derived tag is load-balancer-safe, and what a 304 actually saved here (bytes, not the round-trip).
Senior stretch
  • Add weak vs strong handling: emit a strong ETag for the identity body, vary it per Content-Encoding when you gzip, and demonstrate an If-Range request that correctly refuses a partial response when the validator is weak.
  • Emit Last-Modified alongside the ETag and construct a sub-second double-update that fools If-Modified-Since into a stale 304 while If-None-Match stays correct — proving the one-second blind spot.
  • Layer Cache-Control max-age in front of the ETag and show the composition: the client skips the request entirely while fresh, then falls back to revalidation, capturing the request count for each phase.
  • Add an on-call runbook: how to spot the all-200s regression in an egress/304-rate dashboard, the per-node ETag checklist (inode, mtime drift, per-process state, compression), and a one-command wire check to confirm cross-replica consistency.
Recap

This is the loop you will run on any real revalidation incident: implement the conditional handshake correctly, reproduce the per-node failure that single-server dev hides, fix it by making the ETag a pure function of the content so every replica agrees, and verify with before/after 304 rate and bytes-saved under identical load. Doing it once on a toy API makes the production diagnosis — ‘why did our 304 rate fall off a cliff after we scaled out?’ — immediate.

Continue the climb ↑Cache-Control: the header that programs every cache in the chain
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources2
expand
  1. 01
  2. 02

Trademarks belong to their respective owners. Editorial reference only.