Caching
SWR: free-recall review
Retrieval beats re-reading. For each prompt, say or write a full answer from memory before you open the model answer — the effort of recall is what makes the freshness-versus-latency model stick.
Reconstruct the unit’s spine — the two windows, the latency trade, stale-if-error, single-flight, the client model, and the auth boundary — without looking back at the lesson.
- 01Why does max-age alone produce a p99 sawtooth, and how does stale-while-revalidate flatten it?
- 02Read the header max-age=60, stale-while-revalidate=300, stale-if-error=86400 as nested windows. What happens in each?
- 03What is the background-refresh stampede, and how do you prevent it?
- 04Why must staleness always be bounded, and what goes wrong if it is not?
- 05How do client libraries like SWR and React Query implement the same pattern, and what two behaviours do they add over the raw header?
- 06When is stale-while-revalidate the wrong tool, and what is the senior rule?
If you could reconstruct each answer from memory, you hold the unit’s spine: max-age alone samples the cache-miss into p99, stale-while-revalidate moves the refresh out of the request path to flatten the sawtooth, the layered header is three nested windows (fresh, SWR, stale-if-error), single-flight plus jitter tames the background-refresh herd, staleness must be hard-bounded so a down origin cannot serve content of unknown age forever, the client libraries reproduce the model with dedupingInterval and focus/reconnect revalidation, and SWR is never applied to auth.