Caching
Cache stampede: free-recall review
Retrieval beats re-reading. For each prompt, say or write a full answer from memory before you open the model answer — the effort of recall is what makes the mechanisms stick when you are the one paged at 3am.
Reconstruct the unit’s spine without looking back: why the burst is the failure, where single-flight and a lock each bound the herd, why XFetch is coordination-free, what SWR trades away, why negative caching matters, and how a stampede becomes a metastable failure.
- 01Why does a cache make the failure mode worse than no cache at all, even though total request volume is lower?
- 02Contrast in-process single-flight with a Redis distributed lock: what each bounds, what each costs, and the order to apply them.
- 03How does XFetch achieve 'expected exactly one early refresh per TTL window' with no coordination, and when does it fail?
- 04What does stale-while-revalidate guarantee at a TTL boundary, what does it trade away, and how does CDN request coalescing extend it?
- 05What is negative caching, and what attack does its absence enable?
- 06Describe the metastable failure that can follow an unmitigated stampede and why the system cannot self-recover.
If you reconstructed each answer from memory, you hold the unit’s spine: the burst is the failure (not the volume), single-flight bounds the per-process herd and a lock bounds the per-fleet herd, XFetch refreshes hot keys before expiry with no coordination (and fails on cold keys), SWR buys zero wait at the price of bounded staleness, negative caching closes the miss-storm amplification hole, and an unmitigated stampede can tip into a metastable retry storm that only an external kill signal can break.