Caching
Caching capstone: design a multi-layer strategy
Reading about composed caches is not the same as designing one that holds under traffic. Take a real service with a mix of static, public, and per-user responses, design a coherent multi-layer caching strategy, and prove it: hit rate up, origin load down, no stale-after-edit, no dogpile, and no per-user leak — with evidence at every step.
Turn the track’s mental model into a deliverable design and a working slice: assign each layer its owned data, write the Cache-Control policy per resource class, add validators and stampede protection, wire invalidation into the write path, and verify the whole stack composes under load.
Design and partially implement a multi-layer caching strategy (CDN edge + reverse proxy + application cache + DB) for a real service with at least three resource classes — a hashed static asset, a public cacheable read, and a per-user authenticated response — then prove the strategy holds under load with measured before/after numbers.
- A before/after table: cache hit rate (per layer), origin request rate, and p99 latency — measured under the same load, not estimated — showing origin load down and hit rate up.
- A demonstrated edit-to-visible test: change a product, and show the new value appears at the browser within the documented invalidation window, with the propagation order (Redis → proxy → CDN) evidenced in logs or purge responses.
- A demonstrated no-leak test: confirm the per-user authenticated response is never stored by a shared cache (e.g. response carries private/no-store and the CDN/proxy reports a bypass/MISS for it).
- A demonstrated no-dogpile test: expire the hot key under concurrent load and show the origin sees one recompute (single-flight) or zero blocking readers (SWR), not N simultaneous queries.
- A one-page write-up of the ownership map and the per-resource policy table, naming which mechanism defends each failure mode (leak, stale-after-edit, dogpile, outage) and why.
- Add an on-call runbook: how to read a sudden origin-load spike (which layer's hit rate dropped), how to safely purge a tag without a full flush, and a checklist for shipping a Cache-Control change without leaking per-user data.
- Add cache observability: per-layer hit/miss/bypass counters and an age-vs-freshness panel, so you can see at a glance which layer served each request.
- Add a probabilistic early-expiry (XFetch-style) refresh on the hot key and compare its origin-load profile against the single-flight + SWR combination under the same load.
- Extend the invalidation to be tag-based end to end (surrogate keys at the CDN and tagged Redis keys) and show that one product:42 purge drops every dependent cached response — page, fragment, and API read — without touching unrelated keys.
This is the loop you will run designing caching for any real service: assign ownership before tuning, write the Cache-Control policy per resource class with the s-maxage/max-age split and TTLs shrinking outward, make revalidation cheap with validators, defend the hot key with single-flight plus SWR, fail open with stale-if-error, and wire invalidation into the write path so a change propagates Redis → proxy → CDN in order. Then prove each property — hit rate, no leak, no stale-after-edit, no dogpile, survives outage — with measured numbers under identical load. Composing it once on a real service is what makes the production version judgement, not guesswork.