Crux Read real cache-aside snippets — a key-design bug, a purge that races a reader, and a fixed-TTL stampede — and pick the highest-leverage fix a senior makes first.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min
Invalidation bugs hide in the cache-key string and the order of two lines on the write path. Read each snippet, find the staleness gap it opens, and choose the fix a senior engineer would make first.
Goal
Practise the loop you run in every cache-consistency incident: read the key design and the write path, predict where stale data leaks in, and reach for the highest-leverage fix — not the biggest hammer.
Snippet 1 — the cache key
// GET /products?category=shoes&sort=price&page=2&utm_source=newsletterfunction cacheKey(req) { return "products:" + req.url; // whole URL, query string and all}async function handler(req, res) { const k = cacheKey(req); let body = await redis.get(k); if (!body) { body = await db.queryProducts(req.query); await redis.set(k, body, "EX", 300); } res.send(body);}
Quiz
Completed
The product catalogue rarely changes, yet hit rate is near zero and the origin is busy. What is wrong with this key, and what is the fix?
Heads-up A longer TTL would help a tiny bit, but the real defect is key fragmentation: identical results are cached under thousands of distinct keys, so almost every request misses regardless of TTL.
Heads-up A namespaced prefix is good practice and is not the problem. The fragmentation comes from folding tracking/ordering noise into the key, not from the prefix.
Heads-up Pipelining is a throughput micro-optimisation. It does nothing for a hit rate destroyed by encoding irrelevant query params into the cache key.
Snippet 2 — the purge on write
async function updateUser(id, patch) { const row = await db.users.update(id, patch); // DB is now fresh await redis.del("user:" + id); // drop the cache entry return row;}// elsewhere, the read path (cache-aside):async function getUser(id) { const k = "user:" + id; let u = await redis.get(k); if (!u) { u = await db.users.find(id); // may read a pre-update row await redis.set(k, u, "EX", 3600); // ...and write it back } return u;}
Quiz
Completed
This passes every local test but reverts user edits in production for up to an hour. What is the defect and the cheapest production patch?
Heads-up Reordering does not help: a reader can still miss after the DEL and before the commit, read the old row, and repopulate. The window is between the reader's read and its SET, which moving the DEL does not close.
Heads-up A shorter TTL only shrinks how long the stale value survives — it does not stop the racing reader from writing it. The fix targets the race (double-delete or leases), not the backstop.
Heads-up Retrying the DEL addresses the separate dual-write failure, not this race. Even a DEL that succeeds is undone by a reader's later SET, so a retry alone leaves the bug in place.
Snippet 3 — the TTL
const TTL = 600; // 10 minutes, same for every entryasync function cacheConfig(tenantId) { const k = "config:" + tenantId; let cfg = await redis.get(k); if (!cfg) { cfg = await db.loadConfig(tenantId); // expensive: ~400ms, joins 5 tables await redis.set(k, cfg, "EX", TTL); } return cfg;}
Quiz
Completed
After a fleet-wide deploy that warms many tenants' configs at once, latency p99 spikes hard every 10 minutes. What is happening and the first fix?
Heads-up The join cost is real but secondary — the periodic p99 spike is driven by synchronized expiry stampeding the origin. Jitter and single-flight remove the spike without a schema change.
Heads-up Eviction is not periodic like this; the 10-minute periodicity matches the fixed TTL exactly. This is synchronized expiry, fixed with jitter, not an eviction-policy change.
Heads-up A longer fixed TTL only stretches the interval between identical synchronized stampedes — the spike is just as sharp when it fires. Scattering expiries with jitter is what removes the spike.
Recap
Three classic invalidation bugs, all read straight from the code: a cache key that folds in irrelevant params shatters hit rate, so build keys from only the response-affecting params, normalised; delete-on-write races a concurrent reader’s repopulating SET, patched cheapest with delayed double-delete (leases or write-through for the strong fix); and a fixed TTL warmed in one window stampedes the origin on every expiry, defused with jitter plus single-flight. Read the key and the write path first, name the gap, then pick the smallest fix that closes it.