Crux Read real cache-aside snippets — lock-on-miss, request coalescing, XFetch — predict the behaviour, and pick the highest-leverage fix.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min
Stampede bugs hide in the cache-aside code itself: a lock with the wrong EX, a coalescer that registers the in-flight promise one line too late, an XFetch rule that fires too early. Read each snippet the way you would in review, then choose the fix a senior engineer makes first.
Goal
Practise reading the actual mitigation code — lock-on-miss, request coalescing, and probabilistic early expiration — and spotting the defect that a load test will eventually expose.
Snippet 1 — lock-on-miss
func get(ctx context.Context, key string) ([]byte, error) { if v, ok := cache.Get(key); ok { return v, nil } // miss: try to become the rebuilder locked := redis.SetNX(ctx, "lock:"+key, uuid, 30*time.Second).Val() if !locked { // someone else is rebuilding return rebuild(ctx, key) // <-- rebuild anyway } v, err := rebuild(ctx, key) if err == nil { cache.Set(key, v, 60*time.Second) redis.Del(ctx, "lock:"+key) } return v, nil}
Quiz
Completed
The SetNX lock is acquired correctly, yet a load test still produces N concurrent DB rebuilds. Where is the bug?
Heads-up A long EX risks stale-lock delay on crash, but it does not cause N concurrent rebuilds. The defect is that lock-losers rebuild anyway instead of waiting.
Heads-up SET ... NX is atomic in Redis — exactly one caller wins. The failure is in what the losers do, not in the acquire.
Heads-up Set-then-Del is the correct order. Reordering does not change that every lock-loser still calls rebuild and floods the DB.
Snippet 2 — request coalescing
const inflight = new Map(); // key -> Promiseasync function getCoalesced(key) { const cached = await cache.get(key); if (cached !== null) return cached; const fresh = await rebuild(key); // (A) await the rebuild... inflight.set(key, fresh); // (B) ...then record it const value = await fresh; cache.set(key, value, 60); inflight.delete(key); return value;}
Quiz
Completed
This is meant to coalesce concurrent misses for the same key into one rebuild, but it never coalesces. What is wrong?
Heads-up JS is single-threaded for this code; Map access between awaits is fine. The bug is that the promise is registered after the await and the map is never consulted on entry.
Heads-up Missing jitter is a separate multi-key concern. This snippet fails to coalesce a single key because it awaits before registering and never checks the map.
Heads-up The delete runs after the value is cached, which is correct. The real defect is the await-before-register ordering that defeats coalescing entirely.
Snippet 3 — probabilistic early expiration (XFetch)
An operator wants fewer wasted early rebuilds on warm keys, so they raise beta from 1.0 to 4.0. What is the effect on a hot key vs a colder key?
Heads-up It is the inverse: a larger beta multiplies the left side, making it exceed ttl_remaining earlier, so the window opens sooner and fires more often.
Heads-up Both scale the left side linearly, so beta directly controls how early the window opens. Changing it changes the firing time.
Heads-up Plain-TTL behaviour is the LOW-beta extreme, where the window collapses to the boundary. High beta refreshes earlier and more often.
Snippet 4 — lock with a fencing-token write
def rebuild_and_write(key, my_token): value = rebuild(key) # may take longer than the lock EX if redis.get("lock:" + key) != my_token: return # we lost the lock — abort the write redis.set("cache:" + key, value, ex=60)
Quiz
Completed
The fencing check 'GET lock then SET cache' guards against a slow rebuild that outlived its lock. What residual race remains, and what closes it?
Heads-up The check is two separate round-trips. The lock can expire in the gap, so the write must be made atomic or carry a monotonic version to be race-free.
Heads-up Type handling is a trivial detail; assume it is correct. The real residual race is the non-atomic check-then-write window.
Heads-up The cache TTL is unrelated to the lock race. The residual defect is the non-atomic gap between reading the token and writing the value.
Recap
Every stampede defence lives in code that is easy to get subtly wrong: a lock only helps if the losers wait and re-check rather than rebuild anyway; a coalescer only helps if the in-flight promise is registered synchronously before any await and consulted on entry; XFetch’s beta moves the early-refresh window the opposite way from most people’s intuition (higher beta means earlier, more frequent refreshes); and a fencing-token check is only safe when the read-then-write is atomic or backed by a monotonic version. Read the mitigation, trace two concurrent callers through it, and the bug usually shows itself before any load test does.