awesome-everything RU
↑ Back to the climb

Caching

Dogpile effect: the concurrency collision at the instant a hot key expires

Crux When a hot key expires, every in-flight request misses at the same instant and recomputes the same value in parallel — one expensive query becomes N. Single-flight locks, XFetch early expiry, and TTL jitter break the collision.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at junior altitude — the surface
◷ 16 min

3am page. The homepage feed query — a 200ms aggregation that normally runs maybe twice a minute — is suddenly running thousands of times a second. The database is at 100% CPU and the connection pool is exhausted. Nothing was deployed. Then someone reads the cache config: the feed key has a 5-minute TTL, and at 02:55:00 it expired. At that instant ~5000 requests were in flight, every one of them missed the same key, and all 5000 went to the database at once. One expired key, 5000 simultaneous identical queries.

Dogpile is the concurrency-at-expiry angle

There is a broader topic — cache stampede / thundering herd — about a missing key flooding the origin. The dogpile effect is the sharpest, most specific face of it: the collision that happens precisely at the moment of expiry on a single hot key. The hotter the key, the worse the dogpile, because hotness means many requests are concurrently in flight, and a TTL expiry is an event that hits all of them in the same millisecond.

Walk the timeline. The key is cached and serving thousands of reads per second straight from memory — the origin is idle. The TTL clock ticks to zero. Request #1 reads the cache, gets a miss, and starts recomputing. While it is still recomputing (it takes 200ms), requests #2 through #5000 arrive, each reads the cache, each also gets a miss — because #1 has not written the new value yet — and each also starts recomputing. The gap between “first miss” and “value rewritten” is the entire window in which the dogpile forms. A naive cache-aside read has no coordination across those concurrent misses, so the origin goes from idle to N-parallel in one tick.

The cost math is the whole problem

The reason the dogpile is dangerous is multiplicative, and the numbers are unforgiving. Take an expensive operation — say a 200ms database aggregation. At steady state behind a warm cache, it runs almost never. At the expiry instant, it runs once per concurrent request. One expensive query × N concurrent requests = origin meltdown.

Plug in real traffic. The Wikipedia framing: a page that takes 3 seconds to render at 10 requests per second means 30 processes recomputing simultaneously the moment it expires. Now scale it: 5000 requests/sec hitting a 200ms query means up to 5000 simultaneous database connections, each holding the query for 200ms — far past most pool limits (a Postgres pool is often 20–100 connections). The pool exhausts, healthy unrelated queries queue behind the herd, and a cache miss on one key takes the whole database down. The query cost was never the issue at steady state; the concurrency multiplier at the expiry instant is.

FixHow it breaks the collisionCost / failure mode a senior weighs
Single-flight lock (mutex)One request recomputes; the other N−1 wait or serve staleThe lock itself serializes; no timeout → deadlock if the recomputer crashes
XFetch (probabilistic early expiry)Each reader may recompute slightly before expiry, alone, so the key never expires under N readersNeeds to store delta (recompute cost); tune β; a little extra recompute work
Stale-while-revalidateServe the stale value instantly; one background job refreshesReaders briefly see stale data; needs a soft-TTL / hard-TTL split
TTL jitterRandomize each key’s TTL so a batch of keys never expires togetherSpreads keys apart but does nothing for a single hot key’s own herd

Single-flight: collapse N misses into one recompute

The first reflex is a lock. On a miss, the first request acquires a lock for that key and recomputes; everyone else who misses during the window blocks on the lock and, when it’s released, reads the freshly written value instead of recomputing. Go’s golang.org/x/sync/singleflight is the canonical implementation: call Do(key, fn) and if a call for that key is already in flight, additional callers wait and receive the original call’s result rather than re-executing fn. Reported impact is dramatic — Reddit cut database peak load ~3× with request coalescing on feed endpoints, and flooding a server with 100 concurrent requests for one key can deduplicate ~95% of the redundant expensive runs, turning many 3–5s waits into one 3–5s run shared by all callers.

The senior caveat: a local mutex only coalesces within one process. With 20 app instances behind a load balancer, each instance runs its own single-flight, so the origin still sees up to 20 concurrent recomputes — better than 5000, but not one. To get one recompute across the fleet you need a distributed lock (a Redis SET key val NX PX 30000, for example), which buys cross-process coordination at the cost of a network round-trip and a new dependency on the lock store’s availability.

Why this works

The reason “serve stale while one recomputes” is so attractive is that the dogpile only exists in the window between first-miss and value-rewritten. If readers never have to wait on that window — they get the previous value instantly and a single worker refreshes in the background — the collision has nothing to collide over. You trade a few seconds of staleness for never paying the herd. For a homepage feed, staleness of a few seconds is invisible; for a bank balance it is not. That domain judgement is the actual decision.

XFetch: never let the hot key expire under load

Locking fixes the symptom after the miss; probabilistic early expiration (XFetch, from the VLDB 2015 paper “Optimal Probabilistic Cache Stampede Prevention” by Vattani, Chierichetti, and Lowenstein) attacks the cause — it makes the key practically never expire while it is hot. Each reader, on a hit, rolls a die: it voluntarily recomputes early with a probability that rises as expiry approaches. The decision rule is:

time() − delta * beta * log(rand(0,1)) ≥ expiry

where delta is how long the last recompute took, beta is a tuning knob (the paper proves beta = 1 is effectively optimal), and log(rand(0,1)) draws from an exponential distribution. The elegance: a more expensive value (large delta) gets refreshed earlier and more eagerly, and because each reader decides independently, the first lucky reader recomputes alone — well before the TTL hits zero — and rewrites the key, so when the herd arrives the value is fresh and nobody else misses. The collision is dissolved because the expiry event no longer coincides with peak concurrency.

Senior failure modes: the cure that becomes the disease

Each fix introduces its own production hazard, and the dogpile postmortems are usually about the fix failing, not the original miss.

The recompute lock can become a serialization bottleneck. If the recompute is slow and the key is extremely hot, every reader now blocks on one lock holder for the full recompute duration — you have converted a parallelism storm into a latency cliff where p99 for that endpoint becomes the recompute time for everyone. Worse is a lock without a timeout: if the single recomputer crashes or hangs mid-flight while holding the lock, every reader blocks forever — a self-inflicted total outage where the database is healthy and idle but every request is deadlocked waiting on a lock that will never be released. Always set a lock TTL (and make it longer than the worst-case recompute, or the lock expires mid-recompute and you re-open the dogpile). TTL jitter is real but narrow: it stops a batch of keys cached together from expiring in the same second, but it does nothing for one individual hot key — that key still has exactly one expiry instant and its own herd waiting on it. Reaching for jitter to fix a single-hot-key dogpile is a common misdiagnosis.

Pick the best fit

A single hot key (the homepage feed, 200ms to recompute) takes the DB down every time its TTL expires under ~5000 concurrent req/s. Pick the primary fix.

Quiz

Why does the dogpile form even though only one key expired?

Quiz

You add a recompute lock but skip the lock timeout. What's the senior-level risk?

Order the steps

Order the timeline of a dogpile on one hot key:

  1. 1 The hot key is served from cache for thousands of req/s; the origin is idle
  2. 2 The TTL clock reaches zero and the key expires
  3. 3 Request #1 reads a miss and starts the 200ms recompute
  4. 4 Requests #2..N arrive, all read a miss (no value written yet) and each starts its own recompute
  5. 5 N parallel identical queries hit the origin at once; pool exhausts, p99 explodes
Recall before you leave
  1. 01
    Explain to a teammate why the dogpile is specifically a concurrency-at-expiry problem, not just 'the cache missed', and why a single hot key is the worst case.
  2. 02
    A junior adds a recompute lock and the next dogpile is worse, not better. What are the two ways the lock itself became the failure?
Recap

The dogpile effect is the concurrency collision at the exact instant a single hot key expires: every request in flight reads a miss in the same window — before the first recompute has written the value back — and they all stampede the origin in parallel. The danger is multiplicative, not about the query cost itself: one expensive operation times N concurrent requests is the meltdown, and a 200ms query at 5000 req/s becomes 5000 simultaneous connections that exhaust a 20–100 connection pool and take down healthy unrelated traffic with it. Three fixes attack it from different angles. A single-flight lock collapses the N concurrent misses into one recompute while the rest wait or serve stale — but a local mutex only coalesces per process (use a distributed lock across instances), and a lock without a timeout deadlocks every reader if the recomputer crashes. XFetch (probabilistic early expiration, VLDB 2015) makes the hot key never actually expire under load by having each reader independently roll to recompute slightly early, so the first lucky reader refreshes alone before the herd arrives. TTL jitter staggers a batch of keys cached together but does nothing for one individual hot key’s own herd. The senior judgement is matching the fix to the shape of the problem — and remembering that most dogpile postmortems are about the fix deadlocking or serializing, not the original miss.

Continue the climb ↑Dogpile: multiple-choice review
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources4
expand
  1. 01
  2. 02
  3. 03
  4. 04

Trademarks belong to their respective owners. Editorial reference only.