Caching CACHE · 03 · 01

What is a cache stampede and why it makes things worse

A single TTL expiry under concurrent traffic forces every waiting request to rebuild the cache at once, turning the cache into a weapon against its own database.

CACHE Junior ◷ 10 min

Level

FoundationsJuniorMiddleSenior

Already know this unit? Take a 1-minute quick check →

A flash sale launches at noon. The homepage has been cached for 30 seconds. At 12:00:30 the TTL fires — and 10,000 customers hit the database at once. The cache was supposed to prevent this.

The shape of the failure

A cache layer works by absorbing repeated reads. When a key is live, every request returns in microseconds without touching the database. The flaw appears at TTL expiry: the instant the key becomes stale, every concurrent request sees a miss simultaneously.

Under low traffic this is fine — one request misses, rebuilds, stores the new value, and the next request hits. Under high traffic the expiry window is destroyed. All N concurrent requests arrive between the moment the key expires and the moment any of them writes the new value. Each one independently runs the rebuild. The database — which the cache was hiding — now sees N parallel queries.

Phase	Cache state	DB load
Normal operation (60 s window)	Key live, TTL > 0	Near zero
Expiry instant	Key expired	N concurrent rebuild queries
After first rebuild writes	Key live again	Near zero

The total number of user requests is low — the cache absorbed them for 60 s. But the peak concurrency at expiry equals the full unfiltered traffic rate for that one second. The database was sized for steady-state behind a cache, not for a one-second burst at the traffic ceiling.

When the key is live the cache absorbs reads; at the expiry instant the herd passes straight through the missed cache and arrives at the origin simultaneously — that synchronized burst is the stampede.

Why longer TTL does not help

The intuitive fix is “set a longer TTL”. This does not fix stampede — it only shifts when it happens. A 1-hour TTL means the herd arrives once per hour instead of once per minute. Each hourly stampede is more severe because more cache-writes accumulate behind a single expiry. The right fix changes what happens at expiry, not when expiry occurs.

The bursty traffic shape is itself the problem

Without a cache, the database sees a steady 5,000 RPS. With a cache and TTL=60s, the database sees near 0 RPS for 59 seconds and then 5,000 RPS in one second. The total work is far lower — but the peak is identical to no-cache. The cache reshapes traffic from steady to bursty, and it is the burst, not the volume, that causes failures.

The cache flattens steady-state load to ~10 RPS, but the expiry burst spikes back to the full no-cache rate of 5,000 RPS — it is the peak, not the volume, that breaks the database.

A concrete timeline

T=0s — homepage:v1 cached with TTL=60s. Traffic: 5,000 RPS.
T=0s–59.9s — cache hits. DB sees ~10 QPS (health checks, etc.).
T=60.0s — key expires.
T=60.0s–60.4s — 2,000 requests arrive (5,000 RPS × 0.4s rebuild time). Each runs GET homepage:v1, gets nil, starts a 400ms rebuild. 2,000 parallel DB queries.
T=60.4s — all 2,000 rebuilds complete. Each writes the new value. DB CPU falls.
T=60.4s–120.0s — cache hits again. Cycle repeats at T=120.

Quiz

What triggers a cache stampede?

Quiz

Why does increasing the cache TTL from 60 s to 1 hour not fix stampede?

Order the steps

Put the events of a cache stampede in order:

1 A hot cache key has TTL=60 s and receives 5,000 RPS of read traffic
2 Second 60 arrives: the cache key expires
3 5,000 concurrent requests in the next second all see a cache miss
4 All 5,000 requests run the same expensive backend rebuild independently
5 Database CPU saturates at 100%; renders queue, then time out
6 Half the requests succeed in writing the new value; the other half error to the user
7 For the next 60 s the cache absorbs traffic normally — until the next expiry

Complete the analogy

Fill in the blank: a cache stampede happens because many requests miss the cache _______, all at the same instant.

One expiry fans out to every in-flight request; each runs the same rebuild, so the DB sees N concurrent queries — the full unfiltered traffic rate in one burst.

Recall before you leave

01
Why does a cache make a failure mode worse than no cache at all, even though total request volume is lower?
02
A homepage cached at 60 s TTL has 2,000 concurrent requests arriving per second. The rebuild takes 400 ms. How many parallel DB queries does a stampede produce?

Recap

A cache stampede is not a server crash or a misconfiguration — it is the normal TTL mechanism combined with concurrent traffic. When a hot key expires, every in-flight request sees a miss and runs the rebuild independently. The database, sized for cached steady-state, sees a one-second burst equal to the full unfiltered traffic rate. Increasing the TTL only moves the burst in time; it does not prevent it. The next lesson covers the two simplest mitigations: the distributed lock and in-process single-flight.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

unlocks

Lock and single-flight: bounding concurrent rebuildsmiddle

deepens into

Lock and single-flight: bounding concurrent rebuildsmiddle

appears again in204

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Cache stampede labReproduce a thundering-herd cache miss under load, then kill it with single-flight and early-expiry recomputation.URL shortener at scaleBuild a URL shortener that survives real traffic — then run it: deploy it, watch it, and work the incident when one hot link melts your cache.