Caching CACHE · 03 · 03

XFetch: coordination-free probabilistic early expiration

Vattani et al.''''s XFetch algorithm refreshes the cache before TTL fires using a probabilistic decision rule — no locks, no coordination, expected exactly one early refresh per TTL window regardless of fleet size.

CACHE Middle ◷ 12 min

Level

FoundationsJuniorMiddleSenior

A lock-based mitigation adds a Redis round-trip to every cache miss — fine for ultra-hot keys, wasteful for the 99% of keys that rarely collide. XFetch achieves the same “only one rebuild per TTL window” guarantee with zero coordination and zero network cost.

The decision rule

Why does this matter to you? Every hot key you cache with a plain TTL is a future stampede waiting to happen at expiry. XFetch lets you eliminate that risk with a single probability check on every read — no infrastructure changes required.

Each cache read evaluates:

should_refresh = (-beta * delta * ln(rand())) >= time_to_expire

Where:

delta — typical rebuild duration in seconds (e.g., 0.4)
beta — tuning constant, default 1.0
rand() — uniform random in (0, 1)
time_to_expire — seconds until the current TTL fires

As time_to_expire approaches 0 (close to expiry), the threshold on the left (-beta * delta * ln(rand())) is increasingly likely to exceed it. Far from expiry the probability is near zero. Near expiry it climbs fast.

Result: the cache key gets refreshed before it expires. The first request to fire the probabilistic threshold runs the rebuild while the existing cached value is still served to everyone else. By the time TTL would have fired, the cache already has a fresh value.

Time before TTL	delta=0.4 s, beta=1.0	Probability of early refresh per read
10 s	—	~4%
2 s	—	~18%
0.4 s (= delta)	—	~63%
0.1 s	—	~98%

The exact numbers depend on the distribution of reads, but the trend is clear: the probability concentrates in the final few seconds before expiry.

Expected refreshes per TTL window = 1

The key property Vattani et al. prove: integrating the probability over the TTL window gives exactly one expected early refresh, regardless of fleet size or read rate. A fleet of 100 nodes reading the key 1,000 times per second does not produce 100,000 early refreshes — it still produces approximately 1. The exponential distribution has the property that the minimum of N independent draws is also exponential with rate N × lambda. As fleet size grows, the earliest reader fires the threshold sooner, but the total count remains ~1.

When XFetch works and when it does not

XFetch requires reads to be frequent enough that “some reader” hits the probabilistic threshold before TTL. The algorithm is transparent when reads happen, say, every 100 ms over a 60-second TTL — many chances to fire the threshold in the final seconds.

Failure case: cold or sporadic keys. A key read once per 30 seconds with TTL=60 s may have only one or two reads in the entire “early refresh” window (the final ~1 s). With small N, the probability of missing the threshold entirely is significant. The key expires normally and stampedes.

Rule of thumb: use XFetch on keys with ≥ 10 reads per expected-rebuild-duration window before TTL. For colder keys, fall back to a lock or SWR.

Read rate per key is the decision: XFetch needs many reads in the final seconds before TTL. Below ~10 reads per rebuild window, reach for a lock or SWR instead.

▸Why this works

Why the exponential distribution? It is the continuous-time memoryless distribution — the minimum of N independent exponential draws is still exponential. This means XFetch’s “exactly 1 expected early refresh” property holds no matter how many readers or nodes exist. Any non-exponential decision rule either requires more expected refreshes as fleet size grows (wasted work) or accepts a non-zero miss probability (stampede). The exponential is uniquely optimal among coordination-free algorithms, which is the core result of Vattani et al. (VLDB 2015).

Tuning beta

beta controls how early the refresh window opens:

beta < 1.0: earlier refreshes, wider window, more wasted early rebuilds on cold keys. Use when rebuild cost is very low and staleness must be minimised.
beta = 1.0 (default): the window opens when time_to_expire ≈ delta. Balanced.
beta > 1.0: later refreshes, narrower window, higher risk of missing the threshold on cold keys. Use only when rebuild is expensive and early staleness must be avoided.

Quiz

A cache key has TTL=60 s and rebuild time delta=0.4 s. At what approximate time before TTL does XFetch (beta=1.0) typically fire the first early refresh?

Quiz

A read-heavy cache has cold keys accessed once every 30 s (TTL=60 s). Why does XFetch perform poorly here?

Quiz

A 200-node fleet reads a hot key 50,000 times per second. With XFetch enabled, approximately how many early rebuilds happen per TTL window?

The refresh probability concentrates in the last few seconds before TTL. One reader fires the early rebuild at ~delta before expiry; all others keep hitting the still-valid value, so the cache is never empty.

Recall before you leave

01
Why does XFetch work without any cross-fleet coordination while distributed locks require explicit cross-node communication?
02
What does increasing beta above 1.0 do to XFetch's behaviour, and what is the cost?

Recap

XFetch (Vattani et al., VLDB 2015) solves the stampede problem for hot keys without coordination. Each cache read evaluates -beta * delta * ln(rand()) >= time_to_expire. As TTL approaches, probability climbs; on average exactly one reader per TTL window fires an early rebuild while all others continue hitting the still-valid cached value. The exponential distribution is uniquely optimal among coordination-free algorithms. XFetch fails for cold keys (sporadic reads may miss the narrow window). The practical rule: use XFetch for keys with many reads per rebuild-duration window before TTL, and fall back to a lock or stale-while-revalidate for everything else. Now when you add XFetch to a hot key, check the read rate first — if reads per rebuild-duration are below 10, reach for SWR or a lock instead.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

Lock and single-flight: bounding concurrent rebuildsmiddle

unlocks

Stale-while-revalidate and CDN request coalescingmiddle

deepens into

Stale-while-revalidate and CDN request coalescingmiddle

appears again in204

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Cache stampede labReproduce a thundering-herd cache miss under load, then kill it with single-flight and early-expiry recomputation.