awesome-everything RU
↑ Back to the climb

Caching

XFetch: coordination-free probabilistic early expiration

Crux Vattani et al.''''s XFetch algorithm refreshes the cache before TTL fires using a probabilistic decision rule — no locks, no coordination, expected exactly one early refresh per TTL window regardless of fleet size.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 12 min

A lock-based mitigation adds a Redis round-trip to every cache miss — fine for ultra-hot keys, wasteful for the 99% of keys that rarely collide. XFetch achieves the same “only one rebuild per TTL window” guarantee with zero coordination and zero network cost.

The decision rule

Each cache read evaluates:

should_refresh = (-beta * delta * ln(rand())) >= time_to_expire

Where:

  • delta — typical rebuild duration in seconds (e.g., 0.4)
  • beta — tuning constant, default 1.0
  • rand() — uniform random in (0, 1)
  • time_to_expire — seconds until the current TTL fires

As time_to_expire approaches 0 (close to expiry), the threshold on the left (-beta * delta * ln(rand())) is increasingly likely to exceed it. Far from expiry the probability is near zero. Near expiry it climbs fast.

Result: the cache key gets refreshed before it expires. The first request to fire the probabilistic threshold runs the rebuild while the existing cached value is still served to everyone else. By the time TTL would have fired, the cache already has a fresh value.

Time before TTLdelta=0.4 s, beta=1.0Probability of early refresh per read
10 s~4%
2 s~18%
0.4 s (= delta)~63%
0.1 s~98%

The exact numbers depend on the distribution of reads, but the trend is clear: the probability concentrates in the final few seconds before expiry.

Expected refreshes per TTL window = 1

The key property Vattani et al. prove: integrating the probability over the TTL window gives exactly one expected early refresh, regardless of fleet size or read rate. A fleet of 100 nodes reading the key 1,000 times per second does not produce 100,000 early refreshes — it still produces approximately 1. The exponential distribution has the property that the minimum of N independent draws is also exponential with rate N × lambda. As fleet size grows, the earliest reader fires the threshold sooner, but the total count remains ~1.

When XFetch works and when it does not

XFetch requires reads to be frequent enough that “some reader” hits the probabilistic threshold before TTL. The algorithm is transparent when reads happen, say, every 100 ms over a 60-second TTL — many chances to fire the threshold in the final seconds.

Failure case: cold or sporadic keys. A key read once per 30 seconds with TTL=60 s may have only one or two reads in the entire “early refresh” window (the final ~1 s). With small N, the probability of missing the threshold entirely is significant. The key expires normally and stampedes.

Rule of thumb: use XFetch on keys with ≥ 10 reads per expected-rebuild-duration window before TTL. For colder keys, fall back to a lock or SWR.

Why this works

Why the exponential distribution? It is the continuous-time memoryless distribution — the minimum of N independent exponential draws is still exponential. This means XFetch’s “exactly 1 expected early refresh” property holds no matter how many readers or nodes exist. Any non-exponential decision rule either requires more expected refreshes as fleet size grows (wasted work) or accepts a non-zero miss probability (stampede). The exponential is uniquely optimal among coordination-free algorithms, which is the core result of Vattani et al. (VLDB 2015).

Tuning beta

beta controls how early the refresh window opens:

  • beta < 1.0: earlier refreshes, wider window, more wasted early rebuilds on cold keys. Use when rebuild cost is very low and staleness must be minimised.
  • beta = 1.0 (default): the window opens when time_to_expire ≈ delta. Balanced.
  • beta > 1.0: later refreshes, narrower window, higher risk of missing the threshold on cold keys. Use only when rebuild is expensive and early staleness must be avoided.
Quiz

A cache key has TTL=60 s and rebuild time delta=0.4 s. At what approximate time before TTL does XFetch (beta=1.0) typically fire the first early refresh?

Quiz

A read-heavy cache has cold keys accessed once every 30 s (TTL=60 s). Why does XFetch perform poorly here?

Quiz

A 200-node fleet reads a hot key 50,000 times per second. With XFetch enabled, approximately how many early rebuilds happen per TTL window?

Recall before you leave
  1. 01
    Why does XFetch work without any cross-fleet coordination while distributed locks require explicit cross-node communication?
  2. 02
    What does increasing beta above 1.0 do to XFetch's behaviour, and what is the cost?
Recap

XFetch (Vattani et al., VLDB 2015) solves the stampede problem for hot keys without coordination. Each cache read evaluates -beta * delta * ln(rand()) &gt;= time_to_expire. As TTL approaches, probability climbs; on average exactly one reader per TTL window fires an early rebuild while all others continue hitting the still-valid cached value. The exponential distribution is uniquely optimal among coordination-free algorithms. XFetch fails for cold keys (sporadic reads may miss the narrow window). The practical rule: use XFetch for keys with many reads per rebuild-duration window before TTL, and fall back to a lock or stale-while-revalidate for everything else.

Connected lessons
appears again in178
Continue the climb ↑Stale-while-revalidate and CDN request coalescing
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.