awesome-everything RU
↑ Back to the climb

Distributed Systems

Quorums: build a tunable-consistency store

Crux Hands-on project: build a tiny N=3 replicated key-value store with tunable R/W, then empirically demonstrate the overlap guarantee, a W=1 lost write, and a sloppy-quorum stale read.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 240 min

Reading that R + W > N forces overlap is not the same as watching a stale read appear the instant you tune below it. Build a small replicated store with R and W as dials, then make the guarantee — and each way it breaks — happen on demand, with evidence.

Goal

Turn the overlap invariant into something you can reproduce: implement quorum reads and writes over N=3 simulated replicas, prove R + W > N gives the latest write, then deliberately induce a W=1 lost write and a sloppy-quorum stale read and capture both.

Project
0 of 7
Objective

Build a minimal N=3 replicated key-value store with per-operation tunable R and W, and produce a test report that empirically demonstrates the R + W > N overlap guarantee plus two of its silent failure modes — a W=1 lost write and a sloppy-quorum stale read.

Requirements
Acceptance criteria
  • An automated test suite where each scenario (overlap-holds, W=1-lost-write, sub-overlap-drift, sloppy-stale-read, handoff-converges) is a named, repeatable test asserting the exact returned version — not a manual demo.
  • A short report table: for each (R, W) pair tried, list R + W, whether R + W > 3, and the observed result (latest / possibly-stale / lost), matching theory to observation.
  • The W=1 lost-write test reliably reproduces data loss of an acknowledged write, proving the failure is a configuration consequence and not a bug in your store.
  • A one-paragraph write-up explaining, for each failure scenario, which guarantee was forfeited and the exact arithmetic (R + W vs N) or membership change (hint-holder outside read set) that caused it.
Senior stretch
  • Add read repair: when a quorum read sees disagreeing versions, push the newest to the stale replicas inline, and add a test showing the next read of the same key is consistent even at R=1.
  • Add a quorum-read latency probe: make one replica slow and show that a QUORUM (R=2) read's latency tracks the second-fastest replica, then implement request hedging (fire a duplicate after a delay, take the first) and show the p99 drop.
  • Add concurrent writers to the same key at W=2 and show overlap does NOT order them — both 'succeed' and you get sibling versions / last-write-wins — illustrating that overlap guarantees visibility, not serialisation.
  • Map your dials to a real store: write the equivalent Cassandra consistency levels (ONE / QUORUM / ALL) or DynamoDB ConsistentRead flags for each scenario, and note where managed defaults would have hidden the bug.
Recap

This is the experiment that turns the overlap invariant from a formula into intuition: implement N=3 with tunable R and W, watch R + W > N reliably serve the latest write across any single-node failure, then tune below it and watch the silent failures appear on cue — a W=1 ack lost when its lone replica dies, a sub-overlap read picking the lagging replica, and a sloppy-quorum write hiding on a hint-holder outside the read set until hinted handoff converges it. Building it once, with the arithmetic mapped to the observed result, makes you the engineer who picks R and W deliberately instead of discovering the gap through a 3 a.m. stale-data ticket.

Continue the climb ↑Leader election: one writer, terms, and the split-brain that fencing tokens stop
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources2
expand
  1. 01
  2. 02

Trademarks belong to their respective owners. Editorial reference only.