awesome-everything RU
↑ Back to the climb

Distributed Systems

Quorums: config and code reading

Crux Read real consistency-level configs and a quorum overlap check, predict whether each guarantees the latest write, and pick the highest-leverage fix.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min

Quorum bugs live in the consistency-level lines of your data-access code, not in the database. Read each config and snippet, do the R + W vs N arithmetic, and choose the fix a senior engineer would make first.

Goal

Practise the loop you run when a stale-read ticket lands: find the R and W in the code, check whether R + W > N actually holds for the keyspace’s replication factor, and reach for the fix that restores overlap rather than one that hides it.

Snippet 1 — the read/write consistency split

# keyspace created WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 3 }
session.execute(
    SimpleStatement(
        "UPDATE accounts SET payout = %s WHERE id = %s",
        consistency_level=ConsistencyLevel.QUORUM,   # W = 2 at RF=3
    ),
    (new_payout, account_id),
)
balance = session.execute(
    SimpleStatement(
        "SELECT payout FROM accounts WHERE id = %s",
        consistency_level=ConsistencyLevel.ONE,        # R = 1
    ),
    (account_id,),
).one()
Quiz

The write is QUORUM (W=2) and the read is ONE (R=1) at RF=3. Does this guarantee the read sees the latest payout, and what is the fix?

Snippet 2 — the overlap check helper

def guarantees_latest_read(n: int, r: int, w: int) -> bool:
    # returns True iff a read is guaranteed to see the latest committed write
    return r + w > n

# called for a new keyspace plan, RF = 4
print(guarantees_latest_read(n=4, r=2, w=2))   # what prints, and is it safe?
Quiz

At N=4 with R=2, W=2, what does guarantees_latest_read print, and is that the right answer?

Snippet 3 — the sloppy-quorum write path

def write_quorum(coordinator, key, value, n=3, w=2):
    home = coordinator.preference_list(key, n)        # the N designated replicas
    live = [node for node in home if node.is_reachable()]
    acks = [node.store(key, value) for node in live]
    if len(acks) >= w:
        return "OK (strict quorum)"
    # not enough designated replicas reachable -> sloppy quorum
    for sub in coordinator.next_healthy_ring_nodes():
        acks.append(sub.store_hint(key, value, intended_owner=...))
        if len(acks) >= w:
            return "OK (sloppy quorum)"               # acked, but where?
    return "FAIL"
Quiz

When this returns 'OK (sloppy quorum)' during a partition, what is the consistency consequence for a concurrent strict QUORUM read on the home replicas?

Snippet 4 — the DynamoDB read flag

# DynamoDB, replication is managed; you pick read consistency per request
resp = table.get_item(
    Key={"id": account_id},
    ConsistentRead=False,   # default: eventually consistent
)
Quiz

A balance read uses ConsistentRead=False. What does flipping it to True buy you, and what does it cost — in quorum terms?

Recap

Every quorum bug is read the same way: find R and W in the consistency-level lines, find N from the keyspace replication factor, and check R + W > N with a STRICT inequality — R=1 reads behind QUORUM writes, and any even-N split like R=2,W=2 at N=4, are the classic traps that sum to N and silently permit stale reads. The fix that restores overlap (usually raising the READ level) beats fixes that just hide it. Sloppy quorum acks via substitute hint-holders outside the read set, so its ‘OK’ suspends overlap during a partition; and managed stores expose the same dial as ConsistentRead, where strong reads cost roughly 2x capacity for the guarantee.

Continue the climb ↑Quorums: build a tunable-consistency store
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.