Distributed Systems
Raft: free-recall review
Retrieval beats re-reading. For each prompt, reconstruct the full mechanism from memory — out loud or on paper — before you open the model answer. The effort of recall is what makes the safety argument stick.
Reconstruct the unit’s spine without looking back: the quorum-overlap safety argument, what “committed” means, why the voting rule preserves history, how linearizable reads avoid a log write, and why membership change needs joint consensus.
- 01Why does requiring a majority quorum for both elections and commits prevent split brain, and why specifically a majority rather than any fixed-size group?
- 02What does 'committed' mean in Raft, and why is a single-node fsync not enough?
- 03How does the RequestVote log-completeness rule combine with quorum overlap to give Leader Completeness, and what is the extra current-term commit caveat?
- 04Contrast ReadIndex and lease reads for serving linearizable reads, and state the correctness precondition that makes lease reads risky.
- 05Why does changing cluster membership require joint consensus, and what does the joint phase actually enforce?
- 06What is the minimum viable Raft dashboard, and what root cause does each metric point to?
If you could rebuild each answer from memory you hold the unit’s spine: majority quorum forces overlap, which makes a committed (majority-persisted) entry survive every leader change; the log-completeness vote plus that overlap gives Leader Completeness, with the current-term commit rule closing the figure-8 hole; ReadIndex and lease reads serve linearizable reads without a log write — lease reads trading an NTP correctness precondition for sub-millisecond latency; and joint consensus keeps membership change from splitting the cluster. Watch five metrics, and remember that real incidents nearly always trace to bypassed membership, disabled pre-vote, or a slow disk.