awesome-everything RU
↑ Back to the climb

Distributed Systems

Raft: build and break a cluster

Crux Hands-on project — build or operate a small Raft cluster, then drive it through partition, slow-disk, and membership-change scenarios, proving each safety and recovery property with captured evidence.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 240 min

Reading the safety proof is not the same as watching a cluster truncate an uncommitted entry, halt a minority partition, and survive a node swap without losing a write. Stand up a small Raft cluster, drive it through the exact failure modes the unit described, and capture the evidence that each guarantee held.

Goal

Turn the unit’s mental model into a reproducible lab: replicate a log, force a leader election, prove a committed entry survives while an uncommitted one is discarded, demonstrate the CP halt under partition, and perform a safe membership change — each step backed by logs or metrics, not assertion.

Project
0 of 7
Objective

Stand up a 3-node (then 5-node) Raft cluster — either a real one (embedded hashicorp/raft, etcd, or a from-scratch implementation) or a deterministic simulator — and demonstrate, with captured evidence, that it replicates a log, elects leaders correctly, preserves committed entries across crashes, halts the minority side under partition, and changes membership safely.

Requirements
Acceptance criteria
  • A scenario log or recording for each of: replicated write applied identically on all nodes; clean election with a new term; uncommitted entry truncated after leader crash; committed entry surviving the same crash; minority halt + majority progress under partition.
  • Evidence — captured logs or metric panels — that the term increases monotonically, that commitIndex only advances once a majority's matchIndex covers an entry, and that no two nodes ever applied different commands at the same index.
  • A membership-change transcript showing the cluster moving 3 to 5 one node at a time, with quorum size and tolerated-failure count updating correctly and no period where two disjoint majorities could exist.
  • A one-page write-up mapping each demonstrated behaviour back to the property that guarantees it: quorum overlap, Log Matching, Leader Completeness, the current-term commit rule, and the CP tradeoff.
Senior stretch
  • Inject a slow-disk fault: throttle the leader's fsync past the heartbeat interval and capture the resulting election flapping; then show that moving the WAL to fast storage (or raising the election timeout as a stopgap) restores stability.
  • Add snapshots: compact the log after N entries, take a follower far enough offline that the leader has compacted past its needs, and show InstallSnapshot bringing it back into sync — including the membership config inside the snapshot.
  • Add pre-vote and reproduce the rejoining-node disruption with it on vs off: show a long-partitioned node triggering a spurious election without pre-vote, and being silently rejected with it.
  • Add ReadIndex or lease reads and measure linearizable-read latency vs a naive no-op-commit read under a high read:write ratio; for lease reads, demonstrate the clock-skew failure mode by deliberately skewing one node's clock.
Recap

This is the lab that converts the proof into reflexes. Once you have watched a cluster truncate an uncommitted entry, keep a committed one across the same crash, halt a minority partition while the majority commits, and grow membership one node at a time without a split-brain window — each backed by your own captured logs and metrics — the safety argument stops being an abstraction. Map every behaviour back to the property that guarantees it, and you will diagnose the production incident (slow disk, disabled pre-vote, bypassed membership) from the metrics in minutes instead of hours.

Continue the climb ↑Quorums: the R + W > N invariant and how it quietly breaks
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources2
expand
  1. 01
  2. 02

Trademarks belong to their respective owners. Editorial reference only.