Crux Read Raft RPC handlers, a state-machine step, and a real AppendEntries log, then pick the behaviour, the bug, or the highest-leverage fix a senior engineer makes first.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min
Raft bugs hide in four places: the AppendEntries consistency check, the commit-index advance, the vote-granting rule, and the state machine that applies the log. Read each snippet the way you would in a code review or an incident, then pick what a senior engineer flags first.
Goal
Practise reading the protocol where it actually lives — in RPC handlers and log lines — and spot the off-by-one, the missing check, and the divergence signature that separate a correct Raft from a corrupting one.
Snippet 1 — the AppendEntries consistency check
// follower handling AppendEntriesfunc (r *Raft) handleAppend(a AppendEntries) AppendReply { if a.Term < r.currentTerm { return AppendReply{Term: r.currentTerm, Success: false} } // BUG IS HERE: appends without the prevLog check r.log = append(r.log[:a.PrevLogIndex+1], a.Entries...) r.fsync() return AppendReply{Term: r.currentTerm, Success: true}}
Quiz
Completed
This follower handler is missing one check that Raft safety depends on. Which one, and what breaks without it?
Heads-up Concurrency hygiene matters, but the protocol-level defect is the absent prevLog match. Even single-threaded, this handler will splice onto a divergent prefix and break Log Matching.
Heads-up Equal-term AppendEntries are exactly the normal case from the current leader. Rejecting them would stop all replication. The defect is the missing prevLogIndex/prevLogTerm check.
Heads-up That slice blindly trusts the leader's index without confirming the prefix matches. If the follower's entry at PrevLogIndex has a different term, the splice corrupts history. The prevLog check must gate the append.
Snippet 2 — advancing the commit index
// leader, after collecting matchIndex[] from followersfunc (r *Raft) advanceCommit() { for n := r.commitIndex + 1; n <= r.lastIndex(); n++ { count := 1 // count self for _, m := range r.matchIndex { if m >= n { count++ } } if count >= r.majority() { r.commitIndex = n // committed by majority } }}
Quiz
Completed
A majority has replicated entry n, so this code commits it. But the Raft paper adds one more condition before a leader may advance commitIndex to n. What is it, and why?
Heads-up Waiting for all nodes is never required and would destroy availability. The missing condition is the current-term rule, not a stronger quorum.
Heads-up Persisting commitIndex is an implementation detail; commitIndex is recomputable from the log on restart. The protocol-level gap is the current-term commit rule.
Heads-up For prior-term entries it is not. Committing a replicated-but-prior-term entry directly is the classic figure-8 bug; the current-term guard is what prevents it.
This vote handler enforces one-vote-per-term but skips a check. What can a candidate now do, and which safety property fails?
Heads-up The votedFor check already prevents double voting within a term. The missing piece is the log up-to-date comparison, which protects committed history, not vote counting.
Heads-up Election Safety still holds: one vote per node per term plus majority means at most one winner per term. The gap here lets a stale-log candidate win, breaking Leader Completeness instead.
Heads-up Term comparison only orders messages; it says nothing about how complete the candidate's log is. The explicit lastLogTerm/lastLogIndex check is what guards completeness.
Snippet 4 — a real AppendEntries log
15:42:08 INFO raft: AppendEntries -> D failure (mismatch at idx=400100 term=12 vs follower idx=400100 term=11)15:42:09 INFO raft: AppendEntries -> D retry prevIdx=400050 (decremented)15:42:09 INFO raft: AppendEntries -> D failure (mismatch at idx=400050 term=12 vs follower idx=400050 term=10)15:42:10 INFO raft: AppendEntries -> D retry prevIdx=399000 (decremented)15:42:10 INFO raft: AppendEntries -> D failure (mismatch at idx=399000 term=12 vs follower idx=399000 term=8)15:42:11 WARN raft: D has diverged extensively; consider InstallSnapshot
Quiz
Completed
The leader keeps decrementing nextIndex for follower D one step at a time. Reading this log, what is the diagnosis and the right operational response?
Heads-up Decrementing is the consistency check working correctly: it walks back to the last matching index. The problem is the size of the divergence, which calls for InstallSnapshot, not a code fix.
Heads-up Mismatched older terms are the normal signature of a long-offline follower, not an attack. The fix is to snapshot it back into sync, not to assume Byzantine behaviour.
Heads-up D is a lagging follower being repaired, not the cause of elections here. The right action is InstallSnapshot to end the linear backtracking.
Recap
Four handlers carry Raft’s safety. AppendEntries must gate every splice on a prevLogIndex/prevLogTerm match or logs diverge silently. A leader may only directly commit an entry from its own current term — counting replicas on a prior-term entry reopens the figure-8 overwrite. RequestVote must compare lastLogTerm/lastLogIndex, or a stale candidate wins and breaks Leader Completeness. And a long run of one-step nextIndex decrements with falling terms is the divergence signature that means switch to InstallSnapshot. Read the protocol where it lives — in the handlers and the logs — and these are the exact lines a senior engineer checks first.