Distributed Systems DIST · 02 · 02

How Raft replicates a log entry and decides it is safe to commit

The AppendEntries RPC, the consistency check that keeps logs identical, the commit index, and the replicated state machine pattern built on top.

DIST Middle ◷ 14 min

Level

FoundationsJuniorMiddleSenior

A client sends SET x = 42 to a Raft cluster. The leader appends it to its log. But the leader alone cannot commit that write — it needs confirmation from enough followers that they have it too. If the leader crashes right after appending but before any follower responds, what happens to the write?

The log and AppendEntries

The Raft leader maintains an ordered log of entries. Each entry is a tuple: (term, index, command). When a client sends a write:

Leader appends the entry to its own log and fsyncs to disk.
Leader sends AppendEntries RPC to every follower, carrying: the new entry, plus prevIndex and prevTerm — the index and term of the entry immediately before the new one.
Each follower checks: does my log have an entry at prevIndex with prevTerm? If yes, append and fsync. If no, reject with a mismatch reply.
Once the leader hears acknowledgement from a majority (including itself), it marks the entry committed.
The leader piggybacks its commit index on the next heartbeat. Followers apply entries up to that index to their state machines.

When you trace this sequence, notice that steps 1–3 are a two-phase process — write then confirm — and the commit only happens at step 4. Without that two-phase split, any leader crash between append and majority acknowledgement would silently lose the write.

Replication touches all 5 nodes, but commit only needs a majority of 3 fsync acks (including the leader's) — the entry is safe before the slow follower even replies.

Step	Who acts	What happens
1	Leader	Appends (term=7, idx=101, cmd=“SET x=42”), fsyncs
2	Leader	Sends AppendEntries to B, C, D, E with prevIdx=100, prevTerm=7
3	Followers B, C, D, E	Each checks log at idx=100 has term=7. Appends, fsyncs, replies success
4	Leader	Receives 2 successes (plus own = 3 of 5). Marks idx=101 committed
5	Leader	Replies success to client
6	Next heartbeat	Carries commitIndex=101 — followers apply “SET x=42” to state machine

The consistency check: Log Matching

The prevIndex/prevTerm check is not bureaucracy — it is the mechanism that keeps every follower’s log identical to the leader’s. If a follower’s log diverges (because a previous leader wrote different entries before crashing), the mismatch reply tells the leader to back up and retry with an earlier prevIndex. The leader keeps decrementing until it finds the last point of agreement, then overwrites the follower’s divergent tail with its own.

This converges to identical logs because: any entry that was committed in a prior term was stored by a majority of nodes. The new leader’s log (elected by majority) overlaps with that prior majority, so the new leader has those committed entries. Uncommitted entries on the old leader’s disk get overwritten — they were never acknowledged to any client, so no data is lost.

The replicated state machine pattern

Raft does not care what commands mean. Your application defines the state machine — a key-value map, a config tree, a relation — and the commands that mutate it. Raft guarantees: every node applies every committed entry in the same order. Since the commands are deterministic, every state machine ends up in the same state. This is the replicated state machine pattern: ordered replicated log + deterministic state machine = consistent distributed service.

Important consequences:

Do not use NOW(), random(), or external API calls inside command handlers — these break determinism across replicas.
The state machine is application-specific; Raft is agnostic to it.
Reads for linearizability must go through the leader (covered in a later lesson); follower reads may return stale data.

Leader + F1 + F2 = 3 of 5 — quorum reached. Entry 101 is committed before F3 even replies. The next heartbeat carries commitIndex=101 so followers apply the command to their state machines.

Quiz

A Raft follower receives AppendEntries with prevIndex=50, prevTerm=4, but its own log at index 50 has term 3. What does the follower do?

Trace a client write through a 5-node Raft cluster

1/3

Order the steps

Put the AppendEntries consistency-check steps in order:

1 Leader picks the next log index to send to a follower
2 Leader sends AppendEntries with prevIndex, prevTerm, and the new entry
3 Follower checks: does my log have an entry at prevIndex with prevTerm?
4 If yes, follower appends and replies success
5 If no, follower replies mismatch with its conflicting index/term
6 Leader decrements nextIndex for this follower and retries with an earlier prevIndex
7 Eventually leader finds the last matching point; follower truncates its tail and accepts the leader's version

The leader plus F1 and F2 acking equals 3 of 5 — a majority — so entry 101 is committed. The next heartbeat carries commitIndex=101 and followers apply the entry to their state machines.

Recall before you leave

01
Why is a Raft log entry not committed the instant the leader writes it to disk?
02
A follower was offline for 2 minutes and missed 500 log entries. It comes back. How does it catch up?
03
Why must command handlers in a Raft state machine be deterministic?

Recap

The leader replicates writes via AppendEntries, which carries a prevIndex/prevTerm consistency check that forces every follower’s log to converge to the leader’s. An entry is committed when a majority has persisted it; the leader then broadcasts the commit index so followers can apply the entry to their state machines. This is the replicated state machine pattern: identical ordered logs plus deterministic handlers equals identical state on every replica. Uncommitted entries from a crashed leader are safe to discard — they were never acknowledged. The fsync cost on every commit is the dominant operational expense, which is why Raft workloads need dedicated NVMe, not shared cloud volumes. Now when you see a write hang or a “unable to commit” error, ask: did the majority receive it, or did the leader crash before the quorum ack — those are different recoveries.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

Raft roles, terms, and why majority quorums prevent split brainjunior

unlocks

Raft leader election: timeouts, voting rules, and the four safety propertiesmiddle

deepens into

Raft leader election: timeouts, voting rules, and the four safety propertiesmiddle

appears again in204

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Crash-safe key-value store with a WALBuild a tiny on-disk KV store that survives a kill -9 mid-write by appending to a write-ahead log before touching the main file.