awesome-everything RU
↑ Back to the climb

Distributed Systems

CAP in practice: build and partition a replicated store

Crux Hands-on project — build a small replicated KV store, inject a partition, and demonstrate CP and AP behaviour from the same code, proving each stance with observed reads, writes, and reconciliation.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 240 min

Reading that ‘a partition forces a CP-or-AP choice’ is not the same as watching your own writes diverge and reconciling them by hand. Build a tiny replicated key-value store, sever the link between replicas, and make the same system behave as CP and then as AP — with the request log as your evidence.

Goal

Turn the unit’s mental model into something you can run: prove that P is not optional, that the CP and AP stances are observable behaviours, that strong consistency costs latency even when healthy (PACELC), and that AP’s conflict-resolution tax is real.

Project
0 of 7
Objective

Build a 3-node replicated key-value store you can run locally, inject a controllable network partition between replicas, and demonstrate — with logged reads, writes, and latencies — both a CP configuration and an AP configuration on the same code, then reconcile the AP divergence correctly.

Requirements
Acceptance criteria
  • A request log (or transcript) showing the CP run: minority side errors/timeouts during the partition, majority serves, and no stale/divergent read is ever observed.
  • A request log showing the AP run: both sides accept writes during the partition, producing two divergent values for the same key — captured, not described.
  • A reconciliation transcript proving concurrent writes are merged (vector-clock siblings or CRDT merge), with an explicit demonstration that naive timestamp LWW would have dropped one of them.
  • A latency table: healthy-state p50 for strong (EC) vs weak (EL) settings, showing the consistency-for-latency trade PACELC predicts.
  • The README correctly attributing each behaviour to the CAP/PACELC mechanism that caused it.
Senior stretch
  • Add an asymmetric partition (A reaches B, but B's replies to A are dropped) and show how it differs from a clean split — e.g. a node that keeps trying to act on a one-way view.
  • Inject a logical partition: pause one replica's process (SIGSTOP) for longer than the heartbeat timeout and show peers treating a healthy-but-slow node as failed, triggering a re-election or eviction.
  • Add a quorum-leader (Raft-style) mode with pre-vote and a tunable election timeout, and demonstrate how a too-low timeout under jittery latency manufactures false elections.
  • Expose per-request consistency (like Cassandra's CONSISTENCY level) and show a single client sliding the same key between CP-ish and AP-ish behaviour query by query.
Recap

This is the loop behind every real distributed-systems design review: P is not a choice, so you decide CP-or-AP per partition by setting quorum and consistency knobs; you prove the stance with observed reads and writes, not adjectives; you pay PACELC’s latency tax whenever you demand consistency in the healthy state; and AP’s reconciliation is only safe with real merge logic, never wall-clock LWW. Building it once on a toy store turns the theorem into instinct.

Continue the climb ↑Raft roles, terms, and why majority quorums prevent split brain
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.