Distributed Systems
Clocks: multiple-choice review
Six questions that cut across the whole unit. Each mirrors a real decision designing or debugging a distributed store — not a definition to recite, but a choice about how to order events when the wall clock lies.
Confirm you can connect clock skew, the choice between physical and logical clocks, Lamport vs vector semantics, happens-before, and TrueTime — the synthesis the lesson built toward.
A Cassandra cluster uses last-write-wins on client timestamps. One node's clock is 1.5s behind. A user's later update routed through that node vanishes after a refresh, with a 200 OK returned. What is the root cause?
You need conflict resolution in a replicated KV store that must never silently drop a concurrent update. Lamport timestamps or vector clocks?
Lamport's clock guarantees that if A happens-before B then L(A) < L(B). Why can't you use L(A) < L(B) to conclude that A caused B?
Your vector clocks are correct, but a 200-node cluster now ships 200 counters of metadata with every write, and it grows as you add nodes. What is the real tradeoff you accepted?
What does Spanner's TrueTime expose, and what does commit-wait do with it?
A delete in an LWW store occasionally suppresses every real write to a key for days, even writes that arrive long after the delete. What is the mechanism?
The through-line: wall-clock time is not an ordering primitive, so LWW turns skew into silent data loss (and future-dated tombstones into days-long key suppression). Logical clocks order by causality instead — Lamport gives a cheap O(1) total order but cannot detect concurrency, so vector clocks add a per-node counter to flag concurrent writes at O(n) metadata that must be pruned. TrueTime takes the third path: quantify clock uncertainty as a bounded interval ε and commit-wait it away for real external consistency. Choose the mechanism by what you must know — order, concurrency, or global truth.