Databases
MVCC and isolation: diagnose bloat and a write-skew anomaly
Reading about runaway bloat and write skew is not the same as watching a table double overnight because someone left a transaction open, or seeing two transactions break an invariant and both commit. Build a small Postgres workload, drive it into both failures, then fix them with the unit’s levers — proving each step with system-view numbers.
Turn the unit’s mental model into a reproducible loop: instrument MVCC state, reproduce oldest-xmin bloat and a write-skew anomaly on a real database, diagnose each from the catalog views, fix with the minimal isolation and autovacuum change, and verify with before/after measurements.
On a real PostgreSQL instance, reproduce two MVCC failures — oldest-xmin bloat and a write-skew anomaly — then diagnose and fix both, proving each fix with measurements from pg_stat views, not assertions.
- A before/after table for the bloat scenario: dead-tuple ratio, table size, and oldest backend_xmin age — measured under the same workload, showing the ratio falling after the pin is released.
- The autovacuum 'not yet removable' log line captured, with the matching pid/slot identified from a catalog query (not guessed).
- A transcript of the write-skew anomaly committing under REPEATABLE READ, plus a transcript of SERIALIZABLE aborting one transaction with 40001 and the retry preserving the invariant.
- A one-paragraph write-up: which isolation strategy you would ship for the invariant and why, and which infrastructure setting (timeout, scale_factor) is the durable defence against the bloat failure.
- Add a one-page on-call runbook: the three queries to run when a table is bloating (backend_xmin, slot xmin, wraparound ages), how to read the autovacuum log, and the fix decision tree.
- Demonstrate the HOT path: add a non-indexed column, run an UPDATE-only loop, and show n_tup_hot_upd / n_tup_upd above 80%; then change fillfactor to 100 (or update an indexed column) and show the HOT ratio collapse and WAL per update roughly double.
- Push the write-skew fix under load: benchmark SERIALIZABLE vs RC + FOR UPDATE on a hot row at high concurrency, and chart the 40001 retry rate against throughput to show where SSI false positives dominate.
- Reproduce MultiXact growth: run a foreign-key-heavy workload (many child INSERTs against one parent) and watch age(datminmxid) advance, confirming shared locks pressure a separate wraparound clock from XIDs.
This is the loop you will run in every real MVCC incident: instrument the catalog views first, reproduce the failure deliberately, diagnose from backend_xmin / slot xmin / wraparound ages rather than guessing, fix with the minimal lever (release the pin, pick the weakest isolation level that protects the invariant, tune autovacuum per table), and verify with before/after numbers under the same workload. Doing it once on a toy database makes the production version — where a table is doubling and the pager is going off — muscle memory instead of panic.