Databases
Execution plans: multiple-choice review
Six questions that cut across the whole unit. Each one is a plan you would actually read in an incident — not a definition to recite, but a diagnosis to make from the output and a fix to choose under load.
Confirm you can read an EXPLAIN ANALYZE plan, trace a bad row estimate up through the join it broke, and pick the fix that addresses the cause rather than the symptom — the synthesis the seven lessons built toward.
A query is fast on staging and slow in production. Same schema, same indexes. EXPLAIN ANALYZE on the prod run shows: Nested Loop (actual time=4180..4192 rows=50 loops=1) -> Index Scan on orders (cost=0..3800000 rows=50) (actual time=12..3950 rows=520000 loops=1) -> Index Scan on customers_pkey (actual time=0.002..0.002 rows=1 loops=520000) What is the root cause?
A 1M-row table has an index on status. The query WHERE status = 'active' matches 400,000 rows (40% of the table). The planner picks Seq Scan instead of the index. Is this a bug, and why?
A predicate filters on WHERE country='US' AND region='CA' AND status='shipped'. The planner predicts 500 rows; the node actually emits 50,000. Single-column statistics are fresh. What fixes the estimate?
A parameterised query is fast for months, then P99 jumps to 4s while the mean stays at ~3ms. pg_stat_statements shows a large stddev_exec_time. EXPLAIN (GENERIC_PLAN) on the prepared form shows: Index Scan using idx_orders_created on orders Filter: ((workspace_id = $1) AND (status = $2)) What happened and what is the immediate fix?
A B2B dashboard degrades every Monday morning after a Sunday-night batch import of 20M rows, then recovers by Tuesday on its own. What is the cause and the durable fix?
EXPLAIN (ANALYZE, BUFFERS) on a Hash Join shows Hash Batches: 64 and Sort Method: external merge Disk: 450MB on a sort beneath it. Mean latency is high but reads are mostly shared hit. What is the bottleneck and the targeted fix?
The through-line across the unit is one reading discipline: at every plan node compare rows-estimated to actual rows, find where the gap opens, and trace it upward — a bad estimate cascades into wrong scan choices, wrong join algorithms, and wrong sort methods above it. Scan choice follows selectivity, join choice follows the outer-side estimate, and the durable fixes attack the cause: fresh statistics via ANALYZE, extended statistics for correlated columns, force_custom_plan for skewed prepared statements, and SSD-tuned cost constants. Forcing an algorithm or rebuilding an index treats the symptom; fixing the estimate corrects the whole plan.