Databases
Database track: multiple-choice synthesis
Six questions that cut across the whole track. Each is a moment in one product’s life — from a single table to a billion-row cluster — where two database concepts collide and you have to pick the lever that fixes the right one.
Confirm you can chain the track together: model the schema, choose the index the planner will actually use, read why a plan went wrong, pick an isolation level, size the pool, sequence a zero-downtime migration, and decide if sharding is even the answer.
A query filters orders by tenant and date and you add a composite index. Both plans below run on the same table; only the index column order differs. Why does the planner skip the index in plan A? ```sql -- query SELECT * FROM orders WHERE tenant_id = 42 AND created_at >= '2026-01-01'; -- A: CREATE INDEX ON orders (created_at, tenant_id); -> Seq Scan -- B: CREATE INDEX ON orders (tenant_id, created_at); -> Index Scan ```
A nightly batch runs one transaction that stays open for three hours. Over the same window an OLTP table that is heavily UPDATEd grows from 8 GB to 40 GB on disk even though the live row count barely changes. What links the long transaction to the bloat?
A reporting query was fast for months, then suddenly picked a nested-loop join over a hash join and went from 200 ms to 40 s after a data-import spike. EXPLAIN ANALYZE shows the planner expected 12 rows from one branch but got 2.1 million. Root cause and first durable fix?
You move the app behind PgBouncer in transaction mode to survive a connection storm. Immediately the app errors that prepared statements do not exist, and a SET search_path you ran at connect time no longer sticks. Why?
You must rename a hot column from `email` to `email_address` on a 500M-row table with zero downtime while old and new app code run side by side during a rolling deploy. Which sequence is correct?
A multi-tenant SaaS hits write limits on a single Postgres. The team proposes hash-sharding on a `country` column. Three tenants in one country drive 70% of traffic. What goes wrong, and what is the better shard key?
The track is one decision tree applied to a growing product: model the schema for integrity first, add the index whose leading column matches the access pattern, read the plan to catch row-estimate errors before they cascade, keep transactions short so MVCC can vacuum, size and mode the pool to survive connection storms, decompose every breaking change into expand-contract phases, and reach for sharding only after a high-cardinality, co-locating shard key is chosen and every cheaper lever is spent. Each failure mode resolves back to one question: which layer is the bottleneck, and what is the cheapest lever that fixes it?