awesome-everything RU
↑ Back to the climb

Databases

Sharding: multiple-choice review

Crux Multiple-choice synthesis across the sharding unit — shard-key choice, partitioning vs sharding, co-location, hot shards, cross-shard transactions, and online resharding.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 13 min

Six questions that cut across the whole unit. Each mirrors a decision you make at the whiteboard or mid-incident — not a definition to recite, but a tradeoff to weigh when one Postgres is no longer enough.

Goal

Confirm you can connect shard-key choice, the partitioning-vs-sharding distinction, co-location, the hot-shard failure mode, cross-shard transactions, and online resharding — the synthesis the seven lessons built toward.

Quiz

A B2B SaaS shards by tenant_id. The top 5 tenants generate 60% of traffic. Pure hash sharding is proposed because it 'distributes evenly'. What is the flaw, and the senior fix?

Quiz

A team adds `PARTITION BY RANGE (created_at)` to their largest table and reports they have 'sharded the database'. A month later write throughput is unchanged. Why?

Quiz

On a tenant_id-sharded Citus cluster, an engineer adds `audit_log` distributed by `id` instead of `tenant_id`, then runs `SELECT … FROM orders o JOIN audit_log a ON a.order_id = o.id WHERE o.tenant_id = 42`. What happens?

Quiz

A skew alert fires: one shard is at 94% CPU, the other 31 average 18%, and pg_stat_statements attributes 62% of that shard's query time to one tenant. What is the correct immediate action?

Quiz

A refund transaction updates `orders` (tenant-scoped) and a global `ledger` table sharded by `account_id`, so the two rows live on different shards. Citus runs this with two-phase commit. What is the failure mode to monitor?

Quiz

To add capacity a team runs Citus 11.1+ online rebalancing. Mid-move the coordinator crashes and a teammate panics that the moving shard is corrupted. How do you read this, and what mechanism makes the answer true?

Recap

The through-line of the unit is one decision chain: pick a shard key that is selective, uniform, stable, and present at routing time; partition for pruning and retention but shard for throughput; co-locate every tenant-scoped table so 99% of joins stay single-node; expect hot shards on power-law tenants and answer them with tenant isolation, not rebalancing; keep transactions single-shard to dodge 2PC’s in-doubt risk; and lean on logical-replication-based online resharding — while remembering that a shard-key change is a months-long decision lived with for years.

Continue the climb ↑Sharding: free-recall review
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.