awesome-everything RU
↑ Back to the climb

Data Engineering

OLTP vs OLAP: multiple-choice review

Crux Multiple-choice synthesis across the OLTP vs OLAP unit — row vs column layout, column pruning, compression, vectorization, the replica-analytics outage, and when to split stores.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 12 min

Six questions that cut across the whole unit. Each one mirrors a decision you make in a real architecture review — not a definition to recite, but a tradeoff to weigh when the dashboard and the checkout path share a database.

Goal

Confirm you can connect physical layout, the column-store multipliers, the replica-analytics failure mode, and the store-split decision — the synthesis the overview built toward.

Quiz

A query runs SUM(total) over 40M orders rows that have 50 columns each. Why does a column store crush a row store on exactly this query?

Quiz

Service A runs point lookups by id at 5k req/s; Service B runs a column aggregate over the same data. Why can't one storage layout serve both well?

Quiz

On 100M-row ClickBench, ClickHouse stores the data in roughly 9 GiB where Postgres needs about 100 GiB. Why does columnar compress so much better, and why does that widen the scan-throughput gap rather than just saving disk?

Quiz

A growth dashboard runs a 90-day revenue aggregate against the prod Postgres read replica. Checkout p99 jumps from 40ms to 2s and replica lag climbs to 90s. What root cause does a senior name first?

Quiz

Product wants the 90-day revenue dashboard. Of these four options, which is the architecture a senior ships, and why?

Quiz

A team proposes an HTAP engine to serve both transactions and analytics on one system, arguing it removes the need to split stores. What is the senior framing?

Recap

The through-line of the unit is one decision: access pattern sets layout. Point lookups and small writes want a row store; full-table scans over a few columns want a column store, where pruning (read ~6% of the bytes), 5–10× compression, and vectorized batch execution stack into a 10×–1000× gap no index can close. The classic senior failure is analytics on the prod OLTP replica, where one scan evicts the hot buffer cache, contends for I/O, and drives replica lag. The fix is architectural — a separate columnar store fed by CDC or ETL — accepting a little data latency for isolated storage, caches, and failure domains. HTAP can manage the contention but never abolishes it.

Continue the climb ↑OLTP vs OLAP: free-recall review
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.