Data Engineering DATA · 01 · 09

OLTP vs OLAP: query and plan reading

Read real query plans, DDL, and pipeline config across the OLTP vs OLAP unit, predict scan cost on row vs column storage, and pick the highest-leverage fix.

DATA Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

The query plan and the DDL are where the workload split shows itself. Read each snippet, predict how it behaves on row vs column storage, and choose the fix a senior makes first.

Goal

Practise the loop you run in every data-architecture review: read the query, predict the bytes scanned, and reach for the layout or pipeline change — not a band-aid index — before the analytical workload poisons the transactional one.

Snippet 1 — the EXPLAIN on the row store

EXPLAIN (ANALYZE, BUFFERS)
SELECT country, SUM(total)
FROM orders                       -- 40M rows, 50 columns, row-store Postgres
WHERE created_at >= now() - interval '90 days'
GROUP BY country;

Seq Scan on orders  (cost=0.00..2.1M rows=38_000_000)
  Buffers: shared read=4_900_000   -- ~38 GB pulled through the cache
Planning Time: 0.2 ms
Execution Time: 31_482 ms

Quiz

The planner chose a Seq Scan and read ~38 GB even though the query only needs country and total. What does this tell you, and what is the right fix?

Snippet 2 — the schema split

-- OLTP: heavily normalized for cheap, consistent single-row writes
CREATE TABLE orders (
  id            bigint PRIMARY KEY,
  user_id       bigint REFERENCES users(id),
  country_id    smallint REFERENCES countries(id),
  total_cents   bigint,
  created_at    timestamptz
);

-- OLAP target: denormalized wide fact, country folded in as a low-cardinality column
CREATE TABLE orders_fact (
  order_id    UInt64,
  user_id     UInt64,
  country     LowCardinality(String),   -- dictionary-encoded
  total_cents Int64,
  created_at  DateTime
) ENGINE = MergeTree ORDER BY (created_at, country);

Quiz

Why is the OLTP table normalized with a country_id FK while the OLAP fact denormalizes country into a LowCardinality(String) column?

Snippet 3 — the pipeline config

# Debezium connector: stream Postgres WAL changes into Kafka, land in the warehouse
connector.class: io.debezium.connector.postgresql.PostgresConnector
plugin.name: pgoutput
table.include.list: public.orders,public.users,public.countries
snapshot.mode: initial          # backfill once, then stream changes
# downstream sink batches into orders_fact every 30s

Quiz

This CDC pipeline streams the WAL into a warehouse with a 30s sink batch. What does it buy, and what is the tradeoff you must accept?

Snippet 4 — the write-amplification trap

-- An engineer 'fixes' slow dashboards by writing orders straight into ClickHouse,
-- one INSERT per order, from the checkout hot path:
INSERT INTO orders_fact (order_id, user_id, country, total_cents, created_at)
VALUES (918273, 4412, 'DE', 4999, now());   -- called ~5000×/s, one row each

Quiz

Why does pushing single-row INSERTs from the checkout hot path straight into the column store backfire?

Recap

Read the artifacts and the layout tells you the cost: a Seq Scan reading tens of GB on a row store means a full-table aggregate that no index can fix — the fix is a column store that prunes. Normalized OLTP and denormalized low-cardinality OLAP facts are opposite-by-design, not a bug. CDC buys isolation at the price of bounded lag, and CDC alone does not make scans fast — the columnar target does. And single-row writes into a column store are pathological: keep writes in the row store, batch them into the column store. Diagnose from the plan and the layout, fix the layout, then verify the bytes scanned dropped.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.