awesome-everything RU
↑ Back to the climb

Data Engineering

ELT vs ETL: multiple-choice review

Crux Multiple-choice synthesis across the ELT-vs-ETL unit: where the Transform runs, replayability, warehouse cost, idempotency, and the medallion contract.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 13 min

Six questions that cut across the whole unit. Each one is a decision you actually make designing a pipeline — not a definition to recite, but a tradeoff to weigh against cost, replay, and compliance.

Goal

Confirm you can connect where the Transform runs to its downstream consequences: replayability, the warehouse bill, schema discipline, and the idempotency that keeps a retry from doubling your data.

Quiz

What single architectural change in cloud warehouses (Snowflake, BigQuery) is the real reason the industry flipped from ETL to ELT?

Quiz

You discover a timezone bug in a transform that has shipped wrong numbers for six months. Under ELT with a medallion architecture, what is the fast, correct fix?

Quiz

A dbt model was set to full-refresh by default and scheduled hourly; it rebuilds a 2 TB fact table from scratch every run and the Snowflake bill jumped 40%. The output is correct. Where is the bug and what is the fix?

Quiz

A regulated fintech ingests payment events containing card PANs, and compliance forbids raw cardholder data from ever sitting in the analytics warehouse. Which pattern fits, and why is the modern ELT default wrong here?

Quiz

An EL tool retried a partially-succeeded load and your revenue fact table now shows inflated totals. What property was missing, and what is the durable design fix?

Quiz

Someone calls schema-on-read 'pure freedom — no schema to fight at load time.' What does the unit's framing say they are missing?

Recap

The through-line: where the Transform runs decides everything downstream. Decoupled storage/compute made landing raw cheap, which buys replayability through the medallion contract (immutable bronze, cleaned silver, business-ready gold). But the T now meters on the warehouse bill, so you go incremental by default. And because loaders retry, every load must be idempotent — merge on a unique_key — or a retry doubles your data. ELT is the default; ETL still wins when a hard rule says raw PII must never touch the warehouse.

Continue the climb ↑ELT vs ETL: free-recall review
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources2
expand
  1. 01
  2. 02

Trademarks belong to their respective owners. Editorial reference only.