awesome-everything RU
↑ Back to the climb

Data Engineering

Data platform: free-recall review

Crux Free-recall prompts spanning the whole data-engineering track. Answer each in your own words first, then reveal the model answer and compare.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min

Retrieval beats re-reading. For each prompt, reconstruct a full answer from memory before you open the model answer — the effort of pulling the whole track together is what fuses the separate stores into one mental model.

Goal

Reconstruct the track’s spine without looking back: why OLTP and OLAP split, where the transform runs, how Parquet prunes, what an MV trades, why the log is the source of truth, and how search and vectors divide relevance — and how the seams between them are designed.

Recall before you leave
  1. 01
    Why can't one storage layout serve both the OLTP checkout path and OLAP revenue scans, and what is the standard two-store answer?
  2. 02
    What does choosing ELT over ETL actually buy you, and what is the cost?
  3. 03
    How does Parquet skip data, and what code mistake silently defeats it?
  4. 04
    What does a materialized view trade, and how do you keep that tradeoff honest in production?
  5. 05
    Why is the append-only event log the source of truth in event sourcing, and how does current state relate to it?
  6. 06
    How do full-text search and vector search divide the relevance problem, and why do mature systems run both?
Recap

If you reconstructed each answer, you hold the track’s spine: OLTP and OLAP split because no layout wins both access patterns; ELT buys replay by retaining raw; Parquet prunes via footer stats unless a function hides the column; an MV trades staleness for read speed and needs a declared freshness SLA; the event log is the source of truth and state is a fold over it; and search plus vectors divide lexical from semantic relevance. Above all, each store is correct for its job — the system stays correct only when you design the contract, delivery guarantee, freshness, and reconciliation at every seam between them.

Continue the climb ↑Data platform: code and query reading
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.