Data Engineering DATA · 04 · 10

Materialized views: own the refresh

Hands-on project — build a slow aggregate dashboard, materialize it, compare full-CONCURRENTLY vs pg_ivm refresh under load, bound the staleness window, and prove the tradeoffs with before/after numbers.

DATA Senior ◷ 240 min

Level

FoundationsJuniorMiddleSenior

Reading about refresh strategy is not the same as feeling the lock, the write-path tax, and the silent staleness for yourself. Build a real slow dashboard, materialize it three ways, drive it under load, and let the measurements decide which strategy fits — with evidence at every step.

Goal

Turn the unit’s decision sequence into a reproducible experiment: materialize an expensive aggregate, compare full vs concurrent vs incremental refresh under read and write load, bound the staleness window, and verify each tradeoff with before/after numbers — not intuition.

Project

0 of 8

Objective

Take a deliberately slow aggregate query over a large fact table, materialize it, and empirically compare full REFRESH, REFRESH CONCURRENTLY, and pg_ivm incremental maintenance under realistic read+write load — then ship the strategy that fits a stated staleness budget, with monitoring that makes a failed refresh visible.

Requirements

Acceptance criteria

A before/after table: plain-view read latency, MV read latency, plain-refresh read-block duration, concurrent-refresh duration, and per-write cost with vs without pg_ivm — all measured under the same data and load, not estimated.
A one-paragraph recommendation that picks one strategy for a stated staleness budget and write rate, citing your own numbers (e.g. 'tolerate 5 minutes, 5k inserts/sec → scheduled CONCURRENTLY, because incremental added Xms/write').
Evidence that CONCURRENTLY does not block reads while plain refresh does, captured from the reader-polling test.
A demonstrated alert: a deliberately failed refresh fires the staleness alert while reads keep returning the old snapshot — proving the failure-open mode and that you caught it.

Senior stretch

Add a one-page on-call runbook: how to detect a stalled refresh, the CONCURRENTLY transaction-drain trap, the per-write cost of pg_ivm, and a decision tree from staleness budget + write rate to strategy.
Compare against a streaming materialized view: load the same aggregate into ClickHouse as an MV-on-insert (or Materialize) and show how it removes the refresh job — and what the POPULATE / cross-block-aggregation catch looks like in practice.
Add a CI gate that runs the reader-polling test and fails the build if a refresh blocks reads beyond a threshold (catching an accidental drop of CONCURRENTLY or the unique index).
Model the cost curve: vary the write rate and chart where incremental maintenance stops being cheaper than scheduled full refresh, finding the crossover point for your hardware.

Recap

This is the loop you will run whenever someone says “just materialize it”: confirm the read is expensive and repeated, measure the speedup, then choose refresh by measuring its cost — the plain-refresh lock window, the concurrent-refresh duration and transaction-drain stall, and the per-write tax of incremental maintenance — against the consumer’s staleness budget and the table’s write rate. Bound the staleness window, monitor the refresh job so a silent failure is visible, and let before/after numbers, not intuition, pick the strategy. Doing it once on a real dataset makes the production design review muscle memory.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.