awesome-everything RU
↑ Back to the climb

Data Engineering

Materialized views: SQL and refresh reading

Crux Read real SQL — REFRESH CONCURRENTLY setup, pg_ivm immv creation, and a staleness-window check — and pick the behaviour or the highest-leverage fix a senior would make first.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min

The refresh statement and the SQL around it are where materialized-view problems are actually born. Read each snippet, predict what the database will do, then choose the fix a senior reaches for first.

Goal

Practise the loop you run when reviewing an MV: read the DDL and the refresh path, predict the lock and staleness behaviour, and reach for the structural fix before the tuning knob.

Snippet 1 — the CONCURRENTLY prerequisite

CREATE MATERIALIZED VIEW daily_revenue AS
SELECT date_trunc('day', created_at) AS day,
       region,
       sum(amount) AS revenue
FROM orders
GROUP BY 1, 2;

-- nightly job:
REFRESH MATERIALIZED VIEW CONCURRENTLY daily_revenue;
Quiz

The nightly job fails immediately with 'cannot refresh materialized view concurrently' / a unique-index error. What is missing, and what exactly must it be?

Snippet 2 — incremental maintenance with pg_ivm

CREATE EXTENSION IF NOT EXISTS pg_ivm;

-- turn the aggregate into an incrementally-maintained view
SELECT create_immv('daily_revenue_ivm',
  $$ SELECT date_trunc('day', created_at) AS day,
            region,
            sum(amount) AS revenue
     FROM orders
     GROUP BY 1, 2 $$);
Quiz

After create_immv, a teammate notices that every INSERT into orders got measurably slower. Is this a bug, and what is happening?

Snippet 3 — detecting staleness

-- ops wants an alert if the MV has not refreshed recently.
-- they query the view's own data for the newest day:
SELECT max(day) AS newest_day FROM daily_revenue;
-- alert fires only if newest_day < current_date;
Quiz

Why is alerting on max(day) inside the view an unreliable staleness check, and what should you track instead?

Snippet 4 — the refresh that never finishes

-- a long-running analytics session is open:
BEGIN;
SELECT ... FROM orders JOIN ... ;   -- runs for 20+ minutes, still open

-- meanwhile, the scheduler fires:
REFRESH MATERIALIZED VIEW CONCURRENTLY daily_revenue;  -- appears to hang
Quiz

The CONCURRENTLY refresh sits and does nothing while the analytics session is open. What is the mechanism, and what is the durable fix?

Recap

The refresh path is where MV problems live: CONCURRENTLY needs a unique index on plain view columns or it errors; pg_ivm’s create_immv moves cost onto every write via triggers, so a per-write slowdown is the design, not a bug; staleness must be measured from the refresh job’s own success and timestamp, never from the view’s data content; and a CONCURRENTLY refresh stalls behind long open transactions, so bound them or it never catches up. Read the DDL and the refresh, predict the lock and staleness behaviour, then fix the structure before reaching for a knob.

Continue the climb ↑Materialized views: own the refresh
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources2
expand
  1. 01
  2. 02

Trademarks belong to their respective owners. Editorial reference only.