awesome-everything RU
↑ Back to the climb

Databases

Partial, expression, and covering indexes

Crux Three index modifiers that change what a B-tree costs and what it can answer: partial restricts to a row subset, expression indexes a derived value, INCLUDE adds projection columns without affecting the sort key.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 14 min

A tasks table has 100M rows. 80% have status = 'done' and are never queried by the dashboard. The full B-tree on (workspace_id, created_at) is 2 GB and every insert touches it. A partial index WHERE status IN ('open','in_progress') is 400 MB, 5× cheaper to maintain, and the dashboard queries run just as fast — because it only indexes the rows the dashboard actually needs.

Partial indexes

A partial index is a B-tree (or any index type) built only over rows matching a WHERE predicate.

CREATE INDEX ON orders(user_id) WHERE status = 'pending';

This indexes only pending orders — typically a small fraction of the table. The result: smaller index, faster lookup on the hot subset, cheaper write overhead (the index is only touched when status = 'pending').

For the planner to use a partial index, the query’s WHERE clause must logically imply the index’s predicate. A query WHERE user_id = 42 AND status = 'pending' can use the index. A query WHERE user_id = 42 alone cannot (it would need to match all statuses, but the index only covers pending rows).

Production use cases:

  • Soft-delete pattern: WHERE deleted_at IS NULL — only active rows are indexed; historical deleted rows are excluded.
  • Hot-subset filters: a task table where 95% of rows are status = 'done' — index only 'open' and 'in_progress'.
  • Feature flags: WHERE feature_x_enabled = true — index only rows where the flag is active.

Partial unique indexes are a powerful variant. CREATE UNIQUE INDEX ON users (email) WHERE deleted_at IS NULL enforces “email is unique among non-deleted users.” This lets you soft-delete a user and later re-register the same email — impossible with a simple UNIQUE constraint on the email column.

Expression indexes

An expression index indexes a function of one or more columns, not the columns themselves.

CREATE INDEX ON users (LOWER(email));

This lets WHERE LOWER(email) = 'alice@x.com' use the index. Without it, LOWER(email) is evaluated on every row, defeating any index on the email column.

Common patterns:

  • Case-insensitive search: LOWER(email), UPPER(code).
  • Date truncation: DATE_TRUNC('day', created_at) for group-by-day queries.
  • JSON path extraction: (data->>'event_type') — indexes one field inside a JSONB column.
  • Computed values: COALESCE(first_name, '') || ' ' || COALESCE(last_name, '') for name-search.

The query must use the exact expression for the planner to recognize the index. WHERE LOWER(email) = ? uses the index. WHERE email = ? does not (unless the email is already lowercase and a regular index exists).

Covering indexes with INCLUDE

A regular B-tree index stores key columns in both internal nodes and leaf pages. The INCLUDE clause (Postgres 11+) adds extra columns to the leaf pages only — they are not part of the sort key, but they are stored in the leaf, available for the query.

When a query’s SELECT projection needs only columns that are present in the index (either as key columns or INCLUDE columns), Postgres can answer the query without fetching the row from the heap at all — this is an index-only scan. It is 10–100× faster than an index scan for hot reads because it eliminates the heap fetch entirely.

CREATE INDEX CONCURRENTLY idx_orders_workspace_pending_recent
ON orders (workspace_id, created_at DESC)
WHERE status = 'pending'
INCLUDE (id, total_cents);

This index:

  • Filters by workspace_id (leading column, tenant scope).
  • Sorts by created_at DESC (serves the ORDER BY without a Sort step).
  • Partial WHERE status = 'pending' — only covers ~20% of the table.
  • INCLUDE (id, total_cents) — the typical dashboard projection is covered, no heap fetch needed.
Indexes: size, build, and write cost
B-tree size, 8-byte key, 100M rows
~1 GB
B-tree depth, 100M rows
~4 levels
B-tree lookup time
~5-50 μs
Index build (CREATE INDEX CONCURRENTLY)
~10-30 min / 100M rows
Index write overhead per INSERT/UPDATE
~5-50 μs per index touched
Typical index overhead vs table size
~10-30% per index
Index-only scan vs index scan speedup
~10-100x on hot reads
Partial index size, 5%-matching predicate
~5% of full B-tree
ModifierWhat it doesWhen to use
Partial (WHERE)Index only rows matching predicateHot subset is <20% of table; or unique constraint on non-deleted rows
ExpressionIndex a computed valueQuery filters on LOWER(col), DATE_TRUNC, JSON field, or other function
INCLUDEStore extra columns in leaf pages without affecting sortQuery projects columns beyond the key; enables index-only scan
Why this works

Why are partial indexes underused in practice? Because they require the developer to know the predicate before writing the query. Many teams add indexes by looking at slow queries after the fact, not by thinking about data distribution at schema design time. The discipline: when a new hot query always filters by a specific status, flag, or soft-delete condition, build the partial index at the same time the feature ships. The cost is near-zero at feature time; the benefit is 10–20× cheaper index maintenance across the table’s lifetime.

Pick the right index for a hot dashboard query

1/3
Quiz

A query: WHERE LOWER(email) = 'alice@x.com'. Which index does the planner use?

Quiz

A table has 10M rows, 95% with status = 'done'. A dashboard queries only WHERE status = 'open'. Which is the best index strategy?

Recall before you leave
  1. 01
    Why can a partial unique index enforce a constraint that a plain UNIQUE constraint on a column cannot?
  2. 02
    What two conditions must hold for Postgres to use an index-only scan?
  3. 03
    A query filters on DATE_TRUNC('day', created_at) = '2026-01-15'. There is a plain index on created_at. Does the planner use it?
Recap

Three modifiers turn a plain B-tree into a targeted tool. Partial indexes (WHERE predicate) restrict coverage to matching rows — 5% coverage means 5% of full-index size and write cost. Expression indexes cover computed values like LOWER(email) or (data->>'field') — the query must use the exact same expression. INCLUDE columns (Postgres 11+) store projection-only columns in the leaf pages without affecting the sort key, enabling index-only scans that never touch the heap. Used together — partial + composite + INCLUDE — a well-designed index can cover an entire hot dashboard query with near-zero heap I/O.

Connected lessons
appears again in174
Continue the climb ↑Index types: GIN, GiST, BRIN, Hash, Bloom, and HOT updates
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.