awesome-everything RU
↑ Back to the climb

Databases

The leading-column rule and composite index design

Crux A multi-column B-tree is sorted by its first column first. Queries that skip the leading column cannot use it. Designing composites around this rule is the central skill in Postgres index engineering.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 16 min

A team creates an index on (workspace_id, status, created_at) for a dashboard. Queries filter by workspace_id alone — fast. Queries filter by workspace_id and status — still fast. A new analytics query filters by status alone — sequential scan, 8 seconds. The index exists and is correct. The query is wrong for this index. The difference is the leading-column rule.

B-tree internals: why order matters

Postgres’s default index type is a B-tree (balanced search tree). Each internal node is a disk page (8KB) containing a sorted list of keys and pointers to child pages. Leaf nodes contain the actual key values and pointers (TIDs) to the heap rows.

A B-tree on a single column (a) is sorted globally by a. To answer WHERE a = 42, Postgres walks the tree from the root, taking at most ~4 hops for a 100M-row table (B-tree depth ≈ 4 levels with fanout ~200–400 keys per page), then reads the matching leaf entries.

A multi-column B-tree on (a, b, c) is sorted primarily by a, then by b within equal a values, then by c within equal (a, b) pairs. It accelerates:

  • Queries filtering on a alone: scan the leading subtree.
  • Queries filtering on a and b: narrow further.
  • Queries filtering on a, b, c: narrowest possible, best case.
  • Queries with a plus a range on b (e.g., WHERE a = 1 AND b > 100): still uses the index.
  • Queries with a plus ORDER BY b: the index already provides that sort order.

It does NOT accelerate:

  • Queries filtering only on b, only on c, or only on b, c — the index provides no ordering entry point for these; a full scan of the index is required, which is almost always worse than a seq scan.
Query filterIndex (a, b, c) helps?Why
WHERE a = ?YesLeading column — narrows by a
WHERE a = ? AND b = ?YesPrefix (a, b)
WHERE a = ? AND b > ?YesRange on second column after equality on first
WHERE a = ? ORDER BY bYesIndex already sorted by b within a
WHERE b = ?Nob is not the leading column
WHERE b = ? AND c = ?NoNeither leading

Designing composite indexes for real query patterns

The leading-column rule means composite index design follows the queries, not the table. The discipline:

  1. List the top-N hot queries for a table.
  2. Group them by which column is always present in the filter.
  3. For each group, design a composite with that always-present column as the leading column, then add secondary columns in decreasing selectivity order.
  4. Use ORDER BY columns as trailing key columns — they serve both the filter and the sort without an extra Sort step.

Example: a dashboard query WHERE workspace_id = $1 AND status = 'pending' ORDER BY created_at DESC LIMIT 50. The always-present column is workspace_id (every query is tenant-scoped). The secondary filter is status. The sort is created_at DESC.

Best composite: (workspace_id, created_at DESC) WHERE status = 'pending' — the leading column matches the always-present filter; the sort column is second; the partial WHERE clause is handled separately (covered in lesson 03). An index (workspace_id, status, created_at) would also work but is a different shape.

One composite vs two single-column indexes

A composite (a, b) is one physical structure. It costs roughly 30–50% more than (a) alone in size and write overhead. Two single-column indexes (a) and (b) cost roughly the sum of their sizes and write overheads separately.

A composite wins when the dominant query always filters by the leading column. Two single-column indexes win when queries filter by a often AND by b often, never together — in that case no composite serves both, and the planner can combine them via a Bitmap Index Scan.

Senior rule: design the composite for the dominant query first. Add a second index only if a separate hot query needs a different leading column and the Bitmap And plan is not fast enough.

Trace it
1/3

A team migrates a legacy events table to add proper indexes. Walk the decisions.

1
Step 1 of 3
Step 1: how do you identify which indexes are needed?
2
Locked
Step 2: how do you design composites?
3
Locked
Step 3: how do you deploy without downtime?
Quiz

A table has an index on (region, status). Which query can use this index?

Quiz

A query filters on (region, status) where region has 5 values and status has 4. Which composite index order is better?

Order the steps

Order the composite index design steps:

  1. 1 List the top-N hot queries for the table
  2. 2 Identify which column is always present in the WHERE clause
  3. 3 Use that always-present column as the leading column
  4. 4 Add secondary filter columns in selectivity order
  5. 5 Add ORDER BY columns as trailing key columns to avoid extra Sort steps
  6. 6 Verify with EXPLAIN ANALYZE that the planner uses the index and eliminates Sort nodes
Recall before you leave
  1. 01
    Explain in detail why the leading-column rule exists, and what production patterns work around it.
  2. 02
    When does a composite index win over two single-column indexes, and when do two single-column indexes win?
  3. 03
    A query is SELECT * FROM events WHERE project_id = $1 ORDER BY created_at DESC LIMIT 50. Design the best single index.
Recap

A composite B-tree on (a, b, c) is sorted by a globally, b within each a-group, c within each (a, b)-group. The engine navigates from the root using this sort order, so the leading column must appear in the filter — otherwise the index offers no starting point. This is the most violated rule in production indexing. Design composites by listing hot queries, identifying the always-present filter column as the leader, adding secondary columns in selectivity order, and matching ORDER BY with trailing key columns. A well-designed composite typically replaces 3–5 single-column indexes at lower total write cost.

Connected lessons
appears again in174
Continue the climb ↑Partial, expression, and covering indexes
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.