awesome-everything RU
↑ Back to the climb

Databases

Index types: GIN, GiST, BRIN, Hash, Bloom, and HOT updates

Crux Beyond B-tree: when GIN, GiST, BRIN, Hash, and Bloom are the right tool, how HOT updates reduce index write overhead, why fillfactor matters, and why fresh statistics matter more than the index itself.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 18 min

A team adds CREATE INDEX ON events(payload) on a JSONB column and watches query time drop from 4 seconds to 4 seconds. B-tree cannot index inside JSONB. GIN can. Knowing which index type matches which data shape is what separates a working index from a wasted hour and a wasted gigabyte.

Why B-tree is not enough

B-tree indexes one value per row and answers equality, range, and sort queries on that value. When a row contains multiple values (JSONB keys, array elements, text lexemes) or when the “less than” relation is not defined (geometric shapes, IP ranges), B-tree has no useful sort order to walk. The other index types fill those gaps.

Index typeBest forTrade
B-treeEquality, range, sort on ordered types~10-30% of table size; 5-50 µs/write
GINJSONB containment, arrays, full-text2-5x size; slow on high-write JSONB
GiSTGeospatial (PostGIS), range types, nearest-neighbourExtension-dependent size; complex update
BRINHuge insert-only tables with natural orderingTiny size; only works on physically-ordered data
HashEquality on very wide keysNo range; usually B-tree is better
BloomMulti-column equality, unpredictable filter combosApproximate (false positives); heap re-check required

GIN: inverted index for composite types

Generalized Inverted Index. Where B-tree stores one value per row, GIN stores many values per row — every key in a JSONB document, every element in an array, every lexeme in a tsvector. Internally, GIN is a B-tree of indexed values, each pointing to a posting list (or posting tree for large lists) of TIDs that contain that value.

Use cases:

  • JSONB containment: WHERE payload @> '{"event_type":"login"}' — does this document contain this sub-document?
  • Array overlap: WHERE tags && ARRAY['premium','active']
  • Full-text search: WHERE to_tsvector('english', body) @@ to_tsquery('postgres')

GIN indexes are typically 2-5x the size of the indexed column data. On high-write JSONB columns, each row insert updates every indexed key — so fastupdate mode defers those updates to a background process, giving faster writes at the cost of slightly stale index entries until the buffer is flushed. Production discipline: measure GIN write cost after adding it to a write-heavy column; if insert latency spikes, consider fastupdate or narrow to an expression index on known-hot JSONB fields.

GiST: generic search tree

For data that lacks a global sort order: geospatial shapes, IP ranges, timestamp ranges. GiST is extensible — each non-leaf node stores a “predicate” (e.g., a bounding box) that summarises its subtree. Queries prune subtrees by testing the predicate before descending.

Most engineers encounter GiST via PostGIS:

CREATE INDEX ON locations USING GIST (geom);
-- enables: WHERE ST_DWithin(geom, ST_MakePoint(-74, 40.7), 1000)

Range types (int4range, tsrange) also use GiST by default. Nearest-neighbour queries (ORDER BY geom <-> point LIMIT 10) require GiST.

BRIN: tiny index for huge ordered tables

Block Range INdex. Stores min and max values per block range (default: 128 pages, ~1 MB of heap). To answer a query, Postgres reads the BRIN metadata, identifies candidate ranges, and scans only those blocks. BRIN is approximate — it narrows the heap scan but cannot eliminate it.

BRIN is only useful when data is physically ordered by the indexed column. The classic case: a time-series table where rows are inserted in timestamp order.

-- 10 billion rows, 8 TB table
CREATE INDEX ON events USING BRIN (created_at);
-- Result: ~few MB index; range query reads only the relevant 128-page blocks

For randomly distributed data, BRIN provides no speedup — every block range’s min-max spans the full range, and Postgres must scan all blocks.

Hash indexes

Equality only; no range support; no ORDER BY. Prior to Postgres 10, Hash indexes were not WAL-logged and could not survive crashes — avoid them in older installations. Post-10 they are stable. The narrow use case: equality lookup on very wide keys (e.g., a 1000-character JSON string) where B-tree’s size would be excessive. In practice, B-tree handles this well enough that Hash indexes remain rare. Default to B-tree unless you have a specific, measured reason.

Bloom indexes for multi-column equality

The bloom extension provides a probabilistic filter index. A Bloom index on (a, b, c, d, e) accelerates any query that filters by any subset of those columns — without the leading-column rule. The trade: it is approximate. Every match is a candidate that must be re-verified against the heap (false positives exist). Use case: tables with many columns where queries combine different filter sets unpredictably, making it impractical to build 2^N composite B-tree indexes.

HOT updates and fillfactor

When a row is UPDATEd, Postgres creates a new tuple version (MVCC) and must update every index referencing that row — even indexes whose columns did not change. This is write amplification: one logical update fans out to N index updates.

HOT (Heap-Only Tuple) eliminates index updates when both conditions hold:

  1. No indexed column changed in this update.
  2. The new tuple fits on the same heap page as the old one (there is free space).

When HOT fires, only the heap page changes — index entries still point to the old TID, which chains to the new version. The result: dramatically lower write overhead on update-heavy workloads.

Fillfactor controls how full each page is at insert time (default: 100% for tables, 90% for B-tree indexes). Setting fillfactor to 70–80% on an update-heavy table leaves room on pages for HOT updates:

ALTER TABLE orders SET (fillfactor = 70);
-- Re-fill pages: CLUSTER or VACUUM FULL (offline) or gradual via natural inserts

The trade is initial storage: lower fillfactor means more pages, larger table. Production teams measure update frequency and tune fillfactor on hot tables — a frequently overlooked lever.

Write cost accounting

Every index adds ~5-50 µs per write per index touched. For a table with 10 indexes, each INSERT costs 50-500 µs in index overhead beyond the heap write. Under a write-heavy workload (10,000 inserts/second), that is 0.5-5 ms of overhead per second of clock time. GIN indexes on JSONB columns with many keys can cost 50-200 µs per write — higher than any B-tree.

Senior teams model index budget per table: given write-throughput target, how many indexes can the table afford? Tools: pg_stat_statements for baseline latency, pgbench for synthetic write benchmarks under new index candidates.

Index size vs shared_buffers

Postgres caches hot pages in shared_buffers (typically 25% of server RAM). Large indexes compete with heap pages for cache space. A 5 GB GIN index on a JSONB column can evict hot heap pages, causing queries that were fast from cache to become disk-bound. Symptom: pg_stat_database.blks_hit / blks_read ratio drops after adding a new index. Mitigation: use partial indexes, expression indexes on known-hot fields, or drop unused indexes to reclaim cache.

Why ANALYZE matters more than the index itself

The planner’s choice of which index to use — or whether to use one at all — depends on table statistics collected by ANALYZE (run automatically by autovacuum, manually when needed). Stale statistics cause the planner to under- or over-estimate how many rows a filter matches, leading to wrong plan choices.

Production discipline:

  • Run ANALYZE after bulk inserts, large UPDATEs, or schema changes affecting distributions.
  • Tune autovacuum_analyze_scale_factor lower on tables with skewed data (default 0.2 = 20% change triggers analyze; set 0.02 for hot OLTP tables).
  • Use CREATE STATISTICS for correlated columns (e.g., country, city) so the planner does not assume independence.

A “wrong index choice” diagnosis is often actually “stale stats.”

Index type costs
B-tree size, 8-byte key, 100M rows
~1 GB
GIN size vs indexed column data
2-5x larger
BRIN size, 10B-row time-series
~few MB
Hash index use case in production
rare (B-tree usually better)
HOT update savings (no indexed column changed)
eliminates N index updates
GIN write overhead on high-write JSONB
~50-200 µs per row
B-tree write overhead per index
~5-50 µs per write
fillfactor default (table / B-tree)
100% / 90%
fillfactor for update-heavy HOT-friendly tables
70-80%
ANALYZE trigger (autovacuum_analyze_scale_factor default)
20% of rows changed
Quiz

A table stores geolocation data and the hot query is: WHERE ST_DWithin(geom, $1, 500). Which index type is correct?

Quiz

HOT updates skip index maintenance. What is the requirement for a HOT update to trigger?

Order the steps

Match the scenario to the correct index type — order from best match to worst match for the scenario: 'search JSONB documents by key presence'.

  1. 1 GIN — designed for multi-value types; answers @> and ? operators natively
  2. 2 Expression B-tree on (payload->>'key') — works for equality on a known field, but not general containment
  3. 3 B-tree on payload — indexes the whole document as opaque; useless for containment queries
  4. 4 BRIN on payload — only useful for min/max on ordered data; irrelevant here
Why this works

Why does Postgres ship six index types instead of one universal index? Because the fundamental data structures are incompatible. B-tree requires a total order (every value is comparable with less-than). JSONB documents have no total order. Geometric shapes require a spatial predicate, not a linear key. Time-series blocks need range summaries. Trying to fit all of these into B-tree would produce either a useless or astronomically expensive structure. The design choice — multiple specialized types — keeps each index tight and appropriate for its data shape.

Recall before you leave
  1. 01
    A colleague suggests adding a GIN index to a JSONB audit_log column that receives 5,000 inserts per second. What are the write-cost implications, and what alternative might work?
  2. 02
    Explain when BRIN is dramatically better than B-tree and when it is useless.
  3. 03
    What is write amplification in the context of indexes, and what mechanisms reduce it?
Recap

Postgres ships six index types. B-tree covers equality, range, and sort on ordered types — 95% of production use. GIN indexes multiple values per row (JSONB keys, array elements, tsvector lexemes) using an inverted structure; it is 2-5x larger and slower to write than B-tree. GiST handles geospatial and range types where no linear sort order exists. BRIN stores min-max per block range and is only useful when data is physically ordered (time-series inserts). Hash is equality-only and rarely preferred over B-tree. Bloom is a probabilistic multi-column equality index that avoids the leading-column rule at the cost of false positives.

HOT updates skip index maintenance when no indexed column changed and the heap page has space — tune fillfactor to 70-80% on update-heavy tables to keep HOT firing. GIN write cost on high-write JSONB columns can reach 50-200 µs per insert; narrow to expression indexes on known-hot fields when write overhead is the bottleneck. ANALYZE must be current for the planner to pick the right index; stale statistics are the most common cause of “unexpected Seq Scan despite existing index.”

Connected lessons
appears again in174
Continue the climb ↑Index-only scans, the Visibility Map, and INCLUDE
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.