awesome-everything RU
↑ Back to the climb

Backend Architecture

Pool sizing: why bigger is not faster

Crux The instinct is to raise the pool size until the database stops being the bottleneck — but past a small number throughput drops, because a connection is a real backend process competing for cores, locks, and memory. The right size is a small formula, not a big guess.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 14 min

A team sees the database is the bottleneck, so they raise the connection pool from 20 to 100, expecting roughly five times the throughput. Throughput drops. Latency climbs, CPU on the database spikes, and the work the server actually finishes per second goes down. They raised the one number that looked like a throttle and made everything worse. The pool was not too small — it was now far larger than the number of queries the database hardware can run at once, and every extra connection past that point is not parallelism, it is contention.

More connections is not more parallelism

It feels obvious that more connections means more concurrent work. It does not, and the reason is physical. A database server has a fixed number of CPU cores and a fixed number of disk spindles (or SSD queues). At any instant it can only truly execute as many queries as it has cores; everything else is waiting. A connection that holds a query is not magically running — it is competing for a core to run on.

So when you open 100 connections on an 8-core box, you do not get 100 queries running at once. You get 8 running and 92 fighting over those 8 cores, plus all the overhead of switching between them. The pool size sets how many queries can be in flight; the hardware sets how many can actually progress. When the first number greatly exceeds the second, the gap is pure waste.

The three taxes of an oversized pool

Each connection past the useful count does not sit quietly. It costs you in three ways at once:

  • Context switching. The OS time-slices many connections across few cores. Each switch flushes CPU caches and costs scheduler time. With 100 active connections on 8 cores, the box spends a growing share of CPU just changing which query is running instead of running queries.
  • Lock and latch contention. Concurrent transactions contend for row locks, and the database’s own internal latches (buffer pool, WAL, lock tables) are shared. More concurrent connections means more contention on those shared structures — and contention is super-linear, so it gets worse faster than the connection count grows.
  • Memory. In Postgres each connection is a forked backend process holding ~2–3 MB of baseline memory, and worse, work_mem is allocated per operation per connection. A sort-heavy query at work_mem = 16 MB across 100 connections can reserve gigabytes; the same workload at 10 connections cannot.

The formula: small, derived from hardware

HikariCP’s well-known guidance turns this into a starting formula:

connections = (core_count × 2) + effective_spindle_count

The intuition: while one query waits on disk I/O, another can use the freed CPU core — so you want roughly twice the cores to keep them busy through I/O stalls, plus one per independent disk that can serve a seek in parallel. A 4-core server with one SSD lands at (4 × 2) + 1 = 9 connections. Not 90. The number that maximises throughput is shockingly small, and measured benchmarks back it: a pool of ~10 often beats a pool of 100 on the same hardware, finishing more queries per second at lower latency.

Why this works

Why does a pool of 10 beat a pool of 100 when the workload clearly has more than 10 things to do? Because the extra 90 are not doing work — they are queuing inside the database instead of outside it, and queuing inside is worse. When 100 connections hit an 8-core box, the database accepts all 100 and time-slices them, paying context-switch and lock-contention tax on every one, so each individual query runs slower and the total finishes later. When 10 connections hit the same box, the database runs them near full speed and the other requests wait in the pool’s queue — cheaply, in your application’s memory, costing nothing on the database. Same total work, but the small pool lets the database run at its efficient operating point and pushes the unavoidable waiting to the cheapest place to wait. Bounding the pool is the same bounded-concurrency lesson as the async unit: you queue the overflow somewhere cheap rather than dumping it on the expensive resource.

Sizing is finding the saturation point

The formula is a starting point, not a final answer — real workloads have a mix of CPU-bound and I/O-bound queries, and the right number is found by measurement. The method is to raise concurrency while watching throughput and latency: throughput rises, then flattens (the saturation point), then falls as contention takes over. The best pool size sits at or just below that flat top — the smallest pool that reaches peak throughput. Past it you are buying latency with no throughput in return, exactly the queueing-knee behaviour from the throughput lesson, now driven by the pool you chose.

Pool size (8-core box)What actually happensThroughputLatency
4 (too small)Cores sit idle waiting on I/OBelow peakLow but capacity wasted
~10–16 (right)Cores busy, minimal contentionPeakLowest at peak
50Heavy context switching + lock contentionBelow peakClimbing
100Database thrashes scheduling, not runningWell below peakHigh, unstable
Quiz

A team raises the pool from 20 to 100 on an 8-core database to get more throughput, but throughput drops and latency climbs. Why?

Quiz

Using the HikariCP heuristic, what is a sensible starting pool size for a 4-core database with a single SSD?

Quiz

Why is per-connection memory a real constraint when sizing a Postgres pool, beyond the baseline process cost?

Recall before you leave
  1. 01
    Why does raising the pool size past a small number reduce throughput instead of increasing it?
  2. 02
    What is the HikariCP sizing formula and what is the reasoning behind it?
  3. 03
    How do you actually find the right pool size for a real workload, and why is a small pool better even when there is more work to do?
Recap

The instinct to fix a database bottleneck by enlarging the pool is backwards: a connection is a real backend process competing for a fixed number of cores, so a pool far larger than the hardware can run turns extra connections into pure contention — context switching across too few cores, super-linear lock and latch contention on shared internal structures, and per-connection memory where work_mem is multiplied per operation across every connection. HikariCP’s (cores × 2) + spindles puts a 4-core, one-SSD box at about 9 connections, and a pool of ~10 routinely beats a pool of 100 on identical hardware because the small pool lets the database run at its efficient point while the overflow waits cheaply in the pool’s queue rather than thrashing inside the engine. The right size is the smallest pool that reaches peak throughput, found by raising concurrency until throughput flattens. Knowing the size answers how many connections to keep — the next lesson asks what happens to a request when they are all busy: the wait queue, the acquisition timeout, and how that timeout becomes a deliberate latency dial.

Connected lessons
appears again in185
Continue the climb ↑Acquisition and timeouts: the wait queue is the real latency dial
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.