Backend Architecture BE · 04 · 02

Pool sizing: why bigger is not faster

The instinct is to raise the pool size until the database stops being the bottleneck — but past a small number throughput drops, because a connection is a real backend process competing for cores, locks, and memory. The right size is a small formula, not a big guess.

BE Middle ◷ 14 min

Level

FoundationsJuniorMiddleSenior

A team sees the database is the bottleneck, so they raise the connection pool from 20 to 100, expecting roughly five times the throughput. Throughput drops. Latency climbs, CPU on the database spikes, and the work the server actually finishes per second goes down. They raised the one number that looked like a throttle and made everything worse. The pool was not too small — it was now far larger than the number of queries the database hardware can run at once, and every extra connection past that point is not parallelism, it is contention.

More connections is not more parallelism

It feels obvious that more connections means more concurrent work. It does not, and the reason is physical. A database server has a fixed number of CPU cores and a fixed number of disk spindles (or SSD queues). At any instant it can only truly execute as many queries as it has cores; everything else is waiting. A connection that holds a query is not magically running — it is competing for a core to run on.

So when you open 100 connections on an 8-core box, you do not get 100 queries running at once. You get 8 running and 92 fighting over those 8 cores, plus all the overhead of switching between them. The pool size sets how many queries can be in flight; the hardware sets how many can actually progress. When the first number greatly exceeds the second, the gap is pure waste.

The three taxes of an oversized pool

Each connection past the useful count does not sit quietly. It costs you in three ways at once:

Context switching. The OS time-slices many connections across few cores. Each switch flushes CPU caches and costs scheduler time. With 100 active connections on 8 cores, the box spends a growing share of CPU just changing which query is running instead of running queries.
Lock and latch contention. Concurrent transactions contend for row locks, and the database’s own internal latches (buffer pool, WAL, lock tables) are shared. More concurrent connections means more contention on those shared structures — and contention is super-linear, so it gets worse faster than the connection count grows.
Memory. In Postgres each connection is a forked backend process holding ~2–3 MB of baseline memory, and worse, work_mem is allocated per operation per connection. A sort-heavy query at work_mem = 16 MB across 100 connections can reserve gigabytes; the same workload at 10 connections cannot.

Together, these three taxes mean that every connection past the hardware ceiling is not neutral — it is a drag on every other connection. Skip any one of them and you might think “a few extra connections can’t hurt”; skip all three and you end up with the team from the hook, buying latency with no throughput in return.

Each connection past (cores × 2) + spindles is not neutral — it taxes every other connection on cores, locks, and memory at once.

The formula: small, derived from hardware

HikariCP’s well-known guidance turns this into a starting formula:

connections = (core_count × 2) + effective_spindle_count

The intuition: while one query waits on disk I/O, another can use the freed CPU core — so you want roughly twice the cores to keep them busy through I/O stalls, plus one per independent disk that can serve a seek in parallel. A 4-core server with one SSD lands at (4 × 2) + 1 = 9 connections. Not 90. The number that maximises throughput is shockingly small, and measured benchmarks back it: a pool of ~10 often beats a pool of 100 on the same hardware, finishing more queries per second at lower latency.

▸Why this works

Why does a pool of 10 beat a pool of 100 when the workload clearly has more than 10 things to do? Because the extra 90 are not doing work — they are queuing inside the database instead of outside it, and queuing inside is worse. When 100 connections hit an 8-core box, the database accepts all 100 and time-slices them, paying context-switch and lock-contention tax on every one, so each individual query runs slower and the total finishes later. When 10 connections hit the same box, the database runs them near full speed and the other requests wait in the pool’s queue — cheaply, in your application’s memory, costing nothing on the database. Same total work, but the small pool lets the database run at its efficient operating point and pushes the unavoidable waiting to the cheapest place to wait. Bounding the pool is the same bounded-concurrency lesson as the async unit: you queue the overflow somewhere cheap rather than dumping it on the expensive resource.

Sizing is finding the saturation point

The formula is a starting point, not a final answer — real workloads have a mix of CPU-bound and I/O-bound queries, and the right number is found by measurement. The method is to raise concurrency while watching throughput and latency: throughput rises, then flattens (the saturation point), then falls as contention takes over. The best pool size sits at or just below that flat top — the smallest pool that reaches peak throughput. Past it you are buying latency with no throughput in return, exactly the queueing-knee behaviour from the throughput lesson, now driven by the pool you chose.

Pool size (8-core box)	What actually happens	Throughput	Latency
4 (too small)	Cores sit idle waiting on I/O	Below peak	Low but capacity wasted
~10–16 (right)	Cores busy, minimal contention	Peak	Lowest at peak
50	Heavy context switching + lock contention	Below peak	Climbing
100	Database thrashes scheduling, not running	Well below peak	High, unstable

Quiz

A team raises the pool from 20 to 100 on an 8-core database to get more throughput, but throughput drops and latency climbs. Why?

Quiz

Using the HikariCP heuristic, what is a sensible starting pool size for a 4-core database with a single SSD?

Quiz

Why is per-connection memory a real constraint when sizing a Postgres pool, beyond the baseline process cost?

pool 100 — thrashing context-switch + lock contention; throughput well below peak

pool 50 — contention heavy context switching; throughput below peak, latency climbing

pool ~10–16 — optimal cores busy, minimal contention; peak throughput, lowest latency

pool 4 — under-use cores idle on I/O stalls; throughput below peak, capacity wasted

layers[0] = TOP. The optimal band sits at (cores × 2) + spindles ≈ 9 for a 4-core / 1-SSD box. Above it, every extra connection adds context-switching and lock-contention overhead that shrinks throughput.

Recall before you leave

01
Why does raising the pool size past a small number reduce throughput instead of increasing it?
02
What is the HikariCP sizing formula and what is the reasoning behind it?
03
How do you actually find the right pool size for a real workload, and why is a small pool better even when there is more work to do?

Recap

The instinct to fix a database bottleneck by enlarging the pool is backwards: a connection is a real backend process competing for a fixed number of cores, so a pool far larger than the hardware can run turns extra connections into pure contention — context switching across too few cores, super-linear lock and latch contention on shared internal structures, and per-connection memory where work_mem is multiplied per operation across every connection. HikariCP’s (cores × 2) + spindles puts a 4-core, one-SSD box at about 9 connections, and a pool of ~10 routinely beats a pool of 100 on identical hardware because the small pool lets the database run at its efficient point while the overflow waits cheaply in the pool’s queue rather than thrashing inside the engine. The right size is the smallest pool that reaches peak throughput, found by raising concurrency until throughput flattens. Now when you see a team celebrating a “fix” of raising the pool size and wondering why things got slower, you know where to look: the saturation point was already crossed. Knowing the size answers how many connections to keep — the next lesson asks what happens to a request when they are all busy: the wait queue, the acquisition timeout, and how that timeout becomes a deliberate latency dial.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

Why pool: the cost of creating a connectionjunior

unlocks

Acquisition and timeouts: the wait queue is the real latency dialmiddle

deepens into

Acquisition and timeouts: the wait queue is the real latency dialmiddle

appears again in188

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.