Databases DB · 05 · 03

Pool sizing: the (cores × 2) + spindles formula and the two-layer stack

Postgres throughput peaks at (cores × 2) + spindle-count active backends; client-pool size per worker follows concurrent-requests × db-calls × avg-ms / budget-ms; the layered math keeps both in bounds.

DB Middle ◷ 12 min

Level

FoundationsJuniorMiddleSenior

A team adds PgBouncer and sets pool_size = 200 — matching the application’s total client count. P99 rises instead of falling. Adding more backends to Postgres made things slower. The formula for sizing a pool is not “one connection per client”.

Why more Postgres backends does not mean more throughput

Each Postgres backend is an OS process. Past a small multiple of the CPU core count, additional active backends stop adding throughput and start costing it: more context switches, more contention on shared-memory structures (proc array during snapshot acquisition, lock table), more kernel scheduling overhead.

The HikariCP-popularised formula for the optimal active backend count on the Postgres side:

pool_size = (CPU cores × 2) + effective_spindle_count

cores × 2 covers CPU-bound parallelism plus some hyper-threading benefit
effective_spindle_count adds budget for I/O parallelism when storage can absorb multiple outstanding reads

On modern NVMe with a well-cached working set, spindle count is effectively 0 — the formula reduces to cores × 2. On a 16-core Postgres server with data in RAM: pool_size ≈ 32, not 200.

This is the PgBouncer pool_size (or default_pool_size), not the number of application workers.

The formula lands on 24–48 backends; matching client count overshoots ~10x and degrades Postgres throughput.

Scenario	Formula result	Notes
8-core, NVMe, data in RAM	(8 × 2) + 0 = 16	+50% headroom → pool_size = 24
16-core, NVMe, data in RAM	(16 × 2) + 0 = 32	+50% headroom → pool_size = 48
8-core, HDD RAID, cold data	(8 × 2) + 6 = 22	+50% → pool_size = 33
Naive “match client count”	200–1000	Postgres throughput degrades; do not do this

Client-pool sizing per worker

The client-side pool (node-postgres, HikariCP, asyncpg) is sized differently — it is about in-worker concurrency, not Postgres core count. When you look at a slow API and see every query waiting on a pool checkout, the formula below tells you exactly how many connections that worker needs:

worker_pool_size ≈ concurrent_requests × db_calls_per_request × avg_query_ms / request_budget_ms

Example: 100 concurrent requests per worker, 2 DB calls each at 5 ms, 50 ms total request budget:

= 100 × 2 × 5 / 50 = 20 connections

With 10 workers: 200 total client connections to PgBouncer — cheap (PgBouncer uses ~2 KB per client connection). PgBouncer multiplexes them onto the ~24-backend pool.

A concrete sizing worked example

Size a pool for a Node.js API behind PgBouncer in transaction mode

1/3

Key PgBouncer knobs

pool_size (per-database backend pool target, default 20) — set this to (cores × 2) + spindles + ~50% headroom. This is the ceiling of real Postgres backends PgBouncer will maintain.

reserve_pool_size — extra backends PgBouncer can spin up under burst load (default 0; set ~25% of pool_size).

max_client_conn — total inbound client connections PgBouncer will accept (default 100; set to thousands — client connections are cheap).

server_reset_query = DISCARD ALL — SQL run against a backend before returning it to the pool; clears leaked session state.

query_wait_timeout — how long a client can wait for a backend before PgBouncer rejects it (default 120 s; production: 10–30 s for fast fail).

Together these five knobs form a safety envelope: pool_size controls the Postgres-side load, reserve_pool_size handles spikes, max_client_conn prevents the pooler itself from being overwhelmed, server_reset_query prevents state leaks between clients, and query_wait_timeout makes exhaustion fail fast and loud rather than silently queuing. Omit any one and a different failure mode surfaces under load.

▸Why this works

Why does the formula use cores × 2 rather than cores × 1? One CPU core can handle slightly more than one active backend because Postgres backends spend some time waiting on I/O or locks rather than consuming CPU continuously. The ×2 factor is a practical average across mixed CPU + I/O workloads; for purely CPU-bound queries (complex aggregations, JIT) the multiplier should be closer to 1.

Pool sizing reference numbers

Formula: optimal backend count: (cores × 2) + spindles
Typical production pool_size (8-core): 24 (16 + 50% headroom)
Client-pool per worker formula: concurrent × calls × avg_ms / budget_ms
PgBouncer memory per client conn: ~2 KB
Postgres backend memory: ~5–10 MB
reserve_pool_size recommendation: ~25% of pool_size
max_client_conn typical: 1,000–10,000

Quiz

A Postgres server has 16 cores and all data fits in RAM (NVMe). What is the correct starting point for PgBouncer pool_size?

Quiz

A Node.js worker handles 100 concurrent requests, each making 2 DB calls at 5 ms average, with a 50 ms request budget. What is the correct client pool max?

Client-pool size follows a concurrency formula; the backend pool follows the CPU-bound formula. PgBouncer multiplexes the wide left onto the narrow right.

Recall before you leave

01
Why is `(cores × 2) + spindles` a formula for Postgres-side active backends rather than for client count?
02
What PgBouncer config values control the backend pool and the client-side cap, and how should each be set?
03
When would the (cores × 2) formula need upward adjustment, and when downward?

Recap

Postgres throughput peaks at around (cores × 2) + spindle_count active backends. Beyond that, context switches and shared-memory contention degrade throughput per backend. Set PgBouncer pool_size to this target plus ~50% burst headroom — on a typical 8-core NVMe server, that is 24 backends, not 200. Each application worker’s client pool is sized separately from a concurrency formula: concurrent_requests × db_calls × avg_ms / budget_ms. The two-layer stack (client pool per worker → PgBouncer transaction-mode pool → Postgres) multiplexes thousands of clients onto the small backend count that actually saturates the CPU without overloading it. Now when you see pool exhaustion under load, the formula gives you a number to verify against — if you are already above (cores × 2) + 50%, the answer is not “raise pool_size” but “find what is holding connections open”.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 7 done

Connected lessons

builds on

PgBouncer session, transaction, and statement modesmiddle

unlocks

Pool exhaustion and idle-in-transaction: the 3 AM failure modemiddle

deepens into

appears again in287

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.