Backend Architecture BE · 04 · 09

Pooling: code and config reading

Read real pool config, handler code, and a Postgres log line, predict the failure, and pick the highest-leverage fix — sizing math, a leak, an idle-in-transaction hold, and a timeout config.

BE Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

Pool incidents are diagnosed in config files, handler code, and the database log — not in the abstract. Read each snippet, predict the failure it produces under load, and choose the fix a senior engineer would make first.

Goal

Practise the loop every pooling incident runs: read the config and the hot path, predict where connections starve, leak, or go stale, and reach for the structural fix before reaching for a bigger number.

Snippet 1 — sizing the pool

# 4 vCPU Postgres, one SSD. Service handles ~2000 req/s.
# A panicked engineer set this after a "DB is the bottleneck" alert:
spring.datasource.hikari.maximum-pool-size: 200
spring.datasource.hikari.minimum-idle: 200

Quiz

On this 4-vCPU, one-SSD database, what should maximum-pool-size actually be, and why is 200 harmful?

Snippet 2 — the handler

async function getUser(id) {
  const conn = await pool.acquire();
  const rows = await conn.query("SELECT * FROM users WHERE id = $1", [id]);
  conn.release();           // returns connection to the pool
  return rows[0];
}

Quiz

This handler runs fine in testing but the pool drains to empty over hours in production, fixed only by a restart. What is the defect and the fix?

Snippet 3 — the Postgres log line

LOG:  duration: 612000.244 ms  state: idle in transaction
DETAIL:  process 48213 has been idle in transaction for 00:10:12
HINT:  Sessions idle in transaction hold their connection and any locks.

Quiz

Several backends show 'idle in transaction' for minutes and the pool keeps exhausting. What is happening, and the right defence?

Snippet 4 — the timeout config

spring.datasource.hikari.connection-timeout: 30000   # 30 s, the default
spring.datasource.hikari.max-lifetime: 28800000      # 8 h
# Backing MySQL has wait_timeout = 28800 (8 h, default)

Quiz

Two problems hide in this config that will bite under a slow dependency and after idle periods. Which fix addresses both correctly?

Recap

Every pooling incident is read in config, code, and logs: a pool sized at 200 on 4 cores is contention, not concurrency — the (cores x 2) + spindles heuristic puts it near 9; a release() outside try/finally leaks on the error path and drains the pool over hours; ‘idle in transaction’ in the log means a connection hoarded inside an open transaction across unrelated work, holding locks; and a 30 s acquisition timeout plus a max-lifetime that does not undercut the DB’s wait_timeout is a starvation-and-stale-socket trap. Diagnose from the evidence, fix structurally (right size, guaranteed return, tight transactions, bounded waits), then re-measure.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.