Crux Read real LB configs, a balancing snippet, a PROXY-protocol header, and a graceful-shutdown handler — then pick the behaviour or the highest-leverage fix.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min
LB behaviour is decided in config files and shutdown handlers, not in slide diagrams. Read each snippet, predict what it does under load, and choose the fix a senior engineer would make first.
Goal
Practise the loop you run on every LB incident: read the upstream config or the handler, predict the failure mode, and reach for the change that actually fixes it.
Snippet 1 — the nginx upstream block
upstream api { server 10.0.0.1:8080; server 10.0.0.2:8080; server 10.0.0.3:8080;}server { location / { proxy_pass http://api; }}
Quiz
Completed
Request cost varies widely and one backend periodically GC-pauses. With this default config, what happens, and what is the highest-leverage one-line change?
Heads-up The nginx default is round-robin, not least_conn. With no directive it is blind to load and keeps feeding the paused backend.
Heads-up The block is valid; omitting the directive simply selects the round-robin default. The problem is the default being load-blind, not a syntax error.
Heads-up Adding capacity does not stop round-robin from routing into the paused node. The fix is a load-aware policy, not raw capacity.
Snippet 2 — power-of-two-choices, by hand
import randomdef pick_backend(backends): a, b = random.sample(backends, 2) # two distinct, uniform at random return a if a.active_conns <= b.active_conns else b
Quiz
Completed
This is the core of power-of-two-choices. Why does sampling exactly two — rather than scanning all N for the true minimum — matter in production?
Heads-up True least-connections scans all N for the global minimum, which is O(N) and synchronizes every client onto the same recovered node. Sampling two breaks both.
Heads-up Sampling two is not just cheaper — it achieves ~12% imbalance versus ~100% for pure random, and it desynchronizes the herd, which scanning all N does not.
Heads-up It samples two DISTINCT backends without replacement; comparing a backend to itself would be a no-op and defeat the comparison step.
A raw-TCP backend (not HTTP) needs the real client IP for rate limiting. The LB prepends the line above. Which statement is correct?
Heads-up X-Forwarded-For is an HTTP header (application layer); PROXY protocol is a transport-layer preamble that works even for non-HTTP protocols like PostgreSQL or MQTT.
Heads-up The format is `PROXY <proto> <src> <dst> <sport> <dport>`: 203.0.113.195 is the source (client) and 198.51.100.7 is the destination the proxy connected on.
Heads-up The PROXY header is not signed. The backend must only accept it from known trusted-proxy source IPs; a directly reachable backend can still be fed a forged line.
This handler cooperates with connection draining. What does srv.Shutdown do, and what is the one production gap to watch?
Heads-up Shutdown is graceful: it stops accepting and waits for in-flight requests up to the timeout. Force-close is what you AVOID by calling it.
Heads-up Both matter. If the app exits before the LB stops routing, clients get resets; if the LB drain timeout is shorter than this context, the LB force-closes mid-request anyway. They must be aligned.
Heads-up Shutdown respects the context: at 25 s it returns and remaining connections are force-closed. Without the timeout a stuck stream would hang the deploy indefinitely.
Recap
LB problems live in config and handlers. The nginx default is round-robin, so a load-aware policy like least_conn (or P2C upstream) is the fix for a load-blind pool. Power-of-two-choices is two random samples plus one comparison — O(1) and herd-desynchronizing, not merely a cheaper least-connections. The PROXY protocol preamble restores the client IP for any transport-layer protocol but must be trusted only from known proxy IPs. And graceful shutdown (close-listener, drain in-flight, timeout) only works when the app’s drain window and the LB’s drain timeout are sized together for the longest-lived connection.