Crux Read real WebSocket frame bytes, a broadcast send loop, a reconnect routine, and an Nginx block — predict the behaviour and pick the fix a senior engineer makes first.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min
Real-time bugs are diagnosed in frame dumps, send loops, reconnect routines, and proxy config — not in the spec. Read each artifact, predict what it does under load, and choose the fix a senior engineer reaches for first.
Goal
Practise the loop you run in every WebSocket incident: read the wire bytes or the hot path, predict where it breaks, and apply the highest-leverage fix before adding servers or knobs.
Snippet 1 — a frame on the wire
Frame bytes (hex): 89 04 70 69 6E 67
Quiz
Completed
The server received these bytes from a client. What kind of frame is it, and what must the server do?
Heads-up Opcode 0x1 is text; 0x9 is ping. The low nibble of 0x89 is 0x9, a control frame the protocol layer answers with a pong — it never reaches the application as data.
Heads-up MASK=0 would be a violation for a client DATA frame, but the larger signal here is that it is a ping; the teaching point is the required pong reply. A strict server would also flag the missing mask.
Heads-up Close is opcode 0x8; 0x9 is ping. The correct response is a pong, not a teardown — closing on a ping would drop healthy connections during keepalive.
Snippet 2 — the broadcast loop
function broadcast(clients, message) { for (const ws of clients) { ws.send(message); // enqueue, returns immediately }}// called every 10 ms with a 100 KB payload, 10k clients
Quiz
Completed
A few hundred clients are on slow links. With this loop, what happens under sustained load, and what is the highest-leverage fix?
Heads-up send() in this style does not block; it buffers and returns. That is what lets one slow client's queue grow without bound — bounded memory requires an explicit high-water-mark check.
Heads-up Fast clients drain in milliseconds and hold almost nothing. The unbounded growth is entirely on the slow clients whose queues never empty.
Heads-up A bigger kernel buffer delays the cliff by a few KB per socket; the application queue is where megabytes pile up. The fix is capping the application queue, not enlarging the kernel one.
A deploy drops 5 million clients at once and they all run this routine. What goes wrong, and what is the one-line fix?
Heads-up The cap protects the server by bounding retry frequency; lowering or raising it does not address the core defect, which is that all 5 M clients fire at the same instants.
Heads-up Math.min with 60000 bounds the value long before any overflow; the delays are correct numbers. The defect is that they are identical across clients — no jitter.
Heads-up A tight loop would hammer the server far harder. setTimeout with backoff is correct; it just needs jitter added so clients desynchronize.
WebSocket upgrades succeed, but connections silently die after about a minute of inactivity in production. What is missing from this block?
Heads-up Those headers are exactly what makes the upgrade work — removing them breaks the handshake. The missing piece is the read/send timeout, which governs idle long-lived connections.
Heads-up The scheme of the backend is unrelated to the idle-timeout failure; ws works over an http:// upstream behind TLS termination. The 60 s death is the default proxy_read_timeout.
Heads-up The symptom (death after ~60 s of silence) is the textbook Nginx default proxy_read_timeout closing quiet connections; it is a proxy-config gap, not a client bug.
Recap
Every WebSocket incident is read in artifacts: a frame’s opcode nibble tells you ping (answer with pong) versus data; an unbounded send loop OOMs on slow clients unless a high-water mark caps the queue; a reconnect routine without jitter turns a deploy into a thundering herd; and an Nginx block without a raised proxy_read_timeout silently kills idle connections at 60 s. Diagnose from the artifact, apply the top-of-ladder fix, then confirm under the same load.