Networking & Protocols NET · 08 · 10

WebSocket: code and config reading

Read real WebSocket frame bytes, a broadcast send loop, a reconnect routine, and an Nginx block — predict the behaviour and pick the fix a senior engineer makes first.

NET Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

Real-time bugs are diagnosed in frame dumps, send loops, reconnect routines, and proxy config — not in the spec. Read each artifact, predict what it does under load, and choose the fix a senior engineer reaches for first.

Goal

Practise the loop you run in every WebSocket incident: read the wire bytes or the hot path, predict where it breaks, and apply the highest-leverage fix before adding servers or knobs.

Snippet 1 — a frame on the wire

Frame bytes (hex): 89 04 70 69 6E 67

Quiz

The server received these bytes from a client. What kind of frame is it, and what must the server do?

Snippet 2 — the broadcast loop

function broadcast(clients, message) {
  for (const ws of clients) {
    ws.send(message);            // enqueue, returns immediately
  }
}
// called every 10 ms with a 100 KB payload, 10k clients

Quiz

A few hundred clients are on slow links. With this loop, what happens under sustained load, and what is the highest-leverage fix?

Snippet 3 — the reconnect routine

let attempt = 0;
function reconnect() {
  const delay = Math.min(1000 * 2 ** attempt, 60000);
  setTimeout(() => { attempt++; connect(); }, delay);
}

Quiz

A deploy drops 5 million clients at once and they all run this routine. What goes wrong, and what is the one-line fix?

Snippet 4 — the Nginx location block

location /ws {
  proxy_pass http://backend;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection "Upgrade";
}

Quiz

WebSocket upgrades succeed, but connections silently die after about a minute of inactivity in production. What is missing from this block?

Recap

Every WebSocket incident is read in artifacts: a frame’s opcode nibble tells you ping (answer with pong) versus data; an unbounded send loop OOMs on slow clients unless a high-water mark caps the queue; a reconnect routine without jitter turns a deploy into a thundering herd; and an Nginx block without a raised proxy_read_timeout silently kills idle connections at 60 s. Diagnose from the artifact, apply the top-of-ladder fix, then confirm under the same load.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.