Networking & Protocols NET · 08 · 01

WebSocket: the HTTP upgrade handshake

How a normal HTTP request becomes a persistent bidirectional message pipe in one round-trip — and what each header in the upgrade does.

NET Junior ◷ 10 min

Level

FoundationsJuniorMiddleSenior

Already know this unit? Take a 1-minute quick check →

You open Discord. A friend sends you a message. Your browser shows it instantly — no page reload, no polling. That instant delivery flows through a persistent TCP connection that HTTP alone cannot provide. WebSocket gives both sides the ability to send at any moment, without asking first.

The problem HTTP cannot solve

HTTP is request-response: the client asks, the server answers, then the conversation is over. For live chat, stock tickers, multiplayer games, or collaborative editing — anything where the server needs to push data without waiting for a new client request — HTTP is structurally wrong. Each HTTP request adds at minimum one full round-trip of latency before any data can flow back.

WebSocket solves this by converting an existing HTTP connection into a persistent bidirectional channel where either side can send at any time.

The cost contrast is the whole reason WebSocket exists: HTTP pays a round-trip per server push, WebSocket pays one upgrade then pushes for free.

The metaphor

With HTTP you send letters at the post office: you write one, mail it, and wait for a reply. With WebSocket you pick up a telephone: once connected, both sides talk freely without taking turns.

The upgrade handshake step by step

When you see random 502s on a WebSocket endpoint, the culprit is almost always here — a proxy that never learned to say “yes” to what follows. Knowing exactly what each header does is what lets you diagnose the problem in two minutes instead of two days.

The connection starts as a normal HTTP/1.1 request. The client signals that it wants to switch protocols:

GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

Each header has a precise purpose:

Upgrade: websocket — declares the target protocol.
Connection: Upgrade — tells intermediate proxies this is a protocol switch, not a standard keepalive.
Sec-WebSocket-Key — 16 random bytes, base64-encoded. The server uses it to prove it intentionally processed the upgrade (not a cached HTTP response — see the Inset below).
Sec-WebSocket-Version: 13 — specifies RFC 6455, the only version used in practice.

The server validates the request (GET method, HTTP/1.1, valid headers) and replies:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

The Sec-WebSocket-Accept value is computed as:

base64(SHA-1(Sec-WebSocket-Key + "258EAFA5-E914-47DA-95CA-C5AB0DC85B11"))

The fixed string (258EAFA5-…) is a magic GUID defined in the RFC. After the 101 response, both sides stop speaking HTTP. The TCP connection is now a raw WebSocket stream.

WebSocket handshake facts

Extra round-trips for WebSocket upgrade: 0 (uses the existing HTTP connection)
HTTP status code for upgrade success: 101 Switching Protocols
RFC defining WebSocket: RFC 6455 (2011)
Sec-WebSocket-Version in production: 13 (only version used)
Frame header overhead (small server→client frame): 2 bytes
Message latency after handshake: 0–5 ms

One scenario end to end

You open Discord in your browser:

Browser does TCP + TLS to gateway.discord.gg.
Browser sends the HTTP Upgrade request above.
Server replies 101. Both sides discard HTTP parsing.
From now on, Discord’s server pushes message frames to your browser the moment a friend sends something. Your browser sends frames (typing indicators, messages) in the other direction without opening a new connection.
When you close the tab, a close frame is exchanged and the TCP connection ends.

▸Why this works

Why the Sec-WebSocket-Accept computation defends against cache poisoning. In the early web, a malicious JavaScript on site-a.com could open a TCP connection to an intermediate proxy and craft bytes that looked like a valid HTTP response. If the proxy cached those bytes, other users’ requests could be answered with malicious content. The Sec-WebSocket-Key + GUID + SHA-1 roundtrip makes it essentially impossible for a JavaScript to forge an Accept header without knowing the exact key the server would generate — the proxy cannot cache the result of a computation it never performed.

Quiz

Why does WebSocket need an HTTP Upgrade request instead of just opening a raw TCP connection to a new port?

Quiz

After a WebSocket upgrade completes with 101 Switching Protocols, what is the HTTP protocol used for?

Order the steps

Order the WebSocket handshake steps:

1 Client sends HTTP GET with Upgrade: websocket and Sec-WebSocket-Key
2 Server validates the request and computes Sec-WebSocket-Accept
3 Server replies with 101 Switching Protocols and the Accept header
4 Both sides drop HTTP; the connection becomes bidirectional WebSocket
5 Either side can now send frames at any time

Complete the analogy

Fill in the blank: WebSocket is like _______ where both sides can talk freely without taking turns.

One HTTP round-trip switches the connection: after the 101, HTTP is gone and either side sends binary frames at any moment.

Recall before you leave

01
In one sentence: why doesn't a server just use repeated HTTP requests to push data to the client?
02
What does the 101 Switching Protocols response signal, and what happens on the wire immediately after?
03
Why is Sec-WebSocket-Key added to a fixed GUID before hashing, rather than hashed directly?

Recap

HTTP is pull-only: every data delivery requires the client to ask first. WebSocket breaks that constraint by upgrading a normal HTTP/1.1 connection into a full-duplex channel. The upgrade requires exactly one extra HTTP round-trip: a GET with Upgrade: websocket and a random Sec-WebSocket-Key, answered by 101 Switching Protocols and a SHA-1-derived Sec-WebSocket-Accept. After the 101 response, no HTTP is spoken — both sides exchange compact binary frames where either can initiate. The entire WebSocket handshake adds zero extra TCP connections and zero extra TLS sessions. Message latency after the handshake is 0–5 ms, limited only by the network RTT. Now when you see a WebSocket connection drop randomly after 60 seconds of silence, you know to look at the proxy’s idle timeout — not at your application code.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

deepens into

appears again in287

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Collaborative cursorsShow every connected user's live cursor and selection in a shared document, conflict-free, over WebSocket.