Networking & Protocols
HTTP streaming, SSE, WebSockets, and gRPC
REST APIs return a complete response body. But live dashboards, chat, real-time prices, and AI token streams need something different: data arriving in pieces, or bidirectional. HTTP stretches to cover these cases — through chunked transfer, SSE, WebSocket upgrades, and gRPC trailers — but each choice carries distinct tradeoffs.
Chunked transfer encoding
HTTP/1.1 supports Transfer-Encoding: chunked for streaming responses without knowing the total length upfront. The server sends size-prefixed chunks until a zero-length chunk closes the body:
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: text/plain
5\r\n
Hello\r\n
6\r\n
World\r\n
0\r\n
\r\nThis allows the server to begin sending data before it has computed the entire response — useful for generated HTML, streaming AI completions, and log tailing. HTTP/2 and HTTP/3 do not use chunked encoding; DATA frames natively handle variable-length bodies without an explicit framing layer.
Server-Sent Events (SSE)
SSE (Content-Type: text/event-stream) is a long-lived HTTP response delivering JavaScript-readable events from server to browser one-way. The browser sends one GET request; the server keeps the connection open and pushes events:
HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
data: {"temp": 72.4}\n\n
data: {"temp": 72.6}\n\nSSE is built on standard HTTP, travels through all proxies without protocol upgrades, reconnects automatically if the connection drops, and works on HTTP/2 (where it rides one stream among many, not blocking other requests). The limitation: server-to-client only — the client cannot push data back on the same SSE stream (it must open a separate request).
SSE is the right choice for: live dashboards, notification streams, AI token streaming (text/event-stream is how ChatGPT-style UIs stream tokens). WebSocket is the right choice for: bidirectional real-time (chat, collaborative editing, multiplayer game state).
WebSockets
WebSockets (RFC 6455) start as HTTP/1.1 and upgrade to a binary bidirectional frame protocol:
- Client sends
GET /ws HTTP/1.1withUpgrade: websocketandConnection: Upgrade. - Server replies
101 Switching Protocols. - The TCP connection is now a WebSocket tunnel — binary frames (text or binary) flow in both directions.
WebSockets are incompatible with HTTP/2 multiplexing — once a connection switches to WebSocket protocol, it is no longer an HTTP/2 connection. The fix: RFC 8441 (WebSocket over HTTP/2) — a client sends an extended CONNECT request with :protocol: websocket to create a single HTTP/2 stream that behaves like a WebSocket tunnel. This lets one HTTP/2 connection multiplex many WebSocket connections alongside regular requests.
- SSE — direction
- Server → Client only
- SSE — protocol upgrade required
- None — plain HTTP
- WebSocket — direction
- Bidirectional
- WebSocket — HTTP/2 support
- RFC 8441 extended CONNECT
- gRPC — transport
- HTTP/2 (always)
- gRPC — bidirectional streaming
- Yes (4 modes: unary, server-streaming, client-streaming, bidi)
gRPC: trailers and HTTP/2 dependency
gRPC is a high-performance RPC framework that layers on HTTP/2. Key mechanics:
- gRPC encodes requests and responses as Protocol Buffers (binary serialised structs) in HTTP/2 DATA frames.
- Each gRPC call is one HTTP/2 stream. Multiple concurrent RPCs multiplex on one connection.
- Trailers carry the result: gRPC status and message arrive as trailing HEADERS frames after the DATA frames (body). The gRPC status code is separate from HTTP’s own status code — gRPC always sends
200 OKat the HTTP level; the actual result (OK, NOT_FOUND, etc.) comes in thegrpc-statustrailer.
gRPC-Web exists because browsers cannot read HTTP trailers reliably. gRPC-Web rewrites the server’s trailing headers as a final base64-encoded message in the response body, which browsers can read. A proxy (Envoy, gRPC-Web proxy) translates between real gRPC (trailers) and gRPC-Web (body-embedded trailers).
Compression
Servers compress response bodies via Content-Encoding: gzip|br|zstd. Brotli (br) compresses ~20% better than gzip on text, at comparable decompression speed. Servers choose based on the client’s Accept-Encoding header.
Security concern — CRIME and BREACH attacks: these attacks exploited HTTPS + compression to leak session tokens by observing response size changes as an attacker injects controlled content. The mitigation: disable compression for responses that include both attacker-controlled input and sensitive data in the same response body. Static asset compression is safe (precomputed, no user input). Dynamic API responses mixing session tokens with query-reflected content should either skip compression or use randomised padding.
HTTP-level rate limiting
Servers return 429 Too Many Requests with Retry-After: <seconds> to throttle abusive clients. Common algorithms:
- Token bucket: bucket refills at a fixed rate; burst allowed up to capacity. Best for bursty-but-average-rate workloads.
- Leaky bucket: requests drain at a fixed rate; excess is queued or dropped. Smooths bursts.
- Fixed window: count requests per time window (e.g. 100/minute). Vulnerable to burst at window boundary.
- Sliding window: count requests over the last N seconds. More accurate, more memory.
CDNs (Cloudflare, Fastly) rate-limit at the edge before requests reach origin — the first line of defence. Origin servers add a second layer. Clients should respect Retry-After and use exponential back-off to avoid thundering-herd retries.
Why does gRPC use HTTP/2 trailers for status rather than the HTTP status code?
When should you choose SSE over WebSocket?
Order the steps in a WebSocket handshake over HTTP/1.1:
- 1 Client sends GET /ws HTTP/1.1 with Upgrade: websocket and Sec-WebSocket-Key
- 2 Server replies 101 Switching Protocols with Sec-WebSocket-Accept
- 3 HTTP protocol ends on this TCP connection
- 4 WebSocket bidirectional frame protocol begins on the same TCP connection
- 5 Both client and server can now send frames to each other at any time
Why this works
Why gRPC-Web rewrites trailers into the body. Browser XHR and fetch() APIs cannot read HTTP trailers — the API simply doesn’t expose them. Trailers are headers that arrive after the response body, and the browser’s HTTP client hides them. gRPC-Web solves this by encoding trailers as a final framed message in the body itself, prefixed with a specific byte to distinguish it from response data. A JavaScript gRPC-Web library parses this final message to extract grpc-status. Server-side, an Envoy or nginx-grpc-web proxy translates between real gRPC (with trailers) and gRPC-Web (with body-embedded trailers).
- 01What is the difference between SSE and WebSocket and when should you pick each?
- 02Why do CRIME and BREACH attacks target HTTPS+compression and what is the mitigation?
- 03How does RFC 8441 let WebSockets coexist with HTTP/2 multiplexing?
HTTP streaming extends beyond single request-response via chunked transfer encoding (variable-length bodies), SSE (long-lived server-to-client text event streams), WebSocket (bidirectional binary frames via 101 upgrade, or RFC 8441 extended CONNECT on HTTP/2), and gRPC (binary RPCs multiplexed on HTTP/2, with RPC status in trailing HEADERS frames). gRPC-Web proxies rewrite trailers into body frames for browser compatibility. Compression (Brotli/gzip via Content-Encoding) saves bandwidth but requires care on responses mixing controlled input with sensitive data (CRIME/BREACH). Rate limiting via 429 + Retry-After uses token bucket or sliding window algorithms; CDN edge rate limiting is the first line of defence.