awesome-everything RU
↑ Back to the climb

Networking & Protocols

HTTP/3 in production: QUIC internals, fallback, and observability

Crux Running HTTP/3 in production means understanding QUIC congestion control in userspace, ALPN-based version negotiation, silent fallback mechanics, and the metrics that reveal when UDP is being blocked.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 18 min

Your edge service switches on HTTP/3. One morning the ALPN dashboard shows h3 share collapsing from 40% to 5%. No errors in the application logs. Users aren’t calling in. What just broke? This is an HTTP/3 production incident — and the diagnosis path looks nothing like a TCP outage.

TCP HOL blocking: the precise mechanism

HTTP/2’s antagonist is TCP’s delivery contract. TCP guarantees in-order delivery of a byte stream. When a segment carrying bytes 50,000–50,099 is lost, TCP buffers all subsequently-arriving bytes until the missing segment is retransmitted. The kernel’s TCP receive buffer fills; no bytes advance to the application. At the HTTP/2 layer, stream #1 may use bytes 50,000–50,099, stream #2 may use bytes 50,100–50,199 — both stall while the retransmit completes.

The metric: at 1% packet loss and 100 ms RTT, you expect roughly one loss event per 100 packets, with a ~100 ms stall per event. At 50 concurrent streams, that stall affects all 50. On high-loss links (cellular at 0.5–2%, satellite at 2–5%), HTTP/2 measurably underperforms HTTP/1.1 with 6 parallel TCP connections — because parallel connections have independent loss recovery domains.

QUIC stream independence: precise mechanism

QUIC numbers packets, not byte offsets of a unified stream. Each QUIC packet is acknowledged individually. Each stream tracks which packets carried its frames and which are outstanding.

When packet N is lost:

  1. QUIC retransmits the specific frames from streams A and C that were in packet N.
  2. Streams B, D, and E — whose frames were in other packets — continue sending and receiving.
  3. The application layer for stream B sees no interruption.

Cost: QUIC frame headers add overhead (stream ID, offset, length per frame — ~8–10 bytes each, vs TCP’s segment with no per-stream overhead). Per-connection state is heavier. But on lossy paths, the win is decisive.

HPACK vs QPACK: the ordering dependency

HPACK (RFC 7541) maintains a dynamic table indexed by sequence of insertion. Entry at index 62 was the 62nd (name, value) pair inserted since connection start. A HEADERS frame referencing index 62 must be processed after the frame that inserted index 62.

On TCP: frames arrive in order — the HEADERS frame adding index 62 always arrives before HEADERS frames referencing it. The ordering constraint is invisible.

On QUIC: streams deliver frames independently. HEADERS frame on stream #3 might arrive before HEADERS frame on stream #1 that would have added index 62. If stream #3’s HEADERS references index 62, decoding fails or blocks.

QPACK (RFC 9204) solution:

  • Encoder stream — a dedicated unidirectional QUIC stream carries dynamic table insertions. This stream delivers table updates in order.
  • Decoder stream — carries acknowledgement of which table entries have been processed.
  • HEADERS frames — reference only entries the encoder knows have been acknowledged on the decoder stream. A HEADERS frame on stream #3 never references an entry that might not yet be in the decoder’s table.

Result: HEADERS frames on one stream can never be blocked by HEADERS frames on another stream — QUIC’s stream independence is preserved at the header-compression layer.

HTTP/3 production metrics (April 2026)
Global HTTP/3 traffic share
21.1% (Cloudflare Radar)
Websites advertising HTTP/3
38.8% (W3Techs)
Google, Meta, Cloudflare, Akamai h3 share
>50% of their traffic
Typical enterprise firewall UDP/443 block rate
5–15% of corporate paths
HTTP/3 → HTTP/2 fallback RTT cost
1 extra RTT on first visit
0-RTT data replay risk window
Duration of session ticket validity

Version negotiation: ALPN and the fallback chain

For HTTP/1.1 and HTTP/2: ALPN (RFC 7301) is a TLS extension. During the TLS handshake, the client sends a list of supported protocols (["h2", "http/1.1"]). The server picks one. No extra round-trip — version negotiation is free inside the handshake.

For HTTP/3: The negotiation is two-stage because QUIC/UDP is a different transport:

  1. First connection uses TCP-based HTTP/1.1 or /2.
  2. Server response includes Alt-Svc: h3=":443"; ma=86400 (or an HTTPS/SVCB DNS record). The browser stores this hint.
  3. Subsequent connections to this origin: browser tries QUIC/UDP first, with TLS ALPN selecting h3.
  4. If UDP/443 is blocked (QUIC handshake fails): browser marks this origin as “h3 unavailable” for some cooldown period and falls back to TCP-based HTTP/2.

The fallback is silent — users see no error, but operators must monitor ALPN distribution metrics. A spike in h3→h2 fallback is the signal of a network-path change blocking UDP/443.

Connection coalescing

Multiple origins sharing the same edge server IP and covering TLS certificate can share one HTTP/2 or HTTP/3 connection — connection coalescing. Example: cdn.example.com and api.example.com both resolve to the same IP, and the edge certificate covers both SANs (Subject Alternative Names). Chrome and Firefox coalesce: one QUIC connection serves requests to both hostnames.

This saves handshake cost on pages that load from multiple origins on the same CDN. The pitfall: if the certificate’s SAN list does not include one of the hostnames (misconfiguration), the browser opens a separate connection — a silent latency regression. Monitor per-origin connection counts; a sudden rise signals a coalescing failure.

Congestion control in QUIC userspace

TCP congestion control (CUBIC, BBR, Reno) runs in the operating system kernel. QUIC implements congestion control in userspace — inside the application or QUIC library.

Implications:

  • Flexibility: operators can tune or switch algorithms per connection, per application, or per traffic class without a kernel change.
  • CPU cost: congestion control computations run in the process, competing with application code for CPU time. At high connection counts, this is measurable.
  • Observability: each QUIC connection must export its own congestion window, loss rate, and RTT measurements. Kernel TCP exports these via /proc/net/tcp and ss; QUIC requires per-connection telemetry hooks.

The IETF qlog format (JSON-based per-connection event log) is the deep-debug tool for QUIC. For production, Prometheus/OTel exporters aggregate QUIC metrics across connections.

0-RTT replay: the operational risk

0-RTT resumption (via TLS session tickets) allows the browser to send an HTTP request in the very first QUIC packet, before the handshake completes. This saves one RTT (~50–300 ms on wide-area links).

The risk: 0-RTT data is replayable by a network adversary. An attacker who captures the first QUIC packet can re-send it, causing the server to process the request a second time. For GET requests (idempotent, no side effects), this is harmless. For POST (create an order, send a payment), it is catastrophic.

Production requirements:

  • Server must track and reject replayed 0-RTT data (via anti-replay tokens — RFC 9001 § 9.2).
  • Servers should respond 425 Too Early for non-safe methods in 0-RTT.
  • Anti-replay token storage: bounded; on large distributed deployments, a shared cache (Redis) is needed to detect replays across edge nodes.
Trace it
1/4

Diagnose: a service's HTTP/3 traffic share dropped from 40% to 5% overnight.

1
Step 1 of 4
Step 1: what is the first metric to check?
2
Locked
Step 2: ALPN shows h3 went from 40% → 0% and h2 absorbed everything. What likely happened?
3
Locked
Step 3: tcpdump on an edge node shows incoming UDP/443 packets at near-zero. Confirm the hypothesis. How do you locate the block?
4
Locked
Step 4: while the network team coordinates the fix, what do you do customer-side?
Debug this

Observability dashboard: HTTP/3 fallback spike detection

log
# Prometheus metrics from CDN edge, April 2026

## ALPN protocol distribution (5-minute rolling average)
tls_alpn_protocol{protocol="h3"} 21.1%     # Normal baseline
tls_alpn_protocol{protocol="h2"} 51.3%
tls_alpn_protocol{protocol="http/1.1"} 27.6%

## Fallback rate (h3 attempted but fell back to h2)
http3_to_http2_fallback_rate 0.8%          # Baseline: normal ISP loss/blocking

## At 2026-04-15 14:35 UTC, alert fires: fallback rate spikes
http3_to_http2_fallback_rate 18.2%         # 23x normal
tls_alpn_protocol{protocol="h3"} 1.3%      # Collapsed from 21.1%

## Geolocation breakdown shows the spike is concentrated:
fallback_rate{region="US-Northeast", ISP="AS7922"} 67.4%
fallback_rate{region="US-Midwest"} 2.1%
fallback_rate{region="Europe"} 1.9%

## Network diagnostic from edge
$ mtr -P UDP/443 route-to-AS7922-customer
hop 5: (AS7922 transit edge) UDP packets: 0% loss [OK]
hop 6: (AS7922 DDoS scrubber, new rule: 2026-04-15 14:00 UTC)
     UDP packets: 100% loss [BLOCKED]
hop 7+: unreachable

An HTTP/3 fallback spike appears for one ISP's traffic (AS7922, US-Northeast) but nowhere else. MTR shows UDP reaching hop 5 cleanly. Where is the block and what is the likely cause?

Debug this

curl -v output to a misconfigured HTTP/3 server

log
$ curl --http3-only -v https://api.example.com/health
* Trying 203.0.113.42:443...
* QUIC handshake fail: protocol violation: invalid initial packet
* Closing connection
curl: (35) QUIC handshake fail: protocol violation: invalid initial packet

$ curl -v https://api.example.com/health
* Connected to api.example.com (203.0.113.42) port 443
* ALPN: offers h2,http/1.1
* ALPN: server accepted h2
< HTTP/2 200 OK
< alt-svc: h3=":443"; ma=86400
< server: nginx/1.27.0
< x-quic-version: draft-29
< content-type: application/json
{"status":"ok"}

HTTP/3 explicitly fails with 'protocol violation: invalid initial packet' but the server advertises h3 via Alt-Svc. What is the misconfiguration, and what is the fix?

Which RFC?

Which RFC defines the QUIC transport protocol (not HTTP/3 itself, not TLS-over-QUIC)?

Design challenge

You are deploying HTTP/3 for a global API serving 10k req/s from 50+ countries, expecting 5–15% of paths to block UDP/443. Design the HTTP/3 rollout strategy.

  • Some corporate and carrier networks block UDP/443 silently.
  • You cannot break existing HTTP/1.1 and HTTP/2 clients.
  • Mobile users on cellular are a priority target for HTTP/3's loss isolation.
  • SRE team needs to detect UDP-blocking incidents within 5 minutes.
Recall before you leave
  1. 01
    Explain why QPACK exists separately from HPACK.
  2. 02
    An engineer says HTTP/3 always beats HTTP/2 because of QUIC streams. Where is the flaw?
  3. 03
    What is connection coalescing and what breaks it?
Recap

HTTP/3 in production is fundamentally about UDP path reliability and operational observability. QUIC stream independence means one lost packet stalls only the streams whose data it carried — but this requires QUIC’s packet-based loss tracking (not TCP’s byte-stream model) and QPACK’s encoder/decoder control streams (not HPACK’s ordered dynamic table). QUIC congestion control runs in userspace — more tunable but more CPU-intensive. Version negotiation for HTTP/3 uses Alt-Svc or HTTPS/SVCB DNS and silently falls back to HTTP/2 if UDP/443 is blocked — the ALPN distribution metric is the signal. Connection coalescing saves handshake cost across origins sharing the same IP and cert SAN. 0-RTT saves a round-trip but requires server-side anti-replay and must be restricted to idempotent methods.

Connected lessons
appears again in5
Continue the climb ↑HTTP design: priorities, WebTransport, and semantic correctness
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources7
expand
  1. 01
  2. 02
  3. 03
  4. 04
  5. 05
  6. 06
  7. 07

Trademarks belong to their respective owners. Editorial reference only.