Networking & Protocols NET · 07 · 04

Vary header and cache keys

How CDN cache keys are built, why Vary: User-Agent destroys hit rates, and why a missing Vary: Cookie can serve one user''''s account page to another.

NET Middle ◷ 12 min

Level

FoundationsJuniorMiddleSenior

Your CDN hit rate is 5% despite setting max-age=3600. The dashboard shows thousands of unique cache entries per URL. Somewhere in the origin response headers there is a Vary line that is silently fragmenting every URL into thousands of separate cache keys — one per unique browser header combination. This is the most common CDN misconfiguration in production.

How the cache key is built

A CDN cache stores responses indexed by a cache key. The default key is (URL, HTTP method, Host). The Vary response header extends the key with additional request header dimensions:

Cache-Control: public, max-age=3600
Vary: Accept-Encoding, Accept-Language

This tells the cache: maintain separate entries for each distinct value of Accept-Encoding and Accept-Language. A request for /page with Accept-Language: en and another with Accept-Language: ru create two separate cache entries — both correct language versions are served from cache.

RFC 7234 §4.1 formalises this: “A cache MUST use the value of the Vary header field to determine if a stored response may be used for a given request.”

The Vary footgun

Vary works correctly for Accept-Encoding (gzip vs brotli — ~5 distinct values) and Accept-Language (~10–50 values per site). But some headers have thousands or millions of distinct values:

Vary header	Distinct values	Effect
`Accept-Encoding`	~5 (gzip, br, identity…)	Safe — small fragmentation
`Accept-Language`	~50 per typical site	Usually fine
`User-Agent`	Thousands (every browser × version × OS)	Cache destroyed
`Cookie`	Per-user (millions)	Effectively uncacheable
`Authorization`	Per authenticated user	Effectively uncacheable

Vary: User-Agent is the classic footgun: developers add it to serve different HTML for mobile vs. desktop. The result is one cache entry per unique User-Agent string — every Chrome release, every iOS version, every Safari build. The cache fills up with unique entries that will never be reused; hit rate plummets to near zero. When you see a CDN hit rate below 20% on a “static” page, the first thing to check is the Vary header — nine times out of ten, a high-cardinality dimension crept in.

Vary header impact on cache hit rate

No Vary (or Vary: Accept-Encoding only): 90–98% hit rate (normal operation)
Vary: Accept-Language added (10 values): 80–90% — small fragmentation
Vary: User-Agent added: <10% — near-zero effective caching
Vary: Cookie added: ~0% — per-user entries never reused
Vary: Authorization added: ~0% — per-token entries never reused

Hit rate tracks the inverse of header cardinality: low-cardinality dimensions stay safe, but User-Agent and Cookie shatter one URL into near-unique keys that are never reused.

Cache key surprises

1/3

Cookies are NOT in the cache key by default

This is the most security-relevant Vary fact. Cookies are not part of the default cache key. If an endpoint reads a session cookie and returns personalised content:

Missing Cache-Control: private → CDN caches the personalised response
Missing Vary: Cookie → different users with different cookies get the same cached response

Result: one user’s account page is served to another user — a data leak.

Rule of thumb: for any endpoint that reads a cookie, set Cache-Control: private. Only use Vary: Cookie if you need CDN caching for cookie-differentiated content (rare, complex, requires a cookie whose value is stable and low-cardinality).

▸Edge cases

Cache poisoning via unkeyed headers. James Kettle’s research (PortSwigger, 2018) documented how CDNs can be tricked into caching a malicious response. If the origin reads a request header (e.g. X-Forwarded-Host) to build the response body, but that header is NOT included in the CDN’s cache key (not in Vary), an attacker can send a crafted request with X-Forwarded-Host: evil.com — origin returns a page with evil.com links, CDN caches it by URL only, and serves the poisoned page to all subsequent users. Mitigation: keep CDN and origin tightly aligned on which headers are key-significant; strip unrecognised forwarding headers at the CDN edge; use cache-key normalisation to drop non-significant query params.

Quiz

What does Vary: Accept-Encoding actually do in a CDN cache?

Quiz

What is the practical effect of Vary: Authorization on a shared CDN cache?

Order the steps

Order these Vary header values from safest (least cache fragmentation) to most dangerous (most fragmentation):

1 Vary: Accept-Encoding — ~5 distinct values (gzip, brotli, identity, deflate, none)
2 Vary: Accept-Language — ~10–50 values per typical site
3 Vary: Accept — content-type negotiation, ~5–10 values
4 Vary: User-Agent — thousands of unique values across browsers and versions
5 Vary: Cookie — potentially millions of unique values (one per active session)

The cache key = URL + method + every header named in Vary. Vary: Accept-Language makes one entry per language (safe, ~10–50). Vary: User-Agent makes one entry per browser×version×OS string — thousands of near-unique keys that are never reused, collapsing the hit rate.

Recall before you leave

01
Explain why Vary: Accept-Encoding is required for an endpoint that serves gzipped responses to some clients and uncompressed to others.
02
A news site's article page has Vary: User-Agent to serve different layouts for mobile vs. desktop. Cache hit rate is 3%. What is the root cause and how do you fix it?
03
Why are cookies not in the default CDN cache key, and what is the security risk this creates?

Recap

The CDN cache key is built from the URL, HTTP method, Host, and any request headers listed in the response’s Vary header. Vary: Accept-Encoding is safe — only ~5 possible values. Vary: Accept-Language is usually safe — ~10–50 values. Vary: User-Agent is catastrophic — thousands of unique Browser×version×OS combinations, each becoming a separate cache entry that is never reused, collapsing hit rates to near zero. Cookies are NOT part of the default key, which is an efficiency choice that creates a security risk: endpoints that serve personalised content without Cache-Control: private can leak one user’s response to another. Audit every route that reads cookies or auth headers and mark it private or no-store. Now when you ship a new endpoint, make the Vary audit part of the definition-of-done: low-cardinality headers only, and private on anything that touches a session.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

deepens into

appears again in165

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.