Networking & Protocols NET · 07 · 06

Edge workers and edge-side composition

How V8 isolates at the CDN edge run custom logic in 2–5 ms cold starts, and how ESI and HTMLRewriter compose responses from independently-cached fragments.

NET Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

A product page needs a globally-cached page body (10-minute TTL), real-time personalised pricing (no cache, edge-fetched per user), and a shared comment count (60-second TTL). If you send all three to origin per request, p95 latency is 300 ms. If you compose the page at the edge from three independently-cached fragments, it drops to 30 ms. That is the promise of edge-side composition.

Edge workers: the compute model

Every major CDN now runs user code at the edge — not just static files:

Cloudflare Workers: V8 isolates. Cold start 2–5 ms p99. 128 MiB memory limit. 50 ms CPU wall-time per request (extendable with Workers Unbound). Thousands of customer functions share one V8 process via isolate isolation — negligible per-function startup overhead.
Fastly Compute@Edge: WebAssembly. Cold start similar to Workers.
AWS Lambda@Edge: full Node.js/Python runtime. Cold start 400–600 ms at some regional POPs — a 100× penalty over Workers for latency-sensitive paths.
Vercel Fluid Compute (2026): V8-based, persistent warm instances, concurrent request handling within a single isolate. 99.37% zero-cold-start ratio. 1.2×–5× faster than Workers for heavy SSR (template rendering, large page assembly). 50 ms wall-time budget.

Edge worker runtime comparison (2026)

Cloudflare Workers cold start (p99): 2–5 ms
Lambda@Edge cold start (some PoPs): 400–600 ms
Vercel Fluid Compute zero-cold-start ratio: 99.37% of requests
Workers CPU wall-time budget: 50 ms per request
Workers memory limit: 128 MiB
Workers isolate model: Thousands of customers per V8 process

Use cases for edge workers

Workers sit in the request path and can do anything a proxy can do with code:

Auth at edge: validate JWT or session token at the edge POP without a round-trip to origin. Reject invalid requests before they ever hit your servers.
A/B test routing: read a user cookie or experiment ID and rewrite the request URL to /variant-a/ or /variant-b/ at the edge.
Geo-redirect: Workers have access to the user’s country code (from Cloudflare CF-IPCountry header) — redirect / to /en/ or /de/ without origin involvement.
Request/response mutation: add security headers (Strict-Transport-Security, X-Content-Type-Options) to every response without modifying origin.
Dynamic routing: fan out to multiple microservices, merge responses, return combined JSON — all within the edge POP, not a central region.

Together these use-cases shift logic from a centralised origin to a distributed edge layer. Without auth at edge, every bot probe still burns an origin thread; without mutation at edge, every response header fix requires an origin deploy.

The 50 ms wall-time budget

Workers cap CPU time at 50 ms per request. This eliminates: large-model ML inference, image resizing on large images, heavy database queries. It enables: JWT validation (~1 ms), URL rewriting (~0.1 ms), KV lookup (~3 ms), simple HTML mutation (~5 ms). Design workers to be thin routing/auth/mutation layers, not compute-heavy backends.

▸Why this works

Why V8 isolates instead of containers. A traditional serverless function (Lambda) requires a separate process or container per function — spinning up takes 400–600 ms. V8 isolates are memory-efficient sandboxes within one V8 process: each isolate has its own heap, no shared mutable state, but they share the process’s startup cost. Cloudflare’s model: thousands of customer functions share one running V8 process. Cold-starting a new isolate costs 2–5 ms, not 400 ms. The isolation guarantee is cryptographic (V8 sandbox) not OS-process-level, which is acceptable for CDN workloads running untrusted customer code.

The isolate-vs-container choice is the whole story: 5 ms versus 600 ms p99 is the 100x penalty that disqualifies Lambda@Edge for latency-sensitive request paths.

TLS resumption at the edge

A new TLS connection costs 1 RTT (~10 ms locally at edge). Session resumption via TLS 1.3 PSK (pre-shared keys) skips the handshake on reconnection. Modern CDNs replicate session tickets across all POPs: if a user connects to POP A, then reconnects via POP B (mobile network switch), POP B already has the session ticket — no re-handshake. This matters most on mobile (frequent handoffs between cell towers).

Edge-side composition (ESI and HTMLRewriter)

ESI (Edge Side Includes), W3C draft 2001, still in production at Akamai. HTML responses contain <esi:include src="/fragment/nav" /> placeholders the edge replaces with cached fragment responses before sending to the browser. Each fragment has its own cache key and TTL — the shared nav caches for 6 hours; a breaking-news banner caches for 30 seconds.

Modern replacement: Cloudflare Workers + HTMLRewriter API. The worker fetches the HTML response and uses HTMLRewriter’s streaming HTML parser to inject fragments into specific DOM positions — faster and more flexible than ESI.

The unifying idea: assemble the page at the edge from independently-cached pieces. A product page:

Page chrome (header, footer, nav): max-age=21600 (6 hours)
Product description: max-age=600 (10 minutes)
Price: fetched fresh from a regional pricing service on every request (~30 ms)
Personalised recommendation widget: fetched from KV store keyed by user-id (~3 ms)

Result: the full page is assembled at the edge in ~35 ms, no origin round-trip for 80% of the bytes.

Trace it

1/4

An e-commerce checkout page has mandatory edge-computed tax and shipping rates. The edge worker fetches from a regional microservice on cache miss. Optimise for cold-load scenarios.

Step 1 of 4

Step 1: what is the critical path from user request to response?

Locked

Step 2: tax/shipping rates cannot be cached (vary by postcode). But you can parallelise. How?

Locked

Step 3: edge worker memory is 128 MiB. You're storing all regional rate tables in KV to avoid the fetch. Why might this cause trouble?

Locked

Step 4: you've optimised the worker to 15 ms wall-time. How do you confirm users benefited?

Debug this

Edge worker performance degradation alert

log

$ curl -w "@format.txt" https://api.example.com/checkout
  time_starttransfer: 450ms
  time_connect: 12ms
  time_tls: 2ms
  time_firstbyte: 428ms

Edge diagnostics via worker:
Server-Timing: edge-worker=185ms; regional-service=240ms; kv-lookup=3ms

Worker logs (2026-05-15):
handler_start=0ms
kv_fetch=3ms
regional_service_start=5ms
regional_service_timeout=185ms
handler_end=188ms
total_wall_time=188ms (budget: 50ms × 4 extensions used)

Edge worker is using 4× the wall-time budget and users see 450 ms responses. What is the bottleneck?

Pick the best fit

An e-commerce site needs to serve product pages globally with millisecond-precision price updates. Pick the cache strategy.

Edge-side composition: the worker pulls each fragment from the tier that fits its freshness — chrome from the long-lived edge cache, the price fresh from origin per request, recommendations from KV by user-id — and stitches them into one response without a full origin round-trip for most bytes.

Recall before you leave

01
Why does Cloudflare Workers achieve 2–5 ms cold start while Lambda@Edge takes 400–600 ms?
02
Describe edge-side composition for a news page that has shared chrome, a breaking-news banner (must be fresh within 30 s), and article body (5 min staleness OK).
03
What is the 50 ms wall-time budget in Cloudflare Workers, and what does it exclude?

Recap

Edge workers run custom code at CDN POPs using V8 isolates (Cloudflare Workers: 2–5 ms p99 cold start) or WebAssembly. They enable auth validation, A/B routing, geo-redirect, request/response mutation, and real-time fragment fetching without origin round-trips. The 50 ms wall-time budget restricts compute-heavy tasks; those belong in regional Functions. Edge-side composition (ESI or Workers + HTMLRewriter) assembles responses from independently-cached fragments — page chrome cached for hours, personalised or real-time data fetched fresh per request — combining the cache efficiency of static assets with the freshness of dynamic data. Lambda@Edge’s 400–600 ms cold start disqualifies it for latency-sensitive paths; Vercel Fluid Compute achieves 99.37% zero-cold-start by keeping warm isolates. TLS session ticket replication across POPs amortises reconnection costs on mobile. Now when you design a product page that needs both shared content and per-user data, ask whether edge composition can serve the shared parts from cache and fetch only the delta from origin — most of the bytes can be cached even when the user context cannot.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

deepens into

CDN operations and observabilitysenior

appears again in165

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.