awesome-everything RU
↑ Back to the climb

Browser & Frontend Runtime

Stage costs and the renderer process model

Crux What drives the cost of each pipeline stage, how the renderer process splits work across threads, and why the main thread is the bottleneck.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 14 min

Your page parses fast on your M2 MacBook. It crawls on a mid-range Android. The bottleneck is never “the CPU is slow” — it is which thread is doing which work, and how much of it is on the thread that cannot be parallelised.

The renderer process model

A modern Chromium-family browser runs each tab in its own renderer process. Inside that process:

  • Main thread — runs HTML parsing, CSSOM construction, style, layout, paint setup, and your JavaScript. It is single-threaded by design: DOM, CSSOM, and JS state can only be mutated by one execution context at a time, otherwise consistency would be impossible.
  • Compositor thread — assembles the layer tree and decides which layers need new bitmaps.
  • Raster worker threads — rasterise individual tiles in parallel.
  • GPU process — takes rasterised tiles, uploads them as textures, and runs the final composite-and-display.

Firefox uses a similar split (Quantum CSS for parallel style, WebRender for GPU-driven compositing); Safari/WebKit splits across its WebContent process and the GPU process. Names differ; the architecture rhyme is universal.

Renderer process internals (Chromium)

Main thread

Parse HTML → CSSOM → Style → Layout → Paint setup

+ your JavaScript

Compositor thread

Assembles layer tree, decides dirty tiles

Raster workers (N)

Rasterise tiles in parallel

GPU process

Upload textures, composite and display

Why the main thread is the bottleneck

Any work you put on the main thread — parsing a JSON blob, deserialising a Redux state, running a layout, executing a click handler — competes for the same 16.67 ms window. The compositor and raster threads exist precisely so rendering work can leave the main thread and run in parallel.

That is the architectural justification for transform/opacity animations being “free”: they reach the GPU without touching the bottleneck thread.

Stage-by-stage cost drivers

Each stage has typical levers that blow up its cost.

Parse HTML scales with document bytes. A 500 KB SSR-rendered page parses faster than a 2 MB one, simply because there is less to walk. Synchronous <script> tags block the parser until they finish downloading and executing — modern best practice puts defer or async on every external script not strictly needed for first paint.

CSSOM cost grows with stylesheet bytes and rule count. An unused 800-rule CSS framework wastes parse time even if zero rules actually match anything on the page.

Style calc cost is roughly DOM size × selectors. A 5 000-node DOM with a 2 000-rule stylesheet is 10 million selector-match checks. Most selectors are skipped via a bloom filter, but :has(), descendant combinators with no ancestor anchor, and universal selectors defeat the filter and cost more.

Layout cost is roughly DOM depth × box dependencies. A deeply nested flexbox with auto sizing forces multiple measure passes; a flat grid with explicit cell sizes is one pass.

Paint cost is painted area × paint op count. box-shadow with a large blur radius and filter properties (blur, drop-shadow) are paint-heavy because each pixel requires multi-pixel sampling.

Composite cost is layer count × layer pixel area. The cheap stages are cheap by orders of magnitude, but only if the upstream stages don’t invalidate downstream.

Why this works

Why is the DOM single-threaded at all? Because two execution contexts writing the same DOM node concurrently without locking would require a full concurrent garbage collector and would still leave subtle race windows open. Java tried this with Swing’s UI thread rule; the browser inherited the same constraint. The single-thread rule is a deliberate correctness trade-off, not an oversight.

Quiz

You change a div's `top` property in a rAF loop. Which pipeline stages re-run per frame?

Quiz

You change `transform: translateX(...)` on a div that already has its own compositor layer. Which stages run on the main thread?

Trace it
1/4

DevTools Performance panel shows a 28 ms frame. Inside: 1 ms Parse HTML, 2 ms Recalculate Style, 18 ms Layout, 4 ms Paint, 1 ms Composite Layers, 2 ms idle. The page is scrolling a list of 5000 chat messages. Where is the time?

1
Step 1 of 4
Layout dominates at 18 ms. Most likely cause: every visible chat row is re-measured because something high in the DOM tree changed width
2
Locked
Paint dominates at 4 ms
3
Locked
Composite at 1 ms
4
Locked
Idle time at 2 ms means the page is starving
Compute it

A DOM has 5000 nodes. The stylesheet has 2000 rules. Roughly how many selector-match checks does style calc perform?

checks
Recall before you leave
  1. 01
    What four threads/processes does a Chromium renderer process use?
  2. 02
    Why is the main thread single-threaded?
  3. 03
    What is the cost driver for style recalculation?
Recap

The renderer process has four players: main thread, compositor thread, raster workers, and GPU process. Five of the six pipeline stages run on the single main thread — the same thread as your JavaScript — so every long task competes with rendering. Stage costs are predictable: parse scales with bytes, style calc with DOM × rules, layout with DOM depth × box dependencies, paint with area × ops, composite with layer count × pixel area. Composite-only animations (transform, opacity) skip the main thread entirely; that is why they are an order of magnitude cheaper than layout-triggering ones.

Connected lessons
appears again in143
Continue the climb ↑Invalidation, dirty bits, and contain
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.