Browser & Frontend Runtime WEB · 02 · 08

Production observability: LoAF, INP, and the full attack surface

LoAF and INP in production telemetry, off-main-thread scroll, display locking, reduced-motion, Web Workers, Service Workers, CI testing, and the complete render-pipeline attack surface.

WEB Senior ◷ 18 min

Level

FoundationsJuniorMiddleSenior

Your page scores 95 on Lighthouse in the office. On a user’s Pixel 4a in Jakarta with 3G throttling, the INP is 340 ms. These are two different measurements of two different things. Only one of them matters.

LoAF and INP: two complementary signals

PerformanceLongAnimationFrameTiming (LoAF, shipped 2023–2024 in Chromium) reports any frame that took longer than 50 ms, with a breakdown of what dominated: render time, blocking JS, forced layout. It is a diagnostic tool — it tells you what ran long, not what the user felt.

INP (Interaction to Next Paint) measures the time from a user input event (click, key, tap) to the next paint that visibly responds. INP became a Core Web Vital in March 2024, replacing FID. A poor INP score (>200 ms p75) almost always traces to one of two pipeline problems:

A long JS task on the input handler delaying rAF
A forced sync layout from reading geometry inside the handler

LoAF gives you the data to attribute these in production; INP gives you the metric the user feels. Together they close the loop from production telemetry back to specific pipeline diagnostics.

INP diagnosis path

INP > 200 ms — user perceives sluggish interactions

↓ investigate with

LoAF — which frame, what dominated (JS / layout / render)

↓ maps to

Long JS task in handler → yield with scheduler.yield() or split with setTimeout

Forced sync layout in handler → two-pass read/write batch

INP measures the latency the user felt; LoAF attributes the slow frame back to a pipeline cause. Each cause maps to one targeted fix — yielding for a long JS task, a two-pass batch for a forced layout. This closes the loop from production telemetry to a code change.

Off-main-thread scroll

You might think scroll is always “free” because users experience it as instant. When you find out it isn’t, understanding why it fell back to the main thread tells you exactly how to fix it.

Browsers have shipped compositor-thread scrolling for a decade: when the user scrolls a normal page, the compositor translates the viewport on the GPU without touching the main thread. A page can scroll smoothly even while a long JS task runs.

The catch: any element with a JS scroll handler attached non-passively (without {passive: true}) forces the browser to fall back to main-thread scrolling, because the handler might call preventDefault(). Always pass passive: true to scroll, wheel, and touchmove listeners unless you actually need preventDefault. Modern browsers warn in DevTools when a non-passive listener delays scrolling.

Display locking and content-visibility

The mechanism behind content-visibility: auto is “display locking”: the browser pauses rendering for the locked subtree (no style calc, no layout, no paint) and replaces it with an intrinsic-size placeholder. When the subtree intersects the viewport via the browser’s internal IntersectionObserver, it unlocks and renders.

Numbers: a 10 000-row table that previously cost 1 200 ms of style + layout renders in ~10 ms with content-visibility: auto.

Trade-off: scrolling into a previously locked region briefly pauses to render it. If rows are expensive, you see a single-frame stutter at the unlock boundary. Pair with contain-intrinsic-size to give the browser a realistic placeholder size so scroll position is not jumpy.

Reduced-motion as a render-budget escape valve

@media (prefers-reduced-motion: reduce) is set by users who experience motion sickness or want lower power consumption. Beyond accessibility, it is a render-budget escape valve: any compositor-driven animation can be replaced by an instant state change when reduced-motion is on, freeing the compositor of per-frame work. Battery-constrained users on a budget Android phone get a meaningful battery saving.

Web Workers and main-thread offload

When the main thread hits its ceiling, the only way to lower it is to move work. Web Workers execute JS on a separate thread without DOM access; serialisation via postMessage costs ~1 ms/MB, which is worthwhile for CPU-heavy tasks (large JSON parse, compression, cryptography, markup processing).

OffscreenCanvas gives direct canvas API access from a worker, bypassing the main thread entirely. SharedArrayBuffer + Atomics provide synchronisation primitives between threads for high-frequency data (audio, real-time sensor feeds). The entry cost is high (structured cloning, message-passing architecture), but the performance ceiling is proportionally higher: a 4-core phone with a busy main thread still has 3 idle cores most applications never use.

Service Worker and the first-paint pipeline

Service Workers are not part of the render pipeline, but they critically affect its input. A Service Worker that answers requests from cache (cache-first for static assets, network-first for API) delivers first paint in 50 ms on return visits instead of 500 ms — 4 frames vs 30 frames at 60 fps.

Key constraint: the Service Worker script itself runs on a separate thread, but its startup (when it intercepts the first request) has a small latency cost. Keep the Service Worker thin and fast; a 200 ms startup on the fetch interception delays the first HTML byte and the entire parse pipeline with it.

Performance CI: catching regressions before merge

Budgets hold only when regressions are caught before merge. A realistic CI pipeline has three levels:

Lighthouse CI in headless Chrome on every PR — gates on absolute metrics (LCP < 2.5 s, INP < 200 ms, CLS < 0.1), blocks merge on regression.
Synthetic benchmarks on critical scenarios (open page, scroll list 1000×, click 5 buttons) — measures p95 frame duration and p95 LoAF, compares to baseline branch.
RUM (real user monitoring) on production — sends INP, LCP, CLS percentiles to Datadog or similar, alerts when p75 INP crosses 200 ms on any user segment (country, device, app version).

Without all three levels, a render regression lands in production, lives there silently for months, and is discovered when a user complaint finally makes someone look.

Profiling on real hardware vs DevTools throttling. The “4× CPU slowdown” in DevTools Performance is an approximation, not a replacement for real hardware. An M2 MacBook at 4× slowdown still has a different memory model, different GPU, and different thermal throttling profile than a Pixel 6a. Profile on a physical mid-range Android at least once a week using Chrome for Android + remote debugging, and compare frame durations. A >30% difference means you have a mobile regression invisible on desktop.

▸Edge cases

Layer squashing is the compositor’s answer to overlap-induced layer explosion. When many adjacent non-animating elements are promoted (due to overlap with a single animated layer), the compositor squashes them into a single shared “squashed layer” bitmap. This reduces GPU memory at the cost of a larger single bitmap — if one element in the squash changes, the entire squashed layer must repaint. The squashing heuristic is not user-controllable; the only remedy is to isolate animated layers from non-animating neighbours so the overlap rule does not trigger.

Design challenge

Design the scrolling behaviour for a virtualised chat list that holds 50 000 messages and must hit 60 fps on a mid-range Android phone.

Frame budget: 16.67 ms. Realistic main-thread budget after browser overhead: ~10 ms.
Layout must not depend on off-screen rows.
Composite-only path during scroll. Layout and paint allowed only when new rows enter the viewport.
GPU memory: assume 200 MB available. Layer count must stay below 30 at any time.
Resize handler must not loop reads and writes (no forced reflow).

Quiz

A click handler runs in 80 ms. The user perceives a delay before the UI updates. Which Core Web Vital measures this and what does the pipeline have to do with it?

Quiz

A touchmove listener is attached without `{passive: true}`. What performance regression does this cause?

Recall before you leave

01
What does LoAF report, and how does it differ from INP?
02
Why do non-passive scroll/touchmove listeners cause jank?
03
Name the three CI levels for catching render regressions before production.

Recap

LoAF attributes long frames; INP measures the interaction latency the user feels — poor INP (>200 ms p75) traces to a long JS task or a forced sync layout in the input handler. Non-passive scroll listeners fall back to main-thread scrolling; passive: true restores compositor-thread scroll. content-visibility: auto display-locks off-screen subtrees, dropping a 1200 ms layout to 10 ms. @media (prefers-reduced-motion) is both an accessibility requirement and a render-budget optimisation. Web Workers offload CPU-heavy JS from the main thread; Service Workers serve first paint from cache. CI needs three layers: Lighthouse per PR, synthetic p95 LoAF on critical paths, and RUM alerts on production INP percentiles. The complete attack surface — parser-blocking scripts, oversized CSS, complex selectors, deep flex layouts, paint-heavy filters, layer overflow, layout thrash, non-passive listeners, tasks >50 ms, and long input handlers — maps one-to-one onto specific pipeline stages. Now when a user on a budget Android in Jakarta reports a sluggish tap, you have a telemetry loop: INP gives you the interaction, LoAF gives you the frame, the stage-cost map gives you the fix.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

BeginMainFrame, compositor-driven animations, and GPU memorysenior

appears again in188

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.