Browser & Frontend Runtime
CLS: why layout shifts happen and how to stop them
An article loads fast. The user starts reading the second paragraph. Then an ad slot above the article fills in and shoves everything down by 300 px. The user was mid-sentence; now they are mid-paragraph-of-the-wrong-article. They did not tap anything. The page moved by itself. That is CLS — and it is the easiest of the three vitals to prevent once you understand the one rule behind all four classic causes.
How the score is calculated.
Each time a visible element changes position between two frames without a user-initiated cause, the browser records a layout shift. One shift’s score is:
impact fraction × distance fraction
Impact fraction: the combined area of all shifted elements, as a fraction of the viewport. If an image that covers half the viewport jumps, impact fraction is 0.5. Distance fraction: the largest distance any element moved, as a fraction of the viewport height. A shift of 20% of viewport height gives 0.2.
So a single shift that moves a half-viewport element by 20% of viewport height scores 0.5 × 0.2 = 0.1 — exactly on the “good” border.
CLS is not a lifetime sum of all shifts. It is the sum of shifts in the worst session window: a cluster of shifts where each shift is within 1 second of the previous, and the entire cluster spans at most 5 seconds. This change from the original algorithm was deliberate — the old sum unfairly penalised long-lived pages and infinite scroll.
- Shift score
- impact fraction × distance fraction
- CLS reported
- worst 5 s session window sum
- Shift exclusion window after input
- 500 ms
- Good threshold (p75)
- ≤0.1
The 500 ms exclusion — what CLS does not punish.
Shifts within 500 ms of a user interaction are excluded. Opening an accordion, expanding a dropdown, clicking a “show more” button — the resulting movement is expected by the user and does not count toward CLS. CLS only punishes movement the user did not cause and did not expect.
Four classic causes and their fixes.
-
Images without dimensions. An
<img>with nowidthandheightattributes has a 0-height box until the bytes arrive. When the image loads, the browser learns its intrinsic size, reruns layout, and every element below jumps down. Fix: always setwidthandheightHTML attributes (modern browsers deriveaspect-ratiofrom them automatically and reserve a correctly-proportioned box). CSS can still make the image fluid withwidth: 100%; height: auto— the attributes supply the ratio, the CSS supplies responsive sizing. -
Ads, embeds, and iframes injected into unreserved space. An ad slot that renders 250 px of content into a container with no reserved height shoves everything below it down. Fix: wrap every ad and embed slot in a container with a
min-heightequal to the largest expected ad creative. -
Web font reflow. A fallback font with different line metrics renders first; when the web font loads, the browser reflows text — characters are wider or narrower, lines rebreak, elements below shift. Fix: use
size-adjustand theascent-override/descent-overridefont descriptors to make the fallback metrics match the web font; or usefont-display: optionalto only apply the web font on repeat visits. -
Dynamically injected content above existing content. A cookie banner, notification bar, or chat widget inserted above the page body at runtime pushes everything down. Fix: insert it into pre-reserved space (a container with a known height in the layout), or render it as an overlay (fixed/absolute) so it does not participate in flow layout at all.
The unifying principle: reserve space for anything whose final size is not known at parse time, or ensure it never participates in flow layout.
Animation pitfall.
Animating layout properties — top, left, height, width, margin — does generate layout shifts and can hurt CLS, even if the movement looks intentional to a developer. The fix is to animate transform instead (translateY, scale), which is composited and never triggers layout. A shift caused by a CSS animation is still a shift unless it follows a user interaction within 500 ms.
Why this works
The session window model replaced the original “lifetime sum” model because long-lived pages and infinite scroll were penalised for shifts that happened five minutes after load — well outside the user’s reading context. The session window focuses CLS on bursts of bad behavior: a cluster of shifts during an ad reload, or a batch of images loading without dimensions, rather than the accumulated cost of a site someone browses for an hour. CLS is now more representative of what a user actually notices.
Reserve space for a late-loading image to eliminate CLS
1/3A user opens an accordion and the content below it moves down. Does this count toward CLS?
A cookie banner is injected above the article body after the page loads, pushing all content down 60 px. What is the correct CLS fix?
- 01How is one layout shift's score calculated, and what is the CLS session window?
- 02Name the four classic CLS causes and the fix for each.
- 03A CSS animation moves an element using 'top' and the page fails its CLS audit. How do you fix it without removing the animation?
CLS scores the worst burst of unexpected layout movement: one shift’s score is impact fraction × distance fraction, and CLS reports the worst 5-second session window. Shifts within 500 ms of a user interaction are excluded — CLS only punishes movement the user did not cause. The four classic causes are images without dimensions, ads and embeds in containers with no reserved height, web font reflow when the fallback metrics differ, and dynamically injected content above existing flow. The fix for all four follows one rule: anything whose final size is unknown at parse time must have space reserved before it arrives, or it must use overlay positioning so it does not participate in flow layout. Animating layout properties (top, left, height) generates shifts — animate transform instead.
appears again in143
- Why GraphQL gets N+1junior
- DataLoader mechanics: tick-boundary batchingmiddle
- Batch function contracts: ordering, shapes, errorsmiddle
- Federation and lookahead: batching beyond DataLoadermiddle
- Query complexity defences: depth, cost, persisted queriesmiddle
- Senior GraphQL API: scheduling contract, tenant isolation, observabilitysenior
- Why idempotency: making retries safejunior
- Server-side state machine: four states of an idempotency keymiddle
- Outbox and inbox: effectively-once across the dual-write boundarymiddle
- Concurrency and cache architecture for idempotency at scalesenior
- Observability, production failures, and global-scale designsenior
- What is a cache stampede and why it makes things worsejunior
- Lock and single-flight: bounding concurrent rebuildsmiddle
- XFetch: coordination-free probabilistic early expirationmiddle
- Stale-while-revalidate and CDN request coalescingmiddle
- Detecting stampedes and designing TTL for productionmiddle
- Metastable failure, fencing tokens, and production postmortemssenior
- What a relation is: tables, rows, keys, and constraintsjunior
- Constraints, keys, and Postgres data typesmiddle
- Normal forms, denormalization, and why schemas stickmiddle
- JSONB, arrays, and when a side table winsmiddle
- Heap storage, TOAST, and column alignmentsenior
- Schema integrity: deferral, versioning, and production failure modessenior
- Relational vs document, wide-column, graph, and key-valuesenior
- Index-only scans, the Visibility Map, and INCLUDEsenior
- Production failure modes and the index audit playbooksenior
- pg_statistic, ANALYZE, and production observabilitymiddle
- Production failure modes and plan stabilitysenior
- MVCC: why readers and writers never wait for each otherjunior
- Row versions and snapshots: the on-disk mechanicsmiddle
- HOT updates and isolation levels: what you gain and what you paymiddle
- Vacuum and bloat: keeping the storage tax boundedmiddle
- CLOG, XID wraparound, and MultiXact: deep visibility internalssenior
- SSI internals and production autovacuum tuningsenior
- Real-world MVCC failures, deployment patterns, and distributed snapshotssenior
- Connection pools: amortising the cost of a Postgres backendjunior
- PgBouncer session, transaction, and statement modesmiddle
- Pool sizing: the (cores × 2) + spindles formula and the two-layer stackmiddle
- Pool exhaustion and idle-in-transaction: the 3 AM failure modemiddle
- Migrating to transaction mode: rollout playbook and PgBouncer 1.21 prepared statementsmiddle
- The Postgres process model and why raising max_connections degrades throughputsenior
- Pooler landscape 2026, serverless connection storms, and the full failure-mode taxonomysenior
- What a schema migration is and why it replaces ad-hoc DDLjunior
- ADD COLUMN: instant in PG 11+ vs rewrite in older Postgresjunior
- The lock-queue failure mode: why instant DDL can freeze the databasemiddle
- Safe DDL patterns: NOT VALID, CONCURRENTLY, and unsafe-op fixesmiddle
- Expand-contract: zero-downtime for breaking schema changesmiddle
- Advisory locks, migration tools, and deploy coordinationsenior
- Migration failure taxonomy and production disciplinesenior
- Why sharding exists: the single-Postgres ceilingjunior
- Shard-key selection: hash, range, list, and directory strategiesmiddle
- Partitioning vs sharding: same word, two different thingsmiddle
- Co-location and Citus: the invariant that makes sharding usablemiddle
- The hot-shard failure mode: detection, isolation, and durable policymiddle
- Schema-based sharding and multi-tenancy alternativessenior
- Online resharding, 2PC, and the operational cost of shardingsenior
- The seven acts: from CREATE TABLE to Citusjunior
- Acts 1–3 in depth: schema, indexes, and planner statisticsmiddle
- Acts 4–6 in depth: MVCC bloat, connection pooling, and safe migrationsmiddle
- Act 7 in depth: sharding, co-location, and the seven-tier tradeoff cascademiddle
- Observability, anti-patterns, and production triagesenior
- Raft roles, terms, and why majority quorums prevent split brainjunior
- How Raft replicates a log entry and decides it is safe to commitmiddle
- Raft leader election: timeouts, voting rules, and the four safety propertiesmiddle
- Raft in the real world: partitions, slow disks, and client routingmiddle
- Raft extensions: pre-vote, learners, snapshots, and linearizable readssenior
- Raft in production: membership changes, Multi-Raft, and observabilitysenior
- Where data fetching happens — and why it decides LCPjunior
- Fetch waterfalls — diagnosis and the Promise.all curemiddle
- React Server Components and Suspense streamingmiddle
- Client-side cache: TanStack Query, SWR, and stale-while-revalidatemiddle
- LCP, prefetch, and race conditions in interactive fetchingmiddle
- Senior internals: RSC payload, caching layers, and production failure modessenior
- The three-way handshakejunior
- Sequence numbers and connection statemiddle
- DNS: what it does and why it existsjunior
- The resolver walk: referrals, record types, and gluemiddle
- TTL, caching, and DNS propagationmiddle
- The 1-RTT handshake: key shares and ECDHEmiddle
- Session resumption and 0-RTTmiddle
- WebSocket: the HTTP upgrade handshakejunior
- WebSocket frame format: opcodes, masking, fragmentationmiddle
- WebSocket backpressure: when clients can''''t keep upmiddle
- Reconnection: jittered backoff, thundering herd, message resumptionsenior
- WebSocket at scale: HTTP/2 multiplexing, permessage-deflate, C10Msenior
- WebSocket in production: proxies, security, and distributed architecturesenior
- What reverse proxies dojunior
- Health checks, connection draining, and slow startmiddle
- Session affinity, consistent hashing, and the right fixmiddle
- Retry storms, circuit breakers, and load sheddingsenior
- Resilient LB architecture: anycast, zone-aware routing, and observabilitysenior
- Why QUIC and not TCP+TLSjunior
- Connection IDs and network migrationmiddle
- 0-RTT resumption and packet encryptionsenior
- DDoS: what it is and why it worksjunior
- Amplification attacks and state exhaustionmiddle
- Rate limiting: algorithms and architecturemiddle
- WAFs, firewalls, mTLS, and HSTSmiddle
- DNS cache poisoning and BGP hijackingsenior
- Defense-in-depth architecture and attack economicssenior
- DNS, TCP, TLS in sequence: where the milliseconds gomiddle
- Proxy intercepts and security gates: rate limiters, WAF, mTLSmiddle
- Alternate paths: QUIC 0-RTT, WebSocket upgrade, connection migrationmiddle
- Observability: distributed traces, USE/RED, and samplingsenior
- Resilience: cascading retries, circuit breakers, and error budgetssenior
- What the three signals are: logs, metrics, and tracesjunior
- Why structured logs exist: the diary vs the spreadsheetjunior
- The production log schema: fields every line must carrymiddle
- PII redaction and log injectionsenior
- OTel Logs Data Model and audit logs as a subsystemsenior
- SLI, SLO, and the error budget: reliability by the numbersjunior
- Error budget policy, latency SLOs, and composite journeysmiddle
- Production SLO failures, self-observability, security, and the big picturesenior
- The incident loop: from pager to postmortem to preventionmiddle
- Cache lines, struct layout, and false sharingmiddle
- SIMD, SoA vs AoS, and memory bandwidthmiddle
- Cache-oblivious algorithms, PGO, and production failuressenior
- GC in production: observability, security, edge cases, and fleet governancesenior
- Batching: amortize fixed cost per operationjunior
- The batching window: size and wait timemiddle
- Batching in Kafka and Postgresmiddle
- io_uring and observability of batchingmiddle
- From Nagle to io_uring: evolution of batchingmiddle
- Backpressure, failure isolation, and batch security in productionsenior
- CI enforcement and RUM: making budgets stickmiddle
- V8 JIT pipeline, HTTP priorities, and bundle securitysenior
- The performance loop: discipline, not a projectjunior
- Classify and fix: matching bottleneck families to remediesmiddle
- Observability stack and CI gates: catching regressions before they shipmiddle
- Incident to enforcement: SLO burn to verified fix in 35 minutesmiddle
- Culture, economics, and org-scale performancesenior
- At-most-once, at-least-once, exactly-once: the three delivery contractsjunior
- The three failure legs — where duplicates and losses actually happenmiddle
- Consumer-side dedup: the cheapest path to exactly-once processingmiddle
- Kafka exactly-once semantics: idempotent producer and transactionsmiddle
- SQS visibility timeout, DLQ, and the outbox patternmiddle
- Exactly-once in production: impossibility proof, hybrid patterns, and real incidentssenior
- What OAuth is and why passwords are not the answerjunior
- Authorization code flow with PKCEmiddle
- ID token validation and JWKS cache managementmiddle
- Refresh token rotation and scope-based least privilegemiddle
- Sender-constrained tokens: DPoP and mTLSsenior
- OAuth in production: audience attacks, observability, and real failuressenior