Browser & Frontend Runtime WEB · 03 · 05

Orinoco GC: parallel scavenger, concurrent marking, and write barriers

How V8''''s Orinoco garbage collector separates young and old generation, runs minor GC in parallel, moves major GC marking to a background thread, and uses write barriers to stay correct while JS executes.

WEB Middle ◷ 14 min

Level

FoundationsJuniorMiddleSenior

A Node.js service runs fine for a week, then every 60 seconds there is a 5ms pause that shows up in p99 latency. The application code changed nothing. The GC cycle crossed an old-generation threshold. Understanding Orinoco is the difference between guessing and diagnosing.

The heap layout

Before you reach for --max-old-space-size to fix a memory problem, understand what you’re actually tuning: two completely different GC machines with different pause profiles.

The V8 heap is divided into two generations:

Young generation — newly allocated objects, typically 1–8 MB. Collected frequently (minor GC).
Old generation — long-lived objects, can be hundreds of MB. Collected infrequently (major GC).

Objects start in young gen. Those that survive two minor GCs are promoted to old gen. This generational design exploits the observation that most objects die young — minor GC can be fast because the live set is small.

Generational hypothesis: most objects die young. The young generation (1–8 MB) is collected often and cheaply by the copying Scavenger (<1 ms). Survivors of two minor GCs are promoted to the large old generation, collected rarely by Mark-Sweep-Compact (the major GC).

Minor GC: the parallel Scavenger

The scavenger uses a copying collector with two semispaces:

From-space holds live objects; to-space is empty.
On collection, V8 walks roots (stack, globals, registers), copies reachable young objects to to-space, updates pointers.
Dead objects are abandoned in from-space.
Survivors of two minor GCs are promoted to old generation.

The scavenger has been parallel since Orinoco’s first phase (V8 6.2, 2017): multiple worker threads share the work via dynamic work-stealing. Main-thread pauses are ~1ms because the young-gen live set is small and parallelism keeps wall-clock time low.

Major GC: Mark-Compact with concurrent marking

Old-generation collection traces all live objects, sweeps dead ones, and compacts survivors to reduce fragmentation. Pre-Orinoco this was a stop-the-world operation lasting hundreds of ms on large heaps.

Orinoco’s key innovation: concurrent marking — a background thread walks the heap while JavaScript executes on the main thread. The main thread pays a brief final-marking pause (single-digit ms) plus the sweep/compact phase (parallel across worker threads but still blocking). Result: main-thread pause reduced ~50% on WebGL-heavy workloads.

Orinoco GC numbers

Minor GC (Scavenger) pause: <1 ms typical
Concurrent marking pause reduction: ~50% on WebGL
Orinoco parallel scavenger shipped: V8 6.2 (2017)
Young gen heap size: 1–8 MB
Promotion threshold: survives 2 minor GCs
GC pause goal at 60fps: <16.6 ms per frame

Write barriers

For concurrent marking to be correct while JS simultaneously mutates the heap, V8 uses write barriers: small code emitted at every property write that may cross from a black (already-marked) object to a white (not-yet-visited) object.

V8 uses snapshot-at-the-beginning (Yuasa) semantics: when the main thread overwrites a reference to a white object, the barrier shades the original (overwritten) referent grey so the marker will still visit it. This preserves the invariant “every object reachable at marking start will be visited”, even as the main thread races ahead. The write barrier costs ~3–5 cycles per write; V8 invests heavily in keeping it cheap.

Modern V8 uses tri-color marking (white/grey/black) with snapshot-at-the-beginning semantics and hybrid concurrent + incremental scheduling. Incremental marking amortises GC work across JS execution with no observable single pause.

Order the steps

Order the steps of a parallel minor GC (Scavenger) cycle:

1 Allocation triggers young-gen heap full
2 Main thread initiates Scavenger, worker threads join
3 Walk roots: stack, globals, registers — find live young objects
4 Copy live objects from from-space to to-space using work-stealing
5 Update pointers across heap to point to new locations
6 Objects that survived two GC cycles are promoted to old gen
7 From-space is declared empty, to-space becomes the new from-space

Trace it

1/5

A Node.js process consumes 4GB RAM after a week of uptime. Trace the GC pattern.

Step 1 of 5

Step 1: 4GB after a week means slow leak, not crash. What V8 observable will show the trend?

Locked

Step 2: heap-after is steadily climbing 50MB/day. What kind of leak?

Locked

Step 3: how to find the leak source?

Locked

Step 4: leak found — an LRU cache without max size. Fix?

Locked

Step 5: prevent recurrence?

Quiz

Orinoco's scavenger reduces young-gen GC pause times via parallel work-stealing. Why is old-gen Mark-Compact still the harder problem?

Quiz

Why does concurrent marking need write barriers?

▸Why this works

Why is there a generational split at all? Most allocated objects die quickly — a temporary buffer, a promise, a React element per render. Collecting only the young generation (which is small) is much cheaper than scanning the whole heap. The old generation only needs collection infrequently. This “generational hypothesis” is empirically true for most programs and is why generational GCs became the industry standard in the 1990s.

Recall before you leave

01
Describe the minor GC (Scavenger) cycle in V8.
02
How does concurrent marking in Orinoco avoid corrupting the heap?
03
What is the most common cause of a slow Node.js memory leak, and how do you find it?

Recap

Orinoco is V8’s generational garbage collector. New objects land in the young generation (~1–8 MB); survivors of two minor GC cycles are promoted to old generation. Minor GC uses a copying scavenger that runs in parallel across worker threads — pauses are typically under 1ms. Major GC (old gen) uses mark-compact; concurrent marking moves the object-graph traversal to a background thread while JS runs, reducing main-thread pauses by ~50% on WebGL-heavy workloads. Write barriers at every property write keep the concurrent marker correct when the JS thread modifies references mid-cycle. Incremental marking further amortises work across JS execution. The fundamental tradeoff: 3–5 cycles per write (barrier overhead) in exchange for eliminating hundreds-of-millisecond stop-the-world pauses that were the norm before Orinoco. Now when you see periodic p99 spikes every N seconds in a Node service, your first question should be: is the old-generation GC cycle crossing a threshold? Check --trace-gc before reaching for a cache or a load balancer.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

What V8 is and why performance varies 100×junior

unlocks

deepens into

V8 in production: isolates, pointer compression, and real failuressenior

appears again in169

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.