awesome-everything RU
↑ Back to the climb

Performance

What makes a hot path: symptom vs cause

Crux A wide flame-graph frame names where time accumulates, not why. The same leaf can hide five different bottlenecks — each demanding a different fix.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at junior altitude — the surface
◷ 12 min

The profile is done. One flame-graph frame is wide. Two engineers want to switch template engines. A third engineer asks: “Wide from what? CPU work, allocations, lock contention, or a syscall?” Only one of those four has “switch template engines” as the right fix.

What a hot path is

A hot path is a sequence of calls the program spends most of its time in. The profile shows it as a stack of wide frames climbing from a leaf back to a top-level entry. The leaf names a function; the question is why that function is expensive.

Modern hardware turns the same “1 second of CPU” into very different problems depending on what the CPU was actually doing: executing instructions, waiting for memory, waiting for a lock, waiting for a syscall to return. The diagnosis decides which family of fix applies.

Applying the wrong fix to the right hotspot is the second most common waste in performance work — after optimising the wrong hotspot entirely (covered in the profile-first unit).

The waiting room metaphor

A doctor’s waiting room is full. That tells you the room is busy — not why. Are patients waiting for the doctor, the lab, paperwork, or parking? Each has a different fix: more doctors, faster lab, fewer forms, more parking.

A wide flame-graph frame is the same: the room is full; ask what people are waiting on inside.

Wide frame showsWhat it actually meansWhere to look next
High self-time in user functionFunction does real CPU workInspect the algorithm or data layout
Wide GC frames near leafCaller allocates a lotSwitch to allocation profile
Wide in wall-clock, narrow in CPUFunction waits — lock or syscallCapture off-CPU or mutex profile
Interpreter frame where JIT should beJIT deoptimised — fell back to baselineStabilise object shapes / types

Bea and Sven: one frame, two readings

Bea · Browser finds processOrder at 35% CPU and wants to rewrite the loop. Sven · Origin server looks closer: most of that 35% is in runtime.scanobject (the GC) called from inside the loop. The loop allocates a lot. The fix is sync.Pool, not a new algorithm.

The flame graph showed the symptom. The cause was one level deeper.

A scenario: regex on every request

A search endpoint shows regex.test as a wide leaf. Two engineers want to switch regex engines. A third looks at the parent: the regex is compiled on every request because the pattern is built from a template string. The fix is to compile once at startup. The leaf pointed at the right area; the bug was in the caller’s pattern, not in the leaf itself.

Why this works

The leaf is the dashboard warning light: it says “something is wrong here.” The fix may be inside the function (rewrite the algorithm), in the caller (don’t call so often), in a callee (real cost one level down), or in the surrounding context (allocate less, lock less, fewer syscalls). Senior engineers read the whole neighbourhood, not just the leaf.

Quiz

A flame graph shows a wide leaf frame. What is the FIRST question to ask?

Quiz

Why is 'wide frame = bottleneck' an incomplete reading of a flame graph?

Order the steps

Order the steps of attacking a hot path the senior way:

  1. 1 Open the profile and find the widest leaf frame by self-time
  2. 2 Read the parent chain — is the leaf called from one path or many?
  3. 3 Classify the work: CPU instructions, allocation, cache miss, lock wait, syscall, or JIT deopt
  4. 4 Form one hypothesis about the fix that matches the classification
  5. 5 Apply ONLY that change in isolation
  6. 6 Capture a new profile under the same load and diff against baseline
  7. 7 Verify both the local hotspot shrank AND the headline metric improved
Complete the analogy

Fill in the blank: a wide flame-graph frame names the _______; the cause may sit one level above (in the caller), one level below (in a callee), or in what the function is actually doing.

Recall before you leave
  1. 01
    In one paragraph: why is naming the hot function not enough — what else do you need to read from the profile before you can fix it?
  2. 02
    Give two concrete examples where the fix is in the caller rather than in the wide leaf itself.
Recap

A hot path is the sequence of calls where the program spends most of its time. The flame graph’s wide leaf names the function, but the cause may be in the caller (too many calls), a callee (real cost one level down), or in what the function does (CPU work vs allocation vs waiting). The diagnosis question — which of the five shapes is this hotspot — must precede the fix choice. The next lessons cover each of the five shapes and their fix families.

Connected lessons
appears again in159
Continue the climb ↑Five shapes of hotspot: CPU, alloc, cache, lock, syscall
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.