Performance PERF · 01 · 02

Amdahl''''s law and self-time: the ceiling on every speedup you can ship

Amdahl''''s law converts ''''should I optimise this?'''' into a number. Self-time vs cum-time tells you whether to fix a function or walk down to what it calls.

PERF Middle ◷ 16 min

Level

FoundationsJuniorMiddleSenior

A team spends three weeks rewriting a hash function for a 10x local speedup. The deployed service is 1.05x faster. Amdahl’s law would have told them the answer before they started — if they had checked the profile first.

Amdahl’s law: the ceiling on every speedup

If a section of code takes fraction p of total execution time, and you make that section s times faster, the total application speedup is:

total_speedup = 1 / ((1 − p) + p/s)

The ceiling case — making the section infinitely fast — gives:

max_speedup = 1 / (1 − p)

A function that is 10% of total time can never give more than 1.11x total speedup, even if you replace it with a no-op. A function that is 80% of total time gives 5x if you halve it.

This law converts “should I optimise this?” into a quantitative question. Profile, see that function X is 12% of execution, and accept up-front that even a perfect rewrite caps your win at 13.6%. If the ceiling is below your bar, do not start. If function Y is 70% of execution, a 2x speedup of Y gives 1.54x total — worth the engineering time.

Fraction of total time (p)	Max total speedup (p → 0)	Speedup if made 2x faster
10%	1.11x	1.05x
50%	2.00x	1.33x
80%	5.00x	1.67x
90%	10.0x	1.82x

Wall-clock vs CPU time vs self-time

Before you can trust a profile number, you need to know which kind of time it is measuring — otherwise you will optimise the wrong thing even with the right tool in hand.

A profile reports several different time measurements and confusing them is the most common reading error.

Wall-clock time — what the user feels: how long the request took from start to end. A request at 500 ms wall-clock that only used 50 ms of CPU is 90% waiting (on disk, network, locks, GC).
CPU time — how much of wall-clock was spent actually computing.
Self-time — time spent in a function’s own code, not including time spent in functions it called.
Cumulative time (cum-time) — includes time in callees.

Reading a profile: if cum-time is large but self-time is small, the function is a routing/dispatch layer — the slow work is in something it calls; walk down. If self-time is large, the function is itself doing the work; that is where the fix lives.

Confusing the two leads to optimising a wrapper while the real cost sits in a callee — or vice versa.

▸Why this works

“Optimising 10% of code to zero” is the classic example of an Amdahl trap. Even eliminating that code entirely gives only 1.11x — less than most teams’ minimum threshold for a breaking change. The profile identifies which fractions are large enough to be worth attacking. Without it, you risk spending a week on 1.05x wins.

Amdahl's law and profile-first economics

Amdahl's law max speedup: 1 / (1 − p) where p = fraction sped up
Optimising 10% of code to free: 1.11x total max
Optimising 50% of code to free: 2.0x total max
Optimising 90% of code to free: 10.0x total max
Typical sampling profiler overhead: 0.5-5% CPU

Apply Amdahl's law to decide which optimisation is worth doing

1/3

Weigh fraction of total time before magnitude of local speedup: 2x on 50% (1.33x) beats 4x on 30% (1.29x).

Quiz

A profile shows function X has CUM-time 80% but SELF-time 2%. What does this mean and where do you look for the fix?

Quiz

A microbenchmark says a new hash function is 10x faster than the old one. When integrated, the application is only 1.05x faster. Most likely explanation?

handler() cum-time 100% (all attributed)

handler() self-time 2% — fix NOT here

└ parseBody() (callee) 18%

└ query() (callee) 80% — walk down to here

cum-time = self-time + Σ callees. Tiny self-time but huge cum-time means the fix lives in a callee, not the handler.

Recall before you leave

01
Walk through Amdahl's law with an example: function X takes 30% of execution time. You make X 5x faster. What is the total application speedup and what fraction of execution does X now consume?
02
What is the difference between self-time and cum-time, and what action does each reading suggest?

Recap

Amdahl’s law gives a ceiling: total_speedup = 1 / ((1 − p) + p/s). The profile provides p — the fraction of total time the target function occupies. Without that number, you cannot decide whether an optimisation is worth engineering time. A function at 10% of total time can never deliver more than 1.11x speedup even if made free; one at 80% can deliver up to 5x. Self-time vs cum-time resolves where to look: large self-time means fix the function directly; large cum-time with tiny self-time means walk down to the callee where the real cost sits. Now when you see a profiler output, check p first — if the ceiling is below your target, move on to the next hotspot.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 7 done

Connected lessons

builds on

Why profile first: measure where time actually goesjunior

unlocks

The measurement loop: microbench, macrobench, prod profile, observer effectmiddle

deepens into

appears again in162

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.