awesome-everything RU
↑ Back to the climb

Performance

Amdahl''''s law and self-time: the ceiling on every speedup you can ship

Crux Amdahl''''s law converts ''''should I optimise this?'''' into a number. Self-time vs cum-time tells you whether to fix a function or walk down to what it calls.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 16 min

A team spends three weeks rewriting a hash function for a 10x local speedup. The deployed service is 1.05x faster. Amdahl’s law would have told them the answer before they started — if they had checked the profile first.

Amdahl’s law: the ceiling on every speedup

If a section of code takes fraction p of total execution time, and you make that section s times faster, the total application speedup is:

total_speedup = 1 / ((1 − p) + p/s)

The ceiling case — making the section infinitely fast — gives:

max_speedup = 1 / (1 − p)

A function that is 10% of total time can never give more than 1.11x total speedup, even if you replace it with a no-op. A function that is 80% of total time gives 5x if you halve it.

This law converts “should I optimise this?” into a quantitative question. Profile, see that function X is 12% of execution, and accept up-front that even a perfect rewrite caps your win at 13.6%. If the ceiling is below your bar, do not start. If function Y is 70% of execution, a 2x speedup of Y gives 1.54x total — worth the engineering time.

Fraction of total time (p)Max total speedup (p → 0)Speedup if made 2x faster
10%1.11x1.05x
50%2.00x1.33x
80%5.00x1.67x
90%10.0x1.82x

Wall-clock vs CPU time vs self-time

A profile reports several different time measurements and confusing them is the most common reading error.

  • Wall-clock time — what the user feels: how long the request took from start to end. A request at 500 ms wall-clock that only used 50 ms of CPU is 90% waiting (on disk, network, locks, GC).
  • CPU time — how much of wall-clock was spent actually computing.
  • Self-time — time spent in a function’s own code, not including time spent in functions it called.
  • Cumulative time (cum-time) — includes time in callees.

Reading a profile: if cum-time is large but self-time is small, the function is a routing/dispatch layer — the slow work is in something it calls; walk down. If self-time is large, the function is itself doing the work; that is where the fix lives.

Confusing the two leads to optimising a wrapper while the real cost sits in a callee — or vice versa.

Why this works

“Optimising 10% of code to zero” is the classic example of an Amdahl trap. Even eliminating that code entirely gives only 1.11x — less than most teams’ minimum threshold for a breaking change. The profile identifies which fractions are large enough to be worth attacking. Without it, you risk spending a week on 1.05x wins.

Amdahl's law and profile-first economics
Amdahl's law max speedup
1 / (1 − p) where p = fraction sped up
Optimising 10% of code to free
1.11x total max
Optimising 50% of code to free
2.0x total max
Optimising 90% of code to free
10.0x total max
Typical sampling profiler overhead
0.5-5% CPU

Apply Amdahl's law to decide which optimisation is worth doing

1/3
Quiz

A profile shows function X has CUM-time 80% but SELF-time 2%. What does this mean and where do you look for the fix?

Quiz

A microbenchmark says a new hash function is 10x faster than the old one. When integrated, the application is only 1.05x faster. Most likely explanation?

Recall before you leave
  1. 01
    Walk through Amdahl's law with an example: function X takes 30% of execution time. You make X 5x faster. What is the total application speedup and what fraction of execution does X now consume?
  2. 02
    What is the difference between self-time and cum-time, and what action does each reading suggest?
Recap

Amdahl’s law gives a ceiling: total_speedup = 1 / ((1 − p) + p/s). The profile provides p — the fraction of total time the target function occupies. Without that number, you cannot decide whether an optimisation is worth engineering time. A function at 10% of total time can never deliver more than 1.11x speedup even if made free; one at 80% can deliver up to 5x. Self-time vs cum-time resolves where to look: large self-time means fix the function directly; large cum-time with tiny self-time means walk down to the callee where the real cost sits.

Connected lessons
appears again in159
Continue the climb ↑The measurement loop: microbench, macrobench, prod profile, observer effect
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.