awesome-everything RU
↑ Back to the climb

Performance

Cache vs big-O: multiple-choice review

Crux Multiple-choice synthesis across the cache-vs-big-O unit — memory hierarchy, cache lines, false sharing, branch prediction, data layout, and reading a perf profile.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 13 min

Six questions that cut across the whole unit. None is a definition to recite — each is the judgement call you make when a “fast” algorithm loses to a “slow” one and big-O has nothing to say about it.

Goal

Confirm you can connect the memory hierarchy, cache-line behaviour, false sharing, branch prediction, and data layout into one diagnostic instinct: read the profile, name the constant-factor cause, pick the matching fix.

Quiz

A linked list of 1M nodes iterates ~17x slower than an array of 1M ints, though both are O(N). Which statement best explains the gap?

Quiz

A team flips a matrix-multiply inner loop from row-major to column-major order on a 10 000×10 000 double matrix and sees a ~9x slowdown. The arithmetic is identical. Root cause?

Quiz

A counter array is made lock-free with atomics and gets slower than the locked version as threads scale. perf shows IPC 0.4 and a 72% cache-miss rate. Diagnosis?

Quiz

Sorting an array before summing the elements above a threshold yields a 2–3x speedup, despite the extra O(N log N) sort. Why does it pay back?

Quiz

A hot ML loop runs 5x slower than a reference C++ port; perf shows IPC 0.8 and 25% L3 miss rate. Someone proposes adding SIMD intrinsics. What is the correct first move?

Quiz

perf stat on a hot loop reports IPC 0.40, branch-miss 25%, and LLC-miss 47%. You can fix one thing first. Which, and why?

Recap

The unit’s through-line is one decision tree: big-O sizes how cost grows with N, but the memory hierarchy sets the constant — and that constant swings ~100x with layout. Contiguous beats pointer-chasing; loop order must match storage order; per-thread writes need their own cache line; unpredictable branches want sorting or branchless code; one-field-at-a-time work wants SoA. When a profile is in front of you, read miss rates and IPC first, name the constant-factor cause, and apply the matching fix — measure, never guess.

Continue the climb ↑Cache vs big-O: free-recall review
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.