Performance
Cache vs big-O: free-recall review
Retrieval beats re-reading. For each prompt, say or write a full answer from memory before you open the model answer — the effort of recall is what fixes the mental model, not the recognition of a right-looking option.
Reconstruct the unit’s spine without looking back: why allocation pattern beats operation count, how the cache line creates false sharing, when sorting beats branchless code, why SoA enables SIMD, and the perf-stat loop that ties it together.
- 01Why can an O(N) scan of a contiguous array beat an O(log N) tree traversal in production, and where does that stop being true?
- 02Explain false sharing: how the MESI protocol turns independent per-thread writes into a performance bug, and how you detect and fix it.
- 03When does sorting (O(N log N)) beat a branchless rewrite for a data-dependent branch, and when is branchless the better tool?
- 04Why does SoA enable SIMD where AoS forces gather/scatter, and what is the trade-off?
- 05Beyond cache hit rate, what two constraints limit a memory-bound loop, and how do you tell them apart from cache misses?
- 06State the performance-engineering loop for this unit and which metric points to which class of fix.
If you reconstructed each answer from memory, you hold the unit’s spine: the memory hierarchy sets a ~100x constant that big-O ignores; cache lines are 64-byte ownership units that turn neighbouring writes into false sharing; sorting and branchless are two tools for the same unpredictable-branch problem; SoA unlocks SIMD and bandwidth that AoS blocks; and the perf-stat loop maps each metric to its matching fix. Measure first, name the constant-factor cause, fix layout or branches, re-measure.