Performance PERF · 08 · 09

Performance capstone: reading profiles, traces, and stats

Read real artifacts from across the track — a pprof profile, a hot-path snippet, an N+1 query log, and a bundle report — predict the dominant cost, and pick the highest-leverage fix.

PERF Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

Every unit in the track leaves a different artifact behind: a profile, a hot path, a query log, a bundle report. The senior skill is reading the artifact, naming the dominant cost, and reaching for the right layer — not the most familiar one.

Goal

Practise the cross-track loop on real evidence: read the artifact, locate the cost that dominates, and choose the fix at the layer where that cost lives — before touching any knob.

Artifact 1 — the CPU profile

(pprof) top5 -cum
      flat  flat%   sum%        cum   cum%
         0     0%     0%      8.10s 92.0%  net/http.(*conn).serve
     0.02s  0.2%   0.2%      7.40s 84.1%  app.(*Handler).Search
     0.05s  0.6%   0.8%      6.90s 78.4%  app.scoreAll
     6.10s 69.3%  70.1%      6.40s 72.7%  app.cosineSim   <-- flat
     0.30s  3.4%  73.5%      0.55s  6.2%  encoding/json.Marshal

Quiz

Reading this `top -cum` output, where is the time actually spent and what does Amdahl's law say about optimising json.Marshal?

Artifact 2 — the hot path

// app.scoreAll — called once per search request, ~10k candidates
func scoreAll(q []float32, candidates []Doc) []Result {
    var results []Result                    // nil slice
    for _, d := range candidates {
        v := append([]float32{}, d.Vector...) // fresh copy per candidate
        results = append(results, Result{d.ID, cosineSim(q, v)})
    }
    return results
}

Quiz

Given Artifact 1 pointed here, what is the highest-leverage fix in this loop, and which is a distraction?

Artifact 3 — the query log

SELECT id, total FROM orders WHERE user_id = $1                 -- 1 row set, 50 rows
SELECT name FROM customers WHERE id = $1   -- params: 11
SELECT name FROM customers WHERE id = $1   -- params: 12
SELECT name FROM customers WHERE id = $1   -- params: 13
... (47 more identical-shape queries) ...
-- total: 51 queries, 1 request

Quiz

This log is one request. Name the pattern and the fix that removes it at the source.

Artifact 4 — the bundle report

Route /product/[id]   first-load JS
  framework chunk ..............  48.2 KB
  page chunk ...................  19.4 KB
  moment + moment-tz ...........  71.9 KB   <-- date formatting
  lodash (full) ................  72.3 KB   <-- used: groupBy, debounce
  ------------------------------------------
  total (gzip) ................. 211.8 KB    budget: 130 KB  ❌ OVER by 81.8 KB

Quiz

This route is 82 KB over budget. What is the cost these bytes impose, and what is the highest-leverage trim?

Recap

Four artifacts, one discipline. A top -cum profile distinguishes the root frame (huge cumulative, no leverage) from the real hotspot (high flat); the hot-path snippet shows allocation waste the profile predicted; the query log exposes N+1 by its repeated single-row shape; the bundle report turns bytes into device-CPU cost. In each case the fix lives at the cost’s own layer — eliminate the allocation, batch the round-trips, trim the bytes — never one knob removed from where the evidence points. Now when you open a profile, a query log, or a bundle report, you know which column to read first and which layer to reach for — before you touch a single line of code.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.