Performance
GC: tame a death-spiral
Reading about death-spirals is not the same as pulling a service out of one. Build a small allocation-heavy server, drive it into GC trouble, and apply the unit’s fix ladder until the numbers come back — with evidence at every step.
Turn the unit’s mental model into a reproducible engineering loop: instrument allocation and GC, diagnose the hotspot from a profile, reduce allocations, defend the memory bound, and verify the fix with before/after metrics.
Take a deliberately allocation-heavy HTTP service (your own or the starter below) and bring its GC CPU share under 5% and its p99 under target — without switching collectors — proving each step with measurements.
- A before/after table: alloc rate, GC CPU %, p99 pause, and p99 request latency — measured under the same load, not estimated.
- The allocation profile clearly shows the top hotspots shrinking after the fix (re-profiled, not assumed).
- GC CPU share holds under ~5% and the death-spiral signature is gone from gctrace at sustained load.
- A one-paragraph write-up naming the lever used for each hotspot and why it ranked above tuning the collector.
- Add a one-page on-call runbook: quick triage from the four panels, common allocation causes for your runtime, the fix-priority ladder, and a verification checklist.
- Add an allocation-driven DoS guard — request-body size limit and result-size cap — and show the service stays bounded under an oversized-payload flood.
- Add a CI gate that load-tests a canary, diffs the allocation profile against main, and fails the build if any function's allocation share grows more than 20%.
- Repeat the experiment on a second runtime (e.g. add a JVM or Node version) and compare how the same allocation pattern manifests under a different collector.
This is the loop you will run in every real GC incident: instrument first, diagnose from a profile, fix at the top of the ladder (eliminate before pool before tune before switch), defend the memory bound with GOMEMLIMIT, and verify with before/after numbers under identical load. Doing it once on a toy service makes the production version muscle memory.