Observability
RED and USE: free-recall review
Retrieval beats re-reading. For each prompt, say or write a full answer from memory before you open the model answer — the effort of recall is what fixes the unit in your head, not seeing it again.
Reconstruct the unit’s spine without looking back: why RED and USE are a pair, what saturation really measures, why PSI replaced load average, where the golden signals extend RED, and how cardinality is bounded while still bridging to traces.
- 01Why do senior engineers run RED and USE together rather than either one alone?
- 02What does Saturation measure, and why is it more diagnostic than Utilization?
- 03What is PSI, what do 'some' and 'full' mean, and why did it replace load average for saturation?
- 04What does the golden-signals' Saturation add that RED's three letters miss, with a concrete example?
- 05What is cardinality, why is one unbounded label catastrophic, and which labels are safe?
- 06How do you jump from a p99 metric spike to the specific slow request without an unbounded label?
If you reconstructed each answer from memory you hold the unit’s spine: RED and USE are a pair because one names the user-visible symptom and the other the resource cause; saturation beats utilization because waiting work, not busy-time, is what users feel; PSI replaced load average as the core-count-independent, wall-clock saturation signal; the golden signals add service-level Saturation that RED alone misses; and cardinality is bounded by labelling only route/method/status while exemplars carry the bridge to traces. That is the discipline you run on every page.