Observability
Structured logging: multiple-choice review
Six questions that cut across the whole unit. Each one is a decision you make while a service is on fire at 03:00 — not a definition to recite, but a tradeoff between the bill, the alert, the audit, and the investigation.
Confirm you can connect the schema contract, the level ladder, sampling economics, PII and injection defence, trace correlation, and audit separation — the synthesis the individual lessons built toward.
Two services on one team emit JSON logs, but one calls the field http.response.status_code and the other calls it status. Both are valid JSON. Why does this still break on-call?
A retry helper logs ERROR on every one of its 5 attempts, including the 3 that ultimately succeed. The on-call channel is now noise and a real ERROR was missed last week. What is the correct fix and why?
A single service at 1000 req/s with 1 KB JSON logs is costing ~$260/month in ingest. What is the first lever to cut it ~90% without losing the ability to investigate failures?
A handler interpolates user-submitted comment text directly into a log message string. A user submits a comment containing a newline followed by a forged ERROR JSON object. What class of failure is this, and what is the structural fix?
A Node service reports that ~5% of log lines carry trace_id = '00000000000000000000000000000000'. How do you read this?
A team stores operational logs and audit logs (logins, role grants, regulated-data access) in one index with a 30-day retention policy. Why is this a compliance failure, not just a tidiness issue?
The unit’s through-line is one pipeline of decisions: a stable schema (OTel field names) makes logs queryable and joinable; the level ladder routes volume to storage and severity to humans; sampling keeps the bill proportional to incidents while always retaining WARN/ERROR; PII discipline and CWE-117 injection defence are first-class security concerns solved at the source and the collector; trace_id makes a log line a node in the observability graph instead of an island; and audit logs share the JSON schema but need their own pipeline, retention, immutability, and access. Every answer resolves back to treating the log line as an API contract your on-call and your auditor both depend on.