Observability OBS · 02 · 01

Why structured logs exist: the diary vs the spreadsheet

Free-text logs look readable and become un-queryable at scale. Structured logs are a spreadsheet — every line is a record with addressable fields.

OBS Junior ◷ 8 min

Level

FoundationsJuniorMiddleSenior

The pager fires at 02:00. You have 10 million log lines and five minutes. If logs are free-text, you write a regex and hope. If logs are structured, you write one query and know.

Diary versus spreadsheet

Free-text logs are a diary. Structured logs are a spreadsheet. Both record what happened, but only one lets you ask “show me every payment failure last week grouped by upstream provider” without reading every entry by hand.

The diary is fine when you have ten entries. At ten million entries it is unusable: fields are inconsistent, formats drift between services, and every new question requires writing a new regex.

The spreadsheet has rules: every row has the same columns, every column has a type, every value goes in the right cell.

Format	Example	Query for “status 503 from payment”
Free-text	[ERROR] checkout: gateway timeout (payment, status 503)	Regex, substring match, misses variants
Structured JSON	{“level”:“error”,“upstream”:“payment”,“status”:503}	Single indexed-field query, milliseconds

Why the cost compounds

The cost of a log line is paid in three places: at write time (CPU + RAM for serialization), in transit (network egress), and at the backend (ingest GB + indexed-event count + retention bytes). Structured logs are not free — JSON serialization costs CPU — but they pay back at query time.

The cost of a log line lands in three places — write, transit, backend — and structured logs trade that cost up front to pay back at query time.

A free-text line “user 42 failed checkout because gateway returned 503” requires regex and substring search to filter. The same data as JSON {"level":"error","user_id":42,"route":"/checkout","upstream":"payment-gateway","upstream_status":503,"trace_id":"abc..."} is a single indexed-field query with millisecond response time over weeks of data.

The discipline: pick a schema, populate it consistently, and treat the log line as an API your future on-call self will read at 03:00 with no context.

One event, two encodings. Free text is an opaque string you must regex; structured JSON is named fields (timestamp, level, trace_id, msg, fields) the backend indexes and filters in milliseconds.

The triage scenario

Bea · Browser gets paged: error rate up on checkout. She queries the log backend with service:checkout AND level:error AND @timestamp:[now-15m TO now] — gets 240 matching events, each a JSON record. She facets by the upstream field: 220 of 240 errors come from payment-gateway. She facets by upstream_status: all 220 are HTTP 503. Diagnosis in 30 seconds.

Sven · Origin server pulls one trace_id from a failing log line and opens the trace to see the gateway call timing out. If logs were free-text, Bea would have written a regex, missed cases with different wording, and lost ten minutes.

▸Why this works

JSON is the de-facto encoding for structured logs in 2026 — not XML, not protobuf, not a custom binary. JSON is the lowest-common-denominator that humans can still read in a terminal, that every logging backend can parse out of the box, and that every language’s standard library serializes cheaply. JSON Lines (one JSON object per line, newline-delimited) is the canonical format: append-friendly, streamable, parseable line-by-line, easy for grep + jq when you are SSH’d into a box.

Quiz

Which log line is structured?

Order the steps

Order the fields you should put on every production log line, in roughly priority order:

1 timestamp (ISO-8601 UTC, the most basic)
2 level (DEBUG / INFO / WARN / ERROR)
3 service.name (which service emitted this)
4 trace_id and span_id (the join key to traces)
5 message (human-readable summary)
6 event-specific fields (route, status_code, user_segment, ...)
7 resource attributes (host, region, version — typically set once)

Complete the analogy

Fill in the blank: a structured log is a _______, where every row has the same columns, every column has a type, and every value goes in the right cell.

Recall before you leave

01
In two sentences, why is JSON the de-facto encoding for structured logs in 2026?
02
What is the cost of being unstructured, and when does it show up?
03
Why does the structured log treat trace_id as a required field, not an optional one?

Recap

Structured logs are JSON events with a stable schema: every line is a record with the same shape — timestamp, level, service.name, trace_id, message, and event-specific fields. Free-text logs record the same data as sentences, which is readable at small scale and un-queryable at million-line scale because every query needs a regex. The key payback is at query time: indexed JSON fields answer “show me all 503s from payment-gateway in the last 15 minutes” in one query with millisecond latency. The cost is at write time (CPU for serialization) and at the backend (indexed events are billed per million). The discipline is treating the log schema as an API contract — consistent across services, populated on every line.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 6 done

Connected lessons

unlocks

The production log schema: fields every line must carrymiddle

deepens into

appears again in297

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Grounded RAG ServiceA RAG demo that answers from a corpus is easy; a RAG service you'd trust in front of users is not. The hard part isn't retrieval, it's grounding: making the model say only what the retrieved text supports, attaching citations the reader can check, and proving with an eval set that the answers don't drift into confident fiction. You'll build the whole loop — chunk, embed, store, retrieve top-k, ground, cite, score — and feel exactly where it leaks.Job schedulerA cron + backoff job runner with at-least-once delivery, idempotent handlers, and visibility timeouts — so no job is silently lost even when workers crash mid-execution.