Crux Read a traceparent header, a broken async-propagation snippet, a producer/consumer inject pair, and a tail-sampling config — then pick the behaviour and the highest-leverage fix.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min
Propagation bugs live in headers, code, and config — not in prose. Read each artefact the way you would in an incident, then choose the fix a senior engineer would reach for first.
Goal
Practise the diagnostic loop of every propagation incident: parse the header on the wire, spot where context is silently dropped, and read the collector config that decides which traces survive.
This header is well-formed. What does the trailing 00 tell the receiving service to do by default?
Heads-up The leading 00 is the version; the trailing 00 is trace-flags. Version 00 is simply the current format — no downgrade. The question is about the sampled flag.
Heads-up The parent-id here is 00f067aa0ba902b7, not zero. The trailing 00 is the trace-flags byte: sampled bit clear.
Heads-up traceparent encodes no span count, and a clear sampled flag means the upstream already decided to drop. Overriding it to 100% would produce a partial trace.
Snippet 2 — the deferred work
const { context, trace } = require('@opentelemetry/api');app.post('/checkout', async (req, res) => { // request span is active here setTimeout(() => { writeAuditLog(req.body); // emits a span }, 200); res.status(202).end();});
Quiz
Completed
The audit-log span shows up as its own root trace, disconnected from /checkout. What is the cause and the minimal fix?
Heads-up The boundary is setTimeout, not the function's sync/async nature. Even a sync writeAuditLog inside the timer runs on a new stack with no context. The fix is context.bind, not removing async.
Heads-up inject/extract is for cross-process carriers. The audit work runs in the same process; the lost context is restored with context.bind, not header injection.
Heads-up Response ordering does not affect the deferred callback's context. The span is orphaned because the timer callback lost the active context, regardless of when the response is sent.
The producer injects correctly, yet consumer spans are still orphans. Where is the bug?
Heads-up Kafka record headers (since 0.11) are the standard carrier and are not stripped. The producer side is correct; the consumer simply never extracts.
Heads-up context.bind revives in-process context, but the consumer process started with none — the context lives in the message headers. It must be extracted, not bound.
Heads-up Both sides use the same composite propagator. The round-trip fails because extract is missing on the consumer, not because of a propagator mismatch.
This tail-sampling config OOMed the collector during a traffic spike. What is the single most important missing line?
Heads-up A latency policy is desirable but it filters at decision time; it does not bound buffered memory. The missing safeguard against OOM is the num_traces cap.
Heads-up Raising decision_wait holds traces longer and increases RAM — it makes the OOM worse. 30s is a sane window; the missing piece is num_traces.
Heads-up The load-balancing exporter is an exporter on the upstream tier, not a processor here, and it addresses span routing across replicas — not the single-instance OOM, which num_traces bounds.
Recap
Every propagation incident is read in artefacts: the trace-flags byte tells you whether a well-formed header was sampled (and a clear flag means drop the whole trace, not export a fragment); an unwrapped setTimeout orphans in-process work until you context.bind it; a Kafka round-trip needs inject on the producer and the matching extract on the consumer; and a tail-sampling config without num_traces buffers unbounded until it OOMs. Parse the header, find the dropped context, read the config — then fix at the boundary, not at the dashboard.