awesome-everything RU
↑ Back to the climb

Observability

Trace propagation: code and header reading

Crux Read a traceparent header, a broken async-propagation snippet, a producer/consumer inject pair, and a tail-sampling config — then pick the behaviour and the highest-leverage fix.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min

Propagation bugs live in headers, code, and config — not in prose. Read each artefact the way you would in an incident, then choose the fix a senior engineer would reach for first.

Goal

Practise the diagnostic loop of every propagation incident: parse the header on the wire, spot where context is silently dropped, and read the collector config that decides which traces survive.

Snippet 1 — the header on the wire

traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-00
Quiz

This header is well-formed. What does the trailing 00 tell the receiving service to do by default?

Snippet 2 — the deferred work

const { context, trace } = require('@opentelemetry/api');

app.post('/checkout', async (req, res) => {
  // request span is active here
  setTimeout(() => {
    writeAuditLog(req.body);   // emits a span
  }, 200);
  res.status(202).end();
});
Quiz

The audit-log span shows up as its own root trace, disconnected from /checkout. What is the cause and the minimal fix?

Snippet 3 — producer and consumer

// producer
const headers = {};
propagator.inject(context.active(), headers);
await producer.send({ topic: 'orders', messages: [{ value: body, headers }] });

// consumer
await consumer.run({
  eachMessage: async ({ message }) => {
    // headers are available as message.headers
    await handleOrder(message.value);   // emits spans
  },
});
Quiz

The producer injects correctly, yet consumer spans are still orphans. Where is the bug?

Snippet 4 — the collector config

processors:
  tail_sampling:
    decision_wait: 30s
    policies:
      - name: errors
        type: status_code
        status_code:
          status_codes: [ERROR]
      - name: probabilistic
        type: probabilistic
        probabilistic:
          sampling_percentage: 1
Quiz

This tail-sampling config OOMed the collector during a traffic spike. What is the single most important missing line?

Recap

Every propagation incident is read in artefacts: the trace-flags byte tells you whether a well-formed header was sampled (and a clear flag means drop the whole trace, not export a fragment); an unwrapped setTimeout orphans in-process work until you context.bind it; a Kafka round-trip needs inject on the producer and the matching extract on the consumer; and a tail-sampling config without num_traces buffers unbounded until it OOMs. Parse the header, find the dropped context, read the config — then fix at the boundary, not at the dashboard.

Continue the climb ↑Trace propagation: stitch a broken system into one trace
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.