awesome-everything RU
↑ Back to the climb

Observability

The OTel Collector: receivers, processors, exporters, and deployment patterns

Crux The Collector is a YAML-configured pipeline — receivers accept telemetry, processors transform it in-flight, exporters ship it to backends. Three deployment patterns dominate: agent, gateway, and agent-to-gateway.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 13 min

A new backend requirement lands: route traces to vendor-A, logs to vendor-B, and redact PII from both, starting next Monday. Without a Collector, that is three separate application deploys across 50 teams. With a Collector, it is one YAML change and one Collector restart.

Receivers, processors, exporters

The Collector is a YAML-configured pipeline with three stages:

Receivers accept incoming telemetry:

  • otlp — gRPC (port 4317) and HTTP (port 4318), the primary receiver for OTel-instrumented services
  • filelog — tails log files from the filesystem (useful for legacy apps that write to stdout/files)
  • prometheus — scrapes /metrics endpoints (bridges Prometheus exporters into OTel)
  • kafka — reads telemetry from Kafka topics
  • Vendor-specific receivers for non-OTel data

Processors transform records in-flight:

  • memory_limiter — drops new records when the Collector is above a RAM threshold; prevents OOM by design
  • batch — groups records into efficient batches (fewer network round trips to exporters)
  • resource — adds service.name and other Resource attributes (e.g., inject deployment.environment=production)
  • attributes — redacts, renames, adds fields; used for PII scrubbing and Semantic Convention enforcement
  • tail_sampling — keeps/drops traces based on the full trace context (errors, latency, business criteria) — covered in the next lesson
  • transform — general-purpose transformations via OTTL (OpenTelemetry Transformation Language)
  • k8sattributes — enriches spans/logs with Kubernetes pod, namespace, node, and deployment metadata

Exporters ship records to backends:

  • otlp — to another OTel-aware backend (another Collector, Grafana Tempo, Jaeger, etc.)
  • datadog — vendor-specific
  • prometheusremotewrite — to Prometheus or Grafana Mimir
  • loki — to Grafana Loki for logs
  • Vendor-specific exporters for New Relic, Honeycomb, Splunk, Elastic, etc.

A minimal production-grade Collector YAML — one pipeline for traces:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  memory_limiter:
    check_interval: 1s
    limit_percentage: 80
    spike_limit_percentage: 25
  batch:
    timeout: 10s
    send_batch_size: 512
  attributes:
    actions:
      - key: user.email
        action: delete

exporters:
  otlp/tempo:
    endpoint: tempo:4317
    tls:
      insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch, attributes]
      exporters: [otlp/tempo]

This accepts OTLP on 4317/4318, caps Collector RAM at 80% (drops new records before OOM), batches spans (10 s max wait, 512 max batch), redacts user.email from every span, and exports OTLP to a Tempo backend.

Three deployment patterns

Agent — a Collector runs as a sidecar (one per pod) or DaemonSet (one per node). Receives telemetry from the local application, does minimal processing (resource enrichment, batching), exports directly to the backend. Cheapest in network hops, hardest to run tail sampling (each agent sees only one node’s traffic).

Gateway — applications export directly to a central pool of Collector replicas that does heavy processing (tail sampling, redaction, multi-backend routing). Easier to scale processing centrally, but every span crosses the network from the app to the gateway.

Agent-to-gateway (the production-canonical pattern) — a DaemonSet agent on every node does minimal local processing and forwards via OTLP to a centralised gateway pool that handles tail sampling, redaction, and routing. The agent adds Kubernetes metadata (pod, namespace, node) via the k8sattributes processor before forwarding.

PatternAgent locationTail sampling possible?Best for
Agent onlyDaemonSet per nodeNo — each agent sees only one nodeSimple setups, head sampling only
Gateway onlyCentral poolYes — sees all trafficSmall fleets, simple topologies
Agent-to-gatewayDaemonSet + central gatewayYes — gateway sees full traces via sticky routingProduction Kubernetes, all sizes

In the agent-to-gateway pattern, the agent’s loadbalancing exporter routes by trace_id hash to ensure all spans of a trace land on the same gateway replica — necessary for tail sampling, covered in the next lesson.

Why this works

Why does the Collector deserve to be a separate process from the application, even though the SDK could export to backends directly? Three reasons. (1) Buffering: when the backend slows or fails, the Collector buffers (in memory and on disk) so the application is not blocked. A direct SDK export would either drop telemetry (data loss) or back up into the application’s request handler (latency regression). (2) Policy: tail sampling, redaction, multi-backend routing belong outside application code — platform engineers update YAML without coordinating a redeploy across 50 teams. (3) Heterogeneous fleet: different services run different languages and SDK versions. The Collector normalises everything to OTLP and applies uniform policy regardless of upstream.

Quiz

A team's OTel Collector is dropping spans during a traffic spike (otelcol_processor_dropped_spans is non-zero). Which processor is most likely engaging the drop?

Quiz

Why is the agent-to-gateway pattern the production-canonical deployment for Kubernetes?

Order the steps

Order the Collector pipeline stages a span passes through in a standard trace pipeline:

  1. 1 OTLP receiver accepts the span batch from the application SDK
  2. 2 memory_limiter checks current RAM and drops if over threshold
  3. 3 k8sattributes adds pod, namespace, and node metadata
  4. 4 batch groups spans for efficient export
  5. 5 attributes redacts PII fields
  6. 6 OTLP exporter sends the batch to the backend
Recall before you leave
  1. 01
    What does the memory_limiter processor do and why must it come before other processors in the pipeline?
  2. 02
    Why does the agent-to-gateway pattern use a loadbalancing exporter on the agent tier instead of a simple round-robin?
  3. 03
    Name three processors and explain the production order they should appear in a traces pipeline.
Recap

The OTel Collector is a YAML-configured pipeline with three stages: receivers (otlp, filelog, prometheus, kafka) that accept incoming telemetry; processors (memory_limiter, batch, resource, attributes, k8sattributes, tail_sampling, transform) that transform records in-flight; and exporters (otlp, datadog, prometheusremotewrite, loki) that ship records to backends. memory_limiter must come first — it drops records cheaply before expensive processing when the Collector is under memory pressure. Three deployment patterns: agent (DaemonSet per node, simple, no tail sampling), gateway (central pool, tail sampling possible), and agent-to-gateway (the production-canonical Kubernetes pattern — DaemonSet agents enrich with Kubernetes metadata and forward via trace_id-hashed routing to a central gateway that runs tail sampling and multi-backend routing). The Collector’s value is decoupling policy (redaction, routing, sampling) from instrumentation (application code) — update YAML, not code.

Connected lessons
appears again in202
Continue the climb ↑Sampling strategies: head, tail, and parent-based
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.