Queues, Streams, Eventing QUE · 08 · 09

Queues capstone: code and config reading

Read real pipeline config and consumer code — outbox plus CDC, offset-commit order, dedup, and DLQ handling — predict the behaviour, and pick the fix a senior would make first.

QUE Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

The pipeline’s bugs do not live in prose — they live in a transaction boundary, a commit order, a missing unique constraint, and a connector config. Read each snippet across the order pipeline and choose the fix a senior engineer reaches for first.

Goal

Practise reading the seams of an event-driven pipeline the way you read an incident: spot where atomicity, ordering, idempotency, or DLQ handling is silently broken, and name the highest-leverage correction.

Snippet 1 — the outbox write

-- order handler, one DB transaction
BEGIN;
  INSERT INTO orders (id, customer_id, status, total_cents)
    VALUES ('ord_9f', 'cust_3a', 'placed', 4200);
  INSERT INTO outbox (id, aggregate_id, type, payload, created_at)
    VALUES ('evt_7c', 'ord_9f', 'order.placed', '{"order_id":"ord_9f"}', now());
COMMIT;
-- separately, a relay later:
--   SELECT * FROM outbox WHERE sent = false ...
--   publish to Kafka, then UPDATE outbox SET sent = true

Quiz

What guarantee does this structure give, and what must downstream consumers do as a result?

Snippet 2 — outbox plus CDC interaction

// Debezium outbox-event-router connector
{
  "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
  "table.include.list": "public.outbox",
  "transforms": "outbox",
  "transforms.outbox.type":
    "io.debezium.transforms.outbox.EventRouter",
  "transforms.outbox.route.by.field": "aggregate_id",
  "slot.name": "order_outbox_slot",
  "heartbeat.interval.ms": "0"
}

Quiz

This connector tails the outbox table and routes events to a Kafka topic keyed by aggregate_id. Two settings interact badly with a low-traffic deployment plus a stalled consumer. Which is the real production hazard?

Snippet 3 — the consumer commit order

for msg in consumer:                      # at-least-once Kafka consumer
    event = parse(msg.value)
    consumer.commit()                     # commit offset FIRST
    charge_payment(event.order_id,        # then do the side effect
                   event.amount_cents)

Quiz

This payment consumer commits the offset before charging. What is the failure mode, and what is the correct ordering?

Snippet 4 — DLQ handling

def handle(event):
    try:
        process(event)                       # idempotent side effect
    except Exception:
        send_to_dlq(event)                    # any failure -> DLQ
        consumer.commit()

Quiz

This handler sends every failure straight to the DLQ on the first exception. What goes wrong, and what is the senior fix?

Recap

Every seam of the pipeline is read in code and config: the outbox makes two INSERTs atomic but the relay’s publish-then-mark is at-least-once, so consumers must dedupe; a CDC connector keyed by aggregate_id preserves per-order ordering, but heartbeat.interval.ms = 0 with no slot cap is a disk-fill outage waiting on a stall; committing the offset before the side effect is at-most-once and silently loses charges, so commit after processing and stay idempotent; and DLQ-ing on the first exception buries poison messages under recoverable transients, so retry with a budget and quarantine only on exhaustion. Read the transaction boundary, the commit order, the unique constraint, and the connector config — that is where the pipeline’s correctness actually lives. Now when you review a pull request touching any of these four things, you know you are not reviewing style — you are reviewing whether charges get lost, doubled, or silently blocked.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.