Queues, Streams, Eventing
The transactional outbox: ending the dual-write between DB and broker
An order is placed. The handler does two things: INSERT the order row, then kafka.publish("OrderPlaced"). Both succeed in the happy path for months. Then one night the broker has a 400ms blip; the publish call hangs and the pod gets OOM-killed mid-call. The order row is committed and visible — the customer was charged — but OrderPlaced never landed. Fulfillment never hears about it. The package never ships. There is no error in any log, because nothing threw. Reconciliation finds it a week later, by hand.
Why the dual write has no safe order
You have two systems with no shared transaction: your database and your message broker. Any handler that must update one and notify the other is doing a dual write, and there is no ordering of those two calls that survives a crash in the gap between them.
Write the row, then publish: crash after commit, before publish, and you have lost an event. The state changed but nobody downstream knows. Publish, then write the row: crash after publish, before commit, and you have a phantom event — consumers react to an order that was rolled back and never existed. Wrap both in a try/catch and “compensate” on failure, and you have just written a tiny, buggy distributed-transaction coordinator that itself can crash between steps. Two-phase commit across a DB and Kafka would close the gap, but Kafka doesn’t offer XA participation you’d want in production, and 2PC’s blocking coordinator is a worse problem than the one you started with.
The dual write is not a bug you can code around inside the handler. It is the absence of atomicity across two stores.
One transaction, one source of truth
The outbox pattern collapses the dual write into a single write. In the same local database transaction that changes the business state, you also INSERT a row into an outbox table describing the event. Because both writes are in one ACID transaction against one database, they commit together or roll back together — atomically, for free, with no coordinator.
BEGIN;
INSERT INTO orders (id, status, ...) VALUES (...);
INSERT INTO outbox (id, aggregate, type, payload, created_at)
VALUES (uuid, 'order', 'OrderPlaced', '{...}', now());
COMMIT;After commit, the truth that “an OrderPlaced event must be delivered” is now durable in the same database as the order itself. Publishing to the broker becomes a separate, retryable step performed by a relay (a.k.a. message relay / dispatcher) that reads unsent outbox rows and pushes them to Kafka. The handler’s job is done at COMMIT; it never talks to the broker at all. You have traded an impossible atomic dual-write for a trivially-atomic single write plus an asynchronous, crash-safe pump.
Two relay flavors: polling vs log-tailing
The relay reads the outbox in one of two ways, and choosing between them is the senior decision in this pattern.
Polling publisher — a worker runs SELECT ... WHERE sent = false ORDER BY created_at LIMIT N on an interval, publishes each row, then marks it sent. Dead simple, works on any database, no extra infrastructure. The costs: every poll is load on your primary OLTP database whether or not there’s work, and end-to-end latency is bounded by your interval. Intervals of 100ms–1s are typical; pushing to 25–50ms to chase freshness multiplies query load, and at 10–30 relay replicas that constant polling becomes a real burden on the primary.
Change Data Capture (CDC) — instead of querying the table, a tool like Debezium tails the database’s write-ahead log (the WAL in Postgres) and emits an event the moment the outbox INSERT is committed. Latency drops to single-digit milliseconds, there is zero polling load on the table, and you get the log’s natural commit ordering. The price is operational: you now run and monitor a CDC connector, manage replication slots, and reason about WAL retention. CDC is where the next unit picks up.
| Dimension | Polling publisher | CDC / log-tailing |
|---|---|---|
| End-to-end latency | Bounded by interval (100ms–1s) | Single-digit ms |
| Load on primary DB | Constant query traffic, even when idle | Reads the WAL, near-zero table load |
| Ordering | By ORDER BY; lost with parallel relays | Natural commit order from the log |
| Operational cost | A cron-like worker; trivial | A connector, replication slots, WAL retention |
| Best when | Modest volume, latency tolerant, no new infra | High throughput, low latency, ordering matters |
The guarantee is at-least-once — so consumers must be idempotent
The outbox closes the lost-event gap, but it does not give you exactly-once delivery, and believing it does is the failure mode that bites in production. The relay does two non-atomic things: publish to the broker, then mark the row sent. If it crashes after the broker accepted the message but before the UPDATE commits, the row still looks unsent — so on restart it publishes the same event again. You get at-least-once: every event is delivered one or more times.
That means the contract for everyone downstream is non-negotiable: consumers must be idempotent. Stamp each outbox row with a stable event id, carry it through the broker, and have consumers dedupe on it — typically an inbox table of processed ids (kept for a few hours to a couple of days) or a unique constraint on the side effect. Processing OrderPlaced twice must charge the card once. Without idempotency, the outbox doesn’t make your system reliable; it makes it reliably double-fire.
Why this works
Couldn’t the relay mark the row sent first, then publish? No — that just moves the gap and turns a duplicate into a lost event: crash after the UPDATE and before the publish, and the row looks done but nothing was ever sent. The only orderings available are publish-then-mark (duplicates, recoverable) or mark-then-publish (loss, unrecoverable). At-least-once is the strictly safer choice, which is why idempotent consumers are the load-bearing half of the pattern, not an optional nicety.
The operational tax: bloat, ordering, and competing relays
Three things will eventually page someone:
- Table bloat. The outbox grows with every event. Sent rows must be reaped or the table — and its indexes — balloon into the hundreds of thousands of rows where polling queries degrade. Don’t
DELETEhuge ranges in one statement (it fights inserts for locks); delete sent rows in small batches, or partition the outbox by day so cleanup is a metadata-onlyDROP PARTITIONinstead of index-thrashing row deletes. - Ordering. A single relay preserves
created_atorder. The moment you scale to multiple relays for throughput, two workers can publish concurrently and events arrive out of order. If a consumer needs per-aggregate order, key by aggregate id so all events for one order go to the same partition and one worker. - Competing relays. Run two relay replicas naively and both
SELECTthe same unsent rows and double-publish. The fix isSELECT ... FOR UPDATE SKIP LOCKED(Postgres / MySQL 8.0+): each worker grabs and locks a disjoint batch, skipping rows another worker already holds, so replicas scale out without stepping on each other.
An order service must update the DB and notify fulfillment via Kafka. A crash in the gap loses the event. Pick the design.
The relay publishes a message to the broker, then crashes before it can UPDATE the row to sent. What happens on restart?
You scale the polling relay to three replicas for throughput. What must change so they don't double-publish the same rows?
Order the lifecycle of one event through the outbox pattern:
- 1 In one local transaction, INSERT the business row AND an outbox row, then COMMIT
- 2 The relay reads unsent outbox rows (polling on an interval, or tailing the WAL via CDC)
- 3 The relay publishes each event to the broker
- 4 The relay marks the row sent (the gap here makes delivery at-least-once)
- 5 The idempotent consumer dedupes on the event id and applies the effect once
- 01Explain to a teammate why writing the DB and publishing to Kafka in the same handler is unsafe, and how the outbox fixes it.
- 02Why is the outbox at-least-once rather than exactly-once, and what does that force on the rest of the system?
A handler that updates your database and publishes to a broker is doing a dual write across two systems with no shared transaction, and no ordering of those calls survives a crash in the gap — you either lose an event or emit a phantom one. The transactional outbox removes the dual write entirely: in the same local transaction that changes business state, you INSERT an outbox row, so both commit atomically against one database with no coordinator. A separate relay then ships outbox rows to the broker, either by polling on a 100ms–1s interval (simple, but adds latency and constant DB load) or by tailing the WAL with CDC like Debezium (single-digit-ms latency and natural ordering, at the cost of running a connector). Because the relay publishes then marks sent in two non-atomic steps, delivery is at-least-once, so consumers must dedupe on a stable event id. The operational tax is real — reap or partition the outbox to fight bloat, key by aggregate to preserve order, and use FOR UPDATE SKIP LOCKED so competing relays don’t double-publish — but in exchange you get an event pipeline that never silently loses a write.