Crux Read real SQL and handler snippets — an outbox schema, a relay claim query, and a dual-write bug — predict the behaviour, and pick the highest-leverage fix.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min
The schema, the relay query, and the handler are where outbox correctness is won or lost. Read each snippet, predict how it behaves under a crash or under load, then choose the fix a senior engineer would make first.
Goal
Practise the loop you run in every outbox review: read the SQL and the handler, find the gap a crash exposes, and reach for the fix that closes it without inventing a distributed transaction.
Snippet 1 — the handler
async function placeOrder(db, broker, order) { await db.query("INSERT INTO orders (id, status) VALUES ($1, 'placed')", [order.id]); await broker.publish("OrderPlaced", order); // separate system, separate call}
Quiz
Completed
This handler is the dual write the unit exists to kill. Where is the gap, and what is the minimal fix?
Heads-up An OOM kill or power loss between the two calls throws nothing to catch, and you can't roll back a committed order from a dead process. try/catch can't compensate a crash in the gap — that's why the outbox moves the publish out of the handler entirely.
Heads-up That converts a lost event into a phantom event — crash after publish and before commit, and consumers act on an order that rolled back. Reordering two non-atomic calls never closes the gap.
Heads-up Retries live in the handler's memory and die with the pod, and there's no durable record that a publish was owed. The outbox makes that record durable in the DB so a relay can retry after restart.
Snippet 2 — the outbox schema
CREATE TABLE outbox ( id uuid PRIMARY KEY, aggregate text NOT NULL, event_type text NOT NULL, payload jsonb NOT NULL, created_at timestamptz NOT NULL DEFAULT now(), sent_at timestamptz -- NULL until the relay marks it sent);CREATE INDEX outbox_unsent ON outbox (created_at) WHERE sent_at IS NULL;
Quiz
Completed
Why is the partial index `WHERE sent_at IS NULL` the right shape for the relay's workload?
Heads-up This index is on the outbox table and doesn't touch the orders table at all. It targets the relay's read of unsent rows, not the business write.
Heads-up An index changes query speed, not delivery semantics. Delivery is still at-least-once because publish and mark-sent are two non-atomic steps; the index just keeps the poll cheap.
Heads-up A full index also covers the millions of sent rows the relay never queries, so it grows large and costs more to maintain on every write. The partial index covers only the rows the relay reads.
Snippet 3 — the relay claim query
BEGIN;SELECT id, event_type, payload FROM outbox WHERE sent_at IS NULL ORDER BY created_at LIMIT 100 FOR UPDATE SKIP LOCKED;-- publish each row to the broker, then:UPDATE outbox SET sent_at = now() WHERE id = ANY($claimed_ids);COMMIT;
Quiz
Completed
Three relay replicas run this exact query concurrently. What does FOR UPDATE SKIP LOCKED buy you, and what guarantee still does NOT hold?
Heads-up SKIP LOCKED only prevents two replicas from grabbing the same rows in one pass. The publish-then-UPDATE gap still exists per replica, so a crash there replays the batch — delivery stays at-least-once.
Heads-up Without SKIP LOCKED the second replica blocks on the locked rows, serialising the relays and killing the throughput you scaled out to get. SKIP LOCKED is what makes parallel relays actually parallel.
Heads-up Across three replicas publishing concurrently, global order is lost despite the ORDER BY. Per-aggregate order needs keying by aggregate id so one aggregate's events go to one partition.
Snippet 4 — the consumer
async function onOrderPlaced(db, event) { // event.id is the stable outbox row id, carried through the broker await db.query("INSERT INTO orders_processed (event_id) VALUES ($1)", [event.id]); await chargeCard(event.order);}
Quiz
Completed
Delivery is at-least-once, so this consumer can receive the same event twice. What is wrong, and how do you make it idempotent?
Heads-up Recording the id isn't dedupe unless the column is unique and you act on the conflict. As written, a duplicate just inserts another row and the card is charged again.
Heads-up Charging first makes the double-charge more likely, not less. The fix is a unique constraint plus conflict handling so the side effect runs exactly once per event id.
Heads-up The outbox relay is at-least-once by construction (publish-then-mark gap); no broker setting changes that end-to-end. Idempotency on a stable event id is the consumer's responsibility.
Recap
Every outbox review reads the same way in code: the bare handler shows the dual-write gap a crash exposes; the schema’s partial index keeps the relay’s unsent-row poll cheap as the table grows; the claim query’s FOR UPDATE SKIP LOCKED lets replicas scale out without double-publishing, yet the publish-then-mark gap keeps delivery at-least-once; and the consumer closes the loop with a unique event id plus ON CONFLICT DO NOTHING so a replay charges the card exactly once. Find the gap, make the intent durable, claim rows disjointly, dedupe downstream.