Queues, Streams, Eventing QUE · 01 · 03

Consumer-side dedup: the cheapest path to exactly-once processing

How to wrap side effects and dedup inserts in one DB transaction, why INSERT must come before the side effect, and the Stripe Idempotency-Key pattern for external APIs.

QUE Middle ◷ 12 min

Level

FoundationsJuniorMiddleSenior

You added a dedup check to your payment consumer — a quick SELECT before charging. Duplicates dropped from dozens per week to zero. Then, three weeks later, a DB connection pool exhaustion caused the check to fail silently and the charge ran twice again. The check was outside the transaction. One line in the wrong place.

The naive dedup pattern and why it fails

Before you write a single line of dedup logic, ask yourself: is this check inside the same transaction as the side effect? If not, the window you think you closed is still open.

The first instinct is a two-step check:

SELECT 1 FROM processed WHERE msg_id = 'msg-7a3f';
-- if found: skip
-- if not found: call Stripe, then INSERT into processed

This fails under concurrent delivery. Two consumers receive the same message simultaneously (possible during a rebalance or after a visibility timeout). Both run the SELECT at the same moment, both see “not found”, both call Stripe. Race condition. Two charges.

The correct pattern: INSERT-first, single transaction

The fix: put the dedup INSERT and the side effect in one atomic DB transaction, and INSERT first:

BEGIN;
  INSERT INTO processed (msg_id, created_at)
  VALUES ('msg-7a3f', now());
  -- on UNIQUE violation: ROLLBACK and skip
  -- if insert succeeded: do the side effect
  UPDATE orders SET status = 'paid' WHERE id = 'O-123';
COMMIT;

On a unique-constraint violation, the transaction rolls back — the side effect never runs. On commit, both the record and the side effect are written together. No crash window between them.

The key property: if the consumer crashes after the DB transaction commits but before it acks the broker, the broker redelivers. The next consumer tries to INSERT msg-7a3f again, hits the unique constraint, rolls back, and acks the broker. The side effect was already done once; the duplicate is silently discarded.

The whole correctness argument in one frame: SELECT-then-act leaves a race window open under concurrent delivery; INSERT-first inside one transaction closes it with the UNIQUE constraint.

Transaction structure: INSERT dedup row first

1BEGIN transaction

2INSERT INTO processed (msg_id) — UNIQUE constraint

2aUNIQUE violation? → ROLLBACK. Log “duplicate skipped”. Ack broker.

3Perform side effect (UPDATE orders, send email job, etc.)

4COMMIT transaction

5Ack the broker — message removed from queue

Together, steps 1–4 are what make the pattern correct: without the UNIQUE constraint (step 2), concurrent duplicates race through; without the transaction wrapping both INSERT and side effect (steps 2–4), a crash between them leaves the message marked processed but the work undone, and the next redelivery skips it permanently.

The gate checks each message's idempotency key against the seen-keys store: first-seen keys are processed and recorded; already-seen keys are dropped.

External APIs: the Stripe Idempotency-Key

The transaction trick only works when the side effect is a DB write inside the same transaction. What about external API calls — Stripe, SES, Twilio? You cannot include an HTTP call in a Postgres transaction.

The pattern for external APIs: pass an Idempotency-Key header derived from the message ID.

POST /v1/charges
Idempotency-Key: msg-7a3f

Stripe stores the key and the first response for 24 hours. If you call Stripe again with the same key (because the broker redelivered), Stripe returns the cached response without charging the card again. PayPal, Square, and most payment APIs follow the same convention.

When the external API supports idempotency keys, the pattern is:

INSERT a pending row: INSERT INTO stripe_intents (msg_id, status='pending') — in a transaction. This is the intent log.
Call the external API with Idempotency-Key = msg_id.
On success: UPDATE the row to status='completed', charge_id=....

If the consumer crashes between steps 2 and 3, redelivery re-calls Stripe with the same key (Stripe returns the cached charge_id), then completes the UPDATE. No double charge.

Quiz

Why must the dedup INSERT be in the same transaction as the side effect?

Quiz

A consumer uses Stripe's Idempotency-Key but no local DB dedup. The Stripe call succeeds, then the consumer crashes before acking. On redelivery, what happens?

Order the steps

Order the steps of a correct idempotent consumer wrapping an external payment API:

1 Receive msg-7a3f from broker
2 BEGIN DB transaction
3 INSERT INTO payment_intents (msg_id, status='pending') — unique on msg_id
4 COMMIT the pending row
5 Call Stripe with Idempotency-Key=msg-7a3f
6 UPDATE payment_intents SET status='done', charge_id=ch_abc123
7 Ack the broker — message removed from queue

Recall before you leave

01
What is the crash window that makes SELECT-then-act unsafe for dedup?
02
If the consumer crashes after the DB COMMIT but before acking the broker, what happens on redelivery?
03
What is the Stripe Idempotency-Key TTL and what happens after it expires?

Recap

Consumer-side dedup is the cheapest path to effectively exactly-once processing: maintain a processed-messages table with a UNIQUE constraint on message ID, BEGIN a transaction, INSERT the dedup row first, perform the side effect, COMMIT. A UNIQUE violation on redelivery rolls back the entire transaction so the side effect never re-runs. For external APIs that live outside the DB transaction, derive an Idempotency-Key from the message ID and pass it with every call — Stripe, PayPal, and Square all support this convention and will cache the first response for at least 24 hours, making retries safely idempotent across the broker boundary. Now when you review a consumer that touches Stripe or any payment API, your first question should be: is the Idempotency-Key derived from the message ID, and is it passed on every call path including retries?

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

The three failure legs — where duplicates and losses actually happenmiddle

unlocks

deepens into

appears again in228

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

At-least-once job queueBuild a durable job queue on Postgres with visibility timeouts and idempotent consumers, so a crashed worker never drops a job.Job schedulerA cron + backoff job runner with at-least-once delivery, idempotent handlers, and visibility timeouts — so no job is silently lost even when workers crash mid-execution.