awesome-everything RU
↑ Back to the climb

Backend Architecture

Concurrency and cache architecture for idempotency at scale

Crux Naive SELECT-then-INSERT is not atomic. Postgres ON CONFLICT or advisory locks prevent race conditions. At high throughput, a Redis Cluster hot-path with Postgres cold-path fallback is the production pattern.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 16 min

Two requests with the same idempotency key arrive at the server simultaneously — a real scenario when a mobile client fires two retries before the first acknowledgment. A naive SELECT-then-INSERT lets both through. Both charge the customer.

The race condition in naive implementations

A common first implementation:

-- Not atomic!
SELECT * FROM idempotency_keys WHERE key = $1;
-- Time passes here. A second request slips in.
INSERT INTO idempotency_keys (key, fingerprint, status, expires_at)
  VALUES ($1, $2, 'in_progress', NOW() + interval '24 hours');

Between the SELECT and the INSERT, a second concurrent request can execute the same SELECT (seeing no row), then also INSERT. Both succeed. Both start processing. Both charge the customer.

Fix 1: Postgres INSERT … ON CONFLICT DO NOTHING RETURNING

INSERT INTO idempotency_keys (key, fingerprint, status, expires_at)
  VALUES ($1, $2, 'in_progress', NOW() + interval '24 hours')
  ON CONFLICT (key) DO NOTHING
  RETURNING *;

If the RETURNING set is empty, this request lost the race. It then re-reads the row:

  • If status=in_progress → return 409 Conflict.
  • If status=completed → return the cached response.

The INSERT is atomic at the database level. No race.

Fix 2: Postgres advisory locks

SELECT pg_advisory_xact_lock(hashtext($1));
-- Now safe to SELECT-then-INSERT

pg_advisory_xact_lock holds a session-level exclusive lock for the transaction duration. Only one connection can hold the lock for a given hash at a time.

Tradeoff: hashtext is 64-bit — collisions are possible at ~10⁻¹⁰ probability per 10M keys per day. Acceptable for most workloads; security-critical paths use a full unique index instead. The lock must be held for the entire request including any external API calls (tens to hundreds of milliseconds) — watch pg_locks for pool exhaustion during incidents.

ApproachAtomic?Collision riskNotes
SELECT then INSERTNoN/AUnsafe — race window
INSERT ON CONFLICT DO NOTHING RETURNINGYesNoneRecommended for most cases
pg_advisory_xact_lockYes10⁻¹⁰ (hash collision)Watch pool under slow external calls

Scaling the cache: Redis Cluster

A single Postgres primary bottlenecks at roughly 5–10k writes/sec on commodity hardware. At 50k req/sec, the idempotency write becomes the bottleneck.

Redis Cluster with SETNX:

SETNX idempotency:{key} {fingerprint}:{status} EX {ttl_seconds}

SETNX is atomic set-if-not-exists. Redis Cluster shards by key hash, spreading load across nodes. Each node handles ~50–100k SETNX/sec.

Durability risk: Redis defaults to async fsync. A crash within milliseconds of a write can lose the entry. For a payment API, this means the key is gone and the next retry is treated as new — potential double charge.

Two-tier cache: the production pattern

Payment APIs that hold legal liability for double charges use a hybrid:

  1. Hot path: RedisSETNX on the key. If it succeeds (new key), process and write to Postgres asynchronously via the outbox. If it conflicts (existing key), read Redis for the cached response.
  2. Cold path: Postgres — authoritative record. If Redis misses (rare, on crash), fall through to the Postgres table to recover the authoritative response.
Request → Redis SETNX
          ├─ New: process + store in Postgres async + cache in Redis
          ├─ Conflict: return cached response from Redis
          └─ Redis miss: read from Postgres → populate Redis → return

Stripe and Square use this hybrid in production. Pure Redis is acceptable for non-financial workloads (signups, analytics events) where a rare duplicate is logging noise rather than a compliance event.

Why this works

Why not DynamoDB for the idempotency cache? DynamoDB ConditionExpression handles atomic insert-or-noop natively and scales horizontally without sharding effort. At 50k/sec the cost is roughly $3k/month at standard pricing, and p99 latency is ~5–10ms. Good if you are already on AWS and want zero operational overhead; expensive and adds latency otherwise. The Redis + Postgres hybrid is cheaper and faster at comparable scale.

Quiz

Two requests with the same idempotency key arrive simultaneously. The naive SELECT-then-INSERT implementation runs. What is the worst-case outcome?

Quiz

At 50k requests/sec, why does Redis Cluster outperform a single Postgres primary for idempotency key lookups?

Quiz

A Redis Cluster node crashes milliseconds after a SETNX write. The key is lost. What is the consequence for a payment API using Redis as the sole idempotency store?

Recall before you leave
  1. 01
    Why is SELECT-then-INSERT not safe for idempotency key creation, and what is the correct Postgres fix?
  2. 02
    Describe the two-tier idempotency cache architecture used by production payment APIs.
  3. 03
    What is the tradeoff of using pg_advisory_xact_lock instead of ON CONFLICT for idempotency key serialization?
Recap

Concurrent idempotency key creation requires an atomic operation. Postgres INSERT ... ON CONFLICT (key) DO NOTHING RETURNING * is the standard fix — it is a single statement with no race window. A request that finds RETURNING empty lost the race and re-reads the row to return 409 or replay. At throughput above ~10k writes/sec, a Redis Cluster hot-path with SETNX handles the load; a Postgres cold-path fallback provides durability for financial workloads. In-memory-only dedup is never acceptable behind a load balancer.

Connected lessons
appears again in179
Continue the climb ↑Observability, production failures, and global-scale design
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.