APIs API · 07 · 10

Rate limiting: build a distributed limiter that actually holds

Hands-on project — build a distributed, atomic token-bucket limiter in Redis, prove it holds the real limit across multiple nodes, and ship a correct 429 contract.

API Senior ◷ 240 min

Level

FoundationsJuniorMiddleSenior

Reading about the per-node-counter lie is not the same as watching your limit silently triple in production. Build a limiter, run it behind a load balancer across several nodes, then attack it the way a real client does — across the window boundary and across every node at once — and prove the limit holds.

Goal

Turn the unit’s mental model into a working artifact: a distributed token-bucket limiter whose counter lives in Redis, whose check is atomic, and whose 429 response is a correct contract — verified with a load test that defeats a naive per-node limiter.

Project

0 of 7

Objective

Build a rate-limiting middleware that enforces a real per-API-key limit across at least 3 app nodes behind a load balancer, using an atomic Redis token bucket, and prove with a load test that the global limit holds and the 429 contract is correct.

Requirements

Acceptance criteria

A before/after result: the naive per-node limiter admits ~N times the configured limit under the cross-node load test; the Redis token bucket admits within a small tolerance of the configured limit under the identical test.
A captured 429 response showing status, Retry-After, and the three RateLimit-* headers with RateLimit-Reset expressed in delta-seconds.
Evidence the check is atomic: a concurrency test (many parallel requests for one key) shows no over-admission beyond capacity, and the Lua script is shown handling the whole read-decide-update.
A short write-up: which algorithm and per-key dimension you chose and why, the measured global limit across nodes, and how jitter plus continuous refill defend against the thundering herd.

Senior stretch

Add a sliding-window-log variant for one sensitive endpoint (e.g. login) and compare its memory footprint per key against the token bucket under the same load.
Make the limiter fail open vs fail closed when Redis is unreachable, measure the latency cost of the Redis round trip on the request path, and add a local in-process fallback with a documented tradeoff.
Implement tiered per-key quotas (free vs paid) read from a config or a header, and show a paid key gets a larger burst and rate under the same load test.
Add an allocation-style abuse guard: a per-key daily quota on top of the per-minute rate, and show a client that exhausts the daily quota is held even while under the per-minute rate.

Recap

This is the limiter you will actually ship: the counter lives in Redis so all nodes share it, the check is one atomic Lua script so concurrent requests cannot double-spend, the algorithm (token bucket) matches bursty real traffic, and the rejection is a contract — 429 with Retry-After in delta-seconds, the RateLimit-* headers, and jitter so the reset edge does not stampede. Building it once, and watching the naive version fail the cross-node load test, turns the per-node-counter lie into something you will never ship by accident.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.