APIs API · 02 · 01

Status codes that actually matter in production

A status code is a contract three machines read before any human does: caches, dashboards, and retry logic. Getting the class wrong — or hiding errors in a 200 body — corrupts all three at once.

API Junior ◷ 16 min

Level

FoundationsJuniorMiddleSenior

A payments gateway started failing upstream during a Friday deploy. Every dashboard stayed green: success rate 100%, error rate 0%, no alerts. The on-call engineer found out from customer tweets, not from monitoring. The cause was one line — the gateway caught upstream failures and returned 200 OK with { "error": "upstream timeout" } in the body. The metrics counted HTTP status, so a total outage looked like perfect health for forty minutes.

The classes are a routing decision, not trivia

Memorising that 404 means “not found” is junior trivia. The senior view is that the first digit is a routing instruction for machines that never read your response body: a cache, a load balancer, an APM agent, a client retry loop. They branch on the class, and they branch immediately.

When you pick the wrong class, you are not just wrong in a documentation sense — you are actively misdirecting every automated system downstream.

2xx — it worked; the response is cacheable per its headers, count it as success.
3xx — go somewhere else; follow the Location or use your cache.
4xx — you, the client, sent something wrong; do not retry the same request, it will fail identically.
5xx — I, the server, failed; the request may have been valid, so a retry might work.

That 4xx/5xx split is the load-bearing one. It is the difference between “stop, you broke it” and “wait, I broke it, try again.” Get the class wrong and every machine downstream makes the wrong call — the cache stores a failure, the retry loop hammers a request that can never succeed, the dashboard miscounts the outage.

2xx is not one thing: 200 vs 201 vs 202 vs 204

The success class carries real information a senior uses to shape the API contract.

200 OK — done, here is the result. Synchronous, complete.
201 Created — a resource now exists; return a Location header pointing at it. This is the correct answer to a successful POST /orders, not a bare 200.
202 Accepted — “I took your request, but it is not done yet.” The work is queued or async. The body should hand back a way to poll status. Returning 200 here is a lie: the client thinks the order shipped when it is still sitting in a queue.
204 No Content — success, and there is deliberately nothing to send back (a DELETE, or a PUT that changed nothing the client needs echoed). Saves a round-trip of empty body parsing.

The 202-vs-200 distinction is where async systems leak. If your endpoint enqueues work to Kafka and returns immediately, 202 tells the client “poll me.” 200 tells the client “we are finished” — and now its UI shows a confirmed order that exists nowhere yet.

The 4xx vocabulary: 400 vs 422 vs 409 vs 412, and 401 vs 403

This is where most APIs are sloppy, and where a precise client can give a useful error message instead of a generic “something went wrong.”

Code	Means	What the client should do
`400` Bad Request	Malformed — bad JSON, missing required field, unparseable	Fix the request structure; never retry as-is
`422` Unprocessable	Parsed fine, but values fail business rules (bad email, qty < 0)	Fix the values; show field-level errors
`409` Conflict	Clashes with current resource state (duplicate, version race)	Re-read state, resolve, resubmit
`412` Precondition Failed	`If-Match`/`If-Unmodified-Since` evaluated false	Refetch the entity (the `ETag` moved); retry with new precondition
`401` Unauthorized	Not authenticated — no/expired/invalid credentials	Log in or refresh the token, then retry
`403` Forbidden	Authenticated, but not allowed to do this	Stop — no credential change fixes it

RFC 9110 draws 400 as syntax and 422 as semantics: 400 is “I could not parse this,” 422 is “I parsed it and the values are wrong.” The boundary is fuzzy and many APIs collapse both into 400, but the distinction is worth keeping — it tells the client whether to fix the shape or the data, which is the difference between a developer error and a user error.

The 401 vs 403 pair is constantly swapped. 401 means “I do not know who you are” — the fix is authentication. 403 means “I know exactly who you are, and you still cannot do this” — no token refresh helps. Returning 401 for an authorization failure sends clients into a pointless re-login loop.

▸Why this works

There is a security wrinkle on 404 vs 403. A strict 403 Forbidden on a record the user is not allowed to see confirms the record exists — an information leak. For sensitive resources (another user’s invoice by id), seniors often return 404 Not Found instead, so an attacker enumerating ids cannot tell “no such record” from “exists but not yours.” You trade REST correctness for not leaking existence.

5xx: whose fault, and what the client does next

Ask yourself: if a monitoring alert fires on this endpoint, which team should own it — the API consumer or the API operator? The answer is embedded in the status code. The 5xx class is where retry strategy lives, and the sub-codes attribute blame.

500 Internal Server Error — your application threw and did not handle it. The bug is in your code.
502 Bad Gateway — a proxy/load balancer got a malformed or no response from the upstream it called. The gateway is fine; something behind it is broken.
503 Service Unavailable — the server is intentionally not serving: overloaded, draining, in maintenance. This is the one 5xx that should carry a Retry-After header.
504 Gateway Timeout — the gateway waited for the upstream and gave up. The request may have completed on the backend even though the client saw a timeout — which is exactly why blindly retrying a non-idempotent call here is dangerous.

The senior reflex: 5xx is retryable (the request may have been valid and the failure transient), 4xx is not (the request is wrong and will fail identically) — with 429 as the explicit retryable exception in the 4xx range.

429 and Retry-After: backoff the server controls

429 Too Many Requests says you hit a rate limit. The correct response carries a Retry-After header — either a number of seconds (Retry-After: 30) or an HTTP date. The server knows its own rate-limit window; honour it. Only fall back to client-side exponential backoff with jitter (1s, 2s, 4s, 8s…) when no Retry-After is present. Google Cloud Storage, OpenAI, and Stripe all document this exact ordering: read Retry-After first, then back off with jitter.

The retry rule that ties it together — and the one that causes real incidents when broken: retry 5xx and 429; never blindly retry a non-idempotent 4xx or a non-idempotent request after a timeout. A GET, PUT, or DELETE is idempotent — retrying is safe. A POST is not. If a POST /charge times out (504) and the client retries, the charge may run twice. The fix is an idempotency key: Stripe and Square require clients to send a unique key per logical operation, so the server deduplicates retries instead of double-charging.

Retryability is two questions, not one: is the request idempotent, and is the failure transient? Only the idempotent + 5xx/429 cell is a blind retry; a POST on a 5xx/429 needs an idempotency key, never a naked retry.

Pick the best fit

A client POSTs to /charge, the request times out as 504, and no idempotency key was sent. The charge may or may not have gone through. What does a correct client do?

Quiz

An endpoint enqueues work to Kafka and returns before the work is done. What should it return?

Quiz

A user is logged in but tries to delete another tenant's record. Which status is correct?

Order the steps

A client gets a failed response. Order the decisions that determine whether and how to retry:

1 Read the status class: is it 4xx (client error) or 5xx/429 (retryable)?
2 If 4xx and not 429: do not retry — the request is wrong, it will fail identically
3 If 5xx or 429: check whether the request is idempotent or carries an idempotency key
4 If idempotent/keyed: honour Retry-After if present, else exponential backoff with jitter
5 If non-idempotent and unkeyed (e.g. a POST after a 504): do not blindly retry — reconcile instead

Caches, dashboards, and retry loops branch on the class before any human reads the body — so 4xx means stop, 5xx/429 mean maybe retry.

Recall before you leave

01
Explain why returning 200 OK with an error object in the body is dangerous, and what it breaks downstream.
02
Walk through the retry decision for a failed request, including the idempotency trap.

Recap

A status code is a contract three machines read before any human does — caches, monitoring, and retry logic — and the first digit is the instruction they branch on. 2xx is not monolithic: 201 with a Location for creation, 202 for accepted-but-async work, 204 for deliberately empty success. The 4xx vocabulary carries real meaning a precise client uses — 400 for malformed syntax, 422 for values that fail business rules, 409 for state conflicts, 412 for failed preconditions, 401 for “who are you” versus 403 for “not allowed,” with 404 sometimes substituted for 403 to avoid leaking a resource’s existence. The 5xx codes attribute fault — 500 your bug, 502 a bad upstream, 503 intentional unavailability that should carry Retry-After, 504 a timeout where the work may secretly have completed. Retry 5xx and 429; never blindly retry a non-idempotent 4xx or a non-idempotent request after a timeout, and reach for an idempotency key so retries cannot double-charge. The cardinal sin is tunnelling an error through a 200 body: it blinds your dashboards, poisons your caches, and silences the retries that would have saved the request. Now when you see a dashboard showing 100% success during a fire, the first thing you check is whether someone is hiding errors in a 200 body.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Mini OAuth 2.0 + PKCE loginImplement the authorization-code + PKCE flow end to end against a real provider, so you understand every redirect and token instead of trusting a library.Distributed rate limiterBuild a token-bucket limiter that holds across many app instances by keeping the counter in Redis, not in process memory.