APIs API · 06 · 10

GraphQL N+1: batch and harden an API

Hands-on project — build a GraphQL API that exhibits N+1, prove it with resolver/SQL counts, batch it with DataLoader, harden it against query-shape attacks, and verify with before/after numbers.

API Senior ◷ 240 min

Level

FoundationsJuniorMiddleSenior

Reading about N+1 is not the same as watching a database log fire 51 times for one HTTP request and then driving it back to 2. Build a small GraphQL API on purpose-nested data, measure the storm, batch it with DataLoader, then harden it against the query shapes DataLoader cannot touch — with evidence at every step.

Goal

Turn the unit’s mental model into a reproducible loop: reproduce N+1 and prove it with SQL/resolver counts, fix it with a correct per-request DataLoader, defend the API against depth, complexity, and alias attacks, and verify the whole thing with before/after numbers.

Project

0 of 7

Objective

Build (or take) a GraphQL API over a relational schema with nested lists, reproduce its N+1 storm under load, collapse it with DataLoader, and add query-shape defences — proving each step with measured SQL counts, resolver counts, and latency.

Requirements

Acceptance criteria

A before/after table: total SQL queries, resolver call counts, and p99 latency for the same 50-post query under the same load — measured from your harness, not estimated.
DataLoaders are instantiated per request in the context factory (demonstrated), and a test proves a module-scope instance would leak across two simulated tenants or serve a stale row.
The batch-contract tests pass: dedup returns one SQL trip for a doubly-referenced author, and the out-of-order-rows test still maps every author correctly.
Three crafted attack queries (a deep recursive query, a 5-level first:100 query, and a 1000-alias document) are each rejected at validation with the defence that caught them named.

Senior stretch

Split the schema into two Apollo Federation subgraphs (posts and users), confirm the router batches the cross-subgraph refs into one _entities call, and show __resolveReference still needs its own DataLoader to avoid intra-subgraph N+1.
Implement resolver lookahead for one deep path (posts { author { profile } }) by reading the info AST and issuing a single JOIN; measure it against three stacked DataLoader trips and note the isolation tradeoff.
Add a multi-tenant column and prove tenant isolation belongs in the SQL filter inside the batch function, not just in per-request scope — write a test that fails with scope-only isolation and passes with the tenant_id filter.
Wire a CI gate that runs the 50-post query against a canary and fails the build if any type.field resolver call count grows beyond a baseline (catching an N+1 regression before it ships).

Recap

This is the loop you run on every real GraphQL performance incident: build the evidence harness first (SQL and resolver counts), reproduce and measure the N+1, fix it with a per-request DataLoader written to the order-and-shape contract, prove correctness with dedup and out-of-order tests, then layer the query-shape defences DataLoader cannot replace — depth, multiplicative complexity, alias caps. Verify with before/after numbers under identical load. Doing it once on a toy API makes the production diagnosis muscle memory.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.