APIs API · 06 · 03

Batch function contracts: ordering, shapes, errors

The batchLoadFn must return values in the same order as the input keys, handle missing rows and per-key errors explicitly, and support one-to-one, one-to-many, and count shapes.

API Middle ◷ 12 min

Level

FoundationsJuniorMiddleSenior

Sven wires DataLoader and the page is fast — for most requests. Occasionally posts show the wrong author. The SQL query ran correctly and returned the right rows. The bug is in the batch function: it returned rows in the order Postgres chose, not the order DataLoader expected.

This lesson covers the three contract rules that DataLoader trusts you to uphold — break any one of them and the bugs show up as wrong data in production, not as errors.

The ordering contract

batchLoadFn(keys: Array<K>): Promise<Array<V | Error>> must return an array of the same length as keys, with values at matching positions. If keys = [7, 9, 3] then the return must be [row7, row9, row3] — in that order. When you write a batch function, ask yourself: am I returning database rows in input-key order, or in whatever order the DB chose?

Postgres (and most databases) do not guarantee row order from a WHERE id IN (...) query. If the query returns [row9, row7, row3], and you return that array directly, DataLoader resolves:

key 7 → row9 (wrong)
key 9 → row7 (wrong)
key 3 → row3 (correct by coincidence)

This is a silent bug: no error is thrown, tests may pass, and production serves wrong authors on intermittent queries.

The fix: build a Map from the results, then walk the input keys.

async function batchAuthors(ids) {
  const rows = await db.query(
    'SELECT id, name FROM users WHERE id = ANY($1)', [ids]
  );
  const map = new Map(rows.map(r => [r.id, r]));
  // Walk ids in input order; return null for missing rows
  return ids.map(id => map.get(id) ?? null);
}

Handling missing rows

If a key has no matching row in the database, map.get(id) returns undefined. Returning undefined in a slot resolves the Promise to undefined, which propagates to the resolver as the field value. For non-nullable GraphQL fields this causes a validation error at runtime.

The correct approach depends on the field’s nullability:

Nullable field: return null for missing rows.
Non-nullable field that should always exist: return new Error('User not found: ' + id).

Returning an Error in a slot causes DataLoader to reject only that specific .load() Promise, not all pending loads. This is the opposite of throwing from the batch function, which rejects all pending Promises.

Same failure, opposite blast radius: a per-slot Error fails one field surgically; a throw fails the whole batch.

One-to-many shape

For a field like Post.tags, the loader key is the post ID and the value is an array of tags. Empty arrays are mandatory for posts with no tags — if you return undefined or skip the slot, the next post’s tags fill in.

async function batchTags(postIds) {
  const rows = await db.query(
    'SELECT post_id, tag FROM tags WHERE post_id = ANY($1)', [postIds]
  );
  // Pre-fill every key with an empty array
  const map = new Map(postIds.map(id => [id, []]));
  rows.forEach(r => map.get(r.post_id).push(r.tag));
  return postIds.map(id => map.get(id));
}

Count shape

For Post.likeCount, the batch runs GROUP BY post_id COUNT(*) and maps back to a count (defaulting to 0 for posts with no likes).

Shape	Key	Return per key	Pattern
One-to-one	entity ID	row or null	authorLoader
One-to-many	parent ID	array (possibly empty)	tagsLoader
Count	parent ID	integer (possibly 0)	likeCountLoader

maxBatchSize

When a single GraphQL document references 5000 unique authors, a single WHERE id IN (...) with 5000 IDs is slower than five queries of 1000 due to Postgres planner overhead. Set options.maxBatchSize (typical: 500–1000) to split large batches automatically.

▸Why this works

Why does DataLoader let you return Error per slot instead of throwing? Because throwing fails all pending .load() Promises — every resolver in the request gets a rejection, even those whose rows returned successfully. Per-slot errors give surgical failure: the one field with a missing row returns an error, all others resolve normally. Apollo Server translates a resolver error to a partial GraphQL response with one null field and one error entry.

Quiz

A batch function returns rows in the order the database returned them, without building a lookup map. What is the symptom?

Quiz

A one-to-many batch function for Post.tags omits empty arrays for posts with no tags. What breaks?

Quiz

The batch function throws an unhandled exception. Which .load() Promises are rejected?

DB rows come back unordered; the Map re-keys them so results[i] matches keys[i]. Same length, same order.

Recall before you leave

01
Why must batchLoadFn return values in the same order as the input keys?
02
What is the difference between returning an Error in a slot vs throwing from batchLoadFn?
03
For a one-to-many loader, what must you return for a parent key that has no children?

Recap

The batch function contract has three rules: same length as input keys, values in the same order as input keys, and a value (or Error) for every key. Violating order silently returns wrong rows because DataLoader trusts position. Violating length shifts all downstream resolvers. Skipping empty arrays in one-to-many batches causes data from one parent to appear on another. Return per-slot Errors for known-bad rows so other resolvers in the same batch succeed; throw only when the entire batch cannot proceed. Now when you write or review a batch function, run through the three rules mentally before merging — a missing Map reorder or a skipped empty array will not show up in tests until a specific key ordering happens to expose it.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

DataLoader mechanics: tick-boundary batchingmiddle

unlocks

Senior GraphQL API: scheduling contract, tenant isolation, observabilitysenior

deepens into

Senior GraphQL API: scheduling contract, tenant isolation, observabilitysenior

appears again in204

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.