APIs API · 06 · 02

DataLoader mechanics: tick-boundary batching

DataLoader waits for the current event-loop tick to finish, then fires one batched query for every key queued across all resolvers — turning N per-item SQL calls into one.

API Middle ◷ 13 min

Level

FoundationsJuniorMiddleSenior

After wiring DataLoader, Sven’s 51-query page drops to 2 queries — without changing the schema or the client request. The GraphQL document and all the resolver functions stay exactly the same. Only the fetcher changes.

By the end of this lesson you’ll understand the exact mechanism that turns 50 isolated resolver calls into a single SQL query — and the one mistake that silently leaks tenant data across requests.

What DataLoader does

DataLoader is a small library, originally extracted from Facebook’s GraphQL implementation and maintained under the GraphQL Foundation. It provides new DataLoader(batchLoadFn, options) where batchLoadFn(keys: Array<K>): Promise<Array<V | Error>> is your batched fetcher.

Resolvers call loader.load(key) and receive a Promise. Under the hood — this is the part worth understanding before you touch any production DataLoader config:

Each .load(key) queues the key into an internal array. No query runs yet.
When the current synchronous JavaScript turn ends — specifically when the microtask queue starts draining — DataLoader takes all queued keys and calls batchLoadFn once with the full array.
batchLoadFn runs one WHERE id IN (...) query and returns the values in the same order as the input keys.
DataLoader resolves each waiting Promise with its value.

The tick boundary is the natural join point: GraphQL’s execution engine finishes walking one level of the query synchronously. All 50 Post.author resolver calls queue their IDs during that level-traversal. DataLoader’s microtask fires after, when every ID has been collected.

Why a fixed time window would be worse

A 5 ms buffer would either add 5 ms of latency per request (waiting unnecessarily) or fire too early under load (capturing fewer than all IDs). The tick boundary fires at the earliest possible moment when all IDs for the current level are queued — zero unnecessary wait.

Event-loop tick (GraphQL resolves level 2):
  Post.author(post1) → loader.load(7)   // queued
  Post.author(post2) → loader.load(9)   // queued
  Post.author(post3) → loader.load(7)   // dedup: already queued

Microtask (tick ends):
  batchLoadFn([7, 9])
  → SELECT id, name FROM users WHERE id IN (7, 9)
  → resolves: 7→Bea, 9→Sven

All three Post.author Promises resolve: post1→Bea, post2→Sven, post3→Bea

Deduplication inside one request

If loader.load(7) is called twice in the same request, DataLoader returns the same Promise both times — the key appears only once in the batch. This is useful when the same entity is referenced from multiple paths in one document (e.g. a post’s author and a comment’s author resolving to the same user). Opt out with options.cache = false if you need fresh reads within one request (rare: write-after-read patterns).

The per-request instance rule

A DataLoader instance is a cache. Its lifetime must be the request, not the server process. A module-scope DataLoader shared across all requests:

Returns stale data: Request A loads user 7, the row is updated, Request B loads user 7 from the cache and gets the pre-update row.
Leaks across tenants: Request A and B belong to different tenants. Tenant B sees Tenant A’s cached row for the same ID.

The discipline: instantiate DataLoaders inside the request-context factory (the function Apollo Server calls per request), attach them to context, and let GC reclaim them when the request ends.

// Apollo Server context factory — correct
context: async ({ req }) => ({
  loaders: {
    author: new DataLoader(batchAuthors),
    tags:   new DataLoader(batchTags),
  },
})

// Resolver — correct
Post: {
  author: (post, _args, ctx) => ctx.loaders.author.load(post.authorId),
}

Same loader, opposite safety: per-request scope is correct; a global singleton is a shared cache that leaks tenant data and serves stale rows.

▸lesson.inset.warning

A global DataLoader saves memory in theory. In practice it leaks data across tenants and serves stale rows. Apollo’s docs are explicit: “DataLoader instances are per-request — if you use a DataLoader in your data source, ensure you create a new instance with every request.”

Quiz

DataLoader is invoked twice for the same key in the same request. What happens?

Quiz

Why does DataLoader batch on the event-loop tick boundary instead of a fixed 5 ms window?

Complete the analogy

Fill in the blank: DataLoader gathers all .load() calls made in the same event-loop _______, then fires one batch.

All .load() calls in one event-loop tick coalesce into a single SQL query. 51 trips collapse to 2.

Recall before you leave

01
Why must a DataLoader instance be created per request, not once at server start?
02
When does DataLoader fire the batch function relative to the resolver calls?

Recap

DataLoader moves the fetch out of each resolver and into a per-request batch. Every resolver calls loader.load(id) and gets a Promise. DataLoader holds all queued IDs until the event-loop tick ends, then fires one WHERE id IN (...) query for the full set. Keys are deduplicated within the batch, so loader.load(7) called from two different paths in one document fires one SQL lookup, not two. The instance must be created per request — a global instance leaks tenant data and returns stale rows. Now when you see a DataLoader in a codebase, your first question should be: is this instantiated in the context factory, or is it a module-level singleton? The answer determines whether you have a correctness bug waiting to surface in production.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

unlocks

deepens into

Batch function contracts: ordering, shapes, errorsmiddle

appears again in204

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.