awesome-everything RU
↑ Back to the climb

APIs

Federation and lookahead: batching beyond DataLoader

Crux Apollo Federation gives you cross-subgraph batching for free via _entities; you still owe the intra-subgraph DataLoader. Lookahead joins collapse multi-level nesting at the cost of resolver isolation.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 11 min

A federated graph routes a query across the users subgraph and the posts subgraph. The supergraph router batches the 50 user lookups into one _entities call automatically. Inside the users subgraph, the __resolveReference implementation loops over 50 representations and queries the database 50 times. Federation handed you the network batch for free; you still owe the database batch yourself.

How Apollo Federation batches across subgraphs

In a federated schema, each subgraph that owns an entity type exposes a __resolveReference(representation) resolver. When a query crosses subgraph boundaries — say the posts subgraph returns Post.author: User and the users subgraph owns User — the supergraph router:

  1. Gathers all User references that arose while resolving the query (e.g. 50 posts each referencing an author).
  2. Sends one _entities(representations: [_Any!]!) call to the users subgraph with all 50 representations.
  3. The users subgraph’s __resolveReference is called for each representation.

The cross-subgraph batch is automatic and free. The router does the work.

What you still owe inside __resolveReference

The naive implementation:

// users subgraph — naive, N+1 inside _entities
User: {
  __resolveReference: async (rep) => {
    return db.users.findById(rep.id); // One query per representation
  }
}

With 50 representations, this fires 50 SQL queries inside the single _entities call. You have to add DataLoader:

// users subgraph — correct
User: {
  __resolveReference: (rep, ctx) => ctx.loaders.user.load(rep.id),
}

The loader batches all 50 .load(rep.id) calls from the 50 __resolveReference invocations into one WHERE id IN (...) query.

Supergraph router:
  posts subgraph → 50 posts, each with authorId
  → one _entities([{__typename:"User", id:7}, ...x50]) call to users subgraph

users subgraph (naive):
  __resolveReference({id:7})  → SELECT * FROM users WHERE id=7   (1 SQL)
  __resolveReference({id:9})  → SELECT * FROM users WHERE id=9   (1 SQL)
  ... × 50                                                       (50 SQL)

users subgraph (with DataLoader):
  __resolveReference({id:7})  → loader.load(7)   (queued)
  __resolveReference({id:9})  → loader.load(9)   (queued)
  ... × 50                    → batchFn([7,9,...]) → 1 SQL

Lookahead: collapsing multi-level nesting

DataLoader fixes the “N queries for one column” shape. If the client asks for posts { author { profile { bio } } }, you still have three levels of resolution — DataLoader buffers each level independently, resulting in three trips, not one.

Resolver lookahead reads the info argument’s GraphQL AST inside the top-level resolver to see which child fields the client requested, then joins them in one SQL query.

Query: {
  posts: (_, __, ctx, info) => {
    const wantsAuthor = selectionSetContains(info, 'author');
    const query = wantsAuthor
      ? 'SELECT posts.*, users.name AS author_name FROM posts JOIN users...'
      : 'SELECT * FROM posts';
    return db.query(query);
  }
}

Libraries like Pothos, TypeGraphQL’s @Query with SelectQueryBuilder, and Hasura implement this by mapping the AST to a SQL query planner.

Why this works

The tradeoff: lookahead resolvers know about the entire subtree. The top-level resolver is no longer isolated — it changes behaviour depending on which child fields the client selected. This breaks one of GraphQL’s key design properties (per-field isolation). Use lookahead only when DataLoader at each level leaves observable latency after measurement.

Quiz

A federated supergraph receives a query that references 50 users across two subgraphs. What does the router do, and what does the users subgraph still need?

Quiz

What is the main tradeoff of resolver lookahead compared to DataLoader?

Recall before you leave
  1. 01
    Why does Apollo Federation's _entities batching not eliminate the need for DataLoader inside subgraphs?
  2. 02
    When would you choose lookahead over DataLoader?
Recap

Apollo Federation handles cross-subgraph batching at the router: all user references in a query become one _entities call to the users subgraph. Inside the subgraph, __resolveReference is still called once per representation, and DataLoader is still required to batch those into one SQL query. For queries that traverse multiple levels of nesting, lookahead reads the client’s AST selection set at the top level and issues one joined SQL query — collapsing three DataLoader trips into one at the cost of per-field isolation.

Connected lessons
appears again in202
Continue the climb ↑Query complexity defences: depth, cost, persisted queries
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources2
expand
  1. 01
  2. 02

Trademarks belong to their respective owners. Editorial reference only.