AI / LLM Integration AI · 08 · 08

Composing LLM apps: free-recall review

Free-recall prompts spanning the whole AI/LLM track. Answer each from memory first, then reveal the model answer and compare on the seams between layers.

AI Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

Retrieval beats re-reading. For each prompt, reconstruct a full answer from memory — across the whole track, not one layer — before you open the model answer. The effort of recall is what makes the seam-level reasoning stick.

Goal

Reconstruct the track’s spine: how caching, RAG, streaming, tool calls, agent loops, and evals compose — and where each pair of correct layers breaks at the seam.

Recall before you leave

01
Why can a RAG-backed assistant with a long static system prompt still see near-zero cache hit rate in prod, and how do you fix it?
02
Why is a streamed turn a state machine rather than a token pipe, and what happens if you ignore that?
03
Why are token-budget alerts not the same as budget enforcement for an agent, and what does enforcement look like?
04
Why does a green offline eval suite still let a retrieval regression ship, and how do you close the gap?
05
State the 'bug lives in the seam' thesis and how it changes how you debug a composed LLM app.
06
Walk the order you'd debug a composed assistant whose cost tripled and answers feel slower after shipping.

Recap

If you could reconstruct each answer from memory, you hold the track’s spine: caching is a byte-for-byte prefix match, so dynamic RAG context belongs after the breakpoint; the stream is a state machine where tool_use is a transition and every id must be paired; agent loops need enforced step/dollar caps, not alerts; evals must run the live retrieval path or retrieval regressions ship green. And the meta-lesson over all of it: the bug lives in the seam — trace one real request end to end and model the whole flow. Now when you encounter an unexpected cost spike or a silent regression after shipping, you know to look at the boundary, not the component.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.