AI / LLM Integration AI · 01 · 08

Prompt caching: free-recall review

Free-recall prompts across the prompt-caching unit. Answer each in your own words first, then reveal the model answer and compare.

AI Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

Retrieval beats re-reading. For each prompt, say or write a full answer from memory before you open the model answer — the effort of recall is what makes the mechanism stick.

Goal

Reconstruct the unit’s core mechanisms — token-for-token prefix matching, the write/read economics, TTL choice, the ordering rule, and silent prefix poisoning — without looking back at the lesson.

Recall before you leave

01
Why is prompt caching positional rather than semantic, and what does that imply for prompt design?
02
Walk through the write/read economics and how the break-even falls out.
03
How does the TTL work, when does the 1-hour tier earn its 2x premium, and how do you reason about the break-even between tiers?
04
What is the minimum cacheable length, and what is the dangerous thing about hitting it?
05
Explain silent prefix poisoning: how a single careless edit 10x's the input bill with no error.
06
What are the cache breakpoints, and why do people stack them on a long layered prompt?

Recap

If you could reconstruct each answer from memory, you hold the unit’s spine: matching is positional and token-for-token from position zero, so stable content goes first and volatile last with the breakpoint on the final unchanging block. You pay 1.25x once and 0.1x per read, so caching wins after the first re-read inside the TTL — 5 minutes by default, 1 hour for bursty gaps. Below the model’s minimum cacheable length nothing caches, silently. And the production failure mode is always prefix poisoning near token zero, visible only in the usage block. Now when you’re about to add anything to a system prompt — a timestamp, a feature flag, a new tool — you’ll stop and ask: does this go before or after the breakpoint?

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.