Data Engineering DATA · 07 · 08

Vector search: free-recall review

Free-recall prompts across the vector-search unit — answer each from memory first, then reveal the model answer and compare.

DATA Senior ◷ 14 min

Level

FoundationsJuniorMiddleSenior

Retrieval beats re-reading. For each prompt, say or write a full answer from memory before you open the model answer — the effort of recall is what makes the material stick.

Goal

Reconstruct the unit’s spine — the recall–latency–memory triangle, why recall fails silently, HNSW vs IVF vs IVF-PQ, metric choice, post-filtering, and hybrid search — without looking back at the lesson.

Recall before you leave

01
Why is a low-recall ANN index so dangerous in production, and how do you actually detect it?
02
Describe the recall–latency–memory triangle and which knob moves which axis in HNSW.
03
When do you choose HNSW, IVFFlat, or IVF-PQ, and what does each cost?
04
How do cosine, dot product, and L2 relate, and how do you choose a metric?
05
Why does metadata filtering interact badly with ANN, and how does pgvector address it?
06
What is hybrid search, when do you need it, and how does Reciprocal Rank Fusion combine results?

Recap

If you could reconstruct each answer from memory, you hold the unit’s spine: recall fails silently so you measure recall@k against an exact baseline; the recall–latency–memory triangle is dialed by M, ef_construction, and ef_search; HNSW is the default while IVF-PQ is the memory-driven escape hatch; the metric must match the model and the index opclass; selective filters collapse post-filtered recall until you use iterative scan; and hybrid BM25 + vector with rank fusion is the answer whenever exact tokens matter.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.