Data Engineering
Vector search: free-recall review
Retrieval beats re-reading. For each prompt, say or write a full answer from memory before you open the model answer — the effort of recall is what makes the material stick.
Reconstruct the unit’s spine — the recall–latency–memory triangle, why recall fails silently, HNSW vs IVF vs IVF-PQ, metric choice, post-filtering, and hybrid search — without looking back at the lesson.
- 01Why is a low-recall ANN index so dangerous in production, and how do you actually detect it?
- 02Describe the recall–latency–memory triangle and which knob moves which axis in HNSW.
- 03When do you choose HNSW, IVFFlat, or IVF-PQ, and what does each cost?
- 04How do cosine, dot product, and L2 relate, and how do you choose a metric?
- 05Why does metadata filtering interact badly with ANN, and how does pgvector address it?
- 06What is hybrid search, when do you need it, and how does Reciprocal Rank Fusion combine results?
If you could reconstruct each answer from memory, you hold the unit’s spine: recall fails silently so you measure recall@k against an exact baseline; the recall–latency–memory triangle is dialed by M, ef_construction, and ef_search; HNSW is the default while IVF-PQ is the memory-driven escape hatch; the metric must match the model and the index opclass; selective filters collapse post-filtered recall until you use iterative scan; and hybrid BM25 + vector with rank fusion is the answer whenever exact tokens matter.