Algorithms from zero
Top-K and the k-th largest element
You have a stream of numbers. You want to know the k largest. Naive answer: wait for all numbers, sort them, take the last k. But that is O(n log n) time and O(n) space—you store everything. The insight: you only care about the top k, so keep a min-heap of size k. Push each number. When the heap exceeds k elements, pop the smallest (the minimum). At the end, your heap holds exactly the k largest, and its root is the boundary—the k-th largest element. Time: O(n log k). Space: O(k). When k is small, this crushes sorting.
After this lesson you can solve the k-largest problem using a min-heap of size k: push each element and evict the minimum when the heap grows beyond k. You understand why this works: the heap always holds the k candidates, and the root is the k-th largest. You can explain the time complexity O(n log k) and space complexity O(k), and why this beats sorting O(n log n) when k ≪ n. You can recognize the “k-th largest” and “top k frequency” patterns on LeetCode and other platforms, and apply the same heap-based approach. You can code it in one pass (no sorting phase), and trace the algorithm on a concrete array step by step.
The problem: find the k largest (or k smallest)
You are given an array of numbers. Your task: find the k largest elements. For example:
- Array: [3, 2, 1, 5, 6, 4], k = 2 → answer: [5, 6]
- Array: [3, 2, 3, 1, 2, 4, 5, 5, 6], k = 4 → answer: [5, 5, 6, 4] (or [6, 5, 5, 4], order may vary)
Naive approach: sort
Sort the array and take the last k elements. Time: O(n log n), space: O(1) if in-place or O(n) with external storage.
Problem: you must sort the entire array, even though you only care about the top k. When k is small, this wastes work.
The heap insight: keep a size-k heap
Maintain a min-heap of size exactly k. This heap will always contain the k largest elements seen so far.
Algorithm:
- For each element in the array:
- Push the element into the min-heap.
- If the heap size exceeds k, pop the minimum.
- After processing all elements, the heap contains exactly the k largest.
- The root of the min-heap is the k-th largest element (the smallest of the top k).
Why min-heap, not max-heap? Because you want to evict the smallest candidate when you have too many. A min-heap’s root is the smallest, so you pop it immediately.
Why does this work?
At any point, the heap holds the k largest elements seen so far. When a new element comes in:
- If it is larger than the root, it belongs in the top k, so you remove the current minimum (root) and insert the new element.
- If it is smaller than the root, it is not in the top k, so you ignore it.
After processing all n elements, the heap holds the k largest globally. The root is the boundary: it is the largest element among the top k that could be evicted, i.e., the k-th largest.
Example: find the 2 largest in [3, 2, 1, 5, 6, 4]
Min-heap of size k=2:
- Push 3: heap = [3]. Size = 1 < k, continue.
- Push 2: heap = [2, 3]. Size = 2 = k, continue.
- Push 1: heap = [1, 3, 2]. Size = 3 > k, pop min (1). heap = [2, 3].
- Push 5: heap = [2, 3, 5]. Size = 3 > k, pop min (2). heap = [3, 5].
- Push 6: heap = [3, 5, 6]. Size = 3 > k, pop min (3). heap = [5, 6].
- Push 4: heap = [4, 5, 6]. Size = 3 > k, pop min (4). heap = [5, 6].
Final heap: [5, 6]. The k largest are 5 and 6. ✓
Symmetric problem: find the k smallest
If you want the k smallest elements, use a max-heap of size k. The logic is identical: when the heap exceeds k, pop the maximum (the root). At the end, the heap holds the k smallest, and the root is the k-th smallest.
Common pattern: k-th largest element
Often, the problem asks only for the k-th largest element, not all k. After building the heap, return heap.peek() (the root)—it is the k-th largest.
Common pattern: top k frequency
Given an array, find the k most frequent elements. Solution:
- Count frequency of each element (hash map).
- Build a min-heap of size k on frequencies (not elements).
- Evict the least frequent when the heap exceeds k.
- The heap contains the k most frequent elements.
This is a direct application of the same pattern.
Top-k using a min-heap
1
function topK(arr, k) {
2
// Min-heap (priority queue with smaller values having higher priority)
3
const minHeap = new MinPriorityQueue();
4
5
// Push each element and evict the min if heap exceeds k
6
for (const num of arr) {
7
minHeap.insert(num, num); // insert(value, priority)
8
if (minHeap.size() > k) {
9
minHeap.extractMin(); // Remove the smallest
10
}
11
}
12
13
// Extract all elements from heap (they are in heap order, not sorted)
14
const result = [];
15
while (!minHeap.isEmpty()) {
16
result.push(minHeap.extractMin());
17
}
18
return result;
19
}
20
21
// Example
22
const arr = [3, 2, 1, 5, 6, 4];
23
const k = 2;
24
const topTwoLargest = topK(arr, k);
25
console.log(topTwoLargest); // [5, 6] (order may vary; both are in the heap)
- L2 A min-heap (priority queue with min at root). Size will stay ≤ k.
- L5 For each element in the array.
- L6 Push the element with priority = value (smaller priority = higher priority).
- L7 If heap size exceeds k, the smallest element (the root) is removed.
- L12 Extract all remaining elements. Since it is a heap (not a sorted array), the order may not be strictly ascending, but all k largest are present.
Finding the k-th largest element
1
const arr = [3, 2, 1, 5, 6, 4];
2
const k = 2;
3
const minHeap = new MinPriorityQueue();
4
for (const num of arr) {
5
minHeap.insert(num);
6
if (minHeap.size() > k) minHeap.extractMin();
7
}
Top-K using a min-heap of size k
For each of the n elements in the array:
- Insert: O(log k) (heap has at most k+1 elements before pop).
- ExtractMin (when heap exceeds k): O(log k).
Total: n iterations × O(log k) per iteration = O(n log k) time.
Space: The heap holds at most k elements, so O(k) space.
Comparison with sorting
If you sort the entire array: O(n log n) time, O(1) auxiliary space (if in-place).
Top-K with a heap: O(n log k) time, O(k) space.
When is the heap approach better?
- If k ≪ n (k is much smaller than n), then log k ≪ log n, so O(n log k) ≪ O(n log n).
- If k is close to n, sorting may be simpler and faster in practice.
Example:
- n = 1,000,000, k = 10: O(n log k) = O(1,000,000 × 3.3) ≈ 3.3 million operations. O(n log n) = O(1,000,000 × 20) ≈ 20 million operations. Heap is faster.
- n = 1,000, k = 500: O(n log k) = O(1,000 × 9) ≈ 9,000. O(n log n) = O(1,000 × 10) ≈ 10,000. Similar; sorting may be simpler.
Space-time tradeoff
Sorting uses O(n) space (or O(1) with in-place tricks). Top-K uses O(k) space. When k is much smaller than n, the heap saves memory.
To find the k largest elements, should you use a min-heap or a max-heap of size k?
What is the k-th largest element in the context of the top-K heap algorithm?
Given arr = [3, 2, 1, 5, 6, 4] and k = 2, what is the k-th largest element?
What is the time complexity of finding the k largest elements using a min-heap?
You want to find the 5 largest elements in an array of 1,000,000 numbers. Approximately how many more operations does sorting take compared to the heap approach? (Use log₂(1,000,000) ≈ 20 and log₂(5) ≈ 2.3.)
The 'top k frequent' problem asks: given an array, find the k most frequent elements. How would you apply the heap strategy?
lesson.inset.undefined
Why top-K matters. Real-world systems often ask: give me the top 10 search results, or the top 100 users by engagement, or the top k rarest errors. Sorting the entire dataset every time is wasteful. A top-K heap lets you answer these queries efficiently, even on large datasets and streams. Streaming use case: imagine a sensor emits readings continuously. You want to know the 100 largest readings seen so far. You cannot store all readings (memory-limited). A size-100 heap lets you maintain the top 100 online, processing each reading in O(log 100) time.
lesson.inset.undefined
Using a max-heap instead of a min-heap. If you keep a max-heap of size k to find the k largest, the largest element is at the root—but you want to evict the smallest candidate when the heap exceeds k. A max-heap forces you to search the entire heap for the minimum, which is O(k), not O(log k). Min-heap keeps the smallest candidate (the evictee) at the root, so one comparison tells you if a new element belongs in the top k.
lesson.inset.undefined
k equals n or k is larger than n. If k ≥ n, every element is in the top k, so just return the entire array. If k = 0, return an empty list. If k is 1, you are finding the single maximum, which is O(n) with a heap (one pass, one insert per element, size never exceeds 1) or O(n) with a linear scan.
Explain the top-K heap algorithm. Why do you use a min-heap of size k instead of a max-heap or sorting?
To find the k largest elements, maintain a min-heap of size k. For each element in the array, push it and evict the minimum when the heap exceeds k. The final heap contains exactly the k largest, and its root is the k-th largest element. Time: O(n log k). Space: O(k). This beats sorting (O(n log n)) when k ≪ n. Symmetric logic applies to k smallest (use a max-heap). The same pattern solves “top k frequent elements”: count frequencies, apply the heap strategy on frequency values. Next: lesson 05 extends this into k-way merge, where you maintain a size-k heap of streams, pulling the next minimum from the stream whose head is smallest.