awesome-everything RU
↑ Back to the climb

Algorithms from zero

Why sort?

Crux Sorting is preprocessing: spend O(n log n) once to unlock O(log n) binary search, O(n) duplicate detection, O(n) median finding, and O(n) closest-pair detection. Learn when the cost is worth it.
◷ 16 min

You have unsorted data and a hard problem: find duplicates, search for a value, locate the median, or find two items closest in value. Each looks difficult and expensive. But what if you arranged the data first? A sorted array is like alphabetizing a phonebook—suddenly hard problems become trivial. The cost: one O(n log n) sort. The payoff: unlock O(log n) search, instant duplicate detection, linear-time median finding, and more.

Goal

After this lesson you understand what “sorted” means, why sorting acts as preprocessing that unlocks faster algorithms, what problems become easy after sorting (binary search, duplicate detection, finding median and closest pairs), and when sorting is worth the upfront cost versus when it is not.

The idea

What does “sorted” mean?

An array is sorted (or in sorted order) when each element is less than or equal to the next. For an array of numbers [3, 1, 4, 1, 5, 9, 2, 6]:

  • Sorted in ascending order: [1, 1, 2, 3, 4, 5, 6, 9]
  • Sorted in descending order: [9, 6, 5, 4, 3, 2, 1, 1]

The order is fixed; you can rely on it.

The big idea: sorting as preprocessing

Sorting is a preprocessing step — you do it once, then gain benefits for all future operations. Think of it like organizing a library: spend time sorting books by title (O(n log n)), and afterward every search is instant (O(log n)). Do it once, benefit forever.

What problems become easy after sorting?

1. Binary search (finding a value)

Unsorted array: To find if a value exists, you must scan the array linearly. Worst case: O(n).

Sorted array: Use binary search. Cut the search space in half each step. Worst case: O(log n).

For an array of 1 million elements:

  • Linear search: ~500,000 comparisons (average)
  • Binary search: ~20 comparisons

That is a 25,000× speedup.

2. Duplicate detection (does the array have repeats?)

Unsorted array: Compare each element to every other. Nested loops. O(n²).

function hasDuplicateUnsorted(arr) {
  for (let i = 0; i < arr.length; i++) {
    for (let j = i + 1; j < arr.length; j++) {
      if (arr[i] === arr[j]) return true;
    }
  }
  return false;  // O(n²)
}

Sorted array: Walk through once. If two neighbors are equal, there is a duplicate. O(n).

function hasDuplicateSorted(arr) {
  for (let i = 0; i < arr.length - 1; i++) {
    if (arr[i] === arr[i + 1]) return true;
  }
  return false;  // O(n)
}

Cost: O(n log n) to sort once. Benefit: O(n) instead of O(n²).

3. Finding the median (middle value)

Unsorted array: Find the n/2-th smallest element. Requires a complex algorithm (median-of-medians, or sorting). O(n) average, but tricky.

Sorted array: The median is at index n/2. Direct access. O(1).

4. Finding the closest pair (two values with minimum difference)

Unsorted array: Compare each pair to every other pair. O(n²).

Sorted array: Walk neighbors. The closest pair must be adjacent in sorted order. O(n).

The cost-benefit decision

Sorting always costs O(n log n). The question is: is it worth it?

Sort first if:

  • You will do many queries (search, duplicate check, median) on the same data. Each query is faster, and the sorting cost amortizes.
  • The sorted order itself solves your problem (e.g., finding closest pairs, finding the median).
  • You are in a read-heavy scenario: write data once, read many times.

Do not sort if:

  • You need only one operation (one search, one check) on the data. O(n log n) sort + O(log n) search is not better than O(n) linear search.
  • The data is already sorted (or nearly sorted). Use insertion sort—O(n).
  • Sorting destroys information you need (e.g., indices). Keep the unsorted version.
The code

Let us code the duplicate-detection example, showing the dramatic difference.

O(n²) without sorting:

function hasDuplicateNaive(arr) {
  // Compare each element to all others
  for (let i = 0; i < arr.length; i++) {
    for (let j = i + 1; j < arr.length; j++) {
      if (arr[i] === arr[j]) {
        return true;  // Found a duplicate
      }
    }
  }
  return false;  // No duplicates
}

O(n log n) with sorting:

function hasDuplicateSorted(arr) {
  // First, sort the array
  arr.sort((a, b) => a - b);
  
  // Now check neighbors only
  for (let i = 0; i < arr.length - 1; i++) {
    if (arr[i] === arr[i + 1]) {
      return true;  // Found a duplicate
    }
  }
  return false;  // No duplicates
}

Let us run both on different sizes:

Output
 
Step-by-step trace

Let us trace duplicate detection side by side. Array: [5, 2, 5, 8, 1].

Unsorted (O(n²)):

1 for (let i = 0; i < arr.length; i++) {
2 for (let j = i + 1; j < arr.length; j++) {
3 if (arr[i] === arr[j]) return true;
4 }
5 }

We found the match after 2 comparisons—lucky. Worst case: no duplicates. We would compare all pairs: 1 + 2 + 3 + 4 = 10 comparisons for just 5 elements. That is O(n²).

Sorted (O(n log n)):

After sorting [5, 2, 5, 8, 1] → [1, 2, 5, 5, 8]:

1 arr.sort(...);
2 for (let i = 0; i < arr.length - 1; i++) {
3 if (arr[i] === arr[i + 1]) return true;
4 }

Only 1 comparison after sorting (plus O(n log n) sort). For a large array with no duplicates, we would scan the sorted array once: 4 comparisons, not 10.

Complexity

Here is the complexity comparison for duplicate detection:

ApproachTotal costWhen to use
Nested loop (no sort)O(n²)Only small arrays (n ≤ ~100)
Sort + neighbor checkO(n log n)Medium+ arrays, especially if you need order later

For different array sizes:

nO(n²)O(n log n)
10010,000 ops~700 ops
1,0001,000,000 ops~10,000 ops
10,000100,000,000 ops~130,000 ops
100,00010,000,000,000 ops~1,700,000 ops

The pattern holds for all sorting-unlocked problems: Finding median, finding closest pair, merging sorted arrays. Once sorted, the rest is linear or logarithmic.

Practice 0 / 5

You have an unsorted array and need to find if a specific value exists. Is it better to sort first or scan linearly?

You sort an array in O(n log n), then search it 100 times with binary search (O(log n) per search). What is the total cost?

What is the time complexity of detecting duplicates in a sorted array by walking neighbors?

When is it NOT worth sorting before solving a problem?

You need to find the median (middle value) of an unsorted array. After sorting, what is the time to find the median?

Edge cases

Sorting in place vs. creating a copy. If you sort the original array, you lose the original order. Sometimes you need both—the sorted order for querying and the original indices. In that case, create a copy of the array to sort, or use an array of indices to avoid modifying the original. This costs extra memory, but preserves information.

Common mistake

Oversorting. Do not sort “just to be safe” for a single operation. If you will search once or check for duplicates once, linear time is cheaper than sorting. Only sort when you reuse the sorted order multiple times, or when the problem itself requires order (median, closest pair, etc.).

Check yourself
Quiz

You must detect duplicates in an array of 100,000 elements. Will you use O(n²) nested loops or O(n log n) sort + scan neighbors?

Recap

What is sorting? Arranging elements in non-decreasing (or non-increasing) order.

Why sort? Sorting is preprocessing: spend O(n log n) once to unlock faster solutions afterward.

What problems become easy?

  1. Search: O(n log n) sort → O(log n) binary search. Total: O(n log n).
  2. Duplicate detection: O(n log n) sort → O(n) neighbor scan. Total: O(n log n).
  3. Median: O(n log n) sort → O(1) direct access. Total: O(n log n).
  4. Closest pair: O(n log n) sort → O(n) neighbor check. Total: O(n log n).

When to sort:

  • You will perform multiple queries or operations on the data.
  • The problem requires sorted order (median, closest pair, merging).
  • You are in a read-heavy scenario: write once, read many times.

When NOT to sort:

  • You do only one operation. O(n) linear is cheaper than O(n log n) sort + O(log n) search.
  • The data is already sorted. Use a faster sort (insertion sort) or skip sorting.

The core insight: Sorting is not magic; it costs O(n log n). But it unlocks O(log n) search, O(n) duplicate detection, and O(n) median finding. That is the tradeoff. Pay once, benefit many times.

Next, you will learn specific sorting algorithms—simple sorts (bubble, insertion, selection) that are slow, and fast sorts (merge sort, quicksort) that are efficient.

Continue the climb ↑Simple sorts
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.