Algorithms from zero ALG · 10 · 01

What is dynamic programming?

DP solves a problem by combining answers to overlapping subproblems, storing each answer once. Two required properties: overlapping subproblems and optimal substructure. It turns exponential recursion (like naive Fibonacci) into linear.

ALG ◷ 25 min

Already know this unit? Take a 1-minute quick check →

You already know recursion: a function calls itself on a smaller input and trusts the answer comes back. But watch what happens with Fibonacci. To get fib(5), you compute fib(4) and fib(3). But fib(4) computes fib(3) again. And fib(3) computes fib(2) twice. The same little problems get solved over and over—the work explodes. Dynamic programming is the fix: solve each small problem once, write the answer down, and reuse it. The exponential recursion collapses into a straight line.

Goal

After this lesson you understand what dynamic programming (DP) is: combining answers to overlapping subproblems, storing each answer once instead of recomputing it. You can name the two properties a problem must have for DP to apply—overlapping subproblems and optimal substructure—and explain each. You can see in the Fibonacci recursion tree why naive recursion repeats work (exponential) while DP does not (linear). You can tell DP apart from divide-and-conquer (whose subproblems do not overlap) and from greedy (which commits to one local choice without exploring combinations). You can recognize DP-shaped problems by their signals: “count the ways,” “min/max over choices,” “can you reach.” The next two lessons teach the two ways to implement DP: memoization (lesson 02) and tabulation (lesson 03).

The idea

What is dynamic programming?

Dynamic programming (DP) is a technique for solving a problem by breaking it into smaller subproblems, solving each subproblem once, and storing its answer so you never compute it twice. When the same subproblem comes up again, you look up the stored answer instead of redoing the work.

That is the whole idea: solve once, remember, reuse. DP is recursion plus memory.

The two required properties

DP only helps when the problem has both of these:

Overlapping subproblems. The naive recursive solution solves the same subproblem many times. If every subproblem were unique, there would be nothing to remember and DP would not help.
Optimal substructure. An optimal answer to the whole problem can be built from optimal answers to its subproblems. If gluing together the best pieces does not give the best whole, you cannot combine subproblem answers, and DP does not apply.

If either property is missing, DP is the wrong tool.

Property 1: overlapping subproblems (the motivation)

Look at naive recursive Fibonacci. fib(n) = fib(n-1) + fib(n-2), with fib(0) = 0 and fib(1) = 1. Draw the call tree for fib(5):

                       fib(5)
              /                      \
          fib(4)                   fib(3)
         /      \                 /      \
     fib(3)    fib(2)         fib(2)    fib(1)
     /    \     /   \          /   \
 fib(2) fib(1) fib(1) fib(0) fib(1) fib(0)
  /  \
fib(1) fib(0)

Count the repeats: fib(3) is computed 2 times, fib(2) 3 times, fib(1) 5 times, fib(0) 3 times. The tree branches into two at every step, so its size roughly doubles each level. That is exponential growth—the same handful of subproblems, recomputed again and again.

Property 2: optimal substructure

Fibonacci also has optimal substructure: the answer for n is built directly from the answers for n-1 and n-2. There is only one way to compute fib(n) from its parts, so any correct fib(n-1) and fib(n-2) give the correct fib(n). For optimization problems (shortest path, longest subsequence), this property says: the best whole solution contains best solutions to its subparts.

The DP fix

There are only n+1 distinct Fibonacci subproblems: fib(0), fib(1), …, fib(n). The recursion tree has thousands of nodes, but they answer the same handful of questions. DP computes each of those n+1 answers exactly once and reuses them. Exponential work becomes linear:

Naive recursion:  computes fib(k) many times   -> ~2^n calls
Dynamic prog.:    computes fib(k) exactly once  -> n+1 distinct answers

DP vs divide-and-conquer

Both split a problem into subproblems. The difference is whether the subproblems overlap.

Divide-and-conquer (merge sort, binary search): subproblems are disjoint. Merge sort splits the array into two halves that share no elements. Nothing is recomputed, so there is nothing to cache—plain recursion is already efficient.
Dynamic programming: subproblems overlap. The same subproblem reappears across different branches, so caching its answer is what makes DP fast.

Rule of thumb: if caching subproblem answers would save work, it is DP; if every subproblem is unique, it is divide-and-conquer.

DP vs greedy

Both can solve optimization problems, but they choose differently.

Greedy commits to the single best-looking choice at each step and never reconsiders. It is fast, but it only finds the global optimum on problems where local choices are guaranteed safe.
Dynamic programming considers all ways to combine subproblem answers and keeps the best. It explores the combinations greedy skips, so it solves problems where no single local rule is always correct—at the cost of more work.

Recognizing a DP problem

A problem is often DP when you see:

“Count the ways” to do something (number of paths, number of decodings).
“Find the min / max” over a set of choices (minimum coins, longest increasing subsequence).
“Can you reach / is it possible” to hit a target (subset sum, word break).

…and the problem has a small state: each subproblem is described by a few numbers (an index, a remaining capacity), so the number of distinct subproblems is small enough to store. Small state + overlapping subproblems = DP.

The fib(5) call tree. Plain nodes are unique subproblems; accent nodes are duplicates the naive recursion recomputes from scratch — fib(3) twice, fib(2) three times. DP computes each distinct subproblem once, stores it, and reuses it, collapsing every accent group into one computation.

The code

Naive recursive Fibonacci

This is the textbook recursion. It is correct, but it recomputes the same subproblems—the cost of ignoring overlap.

1 function fib(n) {

2 // Base cases: fib(0) = 0, fib(1) = 1

3 if (n <= 1) {

4 return n;

5 }

6 // Recursive case: two branches, each re-solving overlapping subproblems

7 return fib(n - 1) + fib(n - 2);

8 }

L2 Base cases stop the recursion: fib(0)=0 and fib(1)=1.
L7 Two recursive calls. fib(n-1) and fib(n-2) both recompute fib(n-2), fib(n-3), ... — this is the overlap.

Watching the work explode

Add a counter that increments on every call, then run fib for a few sizes. The call count is exactly 2·fib(n+1) − 1, which grows exponentially. Notice how the count jumps wildly even though the answer grows smoothly.

Output

The answer fib(30) is only 832040, but it took 2,692,537 calls to get there. Add 10 to n and the call count multiplies by more than 100. That is the exponential blowup overlapping subproblems cause. (Lesson 02 stores each answer once and brings fib(30) down to about 30 distinct computations.)

Step-by-step trace

Let us trace the left side of the fib(5) tree and watch the same subproblems reappear. Each frame shows which subproblem the recursion is currently asking for, and whether we have seen it before.

1 function fib(n) {

2 if (n <= 1) return n;

3 return fib(n - 1) + fib(n - 2);

4 }

Complexity

Naive recursion: O(2^n) time (more precisely O(φ^n))

Each call to fib(n) makes two more calls, so the recursion tree branches in two at almost every node. The number of calls to compute fib(n) is exactly 2·fib(n+1) − 1. Since fib(n) itself grows like φ^n (φ ≈ 1.618, the golden ratio), the call count grows like O(φ^n) ≈ O(1.618^n), which is exponential. People often round this up to O(2^n) because the tree is nearly a full binary tree of depth n. Either way: every extra +1 to n roughly multiplies the work.

Space for naive recursion is O(n): the call stack only goes as deep as the longest chain (fib(n) → fib(n-1) → … → fib(0)), even though the total number of calls is exponential.

n      naive calls (2*fib(n+1)-1)
5      15
10     177
20     21891
30     2692537      <- 2.7 million calls for a 6-digit answer

Dynamic programming: O(n) time

There are only n+1 distinct subproblems: fib(0) through fib(n). DP computes each exactly once and stores it, so the total work is O(n) time. Space is O(n) to hold the n+1 stored answers (lesson 03 shows how to shrink this to O(1) by keeping only the last two values).

            time        space
naive       O(2^n)      O(n) stack
DP          O(n)        O(n)  (down to O(1) possible)

The transformation from O(2^n) to O(n) is the entire payoff of DP. You spend a little memory to remember answers, and exponential work collapses to linear. This is why people say DP “turns exponential into polynomial.”

Practice 0 / 6

What are the two properties a problem must have for dynamic programming to apply?

Why is naive recursive Fibonacci slow?

What is the key difference between dynamic programming and divide-and-conquer?

How does dynamic programming differ from a greedy algorithm?

Which problem phrasing is a strong signal that dynamic programming may apply?

There are only n+1 distinct Fibonacci subproblems (fib(0) through fib(n)). How many distinct subproblems are there for fib(10)?

▸Why this works

Why store answers instead of being clever about the math? There is a closed-form formula for Fibonacci, but most DP problems have no formula. DP is general: any problem with overlapping subproblems and optimal substructure can be sped up by remembering answers, even when no neat equation exists. The technique, not the formula, is what transfers to coin change, edit distance, knapsack, and longest common subsequence.

▸Common mistake

Mistake: thinking every recursive problem is a DP problem. DP only helps when subproblems overlap. Merge sort is recursive but its halves are disjoint, so caching saves nothing — that is divide-and-conquer, not DP. Before reaching for DP, check: does the naive recursion solve the same subproblem more than once? If not, DP adds memory cost for no speedup.

▸Edge cases

When optimal substructure fails. DP needs the best whole answer to be buildable from best subproblem answers. Longest simple path in a general graph breaks this: the best path through a vertex is not made of best paths to its parts, because reusing a vertex is forbidden and the pieces can conflict. When optimal substructure does not hold, DP gives wrong answers — overlap alone is not enough.

▸More practice

A recognition drill. For each problem, ask three questions: (1) Does a naive recursion repeat subproblems? (2) Is the best whole built from best parts? (3) Is the state small (a few numbers)? If all three are yes, it is DP. Try it on: number of paths in a grid (yes), minimum coins for an amount (yes), and sorting a list (no — no overlap). The next lessons turn a recognized DP problem into working code two ways: memoization (02) and tabulation (03).

Check yourself

Quiz

Explain what dynamic programming is, the two properties a problem must have for it to apply, and how DP differs from divide-and-conquer and from greedy.

Recap

Dynamic programming (DP) solves a problem by combining answers to its subproblems, computing each subproblem once and storing the result to reuse. Solve once, remember, reuse.

Two required properties:

Overlapping subproblems: the naive recursion re-solves the same subproblem many times (Fibonacci’s fib(3), fib(2) appear again and again). This is what caching fixes.
Optimal substructure: the optimal whole answer is built from optimal answers to subproblems.

The Fibonacci payoff: naive recursion is O(2^n) (precisely O(φ^n)) because the call tree branches in two and repeats work — 2.7 million calls for fib(30). There are only n+1 distinct subproblems, so DP computes each once for O(n) time.

DP vs divide-and-conquer: divide-and-conquer subproblems are disjoint (merge sort) — nothing to cache. DP subproblems overlap — caching is what makes it fast.

DP vs greedy: greedy commits to one local choice and never looks back; DP weighs all combinations of subproblem answers and keeps the best.

Recognizing DP: “count the ways,” “min/max over choices,” “can you reach” — plus a small state describing each subproblem.

Next: lesson 02 implements DP top-down with memoization (recursion + a cache), and lesson 03 implements it bottom-up with tabulation (fill a table with a loop).

Now when you see a problem whose naive recursion re-solves the same subproblem repeatedly, that is your signal: check the two properties, name the state, and reach for DP before writing another exponential search.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 4 done

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.