Base CS from zero
Debugging as reasoning
A program does the wrong thing. The beginner’s instinct is to start changing lines —
flip a comparison, add a + 1, move a statement — run it again, and hope. This is
guessing. It sometimes appears to work, and it teaches you nothing, because you never
learned why the bug existed.
There is a completely different way to do this, and it rests on one fact you have known since Unit 03: the machine is deterministic. The same instruction, on the same state, always does exactly the same thing. A computer never does something “random” or “sometimes.” That single fact turns debugging from guessing into reasoning — a disciplined process you can actually rely on.
After this lesson you can explain why the determinism of the machine makes debugging a matter of reasoning rather than guessing, define a debugging hypothesis as a specific testable claim about which step diverged from expected, and describe the three core moves of testing a hypothesis: narrow the range, read the trace, and check state.
The machine is deterministic. Recall the fetch-decode-execute cycle from Unit 03: the CPU fetches the instruction the program counter points to, decodes it, and executes it. That cycle has a property worth stating sharply: it is deterministic. Given the same instruction and the same machine state — the same register values, the same memory cells — it always produces exactly the same result. Every time. There is no luck in it.
This means a bug is not random. If your program does the wrong thing, it does the wrong thing for a reason — some specific instruction, on some specific state, produced a result you did not expect. And because the machine is deterministic, that reason is fixed: it will produce the same wrong result every time you run it with the same inputs. The bug is sitting still, waiting to be found. It is not hiding; it is just somewhere you have not looked yet.
Therefore debugging is reasoning, not guessing. Guessing changes the program and hopes the symptom disappears. It treats the bug as mysterious. But the bug is not mysterious — it is a determined consequence of the code and the input. So debugging should be the opposite of guessing: a process of figuring out, by reasoning, exactly which step went wrong.
The model to use is the one from Unit 07’s lesson on tracing a program. There you followed a program step by step, knowing the program counter and every variable value at each line, and you could predict each next step exactly. Debugging applies that same skill to a broken program. You know what the program should do at each step. You find out what it actually did. The bug is at the first step where those two diverge.
A hypothesis is a specific, testable claim. Reasoning needs something concrete to reason about. That concrete thing is a hypothesis: a specific, testable claim about which step diverged from what you expected.
“Something is wrong with the loop” is not a hypothesis — it is too vague to test. “The loop runs one time too many, so the last iteration uses an index past the end of the array” is a hypothesis: it names a step, predicts a concrete observation, and can be confirmed or refuted by looking. A good hypothesis always makes a prediction sharp enough that a single check will either support it or kill it. If a hypothesis cannot be wrong, it cannot help you.
Testing move one: narrow the range. You rarely know which line is wrong at the start. But you can usually split the program into a part you have checked and a part you have not. Pick a point in the middle. Ask: at this point, is the state still correct?
If yes, the bug is after this point — the first half is cleared. If no, the bug is at or before this point — the second half is cleared. Either way, half the remaining code is eliminated by one check. Repeat, and the suspect range shrinks fast: a program of hundreds of steps narrows to the single guilty step in a handful of checks. This is the same halving idea you will meet again as binary search — here it is applied to locating a bug.
Testing move two: read the trace. When the bug is a raised exception, you do not have to narrow blindly — the runtime already did the work. The stack trace from lesson 02 is a snapshot of the call stack at the throw. Its top line names the exact function and line where the exception surfaced; the lines below name the chain of calls that led there.
Reading the trace collapses the search instantly: instead of “somewhere in the program,” you start at “this function, this line.” The trace is evidence the machine handed you for free. Reasoning still continues from there — the throw site is where the error surfaced, not always where the cause lives — but the trace gives reasoning a precise starting point instead of a blank page.
Testing move three: check state. A hypothesis predicts a concrete fact about the program’s state — a variable’s value, an array’s contents, which branch was taken. To test it, you go and look at that fact at that exact moment in the run. You inspect the value.
This is where debugging becomes an experiment. The hypothesis says “at line 20, i should
be 3.” You check: at line 20, what is i actually? If it is 3, the hypothesis is refuted —
that step was fine, look elsewhere. If it is 4, the hypothesis is supported — you have
found a step where reality diverged from expectation, and the bug is at or just before
it. Each check either kills a hypothesis or sharpens it. You are not changing the program;
you are interrogating it. Only once reasoning has located the exact guilty step do you
change a line — and now you are fixing a known bug, not guessing.
Debugging a sum that comes out too small.
A function should add the 4 numbers in prices = [10, 20, 30, 40] and return 100. It
returns 60. No exception is raised — the program runs and gives a wrong answer.
1. Form a hypothesis. The result 60 is exactly 10 + 20 + 30 — the first three
numbers, with the last one missing. Hypothesis: the loop stops one iteration too early
and never adds prices[3]. This is specific and testable: it predicts the loop runs 3
times, not 4.
2. Narrow the range. The function has setup, a loop, and a return. The setup
(total = 0) is trivially correct. The return just hands back total. The loop is the
only place total changes — the suspect range is the loop. Reasoning has eliminated
two-thirds of the function without changing a line.
3. Check state. Test the hypothesis by inspecting the loop variable. The prediction
is “the loop runs 4 times, with the index taking values 0, 1, 2, 3.” You check the index
on each iteration and observe: 0, 1, 2 — then the loop exits. It ran 3 times. The
hypothesis is supported: the index never reached 3, so prices[3] was never added.
4. Locate the exact step and fix. The divergence is the loop’s stop condition. It says
index < 3 when it should say index < 4 (or index < prices.length). Now — and only
now — you change one line. You are fixing a bug you have proven is there, not guessing.
Run again: the loop runs 4 times, the result is 100.
No line was changed until reasoning had pinned down the exact guilty step. That is the whole method.
Why this works
Why is guessing so tempting — and so bad? Guessing is tempting because changing a line is fast and occasionally the symptom vanishes. But “the symptom vanished” is not “the bug is fixed.” A guess can hide a bug instead of removing it — make the visible failure go away while the real defect stays, ready to resurface with different input. Reasoning costs more upfront but ends with you understanding the bug, which is the only state from which you can be sure it is actually gone.
Common mistake
A common mistake is to treat the stack trace’s top line as the bug itself and “fix” it there. The trace top is where the error surfaced — but the cause can be earlier. A function might throw “not a number” because a caller passed it bad data, and the real defect is in the caller. The trace gives reasoning its starting point; it does not end the reasoning. Follow the chain until you find the step where state first went wrong.
The machine is deterministic: the same instruction on the same state always gives the same result. Run a buggy program twice with the same input. How many different results can the bug produce?
Which of these is a real hypothesis? Type 1 for 'something is wrong with the loop', or 2 for 'the loop stops one iteration early so prices[3] is never added'.
A program has 16 steps. You debug by narrowing the range, halving the suspect region each time. After how many checks is the suspect range down to a single step? (16 -> 8 -> 4 -> 2 -> 1)
An exception is raised. Which line of the stack trace gives the function and line where the exception surfaced — the starting point for your reasoning? Type 1 for the top line, 2 for the bottom line.
Your hypothesis predicts that at line 20 the variable i should be 3. You check the actual run and find i is 3 at line 20. Is the hypothesis supported or refuted? Type 1 for supported, 0 for refuted.
Why is debugging a matter of reasoning rather than guessing?
Debugging is reasoning, not guessing — and the reason is that the machine is deterministic: the fetch-decode-execute cycle always produces the same result for the same state, so a bug is a fixed, repeatable consequence of the code and input, not a random event. Because the bug sits still, you can find it by reasoning. The unit of reasoning is a hypothesis: a specific, testable claim about which step diverged from what you expected — sharp enough that one check confirms or refutes it. You test a hypothesis with three moves: narrow the range (split the program, check the midpoint, discard the cleared half), read the trace (the stack trace’s top line hands you the throw site for free), and check state (inspect the actual variable values at the moment the hypothesis predicts a fact about them). Only once reasoning has pinned the exact guilty step do you change a line — fixing a known bug, with understanding, instead of guessing.