Base CS from zero CS · 11 · 03

Undefined behaviour

A defined error is detected and reported — the machine raises an exception. Undefined behaviour gives no guaranteed result: reading past an array end, integer overflow, relying on undefined. No error is raised; the program continues with garbage.

CS ◷ 22 min

The last two lessons painted a reassuring picture: something goes wrong, the machine notices, raises an exception, prints a stack trace, and you know exactly what happened. That is the good kind of wrong.

There is a worse kind. Sometimes a program does something the language never promised a result for — and instead of stopping, it keeps running with whatever bits happen to be lying around. No exception. No stack trace. No crash. The program produces an answer; the answer is garbage; and nothing tells you. This lesson is about that second kind of wrong — undefined behaviour — and why it is far more dangerous than any exception.

Goal

After this lesson you can distinguish a defined error (detected and reported by the machine) from undefined behaviour (no guaranteed result), give concrete examples of undefined behaviour grounded in the machine model, and explain precisely why undefined behaviour is more dangerous than a defined error: the failure is silent.

A defined error is detected and reported. In the last two lessons, every failure was defined: the language specifies what happens. Divide by zero, bad type conversion, calling a method on a value that has none — for each, the rule says “raise an exception.” The machine detects the condition and reports it.

The crucial property of a defined error is that you find out. An exception is raised, control flow stops, a stack trace is printed. The failure is loud. Even a crash is, in this sense, a success: the machine caught the problem and told you about it. A defined error is a failure the language has a defined, reported answer for.

Undefined behaviour: the language guarantees no result. Undefined behaviour (often shortened to UB) is the opposite. It is a program operation for which the language specification gives no guaranteed result at all. The language does not say “raise an exception.” It does not say “return zero.” It does not say anything. The operation is simply outside the set of things the language promises to define.

When a program performs an undefined-behaviour operation, anything is permitted to happen: it might produce a plausible-looking wrong number, it might produce a different wrong number on the next run, it might corrupt unrelated data, it might appear to work for years. The language has washed its hands of the outcome. Critically — and this is the whole danger — no error is raised. The program does not stop. It continues, carrying whatever garbage the operation produced.

Example one: reading past the end of an array. Recall from Unit 09 that an array is a run of contiguous cells, and an index is an offset from the array’s base address. Asking for index 5 of a 3-element array means computing the address of a cell that is not part of the array.

In some languages this is a defined error: the runtime checks the index against the array’s length and raises an exception if it is out of range — a defined, reported failure. In a low-level language with no such check, it is undefined behaviour: the machine simply computes the out-of-range address and reads whatever bits sit in that cell — leftover data from something else entirely. No check fires. No exception. The program reads a meaningless value and treats it as a real array element.

Example two: integer overflow. Recall from Unit 01 that a number is stored in a fixed number of bits — say 8 bits, which can hold values 0 through 255. Now add 1 to a value of 255. The true result, 256, needs a 9th bit. There is no 9th bit. The high bit is lost and the stored value becomes 0 (or in a signed type, a large negative number). This is integer overflow.

In a language that defines overflow, the result is specified — it wraps around predictably, or an exception is raised. In a language where signed overflow is undefined behaviour, the standard guarantees nothing: the program might wrap, might produce a different value, and — because the compiler is allowed to assume overflow never happens — might even have whole sections of code optimised away on that assumption. The arithmetic looks normal in the source; the running program does something the source never suggested.

Example three: relying on undefined in JavaScript. JavaScript does check array bounds — reading index 5 of a 3-element array does not crash; it gives the special value undefined. That is defined. The undefined-behaviour-style danger is what happens when your code then uses that undefined without noticing.

undefined + 1 is NaN (“not a number”). NaN spreads silently through every calculation it touches: NaN * 2 is NaN, NaN > 0 is false. No exception is raised at any step. Your program keeps running, computing, and storing NaN where a real number should be — a wrong answer that propagates quietly, exactly the way a true undefined- behaviour result propagates. The failure is silent even though each individual step is “defined.”

Why undefined behaviour is more dangerous than a defined error. A defined error is loud: it raises an exception, stops control flow, prints a trace. You discover it immediately, often the first time you run the code, and the trace points you straight at the line. Undefined behaviour is silent: no exception, no stop, no trace. The program produces a wrong result and carries it forward — into a stored value, a saved file, a displayed total.

Silence is the danger. A loud failure is found and fixed. A silent failure can survive testing, ship to users, corrupt real data, and only surface much later — far from the line that actually caused it, with no trace pointing back. A defined error tells you the truth immediately; undefined behaviour lets a lie travel.

[0]

[1]

[2]

[3]

A 3-element array in contiguous cells: valid indices 0, 1, 2. Index 3 (rose) points one cell past the end. A defined error checks the bound and raises an exception. Undefined behaviour reads cell [3] anyway — whatever leftover bits are there — with no error raised.

Worked example

The same out-of-range read, two languages, two outcomes.

A program holds a 3-element array prices = [10, 20, 30] (valid indices 0, 1, 2) and a bug computes the index as 3 — one past the end.

Language A — defined error. The runtime stores the array’s length alongside it. Before reading, it checks: is 3 less than the length 3? No. The runtime raises an exception: Error: index out of range. Control flow stops. A stack trace is printed pointing at the exact line. The programmer sees the failure on the first test run and fixes the index. Loud. Found. Fixed.

Language B — undefined behaviour. The runtime does no bounds check. It computes the address of cell [3] — base address plus 3 element-sizes — and reads it. That cell was never part of the array; it holds leftover bits, say the number 48291, from some unrelated data. The program reads 48291 as if it were a real price. No exception. No trace. The program continues, perhaps adding 48291 into a total, perhaps saving it. Silent. Shipped. Corrupting data weeks later.

Same bug, same line. The defined error turned it into a five-minute fix; the undefined behaviour turned it into a production incident with no trace to follow.

▸Why this works

Why would a language ever leave behaviour undefined? Speed. A bounds check is an extra comparison before every array read; an overflow check is extra work after every addition. Low-level languages built for maximum performance leave these checks out and declare the unchecked cases undefined — the programmer promises never to trigger them, and in exchange the machine code runs with no checking overhead. It is a deliberate trade: raw speed in return for the loss of the safety net. Higher-level languages usually make the opposite choice and keep the checks.

▸Common mistake

A common mistake is “my program ran and gave an answer, so it must be correct.” Undefined behaviour breaks that assumption completely. A program that hit undefined behaviour also runs and also gives an answer — the answer is just wrong, with nothing to flag it. “Produced an answer” and “produced the right answer” are different claims; undefined behaviour is precisely the gap between them.

Practice 0 / 5

A failure that the machine detects and reports by raising an exception is which kind? Type 1 for a defined error, 2 for undefined behaviour.

An 8-bit unsigned value holds 255. The program adds 1. The true result 256 needs a 9th bit that does not exist. What value is stored (the result wraps to)?

An array has 4 elements, valid indices 0 to 3. The index that is OUT of bounds by exactly one (one cell past the end) is which?

Undefined behaviour is more dangerous than a defined error for one main reason. Type 1 if the reason is 'the failure is silent — no error is raised', or 2 if the reason is 'it always crashes the computer'.

In JavaScript, reading index 9 of a 3-element array gives the value undefined (not a crash). Then undefined + 1 is computed. The result is NaN. Does evaluating undefined + 1 raise an exception? Type 1 for yes, 0 for no.

Check yourself

Quiz

What makes undefined behaviour more dangerous than a defined error?

Recap

A defined error is a failure the language has a specified, reported answer for: the machine detects the condition and raises an exception — control flow stops, a stack trace is printed, you find out immediately. Undefined behaviour is an operation the language guarantees no result for — reading past the end of an array in a low-level language, signed integer overflow, letting an undefined or NaN value spread through a calculation. The defining danger of undefined behaviour is that it is silent: no exception is raised, the program does not stop, no trace is printed. It carries a garbage result forward as if it were correct. A loud, defined failure gets found and fixed; a silent, undefined failure can ship to users and corrupt real data long before anyone notices. “The program produced an answer” never means “the answer is right.” Now when you see a program behave strangely without crashing — wrong totals, corrupted records, intermittent nonsense — reach for undefined behaviour as the first hypothesis, not the last.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.