Base CS from zero CS · 04 · 03

Compilation vs interpretation

Two strategies to run high-level code. A compiler translates the whole program to machine code ahead of time. An interpreter reads and carries out the program statement by statement at run time. Each strategy has tradeoffs in speed, startup, and portability. JIT combines both.

CS ◷ 22 min

You now know that high-level programs cannot run on a CPU directly — they must be translated into machine code first. But how and when does that translation happen?

There are two fundamentally different answers. In the first approach, a compiler reads your entire source file before the program ever runs, translates every line into machine code, and saves the result in a binary file. You then run that binary file, and the CPU executes the machine code directly — no translation happens at run time at all. This is how C, C++, Rust, and Go work.

In the second approach, an interpreter reads your source file at run time, statement by statement. For each statement it figures out what that statement means and carries it out, then moves to the next. No separate binary file is ever produced. This is how early Python and Ruby worked.

The choice between these two strategies involves real tradeoffs: startup time, raw speed, ease of debugging, and portability. Understanding both — and the hybrid just-in-time (JIT) approach used by modern JavaScript engines, the Java Virtual Machine, and Python’s PyPy — is essential for understanding how the programs you write actually run.

Goal

After this lesson you can define compiler and interpreter, explain the ahead-of-time vs run-time translation distinction, describe the tradeoffs between the two strategies, and give a brief explanation of what a JIT compiler does.

The compiler: translate everything, then run. A compiler is a program that takes your source file as input and produces a machine-code file as output. This happens completely before the program runs — the process is called ahead-of-time (AOT) compilation. The compiled binary file contains the raw CPU instructions for a specific platform; no interpreter or compiler is needed to run it.

The compiler’s job is more complex than the assembler’s one-to-one translation. The compiler must:

Parse the source code — turn the text into a structured representation of the program’s meaning (an abstract syntax tree).
Analyse the code — check types, resolve names, detect errors.
Optimise — rearrange and improve the code to run faster, without changing its meaning (reorder instructions, eliminate dead code, unroll loops, etc.).
Generate machine code — emit the CPU instructions for the target platform.

Together these four steps mean the compiler has done all the hard thinking — type-checking, optimising, encoding — before a single instruction runs. The final result is a binary file. When you run ./myprogram, the operating system loads that binary directly into memory and points the program counter at the first instruction. The CPU runs it immediately with no translation overhead.

The interpreter: translate and run, one statement at a time. An interpreter is a program that reads source code at run time and carries out each statement as it encounters it. No separate machine-code file is produced. The source file is the program that the user delivers; the interpreter is the engine that runs it.

For each statement, the interpreter roughly does:

Read and parse the next statement from the source file.
Decide what that statement means (what operation it describes).
Execute the corresponding action directly (by running the interpreter’s own machine code to carry out the operation).
Move to the next statement.

This cycle repeats until the program ends or an error stops it. Notice that without step 2 — deciding what a statement means — the interpreter cannot act; skip it and every statement would be a no-op. The interpreter never writes a machine-code binary. Instead, the interpreter itself is a compiled binary (written in C or another language), and it carries out your program by running its own machine code in response to what your statements say.

Tradeoffs: speed, startup, and portability.

Speed. Compiled programs typically run faster once they are running. The compiler had time to analyse the whole program, apply global optimisations, and produce machine code tailored to the target CPU. An interpreter must do translation work for every statement at run time, adding overhead to every operation.

Startup and portability. A compiled binary is specific to one platform (the machine code for x86-64 will not run on ARM). You must compile separately for each target. An interpreted language carries only the source file — the same source runs on any platform that has an interpreter installed, with no recompilation. This is why Python scripts or JavaScript files can be shared as plain text and run on any machine that has the language installed.

Error detection. A compiler sees the whole program before running any of it, so it can catch errors that span multiple lines — type mismatches, calls to undefined functions, variables used before assignment. An interpreter encounters errors only when it reaches the problematic line at run time, which can make certain bugs harder to detect early.

Development speed. Interpreted languages typically have a tighter edit-run-debug loop. You change a line and immediately re-run the interpreter — no compile step. Compiled languages require re-running the compiler, which for large codebases can take seconds to minutes.

When you choose a language for a new service, these tradeoffs are real decisions: a Go or Rust binary starts cold in milliseconds; a Python service warms up but ships without a compile step. Neither is universally right — the right choice depends on which cost matters more in your situation.

▸Why this works

Why does the interpreter itself not need to be interpreted? The interpreter is a program just like any other. It was written in a high-level language (usually C), compiled to machine code for the host platform, and distributed as a compiled binary. When you run python3 script.py, your OS loads the Python interpreter binary into memory and runs it. The interpreter binary’s machine code then reads script.py and carries out its statements. At no point does the CPU execute your Python source text directly — the CPU only ever runs machine code, which in this case is the interpreter’s own machine code. Your Python code controls what the interpreter does, not how the CPU runs.

Just-in-time (JIT) compilation: the hybrid. Modern language runtimes — JavaScript engines (V8, SpiderMonkey), the Java Virtual Machine (JVM), and Python’s PyPy — use a third strategy called just-in-time (JIT) compilation. JIT starts by interpreting (or running bytecode through a fast virtual machine), and then — at run time, while the program is already running — identifies which parts of the code execute most frequently (called hot spots) and compiles just those parts to native machine code on the fly.

The advantage: you get the fast startup and portability of an interpreter (no ahead-of-time compile step), plus the run-time speed of compiled code for the hot paths that matter most. The JIT compiler can also use information it only knows at run time — such as the actual types of variables that appear — to produce better machine code than an ahead-of-time compiler could.

The disadvantage: there is a warm-up period — the first time a function runs, it is interpreted (slow). After it has been called enough times, the JIT compiles it. Peak performance is not reached until after warm-up. This is why Node.js applications can take a moment to reach full speed, and why benchmark results for JavaScript improve as the code runs longer.

▸Edge cases

Bytecode as an intermediate step. Many languages (Python, Java, Kotlin) do not interpret source text directly or compile all the way to native machine code immediately. Instead, they compile to bytecode: a compact, portable binary format designed for a specific virtual machine (the Python virtual machine, the JVM). The bytecode is smaller and faster to execute than plain source text, but still machine-independent. The virtual machine then either interprets the bytecode or applies JIT compilation to it. This two- stage approach separates portability (bytecode can be shipped anywhere a VM exists) from performance (the VM or JIT handles the final translation to native code).

source.c

input

compiler

step 1

program.exe

binary

CPU runs

step 2

Compiler path (top): the source file is translated to a machine-code binary before the program runs. The CPU loads and runs the binary directly. No translation happens at run time.

Worked example

Comparing what happens when you run the same program, compiled vs interpreted.

Program to compute: result = (a + b) * c, where a=3, b=4, c=5. Expected result: 35.

Compiled path (C):

Before running: The C compiler reads the source file, parses the expression, optimises it, and emits machine code. For a simple expression like this, the compiler may reduce the entire computation to a constant (3+4)*5 = 35, stored directly in the binary. The binary lands on disk.
At run time: The OS loads the binary. The CPU fetches the instruction that loads the constant 35 into a register. Done. The translation happened entirely in step 1; step 2 has zero translation overhead.

Interpreted path (Python):

At run time: The Python interpreter starts. It reads the line result = (a + b) * c. It looks up a (3), b (4), adds them → 7. It looks up c (5), multiplies → 35. It stores 35 in the variable result. Every single step was a look-up, decode, and execute performed by the interpreter’s own machine code. The interpreter did not produce any machine code for your program.
The interpreter moves to the next line. If the program is long, this overhead accumulates. In CPython (the standard Python implementation), interpreted Python is typically 10–100× slower than compiled C for CPU-bound computations.

JIT path (JavaScript in V8):

First call: V8 interprets the JavaScript quickly (with minimal translation).
After many calls: V8’s JIT detects that the function is hot. It compiles the function to x86-64 machine code on the fly. Subsequent calls run the native machine code at near-C speed.
The warm-up cost was paid once; the long-running benefit is near-compiled performance.

Practice 0 / 5

A compiler translates the source program to machine code. When does this translation happen — before the program runs, or during the program's run? Type 1 for 'before' or 2 for 'during'.

An interpreter translates and executes the source code. When does it translate each statement — before the program starts, or at run time? Type 1 for 'before' or 2 for 'at run time'.

Which strategy typically produces faster run-time execution for long-running CPU-intensive programs: compilation or interpretation? Type 1 for 'compilation' or 2 for 'interpretation'.

A compiled C program produces a binary for x86-64. Can that binary run directly on an ARM CPU without recompilation? Type 1 for yes, 0 for no.

A JIT compiler detects which parts of a program run most frequently and compiles only those parts to native machine code at run time. What are those frequently-executed parts commonly called? Type 1 for 'hot spots' or 2 for 'cold paths'.

Check yourself

Quiz

What is the fundamental difference between how a compiler and an interpreter translate a high-level program?

Recap

There are two fundamental strategies for running a high-level program on a CPU. A compiler translates the entire source file into machine code ahead of time (before the program runs), producing a platform-specific binary that the CPU can execute directly. An interpreter reads source statements and executes them at run time, one at a time, without producing a machine-code binary. The tradeoffs: compiled programs typically run faster (the CPU executes pre-translated machine code with no translation overhead) but must be recompiled for each target CPU. Interpreted programs are more portable (the same source runs anywhere a suitable interpreter exists) and have no compilation step, but run slower because of per-statement translation overhead. Just-in-time (JIT) compilation is a hybrid: the program starts interpreted (or via a compact bytecode virtual machine), and the JIT compiler detects frequently executed hot spots and compiles only those portions to native machine code at run time, achieving near-compiled performance after a warm-up period. Modern JavaScript engines (V8), the Java JVM, and PyPy all use JIT. Now when you see a Node.js service taking a few seconds to reach full throughput, or a Java application that is slow on first request, you will recognise that as the JIT warm-up period — not a bug, but the cost of choosing startup flexibility over ahead-of-time compilation.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.