Base CS from zero
From source to running program
Across this unit you have learned four ideas in sequence. Assembly language gives machine code readable names (and the assembler translates them, one to one). High-level languages let one statement stand for many machine instructions and are portable across CPUs (but need a translator). Compilers translate a whole program to machine code before it runs; interpreters translate and execute statement by statement at run time. The runtime is the support machinery — garbage collector, call stack, standard library, VM — that runs alongside your code.
Now it is time to connect all four of those ideas into a single continuous chain. When you save a source file and click Run — or type a command in a terminal — what actually happens, step by step, until the CPU is executing your logic? Where does the journey begin, and where does it end?
This lesson traces that full journey for a tiny program. By the end, you will have seen the path that every program — no matter how large — follows from the moment you finish writing it to the moment the CPU executes its first instruction.
After this lesson you can describe every stage of the source-to-execution pipeline in order (source text → translation → linking → loading → runtime startup → CPU execution), explain the role of the linker and the OS loader, and trace a small program completely through the pipeline.
Stage 1: you write source text. It begins with you. You create a text file — a
.ts file, a .py file, a .c file. The contents are human-readable characters: letters,
numbers, punctuation, whitespace. The CPU cannot run this text. It is the starting
material, not the finished product.
The source file encodes your intent using a programming language’s vocabulary and grammar. It is the highest-level representation of your program — the form that humans author and read. Everything that follows is a series of transformations that ultimately produce the bit patterns the CPU can execute.
Stage 2: translation. The source text must be converted to machine code. Which mechanism performs this conversion depends on the language:
-
Compiled language (C, Rust, Go): A compiler reads the source file, parses it, checks types, optimises, and emits machine code into an object file (
.oor.obj). The object file contains machine code for the functions defined in that source file, plus a list of external symbols the file uses but does not define (like functions from other files or the standard library). -
Assembled language (hand-written assembly): The assembler reads the source file, looks up each mnemonic in a table, and emits machine code into an object file. One mnemonic, one instruction.
-
Interpreted language (Python, Ruby): There is no ahead-of-time translation step. The interpreter loads and executes the source file at run time (possibly compiling to bytecode first as an internal optimisation). For the purposes of this pipeline, the “source text” is what the interpreter receives at run time.
-
Bytecode language (Java, Kotlin, C#): A compiler reads the source and emits bytecode — a compact, portable intermediate binary — into a
.classfile (Java) or.dll/.exe(C#). The bytecode is later run by a VM (JVM, CLR).
Stage 3: linking. A real program almost never consists of a single object file. It
calls library functions (from the standard library and third-party packages), and those
functions live in separate object files. The linker is the program that combines all
the object files (and library archives) into a single executable file. It resolves all
the external symbol references: if your code calls printf, the linker finds the compiled
printf machine code in the C standard library archive, includes it in the output, and
patches the call address so the call goes to the right place.
After linking, the executable is a single binary file that contains all the machine code
the program needs (statically linked) or a file that names shared libraries to be loaded
at run time (dynamically linked). Dynamic linking is more common: the printf machine code
lives in the shared library libc.so on disk, and the executable just contains a reference
to it. The OS loads libc.so at run time when the program starts.
For interpreted languages, there is no explicit linking step — the interpreter resolves module imports at run time.
Stage 4: loading. When you run the executable — by double-clicking or typing its name in a terminal — the operating system’s loader takes over. The loader:
- Reads the executable file from disk.
- Creates a new process — an isolated execution environment with its own virtual address space (the OS’s illusion of private memory).
- Copies the program’s machine-code bytes from the executable file into memory cells in the process’s address space, starting at the address the linker designated.
- Loads any required shared libraries into the process’s address space.
- Sets up the call stack in memory (a region reserved for function call frames).
- Sets the program counter to the address of the program’s first instruction.
- Transfers control: the CPU starts fetching from that address.
After step 7, the CPU is running. The loader’s job is done.
Stage 5: runtime startup and your code. Before your main function (or your top-level
script) runs, the runtime performs its own initialisation:
- For C: the startup code (called
_startorcrt0) sets up the call stack, initialises global variables, and then callsmain. - For Python: the CPython interpreter starts, initialises the Python VM, imports the standard library modules needed by your script, and begins executing your script from the first statement.
- For Java: the JVM starts, loads the bytecode for your class, initialises the runtime
(GC, JIT compiler, class loader), and calls your
mainmethod. - For Node.js: the V8 engine starts, the Node.js runtime initialises the event loop and standard library, and begins executing your JavaScript file.
Once the runtime is initialised, your code begins running. Every function call, every variable access, every allocation from this point on is handled jointly by your code and the runtime.
Stage 6: the CPU’s fetch-decode-execute loop — forever. From the moment the program counter is set to the first instruction, the CPU runs its loop without pause:
- Fetch: read the instruction bytes at the address in the program counter from memory.
- Decode: interpret the bit pattern — which opcode? which registers? which address?
- Execute: carry out the operation (add, load, store, jump, compare…).
- Advance PC: move the program counter to the next instruction (or to a jump target).
- Go back to step 1.
The CPU does not know or care that the instruction was once a line of TypeScript or Python. By the time it reaches the CPU, it is just a bit pattern in a memory cell. The CPU runs its loop. The program’s logic unfolds as a consequence of the specific bit patterns that the translation pipeline placed in memory.
The loop ends when the program calls the OS “exit” system call, when the process is killed externally, or when an unhandled exception crashes it.
Why this works
Why is the pipeline split into so many stages? Each stage has a distinct role that the others cannot do. The compiler understands language semantics but does not know memory addresses (the linker resolves those). The linker combines object files but does not create processes (the OS loader does that). The loader puts code in memory but does not initialise the language runtime (the runtime startup code does). The CPU runs instructions but does not understand your source text. Splitting the pipeline into stages allows each tool to do one job well and be reused across projects.
Tracing a tiny C program through the complete pipeline.
Source file add.c:
#include <stdio.h>
int main(void) {
int a = 3;
int b = 4;
int result = a + b;
printf("Result: %d\n", result);
return 0;
}Stage 1 — Source text. You save add.c. It is 95 bytes of ASCII text. The CPU cannot
run it.
Stage 2 — Compilation. You run gcc -c add.c -o add.o. The compiler:
- Parses the source into an AST.
- Determines that
a,b, andresultare local integer variables (allocated on the call stack, not the heap). - Emits x86-64 machine code for the body: instructions to push a stack frame, move the
constants 3 and 4 into registers, add them, call
printf, clean up the frame, return. - Records an unresolved reference to
printf(defined in libc, not inadd.c). - Writes the machine code and the symbol table to
add.o.
Stage 3 — Linking. You run gcc add.o -o add. The linker:
- Takes
add.oand the C standard library. - Finds the definition of
printfinlibc. - In a dynamically-linked build, records that
printfcomes fromlibc.so.6and patches the call site so that at run time the dynamic linker will resolve it. - Emits the final executable
add.
Stage 4 — Loading. You run ./add. The OS loader:
- Reads the ELF header of
addto find the address and size of the code section. - Creates a process, maps the code section into memory (say, starting at address 0x401000).
- Maps
libc.so.6into the process’s address space. - Sets up a call stack at a high virtual address.
- Sets PC = 0x401040 (the address of
_start, the runtime startup entry point).
Stage 5 — Runtime startup. The C runtime startup code runs: it sets up argc/argv,
calls global constructors (none in this case), then calls main. The PC is now at the
first instruction of main.
Stage 6 — Fetch-decode-execute. The CPU runs:
- Instruction at 0x401040: push the stack frame (allocate space for
a,b,result). - Instruction at 0x401044: move the constant 3 into the memory cell for
aon the stack. - Instruction at 0x401048: move the constant 4 into the memory cell for
bon the stack. - Instruction at 0x40104c: load
ainto register eax, loadbinto edx, execute ADD eax, edx → eax = 7. - Instruction at 0x401054: store eax (7) into the stack cell for
result. - Instruction at 0x401058: load the format string address and the value 7, call
printf. The runtime resolves theprintfaddress (via the dynamic linker) and jumps there. - Inside
printf, the runtime formats “Result: 7\n” and makes an OSwritesystem call to send it to stdout. printfreturns.mainexecutesreturn 0. The runtime callsexit(0). The OS terminates the process and frees its memory.
From 95 bytes of C source text to the terminal printing “Result: 7” — that is the full pipeline.
The pipeline has six main stages. Which stage converts source text into machine-code object files? Type 1 for 'translation (compilation/assembly)' or 2 for 'loading'.
The linker resolves external symbol references between object files. If your code calls printf but printf is defined in the C standard library, which stage makes the program actually able to call printf at run time? Type 1 for 'compilation' or 2 for 'linking'.
The OS loader creates a new process and copies program bytes from disk into memory. After the loader finishes, which register determines which instruction the CPU runs first? Type 1 for 'the program counter' or 2 for 'a general-purpose register'.
In the fetch-decode-execute loop, the CPU fetches instruction bytes from memory. By the time a Python statement reaches the CPU as machine code (via CPython's interpreter), does the CPU know it originated from Python source text? Type 1 for yes, 0 for no.
For interpreted languages like Python, is there an explicit linking stage that combines object files before the program runs? Type 1 for yes, 0 for no.
What is the role of the OS loader in the source-to-execution pipeline?
Every program follows the same six-stage pipeline from source text to CPU execution:
- Source text — you write human-readable code in a file.
- Translation — a compiler, assembler, or (at run time) an interpreter converts the source into machine code or bytecode.
- Linking — the linker combines object files and resolves references to library functions, producing a complete executable (not present for interpreted languages).
- Loading — the OS loader reads the executable from disk, creates a process, copies the machine-code bytes into memory, and sets the program counter to the first instruction.
- Runtime startup — the language runtime initialises (GC, VM, standard library) and calls the program’s entry point.
- Fetch-decode-execute — the CPU runs its loop: fetch instruction bytes from the address in the program counter, decode the opcode and operands, execute the operation, advance the program counter, repeat.
By the time the CPU executes any instruction, the source text is gone — replaced by raw bit patterns in memory cells. The CPU does not know whether it is running Python, C, or assembly: it runs its loop regardless. The full meaning of the program unfolds from the specific patterns that the translation pipeline placed in memory.