From machine code to a language: code and mapping reading
Crux Read assembly and high-level snippets, trace a source-to-machine-code mapping, and tell compilation from interpretation from JIT by what each one produces and when.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 14 min
The clearest way to tell these layers apart is to look at real code and ask: how many machine instructions does this become, and when does the translation happen? Read each snippet and pick the answer a careful engineer would give.
Goal
Practise reading the mapping between layers: assembly to machine code (one-to-one), a high-level statement to many instructions, and the difference compile, interpret, and JIT make to what is produced and when.
On the toy CPU each instruction is 2 bytes. How many machine instructions and how many bytes does the assembler emit for these four lines, and why?
Heads-up The assembler does not add hidden instructions. One-to-one means each line is exactly one machine instruction; the ADD line is a single ADD instruction, with the operands already in R0 and R1 from the LOADs.
Heads-up An assembler does not optimise or fold instructions — that is a compiler's job. The assembler substitutes mnemonic for opcode one-to-one, so four lines stay four instructions.
Heads-up Byte count is set by the instruction encoding, not the length of the mnemonic text. Each instruction is 2 bytes here, so four instructions are 8 bytes.
Snippet 2 — one high-level statement
const total = price * quantity + shipping;
Quiz
Completed
The three variables are already in memory. Roughly how many machine instructions does this single TypeScript line become, and who decides which ones?
Heads-up Source lines are not machine instructions. High-level statements are many-to-one: this expression becomes loads, a multiply, an add, and a store. Only assembly is one-to-one.
Heads-up The programmer chose none of them — that is the abstraction. The compiler selects the instructions, allocates registers, and resolves addresses; you wrote only the intent.
Heads-up The CPU has no instruction for an arbitrary expression. It runs the individual loads, multiply, add, and store the compiler emitted.
Snippet 3 — three ways to run the same program
A) gcc add.c -o add && ./add # produces a native binary, then runs itB) python3 add.py # interpreter reads and runs the sourceC) node add.js # interpret first, JIT-compile hot functions
Quiz
Completed
Which statement correctly distinguishes A, B, and C by what gets produced and when?
Heads-up Only A writes a native binary ahead of time. The interpreter in B produces no machine-code binary for your program; C produces native code only at run time, in memory, for hot functions.
Heads-up C is not a pure interpreter. It interprets first, then JIT-compiles frequently executed functions to native machine code at run time, reaching near-compiled speed after warm-up.
Heads-up A compiles once; ./add then runs the existing binary with zero translation overhead. The interpreted path B is the one that pays translation cost on every run and every statement.
Snippet 4 — bytecode in between
javac App.java -> App.class (bytecode, not native machine code)java App -> the JVM loads App.class and runs the bytecode
Quiz
Completed
What is App.class, and why ship bytecode instead of either source text or a native binary?
Heads-up Bytecode is not native machine code — the CPU cannot run it directly. The JVM reads the bytecode and interprets or JIT-compiles it to native code. That indirection is what makes the same .class portable across CPUs.
Heads-up It is not source text; javac has already translated the source into bytecode, a compact binary for the JVM. The original Java text is not what the JVM loads.
Heads-up The opposite — bytecode is CPU-independent. Any machine with a JVM can run the same .class; the JVM bridges to that machine's native code.
Recap
Reading the layers comes down to two questions. How many machine instructions? Assembly is one-to-one; a high-level statement is many-to-one, and the compiler picks the instructions. When does translation happen, and what is produced? A compiler emits a native binary ahead of time; a pure interpreter emits nothing for your program and translates each statement at run time; a JIT interprets first and compiles hot functions to native code on the fly; and bytecode is a portable middle layer the VM interprets or JIT-compiles. Match the snippet to the mechanism and the rest follows.