Base CS from zero CS · 03 · 02

Fetch–Decode–Execute

The CPU runs one loop forever: fetch the next instruction from memory, decode what it means, execute it, then advance the program counter and repeat. Every program ever run on any CPU is this loop.

CS ◷ 20 min

A CPU is powered on. A program is loaded into memory. Now what?

The CPU does not “think.” It does not “decide” to run anything. It simply starts a loop and never stops. The loop has three steps, repeated billions of times per second:

Fetch the next instruction from memory.
Decode what that instruction means.
Execute it.

Then go back to step 1. That is the entire mechanism by which every program in history — from the first batch-processing jobs in the 1950s to the AI model running on today’s hardware — has been executed. The CPU is a machine that runs this loop.

Goal

After this lesson you can describe the fetch–decode–execute cycle step by step, explain what the program counter is and how it advances, and explain why the cycle is truly a loop that never stops while the CPU is powered.

The program counter: keeping track of where we are. Before the loop can start, the CPU needs to know where in memory the first instruction lives. It keeps track of this with a special register called the program counter (often abbreviated PC, and also called the instruction pointer on x86-64 CPUs).

A register is a tiny, ultra-fast storage slot physically inside the CPU — you will learn about registers fully in the next lesson. For now: the program counter is a register that holds one value, which is the memory address of the next instruction to execute.

At power-on, the program counter is set to a fixed starting address (determined by the chip and the operating system). From that moment on, the program counter is updated automatically by the CPU after every instruction, driving the loop forward.

Step 1 — Fetch. The CPU reads the instruction stored at the memory address held in the program counter.

Concretely:

The CPU puts the value of the program counter onto the address bus (the wires connecting CPU to memory).
Memory returns the bytes stored at that address.
Those bytes are loaded into a special internal register called the instruction register (IR), where the CPU holds the current instruction while it works on it.

The instruction is now inside the CPU. Memory is not involved again until the next fetch (or until a LOAD/STORE instruction explicitly accesses memory).

Step 2 — Decode. The CPU examines the bytes it just fetched. The first part of the bytes is the opcode — the operation code that identifies which instruction this is (ADD, LOAD, JUMP, etc.). The remaining bytes are the operands.

Decoding is done by a dedicated circuit inside the CPU called the control unit. The control unit reads the opcode and produces a set of control signals that tell the rest of the CPU what to do: which functional unit to activate, where to route the operands, where to put the result.

From the programmer’s perspective, decode is invisible. But it is the step that turns a raw bit pattern into a specific action.

Step 3 — Execute. The CPU carries out the action specified by the decoded instruction. What “execute” means depends on the instruction type:

For an arithmetic instruction (e.g., ADD): the control unit routes the operands to the Arithmetic Logic Unit (ALU), the ALU computes the result, and the result is written into the destination register.
For a LOAD instruction: the CPU sends the memory address to the address bus, waits for memory to return the value, and stores it in the destination register.
For a STORE instruction: the CPU sends both the memory address and the value to memory, which stores it.
For a JUMP instruction: the CPU writes the jump target address into the program counter, which will change where the next fetch comes from.

Advancing the program counter. After execution, the program counter must be updated so the next fetch gets the next instruction.

For most instructions (everything except JUMP), the CPU automatically adds the size of the current instruction to the program counter. On many architectures instructions are a fixed size (e.g., 4 bytes each on ARM). On others they vary (x86-64 instructions range from 1 to 15 bytes). Either way, after a non-jump instruction, the program counter now points to the instruction that immediately follows the current one in memory.

For a JUMP instruction, the program counter is set to the jump target address during the execute step. The next fetch will come from that address — which may be far away from the current location. This is how if statements and loops work at the hardware level.

Together, the PC update and the JUMP override give the CPU its only two modes: march forward by default, or leap to a target on command. Without the first, sequential code would be impossible; without the second, there would be no conditionals or loops.

▸Why this works

Why is it a loop and not a tree or a graph? The CPU has no built-in notion of “subroutine call” or “function return” at the loop level. Those are higher-level patterns built on top of the same JUMP instructions. Every function call is implemented by a JUMP to the function’s first instruction, with some bookkeeping to remember where to jump back. Every if statement is a conditional JUMP. Every while loop is a conditional JUMP back to the top. The loop is the one primitive; everything else is a pattern on top of it.

Fetch

→

Decode

→

Execute

↺

The fetch–decode–execute cycle: three stages, repeated forever. Each arrow leads back into the loop.

Worked example

Tracing one full cycle.

Suppose memory contains these three instructions at addresses 100, 104, and 108 (each instruction is 4 bytes long):

Address 100: LOAD R0, 200 (read the value at memory address 200 into register R0)
Address 104: LOAD R1, 201 (read the value at memory address 201 into register R1)
Address 108: ADD R2, R0, R1 (add R0 and R1, store in R2)

Program counter starts at 100.

Cycle 1:

Fetch: Read bytes at address 100. Instruction register receives LOAD R0, 200.
Decode: Control unit recognises opcode LOAD, identifies destination R0, identifies source memory address 200.
Execute: CPU reads memory address 200, receives some value (say, 7). CPU stores 7 in R0.
PC advance: PC = 100 + 4 = 104.

Cycle 2:

Fetch: Read bytes at address 104. IR receives LOAD R1, 201.
Decode: Opcode LOAD, destination R1, source address 201.
Execute: CPU reads memory address 201, receives value (say, 3). CPU stores 3 in R1.
PC advance: PC = 104 + 4 = 108.

Cycle 3:

Fetch: Read bytes at address 108. IR receives ADD R2, R0, R1.
Decode: Opcode ADD, destination R2, sources R0 and R1.
Execute: ALU computes R0 + R1 = 7 + 3 = 10. Result stored in R2.
PC advance: PC = 108 + 4 = 112.

The CPU will now fetch whatever instruction is at address 112, and the cycle continues. R0 = 7, R1 = 3, R2 = 10 — the addition is complete.

▸Common mistake

A common mistake is thinking the CPU “knows” what program it is running or “understands” the overall goal. It does not. The CPU executes the current instruction, advances the program counter, and executes the next one. It has no view beyond the current cycle. The apparent intelligence of a running program comes entirely from the arrangement of the instructions in memory — the CPU itself is just a loop.

Practice 0 / 5

The program counter holds the memory address of the next instruction to execute. Before any instruction runs, what does the program counter hold?

During which step does the CPU read the instruction from memory?

During which step does the control unit identify the opcode?

Instructions at address 200 are each 4 bytes. After executing the instruction at address 200, the program counter becomes what value?

A JUMP instruction targets address 500. After it executes, the program counter holds what value?

Check yourself

Quiz

What is the role of the program counter in the fetch–decode–execute cycle?

Recap

The CPU runs the fetch–decode–execute cycle forever. In the fetch step it reads the instruction at the address held in the program counter (PC). In the decode step the control unit identifies the opcode and operands. In the execute step the appropriate hardware unit carries out the action — the ALU for arithmetic, the memory bus for LOAD/STORE, or the PC update for JUMP. After a non-jump instruction the PC advances by the instruction’s byte size. After a JUMP the PC is set to the target address. The cycle then restarts. Every program ever run — every function, every loop, every if statement — is implemented as a sequence of instructions driven by this same loop. Now when you see a CPU hang or an infinite loop in your code, you know exactly what is happening in the silicon: the fetch–decode–execute cycle is running, but the program counter keeps revisiting the same instructions.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Tiny Stack VMBuild the machine that runs the machine. You define a small bytecode instruction set, write an assembler that turns readable mnemonics into bytes, and write the interpreter loop that fetches, decodes, and executes them one at a time. By the end you will have demystified the whole stack between a line of source and a running program — because you wrote every layer of it yourself.