Base CS from zero
Collections in memory
You have a clean picture of an array of numbers: one contiguous block of equal-size cells. But real programs rarely store plain numbers. They store a list of users, a list of orders, an object whose fields are themselves objects. What does that look like in memory?
Here the array model seems to break. An object can be large and can grow; it cannot be packed flat into a 4-byte array cell. So how does an array hold a hundred objects, each of a different size, and still keep its tidy equal-size grid?
The answer is the most important idea in this unit, and it comes straight from Unit 06: the array cells do not hold the objects. They hold references — addresses that point at the objects, which live elsewhere. This lesson follows that idea to its conclusion: any nested data structure is really a graph of small cells linked by references.
After this lesson you can explain why an array or object of objects stores references rather than the objects themselves, describe what each cell of such a collection actually holds, and explain why nested data forms a graph of cells linked by references.
Recall: value cells and reference cells. From Unit 06, lesson 04, two kinds of thing can sit in a cell. A small fixed-size value — like a number — fits directly in the cell. A larger structure — an object or an array — does not fit; instead the cell holds a reference: a number that is the address of where that structure actually lives. A reference is a cell pointing at another place in memory.
So a cell never literally contains a big object. It either contains a small value outright, or it contains the address of a big thing stored elsewhere. Keep both cases in mind for the rest of this lesson.
An array of numbers: values, contiguous. When an array holds numbers, each cell holds
the number itself — a small fixed-size value. The picture from lessons 01 and 02 holds
exactly: one contiguous block of equal-size cells, each cell containing a real value, and
base + i × element_size lands on it.
This is the simplest and tightest case: the data is the contiguous block. Nothing points anywhere else.
An array of objects: references, contiguous. Now suppose the array holds objects — say a list of users. An object does not fit in one small cell. So the array cannot store the objects inside its block.
Instead, each cell of the array holds a reference — the address of one object that lives somewhere else in memory. The array block stays exactly as before: contiguous, equal-size cells, because every reference is the same fixed size (an address is just a number). What changed is what the cells contain: not values, but addresses.
So an “array of objects” is two layers. Layer one: the contiguous array block of
equal-size reference cells — arr[i] still reaches its cell in constant time. Layer two:
the actual objects, scattered around memory wherever there was room, each pointed at by one
reference in the array. Reading users[2].name is two hops: compute the address of
users[2], read the reference there, follow it to the object, then look up the name
field inside that object.
An object of objects: the same, with named cells. The same thing happens with a nested
object. If a field’s value is itself an object, the field’s slot does not hold that whole
object — it holds a reference to it. company.address is a field whose slot holds the
address of a separate address object. Whether the container is an array (positional
cells) or an object (named cells), the rule is identical: a cell holds a small value
directly, or it holds a reference to a larger thing.
Nested data is a graph of cells. Follow this all the way. An object can have a field that references another object, which has a field that references a third object, which references the first. An array can hold references to objects that hold references to arrays.
The result is not a flat block and not a simple list. It is a graph: a set of cells, where some cells hold plain values and some cells hold references that point to other cells. To “have” a big nested data structure is to hold one reference — the entry point — and from there the program walks reference to reference, cell to cell, reaching the rest. Every list of records, every tree, every nested object you will ever build is, underneath, this: small cells, some holding values, some holding addresses, wired together by references.
Why this works
Why not just store the objects inside the array block? Because objects are not all the
same size, and they can grow after they are created. An array block depends on every cell
being the same fixed size — that is what makes base + i × element_size work. If cells
had to hold whole objects of varying size, the constant-time index formula would collapse.
References solve this exactly: a reference is always the same small fixed size no matter
how large the object it points to. The array keeps its uniform grid; the variable-size
objects live elsewhere. Indirection — pointing instead of containing — is what lets a
fixed-shape container hold variable-shape contents.
Reading users[1].age in an array of objects.
users is an array of 3 user objects. The array block has base address 1000; each cell
holds a reference, and a reference is 4 bytes. The objects live elsewhere: user object 0
at address 7000, object 1 at address 7100, object 2 at address 7200. We want
users[1].age.
Step 1 — Reach the array cell. Use the index formula from lesson 02:
address = base + index × element_size = 1000 + 1 × 4 = 1004. The cell users[1] is at
address 1004.
Step 2 — Read the reference. The program reads the cell at 1004. It does not find a user object there — it finds a reference: the value 7100, the address of user object 1.
Step 3 — Follow the reference. The program goes to address 7100, where user object 1 actually lives. This is the second hop — the act of “following a reference to the thing it points at”.
Step 4 — Look up the field. Now at the object, the program uses the key age (lesson
03) to find the age field’s slot inside the object, and reads its value.
Two layers, two hops: an arithmetic hop into the contiguous array, then a reference hop out to the object. Every “array of objects” access works this way.
Common mistake
A common mistake is to imagine an array of objects as one giant contiguous block with the whole objects packed end to end inside it. That is not the layout. The array block holds only references — small, equal-size addresses. The objects themselves are separate, scattered wherever memory had room. The array is contiguous; the objects it refers to are not. Confusing “the array is contiguous” with “the objects are contiguous” leads to wrong reasoning about how nested data sits in memory.
An array holds 5 numbers. Each cell holds the number itself (a value), not a reference. Type 1 for yes, 0 for no.
An array holds 5 objects. What does each cell of the array block hold: type 0 for the whole object, type 1 for a reference (address) pointing to the object.
An array of objects has base address 1000 and each reference cell is 4 bytes. Using base + index x element_size, at what address is the reference cell for element 2?
Reading users[1].name in an array of objects takes how many hops: one arithmetic hop into the array, then one reference hop out to the object. Total hops?
A reference is always the same fixed size no matter how large the object it points to. Because of this, an array of references can keep its equal-size contiguous grid. Type 1 for yes, 0 for no.
How does an array of objects actually sit in memory?
A cell holds either a small fixed-size value directly, or a reference — the address of a larger structure stored elsewhere. An array of numbers is the simple case: each cell holds a value, and the data is the contiguous block. But an array of objects — or an object of objects — cannot pack variable-size objects into its equal-size grid. Instead each cell holds a reference, and the objects live scattered elsewhere; the container stays a contiguous, equal-size grid because every reference is the same fixed size. Reaching a value in such a collection takes two hops: one into the contiguous container, then one following the reference out to the object. Carried all the way, nested data is a graph of cells: small cells, some holding values, some holding references, wired together — and to hold a nested structure is just to hold one reference into that graph.