Databases DB · 01 · 01

What a relation is: tables, rows, keys, and constraints

The core vocabulary of the relational model — relations, tuples, candidate keys, and why declarative constraints are the model''''s enduring win.

DB Junior ◷ 12 min

Level

FoundationsJuniorMiddleSenior

Already know this unit? Take a 1-minute quick check →

A team stores “user has many tags” as a comma-separated column. Six months later: “how many users have tag X?” — every row must be parsed. A relational design answers that question as a point lookup. The difference is the model you start with.

What a relation is

Edgar Codd’s 1970 paper defined the foundations of every SQL database since. In ten minutes you will know why the comma-separated column breaks and what to do instead. The vocabulary is small:

Relation — a set of tuples sharing the same shape. The table is the SQL implementation.
Tuple — a single row; every tuple in a relation has the same attributes.
Attribute — a column; each draws values from a domain (a type).
Candidate key — a minimal subset of attributes that uniquely identifies every tuple. A table may have several candidate keys; one is designated the primary key.

Concept	SQL term	What it means
Relation	Table	A set of rows sharing one shape
Tuple	Row	One record
Attribute	Column	One named typed value per row
Domain	Type	The set of valid values for an attribute
Candidate key	PRIMARY KEY / UNIQUE	A minimal row identifier

Why constraints are the model’s enduring win

When you inherit a codebase and find data inconsistencies that “shouldn’t be possible,” almost always the answer is: no constraints. The relational model does not just store data — it refuses bad data. Constraints are declarative rules the engine checks before any insert, update, or delete:

PRIMARY KEY — uniquely identifies each row; implies NOT NULL and UNIQUE.
FOREIGN KEY — a column referencing a primary or unique key in another table; the engine refuses orphan rows.
NOT NULL — this attribute must always have a value.
UNIQUE — this column or column set has no duplicates across rows.
CHECK — an arbitrary boolean expression evaluated on every write; e.g. CHECK (amount >= 0).

Without constraints you have a key-value store with SQL syntax. Application bugs can commit bad data. With constraints, the engine refuses on behalf of every caller — application code does not have to remember the rules.

The metaphor

A relational schema is a library catalogue. Each table is a drawer (books, authors, loans). Each row is a card with the same fields. A loan card has member-id and book-id pointing at other drawers (foreign keys). The librarian (the engine) refuses to file a loan card if the book-id does not exist. The system stays coherent without humans re-checking.

A practical scenario

Sven · Origin server wants a “favourites” feature for a marketplace. Otto · Origin database asks the shape question. Sven says “user has many favourites.” Otto reaches for the model: users, items, favourites table with (user_id, item_id) as PK plus two FKs. Three DDL lines and the feature is structurally correct — no duplicates, no orphans, queryable both ways. A JSON-array shortcut breaks the moment you ask “who favourited this item.”

Another team stores addresses as comma-separated text on the user row. Six months later marketing asks “how many users in Oregon?” — every text field must be parsed. A year later finance asks “sales by state” — same problem on every order. A relational design (addresses table with structured columns) is two extra schema lines and three orders of magnitude cheaper to query.

Pick the JSON-array shortcut and three of four queries scan every row; pick the join table and they become point lookups — roughly three orders of magnitude cheaper — with the engine guaranteeing no duplicates and no orphans.

▸Why this works

The cost of the relational model is the constraint check at write time and the discipline of designing the schema before you can write code. Senior engineers pay that cost knowingly; new teams either pay it accidentally (denormalize first, regret later) or skip it (treat the database as a key-value store and accumulate years of integrity debt). This lesson is about why the first path is almost always cheaper across the lifetime of a system.

Order the steps

Order the steps to design a schema for 'user has many addresses':

1 Identify the entities: User, Address
2 For each entity, pick a primary key (id BIGSERIAL or uuid)
3 Identify the relationship: one user has many addresses (1:N)
4 Add a user_id column to addresses with REFERENCES users(id)
5 Add NOT NULL on the FK (every address must belong to a user)
6 Add an index on addresses(user_id) so 'list addresses for user X' is fast
7 Decide ON DELETE: CASCADE or RESTRICT

Quiz

What is a primary key in a relational table?

Quiz

What does a foreign key declaration buy you?

Complete the analogy

Fill in the blank: the database engine refuses bad inserts because of declared _______ — rules like NOT NULL, UNIQUE, FOREIGN KEY, CHECK.

The composite PK (user_id, item_id) blocks duplicates; the two foreign keys point at real parent rows, so the engine refuses any orphan favourite.

Recall before you leave

01
In two sentences, why is storing 'user has many tags' as a JSON array column usually worse than a tags table plus a user_tags join table?
02
Name the five constraint kinds a relational engine enforces and state what each one does.
03
What is the difference between a candidate key and a primary key?

Recap

The relational model defines data as sets of typed tuples sharing a fixed shape (a relation). Tables, rows, and columns are the SQL implementations of relations, tuples, and attributes. Every row is identified by a candidate key — one is the primary key. Foreign keys link tables and let the engine refuse orphan rows. The five constraint kinds (PK, FK, NOT NULL, UNIQUE, CHECK) are the declarative guarantees that move correctness from every application to the database boundary. The cost is schema-design discipline before code; the payoff is correctness every caller gets for free. Now when you see a “user has many X” requirement, you will reach for a join table with two foreign keys before you reach for a JSON column.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 6 done

Connected lessons

unlocks

deepens into

appears again in190

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.

Apply this

Put this lesson to work on a real build.

Mini CRUD APIBuild your first real backend: a tiny HTTP API that creates, reads, updates, and deletes notes — backed by SQLite so the data survives a restart. You go from a one-line 'hello' server to a small service that validates input and stores rows, one honest step at a time.