Algorithms from zero ALG · 09 · 04

Connected components: count islands and flood fill

A connected component is a maximal set of mutually reachable vertices in an undirected graph. Loop over every vertex, launch BFS/DFS from each unvisited one, increment a counter. The grid ''''number of islands'''' is the same flood-fill idea. O(V+E); grid O(rows*cols).

ALG ◷ 26 min

You learned DFS and BFS. Both start from one vertex and visit everything reachable from it. But here is the catch from the last two lessons: a single traversal only reaches one “piece” of the graph. If the graph has disconnected parts—two separate clusters of friends with no link between them—one BFS will never cross the gap. So how do you find every separate piece and count them? The trick is small and powerful: loop over all vertices, and every time you hit a vertex you have not visited yet, launch a fresh BFS or DFS and bump a counter. Each launch swallows one whole connected component. The same pattern solves the famous interview problem “number of islands”: a grid of land and water is just a graph in disguise, and each island is a connected component.

Goal

After this lesson you can define a connected component in an undirected graph (a maximal set of mutually reachable vertices). You can write the count-components algorithm: loop over every vertex, run BFS or DFS from each unvisited one, increment a counter per launch. You understand why each traversal marks exactly one whole component visited, so vertices are never double-counted. You can solve the grid variant “number of islands” / flood fill by treating a grid as an implicit graph with 4-directional neighbors. You know the complexity is O(V + E) for an explicit graph and O(rows * cols) for a grid. You can also state the directed-graph caveat: in a directed graph the right notions are weakly and strongly connected components, and strongly connected components need a different algorithm (deferred). Next lesson goes deeper into graph structure.

The idea

What is a connected component?

In an undirected graph, a connected component is a maximal set of vertices where every vertex can reach every other vertex by following edges. “Maximal” means you cannot add any more vertices to the set and keep it connected—the component already contains everything reachable.

A graph that is one single piece has exactly one component. A graph split into separate clusters has one component per cluster. An isolated vertex (no edges) is a component all by itself.

Example: a graph with three components

Component A      Component B    Component C

  0 --- 1            3              5 --- 6
  |    /             |
  2 --/              4

  {0, 1, 2}        {3, 4}         {5, 6}

Vertices 0, 1, 2 are all mutually reachable, so they form one component. 3 and 4 form a second. 5 and 6 form a third. There is no edge bridging A, B, and C, so this graph has 3 connected components.

Adjacency list (undirected, so each edge appears in both directions):

0: [1, 2]
1: [0, 2]
2: [0, 1]
3: [4]
4: [3]
5: [6]
6: [5]

The count-components algorithm

The insight: one BFS (or DFS) from a vertex visits its entire component and nothing else. So:

Create one shared visited set for the whole graph.
Set count = 0.
For each vertex v from 0 to V-1:
- If v is not visited:
  - Increment count (you found a new component).
  - Run BFS or DFS from v. This marks every vertex in v’s component visited.
Return count.

The shared visited set is the key. When you later reach a vertex that was already swallowed by an earlier traversal, you skip it—so you never launch a second traversal inside the same component.

Worked trace on the example

visited = {}, count = 0

v = 0: not visited. count = 1. BFS from 0 marks {0, 1, 2}.
v = 1: visited. skip.
v = 2: visited. skip.
v = 3: not visited. count = 2. BFS from 3 marks {3, 4}.
v = 4: visited. skip.
v = 5: not visited. count = 3. BFS from 5 marks {5, 6}.
v = 6: visited. skip.

Result: count = 3.

Each traversal launch lines up with exactly one component, so the counter equals the number of components.

The grid variant: number of islands

A grid of cells is a graph in disguise—an implicit graph. You do not build an adjacency list; the neighbors are computed on the fly. Given a grid where 1 is land and 0 is water, count the islands. An island is a group of 1 cells connected horizontally or vertically (4 directions: up, down, left, right—not diagonally).

Each land cell (r, c) has up to 4 neighbors: (r-1, c), (r+1, c), (r, c-1), (r, c+1). A neighbor counts only if it is inside the grid and is also land. This is exactly the same count-components pattern:

For each cell (r, c):
- If it is land and not visited:
  - Increment the island counter.
  - Flood fill from (r, c): BFS or DFS over connected land, marking each visited.

“Flood fill” is just BFS/DFS on the grid—the name comes from paint-bucket tools that spread color across a connected region.

In the grid above: one island is 0; the other is 3. Total: 2 islands.

Directed graphs: a quick caveat

Connected components as described apply to undirected graphs. In a directed graph, edges are one-way, so “reachable” is not symmetric: A might reach B without B reaching A. Two refined notions appear:

Weakly connected: connected if you ignore edge directions (treat every directed edge as undirected). You can find weakly connected components with the exact algorithm above, run on the undirected version.
Strongly connected: every vertex reaches every other following directions both ways. Finding strongly connected components needs a dedicated algorithm (Kosaraju’s or Tarjan’s), which we defer to a later lesson.

For this lesson, stay undirected: the simple loop-and-traverse pattern is all you need.

Three components = three maximal connected groups. No edge crosses a dashed boundary, so a BFS or DFS launched inside one cluster reaches only that cluster. The outer loop hits 0 (launch 1), then 3 (launch 2), then 5 (launch 3) — every other vertex is already visited when reached, so count lands on exactly 3.

The code

Counting connected components (BFS inside)

1 function countComponents(V, adj) {

2 const visited = new Set();

3 let count = 0;

5 // Try every vertex as a potential new starting point

6 for (let v = 0; v < V; v++) {

7 if (!visited.has(v)) {

8 count++; // found a brand-new component

9 bfs(v, adj, visited); // mark its whole component visited

10 }

11 }

13 return count;

14 }

16 function bfs(start, adj, visited) {

17 const queue = [start];

18 visited.add(start);

19 while (queue.length > 0) {

20 const u = queue.shift();

21 for (const w of adj[u]) {

22 if (!visited.has(w)) {

23 visited.add(w);

24 queue.push(w);

25 }

26 }

27 }

28 }

L2 One shared visited set for the entire graph (not per-component).
L3 Component counter starts at 0.
L6 Loop over every vertex 0..V-1, so no component is missed.
L7 If v was not reached by any earlier traversal, it opens a new component.
L8 Increment once per component, not once per vertex.
L9 This traversal marks v's entire component visited, so its other vertices are skipped later.
L19 Mark-on-enqueue: each vertex enters the queue exactly once.
L22 Examine each neighbor; enqueue only the unvisited ones.

Running it on the 3-component graph

Output

Number of islands (DFS flood fill on a grid)

1 function numIslands(grid) {

2 const rows = grid.length;

3 const cols = grid[0].length;

4 let count = 0;

6 function flood(r, c) {

7 // Stop at the grid edge or on water/visited cells

8 if (r < 0 || r >= rows || c < 0 || c >= cols) return;

9 if (grid[r][c] !== 1) return;

11 grid[r][c] = 0; // sink this cell so it is never revisited

12 flood(r - 1, c); // up

13 flood(r + 1, c); // down

14 flood(r, c - 1); // left

15 flood(r, c + 1); // right

16 }

18 for (let r = 0; r < rows; r++) {

19 for (let c = 0; c < cols; c++) {

20 if (grid[r][c] === 1) {

21 count++; // a fresh island

22 flood(r, c); // erase the whole island

23 }

24 }

25 }

27 return count;

28 }

L8 Bounds check: a neighbor off the grid is not a valid cell.
L9 Stop on water (0) or already-sunk land—this is the visited check.
L11 Marking visited by overwriting land with water; no separate set needed.
L12 Recurse into the 4 orthogonal neighbors (no diagonals).
L20 Scan every cell; each unsunk land cell opens a new island.
L21 Count once per island, then flood-fill erases the rest of it.

Running number of islands

Output

Step-by-step trace

1 adj = { 0:[1,2], 1:[0,2], 2:[0,1], 3:[4], 4:[3], 5:[6], 6:[5] };

2 visited = new Set();

3 count = 0;

4 for (v = 0; v < 7; v++) { if (!visited.has(v)) { count++; bfs(v); } }

Complexity

Explicit graph: time O(V + E)

It looks like the outer loop plus a traversal per vertex might cost more, but it does not. The outer for loop runs V times. The if (!visited.has(v)) check is O(1) each, so the loop overhead alone is O(V). The traversals, taken together, visit each vertex exactly once and examine each edge exactly once across the entire run—because the shared visited set stops any vertex from being entered twice. So all the BFS/DFS work sums to O(V + E), not O(V) traversals each costing O(V + E).

Total: O(V + E).

Example: a graph with 7 vertices and 5 undirected edges (10 directed entries in the adjacency list) runs in O(7 + 10) ≈ O(17) work, no matter how the components are split.

Grid: time O(rows * cols)

For the islands problem, let R = rows, C = cols. The double loop visits all R * C cells. Flood fill, summed over all islands, touches each cell a constant number of times (once to sink it, plus up to 4 neighbor checks). So the total is O(R * C).

Here the grid is an implicit graph with V = R * C cells and at most E ≈ 2 * R * C edges (each cell has up to 4 neighbors, each edge shared by 2 cells). So O(V + E) = O(R * C), matching the general formula.

Space: O(V) (grid: O(rows * cols))

The visited set holds up to V vertices: O(V). The BFS queue or DFS stack can hold up to V vertices in the worst case: O(V).

For the grid, if you sink cells in place (overwrite 1 with 0) you need no extra visited set—but recursion depth for DFS flood fill can reach O(R * C) in the worst case (a grid that is one long snake of land), which is the recursion stack cost. An iterative BFS flood fill caps queue size at O(R * C) too.

Why the shared visited set matters

Wrong (a fresh visited set per traversal):
  Outer loop hits v=1 after the component {0,1,2} was already counted.
  With a per-traversal visited set, v=1 looks "unvisited" again.
  -> count increments wrongly; the same component is counted 3 times.

Right (one shared visited set):
  After BFS from 0, visited = {0,1,2}.
  v=1 and v=2 are seen as visited, so they are skipped.
  -> each component is counted exactly once.

Practice 0 / 6

In an undirected graph, what is a connected component?

When counting components, why must the visited set be shared across all traversals (not reset per traversal)?

In the 'number of islands' problem, two land cells belong to the same island when they are...

What is the time complexity of counting connected components on a graph with V vertices and E edges?

Connected components (as taught here) apply to undirected graphs. For a directed graph, which notion needs a separate, more advanced algorithm?

A 4x5 grid (4 rows, 5 columns) is given for the number-of-islands problem. In terms of total cells, the time complexity is O(rows * cols). How many cells is that?

▸Why this works

Why one traversal equals one component. A BFS or DFS launched from vertex v visits exactly the set of vertices reachable from v—no more, no less. In an undirected graph, “reachable from v” is precisely v’s connected component. So a single launch covers one whole component, and the shared visited set ensures the next launch only happens when you reach a vertex in a brand-new component. That alignment is why the launch counter equals the component count.

▸Common mistake

Counting per vertex instead of per launch. A common bug is to increment the counter every time you visit a vertex, or to reset the visited set inside the loop. Increment only when you start a fresh traversal from an unvisited vertex, and keep one shared visited set for the entire run. Otherwise you count vertices, not components.

▸Edge cases

Isolated vertices and empty grids. A vertex with no edges is its own component (the traversal from it visits only itself, then stops). Count it like any other. For grids: an all-water grid has 0 islands; a single land cell is 1 island; the loop and the bounds checks handle empty rows and ragged input as long as you guard grid[0].length when the grid is non-empty.

▸More practice

BFS or DFS—your choice. The count-components pattern works identically with BFS (queue) or DFS (recursion or stack); both visit one whole component per launch. For deep, narrow shapes, recursive DFS risks a deep call stack—prefer iterative BFS/DFS or raise the stack limit. For wide shapes, the BFS queue can grow large. The component count is the same either way.

Check yourself

Quiz

Explain what a connected component is, how the count-components algorithm works, why the shared visited set is essential, how the grid 'number of islands' problem maps onto it, and the complexity.

Recap

A connected component of an undirected graph is a maximal set of mutually reachable vertices.

The algorithm: keep one shared visited set, loop over every vertex, and for each unvisited vertex increment a counter and launch BFS or DFS. Each launch marks exactly one whole component visited, so the counter equals the number of components.

Number of islands / flood fill: a grid is an implicit graph with 4-directional neighbors; flood-fill each unvisited land cell and count one island per launch—the identical pattern.

Directed caveat: this is for undirected graphs. Directed graphs have weakly connected components (run this on the undirected version) and strongly connected components (need Kosaraju’s or Tarjan’s, deferred).

Time: O(V + E) for an explicit graph, O(rows * cols) for a grid. Space: O(V).

Now when you encounter a “count islands,” “number of provinces,” or “find all connected clusters” problem, you will recognize it instantly as the outer-loop-plus-traversal pattern—and know to share one visited set across all launches.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.