Backend Architecture BE · 02 · 05

DI as a testing seam: fakes, mocks, and the boundary that matters

The whole point of injecting dependencies is the seam it creates: a place to substitute a test double. But there are two doubles with opposite purposes, and the most common testing failure is mocking everything until the test asserts the implementation instead of the behavior.

BE Senior ◷ 16 min

Level

FoundationsJuniorMiddleSenior

A team is proud of their OrderService test suite: 100% coverage, every dependency mocked, all green. Then a refactor that changes nothing about behavior — splitting one repository method into two — turns forty tests red. The tests were not checking that orders get placed. They were checking that repo.save was called exactly once with exactly these arguments. The seam DI gave them was real; they just pointed it at the wrong thing.

The seam is the payoff

Everything in this unit — constructor injection, the composition root, abstractions instead of new — pays off here. Because OrderService receives a PaymentGateway rather than constructing a StripeClient, a test can pass in a substitute. That substitute is a test double, and the injection point is the seam: the join where production wiring is swapped for test wiring. No seam, no isolated unit test. This is why “is it testable?” and “are the dependencies injected?” are nearly the same question.

Two doubles, opposite purposes

The word “mock” is used loosely for every substitute, but the distinction is the whole lesson:

A stub / fake stands in for a dependency and provides state. A fake UserRepository backed by an in-memory Map behaves like the real thing: you save a user, you can read it back. Your assertions check the result — the order ended up persisted, the returned total is correct.
A mock is programmed with expectations about calls. It asserts that payment.charge(amount) was called once with this argument. Your assertions check the interaction, not the outcome.

The first verifies what the system did; the second verifies how it did it. Both are legitimate, but they fail differently — and the Hook is what happens when you use mocks for something a fake should have covered. When you reach for a mock, ask yourself: is the call itself the observable effect? If yes, mock is right. If the effect is the state the system ends up in, use a fake and assert the outcome.

The same DI seam, two opposite jobs: a fake asserts the resulting state and survives refactors, a mock asserts the calls and breaks on any change to call shape. Pick by whether the call itself is the observable effect.

Classicist vs London, and why it matters

This is the classicist vs mockist (“London school”) split. Mockists mock every collaborator and assert interactions, so each unit is tested in total isolation. Classicists use real objects or fakes for collaborators they own, and reserve mocks for awkward boundaries. The practical consequence is coupling to structure: a fully-mocked test knows the exact call shape of its dependency, so any refactor that preserves behavior but changes call shape breaks the test. That is the forty-red-tests bug. Tests that assert through state survive refactors because they only care about the observable result.

▸Why this works

Why do interaction tests break on refactors that change nothing? Because a mock expectation is an assertion about the implementation. expect(repo.save).toHaveBeenCalledTimes(1) encodes “the production code calls save exactly once.” Split that into two saves inside a transaction — identical behavior, identical final state — and the expectation is now false even though nothing a user could observe changed. The test was measuring the code’s internal moves, not its output. State-based tests don’t have this problem: they ask “after running, is the order persisted and the total right?”, which is invariant under any refactor that preserves behavior. Mocks are not wrong — they are the right tool for verifying an effect you cannot observe through state, like “an email was sent” — but every mock is a small bet that this particular call shape is part of the contract.

Mock at the boundary, fake what you own

The discipline that avoids over-mocking: mock at the edges of your system, use real objects or fakes inside it. Code you own and control — domain services, your own repositories — can be wired together with real instances or in-memory fakes, so tests exercise actual collaboration. The things worth mocking are the boundaries you do not control or cannot afford in a test: the payment gateway, the email sender, the clock, the third-party HTTP call. These are exactly the dependencies where you want to assert “we called Stripe with this amount” because the call itself is the externally-visible effect. The seam is most valuable precisely at the system boundary — which is also where DI matters most.

The seam also buys you orders of magnitude in test speed, which is why it changes how a team works. An in-memory fake repository backed by a Map resolves a save-and-read in single-digit microseconds; the same test against a real Postgres — even a local one — pays connection setup plus a network round-trip, typically 1–10ms each, plus per-test cleanup. That gap compounds: a suite of 500 unit tests on fakes finishes in well under a second and runs on every save; the same 500 against a real database is a multi-second-to-minute job you run once before pushing. DI is what lets the bulk of tests stay on fast fakes while a thin layer of integration tests exercises the real boundary — you get a millisecond inner loop and still verify the wiring. The tradeoff to stay honest about: fakes can drift from the real dependency’s behavior (an in-memory Map won’t enforce a unique constraint or surface a deadlock), so the fast fakes do not replace a small set of real-boundary tests — they let you afford to write far more of the cheap ones.

Over-mocking is a design smell

When a unit test needs ten mocks to construct the subject, the test is not the problem — the design is. A class that requires ten collaborators is doing too much, and the painful test is the messenger. The reflex of a senior engineer is to read test pain as feedback about coupling, not as a reason to reach for more mocking machinery. Hard-to-test usually means hard-to-change.

Double	Provides	You assert	Breaks on
Fake / stub	Realistic state	The result/outcome	Behavior change only
Mock	Recorded expectations	The interaction (calls)	Any call-shape change
Real object	Actual behavior	The result/outcome	Behavior change only

Quiz

A behavior-preserving refactor splits one `repo.save()` into two saves inside a transaction, and dozens of tests go red. What does this reveal about those tests?

Quiz

Which dependency is the best candidate to replace with a mock that asserts the call, rather than a fake that provides state?

Quiz

A unit test needs ten mocks just to instantiate the class under test. What is the senior reading of this pain?

The seam is the constructor. The fake asserts state (order saved); the mock asserts the call (charge invoked once with this amount). Using a mock where a fake suffices couples the test to call shape and breaks on any structural refactor.

Recall before you leave

01
What is the test seam, and how does DI create it?
02
What is the difference between a fake/stub and a mock, and how do they fail differently?
03
What is the 'mock at the boundary, fake what you own' rule and why does over-mocking signal a design problem?

Recap

The seam that dependency injection creates is the entire reason testability and injection are the same conversation: the injection point is where production wiring gives way to a test double. But “double” hides a fork. A fake or stub supplies realistic state and lets assertions check the outcome, so it only breaks when behavior truly changes; a mock records call expectations and asserts interactions, so it breaks on any refactor that alters call shape — the cause of a behavior-preserving change turning dozens of tests red. The classicist discipline keeps tests robust: mock the boundaries you do not own (payment, email, clock, external HTTP), where the call itself is the visible effect, and wire real objects or fakes for the code you control, asserting through state. And when a test needs ten mocks just to stand the subject up, the pain is the design talking — too many collaborators, too much responsibility. With the seam understood, the last lesson turns to what a real DI container does in production: resolution graphs, circular dependencies, eager startup, and when not to use one at all. Now when you see a refactor turn dozens of tests red despite no behavior change, you know the diagnosis: those tests were asserting call shapes, not outcomes — and the fix is to swap the mocks for fakes and assert through state.

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Connected lessons

builds on

DI scopes and lifecycles: singleton, request, transientmiddle

unlocks

DI containers in production: resolution graphs, circular deps, and when not tosenior

deepens into

DI containers in production: resolution graphs, circular deps, and when not tosenior

appears again in188

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.