awesome-everything RU
↑ Back to the climb

Backend Architecture

DI as a testing seam: fakes, mocks, and the boundary that matters

Crux The whole point of injecting dependencies is the seam it creates: a place to substitute a test double. But there are two doubles with opposite purposes, and the most common testing failure is mocking everything until the test asserts the implementation instead of the behavior.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 16 min

A team is proud of their OrderService test suite: 100% coverage, every dependency mocked, all green. Then a refactor that changes nothing about behavior — splitting one repository method into two — turns forty tests red. The tests were not checking that orders get placed. They were checking that repo.save was called exactly once with exactly these arguments. The seam DI gave them was real; they just pointed it at the wrong thing.

The seam is the payoff

Everything in this unit — constructor injection, the composition root, abstractions instead of new — pays off here. Because OrderService receives a PaymentGateway rather than constructing a StripeClient, a test can pass in a substitute. That substitute is a test double, and the injection point is the seam: the join where production wiring is swapped for test wiring. No seam, no isolated unit test. This is why “is it testable?” and “are the dependencies injected?” are nearly the same question.

Two doubles, opposite purposes

The word “mock” is used loosely for every substitute, but the distinction is the whole lesson:

  • A stub / fake stands in for a dependency and provides state. A fake UserRepository backed by an in-memory Map behaves like the real thing: you save a user, you can read it back. Your assertions check the result — the order ended up persisted, the returned total is correct.
  • A mock is programmed with expectations about calls. It asserts that payment.charge(amount) was called once with this argument. Your assertions check the interaction, not the outcome.

The first verifies what the system did; the second verifies how it did it. Both are legitimate, but they fail differently — and the Hook is what happens when you use mocks for something a fake should have covered.

Classicist vs London, and why it matters

This is the classicist vs mockist (“London school”) split. Mockists mock every collaborator and assert interactions, so each unit is tested in total isolation. Classicists use real objects or fakes for collaborators they own, and reserve mocks for awkward boundaries. The practical consequence is coupling to structure: a fully-mocked test knows the exact call shape of its dependency, so any refactor that preserves behavior but changes call shape breaks the test. That is the forty-red-tests bug. Tests that assert through state survive refactors because they only care about the observable result.

Why this works

Why do interaction tests break on refactors that change nothing? Because a mock expectation is an assertion about the implementation. expect(repo.save).toHaveBeenCalledTimes(1) encodes “the production code calls save exactly once.” Split that into two saves inside a transaction — identical behavior, identical final state — and the expectation is now false even though nothing a user could observe changed. The test was measuring the code’s internal moves, not its output. State-based tests don’t have this problem: they ask “after running, is the order persisted and the total right?”, which is invariant under any refactor that preserves behavior. Mocks are not wrong — they are the right tool for verifying an effect you cannot observe through state, like “an email was sent” — but every mock is a small bet that this particular call shape is part of the contract.

Mock at the boundary, fake what you own

The discipline that avoids over-mocking: mock at the edges of your system, use real objects or fakes inside it. Code you own and control — domain services, your own repositories — can be wired together with real instances or in-memory fakes, so tests exercise actual collaboration. The things worth mocking are the boundaries you do not control or cannot afford in a test: the payment gateway, the email sender, the clock, the third-party HTTP call. These are exactly the dependencies where you want to assert “we called Stripe with this amount” because the call itself is the externally-visible effect. The seam is most valuable precisely at the system boundary — which is also where DI matters most.

Over-mocking is a design smell

When a unit test needs ten mocks to construct the subject, the test is not the problem — the design is. A class that requires ten collaborators is doing too much, and the painful test is the messenger. The reflex of a senior engineer is to read test pain as feedback about coupling, not as a reason to reach for more mocking machinery. Hard-to-test usually means hard-to-change.

DoubleProvidesYou assertBreaks on
Fake / stubRealistic stateThe result/outcomeBehavior change only
MockRecorded expectationsThe interaction (calls)Any call-shape change
Real objectActual behaviorThe result/outcomeBehavior change only
Quiz

A behavior-preserving refactor splits one `repo.save()` into two saves inside a transaction, and dozens of tests go red. What does this reveal about those tests?

Quiz

Which dependency is the best candidate to replace with a mock that asserts the call, rather than a fake that provides state?

Quiz

A unit test needs ten mocks just to instantiate the class under test. What is the senior reading of this pain?

Recall before you leave
  1. 01
    What is the test seam, and how does DI create it?
  2. 02
    What is the difference between a fake/stub and a mock, and how do they fail differently?
  3. 03
    What is the 'mock at the boundary, fake what you own' rule and why does over-mocking signal a design problem?
Recap

The seam that dependency injection creates is the entire reason testability and injection are the same conversation: the injection point is where production wiring gives way to a test double. But “double” hides a fork. A fake or stub supplies realistic state and lets assertions check the outcome, so it only breaks when behavior truly changes; a mock records call expectations and asserts interactions, so it breaks on any refactor that alters call shape — the cause of a behavior-preserving change turning dozens of tests red. The classicist discipline keeps tests robust: mock the boundaries you do not own (payment, email, clock, external HTTP), where the call itself is the visible effect, and wire real objects or fakes for the code you control, asserting through state. And when a test needs ten mocks just to stand the subject up, the pain is the design talking — too many collaborators, too much responsibility. With the seam understood, the last lesson turns to what a real DI container does in production: resolution graphs, circular dependencies, eager startup, and when not to use one at all.

Connected lessons
appears again in185
Continue the climb ↑DI containers in production: resolution graphs, circular deps, and when not to
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.