awesome-everything RU
↑ Back to the climb

Backend Architecture

Circuit breakers: code and config reading

Crux Read real breaker config and call-site code, predict the state transitions and trip behaviour, and pick the fix a senior engineer would make first.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min

A breaker’s behaviour is decided in its config and at its call site. Read each snippet, predict what the breaker actually does under load, and choose the change a senior engineer would make first.

Goal

Practise the loop you run on every breaker review: read the config and the call wrapping, predict the state transitions and trip behaviour, and spot the misconfiguration that makes the breaker fire wrong — or not at all.

Snippet 1 — the state machine

// resilience4j: walk a single breaker through one incident
CircuitBreaker cb = registry.circuitBreaker("payments");

// 1. closed: calls pass, failures counted
cb.executeSupplier(() -> paymentClient.charge(req));   // ok
cb.executeSupplier(() -> paymentClient.charge(req));   // throws -> counted

// ... failure rate crosses failureRateThreshold ...
// breaker transitions: CLOSED -> OPEN, starts waitDurationInOpenState

cb.executeSupplier(() -> paymentClient.charge(req));   // <- what happens here?
Quiz

The breaker has just transitioned CLOSED to OPEN and the cooldown timer is running. The next executeSupplier call arrives. What happens, and what is the dependency's load right now?

Snippet 2 — the threshold config

CircuitBreakerConfig.custom()
    .slidingWindowType(SlidingWindowType.COUNT_BASED)
    .slidingWindowSize(100)
    .failureRateThreshold(50)        // trip at 50% failures
    .minimumNumberOfCalls(100)       // ...but only after 100 calls
    .build();
// endpoint: /admin/report  ->  ~6 requests per minute
Quiz

This config is applied to a low-traffic admin endpoint serving roughly 6 requests/minute. The dependency behind it goes hard-down. How does the breaker behave, and why?

Snippet 3 — the half-open probe

CircuitBreakerConfig.custom()
    .waitDurationInOpenState(Duration.ofSeconds(10))
    .permittedNumberOfCallsInHalfOpenState(10)
    // automaticTransitionFromOpenToHalfOpenEnabled = false (default)
    .build();
// dependency recovered at second 4; breaker tripped at second 0
// no calls arrive between second 0 and second 30
Quiz

The dependency recovered at second 4 and the cooldown is 10 s, but no calls arrive between second 0 and second 30. When does this breaker actually probe and reopen to traffic?

Snippet 4 — the timeout interplay

// breaker counts failures, but the call has no time budget
Supplier<Resp> guarded = CircuitBreaker
    .decorateSupplier(cb, () -> recsClient.fetch(req));  // can hang 30s

// during the incident recsClient.fetch never errors -- it just hangs
Resp r = guarded.get();
Quiz

During the incident recsClient.fetch stops erroring and instead hangs for 30 s per call. The breaker is wired but never trips. What is the single highest-leverage fix?

Recap

Breaker behaviour is read in config and at the call site. OPEN short-circuits every call instantly so the dependency gets zero load; a minimum-volume floor sized for high traffic silently blocks tripping on a low-traffic endpoint (use a smaller floor or a time-based window); with automatic transition disabled the breaker probes only when a call arrives after the cooldown, never on the timer alone, so an idle breaker stays open until traffic returns; and a breaker with no timeout is blind to a hang, because only a time budget converts a hang into a countable failure. Diagnose from the config and the wrapping, then fix the one setting that makes the breaker fire correctly.

Continue the climb ↑Circuit breakers: contain a cascading failure
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.