awesome-everything RU
↑ Back to the climb

Engineering Practice

Flag debt and rollout discipline

Crux Trunk-based trades branch debt for flag debt: a release flag left on after rollout is a long-lived branch hiding in an if, and N stale flags make 2^N untested configs. The discipline that makes it pay off is cleanup — delete the branch on merge, the flag on rollout.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 17 min

A team that adopted trunk-based two years ago opens their flag dashboard: 400 flags, and nobody can say which still matter. A “new checkout” flag has been at 100% for a year — but the old code path is still there behind the else, untested, occasionally hit by a config typo. An incident traces to two flags whose combination nobody ever tried. They escaped merge hell and walked straight into a different prison: the flags they never deleted. Trunk-based didn’t remove the debt. It changed its shape.

The debt didn’t vanish — it changed shape

The previous lessons made flags the hero: they decouple deploy from release and make daily integration possible. The senior caveat is that every flag is also a branch in your runtime. A long-lived git branch splits your codebase into divergent histories; a long-lived flag splits your running system into divergent execution paths. You moved the fork from version control into an if statement, which is harder to see and harder to delete, because it’s live in production and someone might be depending on it.

So a release flag that outlives its feature is not neutral leftover — it is a long-lived branch in disguise, with the same drift problem reappearing as untested code paths. The old path behind the else rots: nobody runs it, nobody updates it, but it’s still reachable. Trunk-based only delivers its benefit if you remove flags as aggressively as you remove branches: delete the branch on merge, delete the flag on full rollout. Cleanup isn’t hygiene you do when you have time; it’s the other half of the trade that makes the whole practice net-positive.

Why stale flags explode combinatorially

The cost of flag debt isn’t linear in the number of flags — it’s exponential. Each independent boolean flag doubles the number of possible runtime configurations of your system. With 3 flags you have 2³ = 8 configurations; with 10 flags, 1,024; with 20, over a million. You test a handful of those paths and ship; the rest are live but unverified. Most incidents from flag debt are interaction bugs: flag A is fine, flag B is fine, but the state where both are on (or one on and one half-ramped) was never exercised and does something wrong. This is the same 2^N untested-configuration problem that stale branches would cause if you kept them all alive — it just hides inside conditionals instead of in git branch -a.

This is why the lifetime taxonomy from lesson 3 is load-bearing. Release flags must be short-lived precisely because they’re the ones that silently become permanent. The mitigation is lifecycle discipline: give each release flag an owner and an expiry, track flag age the way you track branch age, fail the build or alert when a release flag outlives its expected life, and make “remove the flag and the dead path” a required closing step of every rollout — not an optional follow-up ticket that never gets prioritized.

Active release flagsPossible runtime configs (2^N)Realistically testedInteraction-bug surface
38A fewSmall, manageable
101,024Still a fewMost configs never run
20> 1,000,000A vanishing fractionInteraction bugs are inevitable

Progressive delivery, and an honest read of the data

Done right, the flag lifecycle is also a delivery technique: progressive delivery is the discipline of releasing through controlled, measured stages — internal → 1% → canary cohort → 100% — with automated guardrails that halt or roll back the ramp if error rate, latency, or business metrics regress. The flag is the actuator; the metrics are the controller. This is the operational maturity trunk-based is building toward: release as a closed feedback loop, not a one-way push.

It’s also worth being precise about what the famous numbers prove. DORA’s research consistently ranks trunk-based development among the strongest correlates of elite delivery — the 2024 report (39,000+ respondents) found only ~19% of teams reach elite, and elite performers deploy on the order of 182× more frequently with 127× faster lead times and far lower change-failure rates, and are markedly more likely to practice trunk-based. But correlation runs through the prerequisites: trunk-based only delivers when the fast green gate, the flags, and the cleanup discipline are all present. Adopt the branching model alone and you get the failure modes from these five lessons — merge-free drift if branches linger, a red trunk if the gate is missing, or a combinatorial flag swamp if cleanup is skipped. The lever isn’t the branch model; it’s the whole disciplined system around it.

Why this works

The clean symmetry of this unit: a long-lived branch and a stale flag are the same bug in two locations. Both are forks that should have been temporary, both accumulate untested divergence, and both impose a 2^N cost if you let them pile up — branches in your history, flags in your runtime. Trunk-based development is, at root, a single discipline applied to both: keep forks short-lived and delete them the moment they’ve served their purpose. Get that discipline and the DORA numbers follow; skip it and you’ve just relocated the mess.

Pick the best fit

A feature has been at 100% rollout and stable for two weeks. The release flag is still in the code. What should happen?

Quiz

Why is a release flag left on after full rollout described as 'a long-lived branch hiding in an if'?

Quiz

Why is flag debt described as exponential rather than linear in the number of flags?

Order the steps

Order a disciplined release flag's full lifecycle:

  1. 1 Create the release flag with an owner and an expiry date
  2. 2 Ship the work to trunk dark behind the off flag, integrating daily
  3. 3 Progressively ramp 1% → 100% with metric guardrails that can auto-halt
  4. 4 Hold at stable 100% briefly to confirm no regression
  5. 5 Delete the flag and the dead old path as the closing step of the rollout
Recall before you leave
  1. 01
    Explain the claim that trunk-based 'trades branch debt for flag debt' and why that trade is only worth it with cleanup discipline.
  2. 02
    What does DORA's data actually prove about trunk-based, and what's the honest causal story?
Recap

Trunk-based development doesn’t eliminate debt — it changes its shape from branch debt to flag debt, because every flag is a branch in your runtime: a release flag left on after full rollout is a long-lived branch hiding in an if, with the old path rotting untested but still reachable. The cost is exponential, since each independent boolean flag doubles the runtime configurations — 20 flags is over a million states, of which you test a handful — so interaction bugs (A fine, B fine, both-on never tried) become inevitable, the same 2^N untested-configuration problem stale branches would cause, relocated into conditionals. The discipline that makes the whole practice net-positive is cleanup run as the closing step of the rollout: give release flags owners and expiries, track flag age like branch age, and delete the flag and dead path the moment a feature is stable at 100% — ideally as the end of a progressive-delivery ramp where metrics guard each step and can auto-halt. And the DORA numbers — ~19% elite, 182× deployment frequency, 127× faster lead times — correlate with trunk-based only when the fast green gate, the flags, and the cleanup discipline are all present; the lever was never the branching model alone, but the disciplined system around it that keeps every fork, in history or in runtime, short-lived.

Connected lessons
Continue the climb ↑Trunk-based: multiple-choice review
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources4
expand
  1. 01
  2. 02
  3. 03
  4. 04

Trademarks belong to their respective owners. Editorial reference only.