Deployment & Infra DEP · 02 · 01

Compose vs Kubernetes: choosing the right orchestration weight

Compose orchestrates containers on one host with a tiny YAML; Kubernetes is a distributed control plane that self-heals, schedules across nodes, and rolls out with health gating — at a real operational cost. The mistake is buying that cost before you have the scale to justify it.

DEP Junior ◷ 16 min

Level

FoundationsJuniorMiddleSenior

A team of three ships an API and reaches for a managed Kubernetes cluster on day one, “to be ready to scale.” Six months later they have one node, eleven services nobody can keep straight, and a YAML directory bigger than the app. A bad node drain takes the whole product down for forty minutes because nobody understood pod disruption budgets. They never had a scaling problem — they imported a future one, and paid the on-call tax with no traffic to justify it.

By the end of this lesson you’ll know exactly which signals justify the switch — and which ones are just anxiety dressed up as architecture.

Two tools, two jobs

Docker Compose and Kubernetes both “run containers,” and that shared verb hides how different they are. Compose is a single-host process manager: one docker-compose.yml declares your services, networks, and volumes, and docker compose up starts them all on this machine. It is fast to learn, fast to start, and trivial to debug — the whole stack is one file and docker compose logs. Its overhead is tiny, on the order of ~50MB, because there is no cluster to run.

Kubernetes is a different animal: a distributed control plane that runs across machines. You don’t start containers; you declare a desired state (3 replicas of this image, behind this service, with these health checks) and the control plane works continuously to make reality match. That continuous matching is the whole product, and it costs you a real control plane — roughly ~2GB of resident overhead before your app runs a single container.

The reconciliation loop is what you’re paying for

When you look at a Kubernetes bill and wonder what you’re actually buying, the answer is this one mechanism.

The single mechanism that separates Kubernetes from Compose is the reconciliation loop. Controllers watch the cluster, compare actual state against the desired state stored in etcd, and close the gap — continuously, event-driven, forever. A ReplicaSet controller’s only job is “keep N pods running”: kill a pod and it notices and replaces it; lose a whole node and the scheduler reschedules its pods onto healthy nodes. That is self-healing, and Compose simply does not have it. Compose’s restart: always restarts a container on the same host; if the host dies, everything on it dies with it.

The same loop powers rolling updates with health gating. A Kubernetes Deployment brings up new pods, waits for their readiness probe to pass, shifts traffic to them, and only then tears down the old ones — so a bad release never serves traffic, and you get zero-downtime rollout and automatic rollback for free. Compose restarts a service in place; for the window between stop and healthy-start, that service is down. There is no health gate and no automatic rollback.

Capability	Docker Compose	Kubernetes
Scheduling scope	Single host only	Many nodes, the scheduler places pods
Node failure	Everything on that host dies	Self-heals — pods rescheduled elsewhere
Deploys	Restart in place; brief downtime	Rolling update gated on readiness probe
Autoscaling	None (manual, vertical only)	Horizontal autoscaling on metrics
Overhead	~50MB, no control plane	~2GB control plane before your app
Learning curve	Minutes; one YAML file	Weeks; pods, services, CNI, RBAC, YAML sprawl

The complexity tax, and who actually pays it

Before you add Kubernetes to a new project, ask yourself: what does my current traffic actually demand? The answer usually reframes the decision entirely.

Everything Kubernetes gives you arrives bundled with a tax: a control plane to keep alive, a cluster network (CNI) to understand, RBAC, ingress, secrets management, and the famous YAML sprawl. Surveys consistently put operational complexity as the top Kubernetes pain point — around 70% of users name it — and clusters routinely sit at 30–50% utilization, so a large share of what you provision is waste. For a small team the dollar figure is brutal: a credible production setup runs into the low five figures per month, and the real cost is the engineer-hours that go to keeping the cluster healthy instead of shipping features.

The senior tradeoff is therefore not “which is better” but capability vs complexity tax, weighed against your actual scale. Kubernetes’ capabilities only pay off when you genuinely need multi-node scheduling, zero-downtime rollout under real traffic, or autoscaling. Below that, you are paying for self-healing across machines you don’t have, for a scheduler placing pods on a single node. The classic failure — the team of three from the Hook — is buying the tax before the scale exists to justify it.

▸Why this works

“We’ll need it to scale eventually” is the line that buys Kubernetes too early. The honest counter: a single beefy host with Compose can serve a surprising amount of traffic, and migrating to Kubernetes later is a known, bounded project. Importing its complexity now is a permanent tax with an uncertain payoff date. Start at the weight your traffic earns; upgrade when a real ceiling — not a hypothetical one — is in sight.

Where the line actually is — and the middle ground

There are honest signals that you’ve outgrown Compose. The first is the single-host ceiling: Compose scales vertically (a bigger box) or by same-host replicas, and a single machine has a hard limit — when you need to spread load across machines, Compose can’t. The second is availability: when a single host going down is no longer acceptable, you need scheduling that survives a lost node. The third is operational needs Compose lacks: zero-downtime rolling deploys gated on health, and horizontal autoscaling on load. Hit one of those for real and the tax starts to pay for itself.

Crucially, it’s not a binary. Between “one Compose host” and “self-managed Kubernetes” sit options that give you multi-node resilience without the full operational weight: Docker Swarm (multi-host Compose-like syntax, far simpler than k8s), HashiCorp Nomad, and managed PaaS / serverless containers like Cloud Run, AWS App Runner, ECS, or Azure Container Apps that run your containers and handle scaling without you owning a control plane. A managed Kubernetes service (EKS/GKE/AKS) removes the control-plane operations but not the conceptual complexity — you still own pods, networking, and the YAML.

Pick the best fit

A 4-person startup runs an API + Postgres + Redis + a worker, all on one rented server, modest traffic. Pick the orchestration weight.

Quiz

Which capability is the real reason to move from Compose to Kubernetes — the thing Compose fundamentally cannot do?

Quiz

What does the Kubernetes reconciliation loop give you that Compose's restart: always does not?

Order the steps

Order the questions a senior asks before reaching for Kubernetes:

1 Does the workload fit on one host with room to grow? If yes, Compose is enough
2 Is a single host going down unacceptable, or do I need to spread load across machines?
3 Do I need zero-downtime rolling deploys gated on health, or horizontal autoscaling?
4 If yes to those — can a middle ground (Swarm, Nomad, managed PaaS) meet the need with less tax?
5 Only if the scale and ops needs are real and the middle ground doesn't fit → Kubernetes

Together these questions form a graduated filter: most workloads fail out at step one or two and never need to reach step five. Skip straight to “what does Kubernetes give us?” and you’ve already made the mistake the Hook describes.

Controllers watch, compare actual against desired, and act to close the gap — continuously, forever. That self-healing loop is what you pay for over Compose, whose restart: always cannot survive a lost host.

Recall before you leave

01
Explain to a teammate why a 3-person team adopting self-managed Kubernetes on day one is usually a mistake, and what to do instead.
02
What concrete signals tell you you've genuinely outgrown Docker Compose?

Recap

Compose and Kubernetes both run containers, but they answer different questions. Compose is a single-host process manager — one YAML, minutes to learn, ~50MB overhead, perfect for local dev and small single-node deploys, but with no multi-node scheduling, no self-healing across machines, no health-gated rollouts, and no autoscaling. Kubernetes is a distributed control plane whose reconciliation loop continuously drives reality toward a declared desired state, giving self-healing, scheduling across nodes, zero-downtime rolling updates, and horizontal autoscaling — at the cost of a real complexity tax: a control plane, CNI, RBAC, YAML sprawl, on-call burden, and five-figure monthly bills. The senior decision is capability vs complexity weighed against actual scale. The classic failure is buying the tax too early; the opposite failure is ignoring Compose’s single-host ceiling too long. Between them sit Swarm, Nomad, and managed PaaS. Pick the weight your traffic earns, and migrate when a real ceiling — not a hypothetical — comes into view. Now when you see a team reach for Kubernetes on day one, you’ll know the question to ask: which of the three honest signals have you actually hit?

Practice

Start at the top. Tasks go easiest → hardest: recall a fact, apply it to a case, then a senior-level stretch. Open one, attempt it, then reveal.

recallapplystretch0 of 5 done

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.