awesome-everything RU
↑ Back to the climb

Deployment & Infra

Kubernetes objects: the reconciliation loop behind every Pod, Service, and rollout

Crux Kubernetes is declarative — you submit desired state and controllers reconcile actual toward it forever. The object hierarchy (Pod → ReplicaSet → Deployment) plus Services and probes is why a rollout heals itself, or 500s when you forget a readiness probe.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at junior altitude — the surface
◷ 17 min

A team ships a routine version bump. The rollout looks green in kubectl get pods — new pods are Running. But the on-call dashboard lights up: error rate spikes to 4% for ninety seconds on every deploy, then settles. It happens every single rollout. The cause is one missing field: the Deployment has no readiness probe. The instant a pod’s container process starts, Kubernetes marks it Ready and the Service begins routing live traffic to it — while the app is still loading config, warming a connection pool, and JIT-compiling. Requests land on a process that isn’t actually serving yet, and they 500.

Declarative, not imperative: the reconciliation loop

The single idea that makes Kubernetes coherent is this: you do not tell the cluster what to do, you tell it what you want to be true. You submit a desired state — “I want 5 replicas of this image” — and the cluster’s controllers run a loop forever, comparing actual state against desired state and taking corrective action to close the gap. This is the reconciliation loop, and it is why Kubernetes is self-healing. A node dies and three pods vanish? The ReplicaSet controller observes actual=2, desired=5, and creates three more. You never wrote “restart on failure” — you declared an invariant, and a controller enforces it.

This is the deepest reason you never kubectl run a bare Pod in production. A bare Pod is imperative: nothing owns it, nothing reconciles it. The node it lives on reboots and the Pod is gone — permanently, with no controller to recreate it. The same loop that heals your Deployment ignores the orphan Pod because no controller has it as a desired-state target.

Why this works

The loop must be idempotent: running it twice with the same desired state must not double anything. That is why controllers compare-then-act instead of just acting. A controller that “creates a pod when it sees a Deployment” would create infinite pods on every loop tick; a controller that “ensures pod count equals replicas” converges and then does nothing. Idempotence is what lets the loop run continuously without thrashing.

The workload hierarchy: Pod → ReplicaSet → Deployment

Three nested objects, each adding one capability:

  • Pod — the smallest deployable unit. One or more containers that share a network namespace (same IP, talk over localhost) and can share volumes. Pods are ephemeral: they get a new IP every time they restart, and they are designed to be thrown away and replaced, never repaired.
  • ReplicaSet — keeps exactly N identical pods alive. Its whole job is the count: it watches its pods and reconciles toward replicas: N. You almost never create one directly.
  • Deployment — manages ReplicaSets to enable rollouts and rollback. When you change the image, the Deployment creates a new ReplicaSet and scales it up while scaling the old one down, governed by maxSurge and maxUnavailable (both default 25%). The old ReplicaSet sticks around at scale 0, which is exactly how kubectl rollout undo works — it scales the previous ReplicaSet back up.
ObjectOwnsThe one job it adds
PodContainersRun containers together (shared net + volumes); ephemeral
ReplicaSetPodsKeep exactly N copies alive (reconcile the count)
DeploymentReplicaSetsRollouts + rollback (swap ReplicaSets gradually)
Service(selects Pods)Stable IP + DNS over a changing set of pod IPs

Services and the label-selector glue

Pods are ephemeral and their IPs change constantly, so nothing should ever talk to a Pod IP directly. A Service gives you a stable virtual IP and a DNS name (my-svc.my-namespace.svc.cluster.local) that stays fixed while the pods behind it churn. The magic that wires a Service to its pods is labels and selectors: the Service declares selector: { app: web }, and Kubernetes maintains an Endpoints (or EndpointSlice) object listing the IPs of every Pod whose labels match. kube-proxy then load-balances traffic across those endpoints. Labels are the universal glue across Kubernetes — Deployments use the same mechanism to know which pods they own.

The three core Service types are a layered escalation of exposure:

  • ClusterIP (default) — a virtual IP reachable only inside the cluster. Internal service-to-service traffic.
  • NodePort — opens a static port (default range 30000–32767) on every node’s IP, forwarding to the ClusterIP. Crude external access; rarely the right answer for production HTTP.
  • LoadBalancer — provisions a cloud load balancer pointing at the NodePort. The traffic chain is: external client → cloud LB → node:NodePort → ClusterIP → Pod. One LB per Service, which gets expensive fast.
  • Ingress — not a Service type but an L7 HTTP router in front of Services. One LoadBalancer feeds an Ingress controller (NGINX, Traefik), which routes by host and path to many ClusterIP Services. This is how you expose dozens of apps behind a single external IP and TLS cert.

ConfigMaps, Secrets, and the production failure

Config and credentials live in their own objects so you don’t bake them into the image: ConfigMap for non-sensitive config, Secret for credentials (base64-encoded, and not encrypted at rest unless you enable encryption). Both mount as env vars or files. One sharp edge: changing a ConfigMap does not trigger a rollout — pods keep the old values until they restart, so teams hash the config into a pod annotation to force a new ReplicaSet on change.

Now the failure from the Hook, mechanically. A Service routes to a pod the moment that pod is Ready. Without a readiness probe, “ready” means only “the container process started” — not “the app can serve requests.” So during every rollout, the Service adds the new pod to its endpoints while the app is still warming up, and a slice of traffic 500s until it finishes. The fix is a readiness probe (an HTTP GET /healthz, a TCP check, or an exec) that returns success only when the app is genuinely serving. A failing readiness probe pulls the pod out of the Service endpoints — no traffic until it passes. This is distinct from a liveness probe, which restarts a pod that has wedged. The classic outage: using a liveness probe where you needed readiness, so a slow-starting app gets killed and restarted in a loop instead of just being held out of rotation. Probe defaults are aggressive — periodSeconds: 10, timeoutSeconds: 1, failureThreshold: 3 — and a slow /healthz under load will flap; for slow boots use a startupProbe to hold the others off.

Pick the best fit

A stateless HTTP API takes ~20s to warm (config load + connection pool + cache prime) before it can serve. You need zero-error rollouts. What do you configure?

Quiz

A node reboots and takes three of your pods with it. Your Deployment requested 5 replicas. What happens, and why?

Quiz

How does a Service know which pods to send traffic to?

Order the steps

Order what happens during a Deployment image change (a rolling update):

  1. 1 You apply the new image; the Deployment records a new desired state
  2. 2 The Deployment creates a new ReplicaSet for the new image
  3. 3 New RS scales up and old RS scales down, bounded by maxSurge / maxUnavailable
  4. 4 Each new pod passes its readiness probe before the Service routes traffic to it
  5. 5 Old ReplicaSet reaches scale 0 but is kept, so rollout undo can scale it back up
Recall before you leave
  1. 01
    Explain why a Deployment self-heals after a node failure but a bare Pod does not.
  2. 02
    Walk through exactly how a missing readiness probe causes 500s during a rollout, and how the probe fixes it.
Recap

Kubernetes is declarative: you submit a desired state and controllers run a reconciliation loop forever, comparing actual against desired and acting to close the gap — which is what makes the cluster self-healing and why you never run a bare, unowned Pod in production. The workload hierarchy stacks one capability per layer: a Pod runs containers that share network and volumes but is ephemeral; a ReplicaSet keeps exactly N pods alive by reconciling the count; a Deployment manages ReplicaSets to roll out and roll back, swapping them gradually under maxSurge and maxUnavailable. Services give a stable virtual IP and DNS over the churning pod IPs, wired by label selectors into an Endpoints list, escalating from ClusterIP to NodePort to LoadBalancer, with Ingress doing L7 routing in front. ConfigMaps and Secrets externalize config and credentials. And the production lesson that ties it together: a Service routes to a pod the instant it’s Ready, so without a readiness probe every rollout sends traffic to apps that have started but can’t yet serve — and they 500 until they warm up.

Continue the climb ↑K8s objects: multiple-choice review
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources4
expand
  1. 01
  2. 02
  3. 03
  4. 04

Trademarks belong to their respective owners. Editorial reference only.