Backend Architecture
Graceful shutdown: free-recall review
Retrieval beats re-reading. For each prompt, say or write a full answer from memory before you open the model answer — the effort of recall is what makes the shutdown sequence stick when you need it during a 2 a.m. deploy.
Reconstruct the unit’s spine — the termination sequence, the deregistration race, reverse-dependency teardown, the deadline budget, requeue safety, and fleet coordination — without looking back at the lessons.
- 01Walk the Kubernetes pod termination sequence, and explain why SIGTERM and SIGKILL are categorically different.
- 02What is the PID 1 trap, and what are the two fixes?
- 03Describe the deregistration race and the two-lever fix; why fail readiness but keep liveness passing?
- 04Why does server.close() alone hang, and what is reverse dependency order?
- 05Why is the grace period a budget, and what is the disposition for long requests versus background jobs?
- 06Why does requeue demand idempotency, and why doesn't a fleet of perfect per-instance shutdowns guarantee a zero-downtime deploy?
If you could reconstruct each answer from memory, you hold the unit’s spine: the termination sequence ends in an uncatchable SIGKILL, so finish during SIGTERM — and make sure it actually reaches a handler in PID 1; the deregistration race means fail readiness and wait out propagation before you stop accepting; teardown runs in reverse dependency order with datastores last, bounded by a guardian timeout; the grace period is a budget, so reject long requests and requeue jobs — but requeue is at-least-once and demands idempotency; and zero-downtime deploy is the fleet-level orchestration (surge before drain, deregister before terminate, jitter the closes) that arranges individually-correct shutdowns so they never collide.