awesome-everything RU
↑ Back to the climb

Networking & Protocols

TTL, caching, and DNS propagation

Crux TTL is permission, not a command — understanding its operational impact on freshness, load, and the myth of DNS propagation.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 14 min

You updated a DNS record and your co-worker can see the change. You cannot. An hour later you finally see it. Your manager calls this “DNS propagation” and says to wait 24–48 hours. That framing is technically wrong and operationally expensive. Understanding what TTL actually is — permission, not a command — changes how you manage DNS records in production.

What TTL means

Every DNS record carries a TTL (Time To Live) — a number in seconds. When a resolver caches an answer, it counts down from TTL to zero. At zero it discards the cached record and re-queries. The authoritative server does not push updates; caches expire and pull the new value on their own schedule.

This has two operational consequences:

  1. You cannot force a cached value to disappear. Once a TTL is published and cached, you must wait for that TTL to count down everywhere.
  2. “DNS propagation” is a misleading term. Nothing propagates. All that happens is that distributed caches expire one by one. There is no wave; there is no timeline. A cache that was populated 1 minute before your change will hold the old value for (TTL - 1 minute) longer.

High TTL vs low TTL

TTLBenefitCost
High (86400 s = 1 day)Few queries; fast cache hits; authoritative server handles less loadStale data lingers up to 1 day after a change
Low (60 s)Stale data clears in 1 minute after a changeMore queries; higher load on authoritative; cache cannot serve stale during outage
Cache hit rate under steady load (10 queries/hour)
TTL = 60 s
~17% cache hits
TTL = 300 s
~50% cache hits
TTL = 3600 s
~90% cache hits
TTL = 86400 s (1 day)
~99% cache hits
Planned-change SOP
Lower TTL to 60 s one week before, change, raise back

Planned migration SOP

The standard operational pattern for a planned DNS change:

  1. One week before: Lower TTL to 60 seconds. Wait for the old high TTL to expire everywhere (max wait = old TTL).
  2. Change day: Update the record. Bad outcome reachable immediately — max cache age is now 60 s.
  3. After stabilisation: Raise TTL back to 3600 s or higher.

Skipping step 1 means old caches can serve stale data for up to the old TTL (e.g., 24 hours) after your change.

Negative caching (RFC 2308)

DNS caches do not only store positive answers. They also cache negative responses:

  • NXDOMAIN (name does not exist) — cached for min(SOA.MINIMUM, SOA.TTL), typically 1–3 hours.
  • NODATA (name exists but no record of that type) — same cache duration.
  • SERVFAIL — cached briefly per RFC 9520: 30 seconds to 5 minutes. Short enough to not amplify an outage, long enough to prevent a tight retry loop.

Negative caching is essential for performance: without it, every query for a non-existent subdomain would hit the authoritative server every time. Without SERVFAIL caching the internet’s early typo storms and misconfigured clients could flood authoritative servers into collapse.

Quiz

What does TTL actually tell downstream resolvers?

Quiz

You change an A record on your authoritative server. A resolver cached the old value 10 minutes ago with TTL=3600. How long before that resolver serves the new value?

SOA record and zone authority

Every zone has exactly one SOA (Start of Authority) record at the apex. Its fields govern replication and negative caching:

  • SERIAL: incremented on every zone change. Secondaries compare their serial to the primary’s; if lower, they pull an update.
  • REFRESH: how often secondaries poll without a NOTIFY (typically 1–24 hours).
  • RETRY: poll interval when REFRESH fails.
  • EXPIRE: how long a secondary serves stale data when the primary is unreachable (often 1 week).
  • MINIMUM: negative-cache TTL (RFC 2308). The actual negative TTL is min(SOA.MINIMUM, SOA.TTL).

Common ops mistake: decrementing SERIAL. The convention is YYYYMMDDNN format (2026051301 = 2026-05-13, change #01). Manual edits that decrement SERIAL break replication silently — secondaries skip the “older” zone.

Order the steps

Order the recommended steps for a planned DNS migration (changing an A record IP):

  1. 1 Identify current TTL (e.g. 86400 s)
  2. 2 Lower TTL to 60 s; wait for old TTL to expire everywhere
  3. 3 Update the A record to the new IP
  4. 4 Monitor for errors; verify new IP is resolving correctly
  5. 5 Raise TTL back to 3600 s or higher
Why this works

Stale-while-revalidate (RFC 8767). When an upstream authoritative is unreachable but a cache entry exists past its TTL, a resolver may serve the stale answer for up to 1–3 days (configurable) while attempting a refresh in the background. This dramatically improves availability during authoritative outages at the cost of serving slightly stale data. Unbound supports this with serve-expired yes. The trade-off: if a zone was intentionally removed rather than just unavailable, stale-while-revalidate hides the deletion from users longer than the TTL would suggest.

Browser DNS cache

Browsers maintain their own DNS cache, separate from the OS stub resolver and the upstream recursive resolver. Chrome caches DNS entries for approximately 1 minute regardless of the record’s actual TTL — intentionally short enough to forget potentially malicious answers, long enough to avoid re-resolving every link on a page. Firefox follows a similar policy.

Clearing the OS-level DNS cache (sudo systemd-resolve --flush-caches on Linux, sudo killall -HUP mDNSResponder on macOS) does not clear the browser’s cache. To force a full re-resolution: restart the browser or visit chrome://net-internals/#dns and clear the host cache.

Browsers also pre-warm DNS via <link rel="dns-prefetch" href="//cdn.example.com"> — resolving names before the user clicks a link so the lookup latency is hidden.

Recall before you leave
  1. 01
    Operational impact of high TTL vs low TTL — name one benefit and one cost of each.
  2. 02
    You observe dig @1.1.1.1 example.com A returning 80 ms on every query. What does this suggest?
  3. 03
    What is negative caching and why is it important?
Recap

TTL is a maximum hold time for downstream caches, not a validity guarantee from the authoritative server. DNS has no push mechanism: “propagation” is just distributed caches expiring one by one. To manage a planned change safely, lower the TTL well before the change so the worst-case staleness window is short. Negative responses — NXDOMAIN, NODATA, and SERVFAIL — are also cached, governed by SOA.MINIMUM and RFC 9520 respectively. The SOA record controls zone replication: SERIAL increments signal secondaries to pull updates, and decrementing SERIAL silently breaks replication. Browsers maintain their own DNS cache independent of the OS resolver; clearing the OS cache does not affect the browser’s cache.

Connected lessons
appears again in152
Continue the climb ↑DNSSEC: chain of trust and validation failure
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources4
expand
  1. 01
  2. 02
  3. 03
  4. 04

Trademarks belong to their respective owners. Editorial reference only.