awesome-everything RU
↑ Back to the climb

Networking & Protocols

DNSSEC: chain of trust and validation failure

Crux How DNSSEC signs every RRset from root to leaf, what breaks when the DS-KSK link snaps, and why a KSK rollover without updating the parent DS silently takes your site offline for 30% of users.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 16 min

Your DNS record looks correct in every authoritative tool. But 30% of users cannot reach your site — they see DNS_PROBE_FINISHED_NXDOMAIN. The other 70% are fine. The difference is not which ISP they use; it is whether their resolver validates DNSSEC. You rolled a KSK last week and forgot to update the DS record at the parent registrar. The chain snapped.

What DNSSEC does and does not do

DNSSEC authenticates DNS responses: it proves that the A record for bank.example came from the legitimate authoritative server and was not forged in transit. It does not encrypt DNS messages — an eavesdropper can still see which domain you queried; that is what DoH and DoT address. DNSSEC validates data; encryption hides queries.

Key roles: ZSK and KSK

Every signed zone publishes two types of keys:

  • ZSK (Zone Signing Key) — signs the actual data RRsets (A, MX, TXT records, etc.). Rotated frequently (every 1–3 months in practice).
  • KSK (Key Signing Key) — signs only the DNSKEY RRset that publishes the ZSK. Rotated rarely (every 1–3 years). Its hash is published in the parent zone as a DS (Delegation Signer) record.

The split exists so ZSK rotation is cheap (no parent update needed), while KSK rotation — which requires parent coordination — is rare.

The chain of trust

Every validating resolver carries one hardcoded trust anchor: the fingerprint of the root zone’s KSK. Validation walks downward:

  1. Resolver holds root KSK fingerprint as trust anchor.
  2. Root publishes DNSKEY (root KSK + root ZSK). Resolver verifies root KSK matches the trust anchor.
  3. Root ZSK signs the .com DS record. Resolver verifies the .com DS with root ZSK.
  4. .com publishes DNSKEY (.com KSK + ZSK). Resolver verifies .com KSK matches the DS hash from step 3.
  5. .com ZSK signs the example.com DS record. And so on, down to the leaf zone.
  6. Leaf zone’s ZSK signs the A record. RRSIG travels with the answer.
  7. If every step verifies → mark answer AD (Authentic Data). If any step fails → mark BOGUS → return SERVFAIL.
DNSSEC chain of trust
Trust anchor
Root KSK fingerprint (hardcoded in every validating resolver)
ZSK purpose
Signs zone data (A, MX, TXT, …)
KSK purpose
Signs only the DNSKEY RRset
DS record
Parent zone holds hash of child's KSK
RRSIG
Signature over one RRset, attached to every record
BOGUS result
SERVFAIL returned to client

The KSK rollover failure mode

The most common DNSSEC incident:

  1. Operator generates a new KSK in the child zone.
  2. Operator updates DNSKEY at the child — new KSK now signs the DNSKEY RRset.
  3. Operator forgets to update the DS record at the parent registrar.
  4. Result: parent’s DS hashes the old KSK; child’s RRSIG is produced by the new KSK.
  5. Validating resolvers fetch the child DNSKEY, verify RRSIG — it validates with the new KSK — but then check DS at parent and find the hash does not match. Chain broken → BOGUS → SERVFAIL.

Only resolvers that validate DNSSEC (Cloudflare 1.1.1.1, Google 8.8.8.8, Quad9) return SERVFAIL. Non-validating ISP resolvers serve the unsigned answer normally. This splits traffic: roughly 30–40% of global users (those on validating resolvers) cannot reach the site; the rest can.

Trace it
1/5

A senior engineer is paged: bank.example is offline for ~30% of users worldwide. Trace the diagnosis.

1
Step 1 of 5
dig +trace bank.example from operator's machine succeeds. What does that tell us?
2
Locked
drill +dnssec bank.example reveals AD=0 and RRSIG validation fails. What broke?
3
Locked
Why only ~30% of users?
4
Locked
Immediate mitigation?
5
Locked
Post-mortem fix?
Debug this

dig output during DNSSEC outage — diagnose

log
$ dig @1.1.1.1 +dnssec api.bank.example
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 41832
;; flags: qr rd ra; QUERY: 1, ANSWER: 0

$ dig @8.8.8.8 +dnssec api.bank.example
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 8821

$ dig @1.1.1.1 +cd api.bank.example     # +cd = checking disabled
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1284
;; flags: qr rd ra cd; QUERY: 1, ANSWER: 1
api.bank.example.   60   IN   A   203.0.113.42

Both validating resolvers return SERVFAIL but +cd returns a normal A record. Root cause and customer impact?

NSEC vs NSEC3: denial of existence

DNSSEC must also prove that a name does not exist — otherwise an attacker could delete a record and claim “there is no A record for bank.example.” The mechanism:

  • NSEC: chains zone names alphabetically. “Between alpha.example.com and zebra.example.com nothing exists.” Easy to verify but lets an attacker walk every NSEC record to enumerate the entire zone.
  • NSEC3 (RFC 5155): hashes names with a salt before chaining. Adjacent entries in the chain are hash-adjacent, not alphabetically adjacent. Prevents trivial zone enumeration; a rainbow-table attack is still possible but costly.
  • NSEC5 (proposed/experimental): uses verifiable random functions; not widely deployed.

Operational guidance: use NSEC3 for zones where enumeration matters (corporate intranets, customer subdomains); NSEC is fine for public zones where the contents are already known.

Quiz

Why does a KSK rollover that omits a DS update at the parent cause SERVFAIL only for ~30% of users?

Order the steps

Order the DNSSEC validation walk for a signed answer:

  1. 1 Receive RRSIG over answer; need DNSKEY of the zone to verify
  2. 2 Fetch DNSKEY RRset for the zone; verify with ZSK signature
  3. 3 Check that ZSK is signed by the zone's KSK
  4. 4 Compare hash of KSK against DS record at parent zone
  5. 5 Verify parent's DS RRSIG up to root zone
  6. 6 Compare root KSK against locally configured trust anchor
  7. 7 If every link verifies, mark answer AD (Authentic Data) and return
Why this works

The 2018 root KSK rollover. The root zone’s trust anchor was published in 2010 and never changed. In 2018 ICANN executed the first-ever root KSK rollover (KSK-2010 → KSK-2017). Preparation took over three years because any resolver that did not update its trust anchor before rollover day would start returning SERVFAIL for all DNSSEC-signed names worldwide. ICANN monitored resolver populations via specially instrumented queries and paused the rollover for 11 months in 2017 when they detected too many non-RFC-5011-compliant resolvers still using KSK-2010. RFC 5011 defines automatic key rollover: the root zone publishes the new KSK alongside the old for at least 30 days so compliant resolvers add the new key before the old is retired. The next planned root KSK rollover is expected around 2027–2028. The lesson: DNSSEC operational complexity scales with the size of the population depending on your zone.

2026 CA/Browser Forum requirement

CA/Browser Forum Ballot SC-085v2 (effective March 15, 2026) requires all publicly-trusted certificate authorities to validate DNSSEC when present during Domain Control Validation and CAA checks. If your domain has a misconfigured DNSSEC chain — expired signatures, broken delegation, wrong DS — the CA will refuse to issue or renew a TLS certificate until the chain is fixed. This does not mandate DNSSEC activation; zones that have never been signed are unaffected. But domains that signed their zones must now keep the chain valid or lose the ability to get new certificates.

Recall before you leave
  1. 01
    DNSSEC validates what — the A record data, the fact that a name exists, or both?
  2. 02
    What is the operational difference between ZSK and KSK, and why does the split exist?
  3. 03
    What does the +cd flag do in dig, and why is it useful during a DNSSEC incident?
Recap

DNSSEC authenticates DNS responses using a chain of cryptographic signatures from the root zone’s trust anchor down to individual records. Two key types serve distinct roles: the ZSK signs data RRsets and rotates frequently; the KSK signs the DNSKEY RRset and rotates rarely, with its hash stored as a DS record in the parent zone. A break in the chain — mismatched DS after a KSK rollover, expired RRSIG, wrong delegation — causes validating resolvers to return SERVFAIL and mark the answer BOGUS. Non-validating resolvers still serve the data, which is why a DNSSEC misconfiguration typically breaks 30–40% of users (those on Cloudflare, Google, Quad9) while the rest are unaffected. NSEC and NSEC3 authenticate denial of existence; NSEC3’s hashed chaining prevents trivial zone enumeration. Since March 2026, CAs must validate DNSSEC chains during certificate issuance — a broken chain now also blocks TLS certificate renewal.

Connected lessons
Continue the climb ↑Encrypted DNS: DoH, DoT, DoQ, and cache poisoning
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources5
expand
  1. 01
  2. 02
  3. 03
  4. 04
  5. 05

Trademarks belong to their respective owners. Editorial reference only.