Networking & Protocols
MTU and fragmentation
A developer deploys a service behind a VPN. Small requests work fine. Large requests — a file upload, a big JSON payload — silently hang. No error message. The server receives nothing. The problem is a mismatch between packet size and the maximum the link can carry, combined with a firewall blocking the error message that would have fixed it.
What MTU is
Maximum Transmission Unit (MTU) is the largest IP packet (header + payload) that a link can carry without fragmentation. Ethernet’s standard MTU is 1500 bytes. Different link technologies have different MTUs:
- Ethernet (standard)
- 1500 bytes
- PPPoE (DSL)
- 1492 bytes
- Wi-Fi (802.11)
- ~1500 bytes
- Jumbo frames (10G/40G)
- 9000 bytes
- IPv6 minimum required MTU
- 1280 bytes
- VPN tunnels (WireGuard, IPSec)
- 1400–1450 bytes
The Path MTU is the smallest MTU on the entire path from source to destination. A packet must fit within the Path MTU or it will be dropped somewhere on the path.
IPv4 fragmentation
When an IPv4 packet arrives at a router whose egress link has a smaller MTU, the router splits the packet into fragments:
- The router copies the identification field to all fragments.
- It sets the “more fragments” flag on all but the last.
- Each fragment carries an offset into the original packet.
- The destination reassembles all fragments with the same ID + source IP.
Fragmentation is expensive: any one lost fragment forces retransmission of the whole original datagram. Each fragment is routed independently. Fragmentation at routers adds latency and CPU work.
The DF (Don’t Fragment) bit. When set in the IPv4 header Flags field, the router may not fragment the packet. If it doesn’t fit the egress MTU, the router drops it and sends ICMP “fragmentation needed” back to the source. This enables PMTUD.
IPv6 — no router fragmentation
IPv6 routers never fragment. If a packet is too large, the router drops it and sends ICMPv6 “Packet Too Big” back to the sender. The sender must reduce the packet size and retransmit. This moves complexity to the sender and keeps router processing simple.
IPv6 senders are required to support a minimum MTU of 1280 bytes (they may always send 1280-byte packets to any destination). Path MTU Discovery is mandatory for IPv6.
Path MTU Discovery (PMTUD)
PMTUD discovers the smallest MTU on the path. The sender sets DF=1 (IPv4) and sends full-sized packets. When a router drops one because it’s too large, it sends ICMP “packet too big” (or “fragmentation needed”) back. The sender adjusts its packet size down and retries.
PMTUD black holes occur when a firewall or middlebox silently drops ICMP. The sender never receives the “packet too big” signal. It keeps transmitting full-size packets, which keep getting dropped. The connection stalls — small requests work (they fit), large ones hang forever.
Symptoms of a PMTUD black hole:
- Small HTTP requests succeed; large payloads or files hang.
ping -M do -s 1450 destinationfails with no response.tracerouteshows the path fine, but TCP sessions stall at large sizes.
Fixes for PMTUD black holes
TCP MSS clamping: the MSS (Maximum Segment Size) is negotiated in the TCP handshake. A router on the path (VPN gateway, PPPoE terminator) rewrites the MSS option in SYN packets to a safe value (e.g. 1452 = 1500 - 20 IP - 20 TCP - 8 PPPoE). Endpoints never negotiate a segment size that would create an oversized packet. This is the pragmatic fix for most VPN and PPPoE deployments.
PLPMTUD (RFC 4821): Packetisation-Layer PMTUD. The sender probes with increasingly large TCP segments; if a probe doesn’t return an ACK, it backs off. Does not rely on ICMP at all — immune to ICMP filtering. Modern stacks default to PLPMTUD.
Open ICMP policies: allow ICMP type 3 (destination unreachable) and ICMPv6 type 2 (packet too big) through firewalls. Filtering all ICMP is a common but incorrect “security” policy that breaks PMTUD and traceroute.
Diagnose: packets are silently dropped when their size exceeds 1400 bytes.
Why does IPv6 forbid router-level fragmentation?
What is a PMTUD black hole?
- 01You run tcpdump and see TCP retransmissions every ~1 second, no loss detected, the connection runs through a VPN tunnel. What is a likely PMTUD-related root cause?
- 02Why does IPv6 require the sender to perform PMTUD, while IPv4 routers can fragment on the fly?
- 03What is TCP MSS clamping and why is it used instead of relying on PMTUD?
MTU is the link’s maximum frame size. Ethernet standard is 1500 bytes; VPN tunnels reduce it further by adding header overhead. IPv4 routers can fragment an oversized packet into smaller pieces, but any lost fragment forces retransmission of the entire datagram. IPv6 bans router fragmentation entirely: if a packet is too large, the router sends ICMPv6 “Packet Too Big” and the sender must shrink. Path MTU Discovery relies on these ICMP signals — when a firewall silently drops ICMP, the sender never learns to shrink, and large connections stall (PMTUD black hole). Pragmatic fixes: TCP MSS clamping (rewrites the MSS option in SYNs to a safe value, immune to ICMP filtering) and PLPMTUD (RFC 4821, probes via TCP segments).