Networking & Protocols
IP security and operations
A site goes offline for one ISP. You can still reach it from your machine. The customer reports their browser times out completely. A BGP looking-glass query shows someone else is announcing your prefix. You have 15 minutes to understand what happened and fix it. This is IP security — not the TLS layer above, but the routing substrate underneath.
IP spoofing and why it matters
Nothing in IP prevents a sender from writing an arbitrary source address. A host can claim to be 1.2.3.4 and the network forwards the packet. This enables:
- Reflection/amplification DDoS: attacker sends small UDP packets to an open DNS or NTP server with a spoofed source IP (the victim’s IP). The server sends a large response to the victim. DNS amplification factor: ~60×. NTP monlist: up to ~5000×. Memcached: up to 51,000×. The attacker spends 1 Gbps, the victim absorbs 51 Tbps.
- Evading IP-based rate limiting: flood from randomised source IPs.
- Bypassing firewall rules that trust specific source IPs.
- BCP 38 adoption (egress filtering)
- ~60% of ASes
- RPKI ROV adoption (Tier-1 ISPs)
- ~85%+ in 2026
- DNS amplification factor
- up to 60×
- NTP monlist amplification
- up to 4,670×
- Cloudflare DDoS record (2024)
- ~5.6 Tbps
- BGP hijack incidents (2024)
- multiple high-profile cases
BCP 38 and uRPF: defeating spoofing at the source
BCP 38 (RFC 2827) recommends ISPs apply egress filtering: check that packets leaving their network have source IPs from address space they are authorised to announce. A packet from 192.0.2.1 should not leave a network that only owns 203.0.113.0/24.
Global BCP 38 adoption is ~60%. The remaining 40% of ASes leak spoofed traffic, enabling amplification attacks at scale.
uRPF (Unicast Reverse Path Forwarding) is the data-plane mechanism. When a packet arrives on an interface, the router checks: “would I route a packet back to the source IP via this same interface?” If not, the source is spoofed — drop it. uRPF in strict mode requires the return path to use the same interface; loose mode just requires the source to be routable. Deployed at customer-facing router ports, not in the core.
BGP hijacks and RPKI
BGP hijack: a rogue AS announces a prefix it does not own. Example: AS12345 announces 203.0.113.0/24 even though it belongs to AS64512. Neighbouring ASes may accept the announcement (BGP has no built-in authentication). Traffic to that prefix reroutes to AS12345. The legitimate owner’s services become unreachable or traffic is intercepted.
RPKI (Resource Public Key Infrastructure): a cryptographic system that binds IP prefixes to AS numbers.
- A prefix owner creates a Route Origin Authorisation (ROA): “Only AS64512 may announce 203.0.113.0/24.” The ROA is signed and published at the owner’s RIR (RIPE, ARIN, etc.).
- Routers configured for Route Origin Validation (ROV) fetch the signed ROA data from trusted repositories and mark BGP routes as
valid,invalid, ornot-found. - ROV-enabled routers drop announcements whose origin AS does not match the ROA —
invalidroutes filtered.
Deployment: Tier-1 ISPs have largely deployed ROV by 2026; smaller ISPs lag. Cloudflare’s “Is BGP safe yet?” tracker measures coverage. Until near-universal deployment, hijacks occur monthly.
RPKI gotcha: ROAs have expiry dates, just like TLS certificates. If you let a ROA expire, your prefix becomes not-found, and conservative ROV-enabled ISPs may drop your routes. Monitor ROA expiry like any certificate.
A senior NetOps engineer is paged: a customer reports their site at example.com is unreachable from one ISP. Trace the diagnosis.
DDoS at the IP layer
Volumetric attacks flood the target with packets, exceeding link capacity or kernel-state limits. Defence:
- Tier-1 scrubbers (Cloudflare Magic Transit, Akamai Prolexic, AWS Shield Advanced): sit upstream, advertise BGP routes pulling traffic through scrubbing centres, drop attack traffic, re-inject legitimate.
- Anycast absorption: distributing the prefix across hundreds of POPs means each absorbs a fraction of the attack volume.
- Source validation (BCP 38) at the source ISP eliminates spoofed amplification packets at origin — the single biggest unilateral fix but adoption is uneven.
IP fragmentation as a security risk. Some firewalls inspect only the first fragment, letting attackers hide malicious payload in subsequent fragments. Modern firewalls reassemble before inspection, but this consumes memory (attack surface). IPv6’s ban on router fragmentation removes this attack surface.
Operational observability tools
| Tool | What it tells you |
|---|---|
traceroute / mtr | Path + per-hop loss/latency |
tcpdump | Raw packet capture |
ip route show | Linux routing table |
ip -s link | Interface stats: drops, errors |
| BGP looking-glass | Route table from inside an ISP |
| RIPE Atlas | Crowd-sourced global measurement (13k+ probes) |
| sFlow/NetFlow/IPFIX | Sampled traffic analytics |
Looking-glass servers (e.g. lg.he.net for Hurricane Electric) expose BGP and traceroute from inside ISP networks — essential for diagnosing whether your prefix is announced correctly from a remote AS’s perspective.
The data-plane revolution
Modern routers do not use kernel IP forwarding for line-rate work. Programmable ASICs (Tofino, Broadcom Trident) implement forwarding in hardware at terabits per second. Software data planes (DPDK, eBPF/XDP in Linux) move packet processing into userspace or just-above-kernel for flexibility at high throughput.
AWS’s Nitro cards offload network virtualisation entirely from VM hosts. When you operate a load balancer or firewall at meaningful scale, performance depends on data-plane architecture as much as on the routing protocol — and bugs in the data plane manifest as silent packet loss rather than obvious crashes.
mtr output showing packet loss — diagnose
$ mtr -rwc 50 dest.example.com
HOST: my-host Loss% Snt Last Avg Best Wrst StDev
1.|-- 192.168.1.1 0.0% 50 1.2 1.3 1.0 2.5 0.3
2.|-- isp-edge.example.net 0.0% 50 8.1 8.5 7.9 9.8 0.4
3.|-- core-rtr-1.tier1.net 0.0% 50 12.4 12.7 12.1 13.5 0.3
4.|-- core-rtr-2.tier1.net 0.0% 50 25.8 26.2 25.5 27.0 0.3
5.|-- peering.transit.net 14.0% 50 55.2 56.8 54.0 92.5 10.2
6.|-- ??? 100.0% 50 0.0 0.0 0.0 0.0 0.0
7.|-- dest.example.com 0.0% 50 71.3 72.0 70.5 75.1 0.8 Hop 5 shows 14% loss + high jitter, hop 6 shows 100% loss, but hop 7 shows 0%. What is happening?
What does BCP 38 (RFC 2827) recommend?
Which RFC defines the IPv6 base packet format including the 40-byte fixed header?
Why this works
Why RPKI deployment is slow despite hijacks. RPKI requires: (1) every prefix owner to register ROAs at their RIR; (2) every router-operator to fetch RPKI data, validate, and drop invalids — operational overhead + risk of false positives; (3) the RPKI infrastructure itself to stay available (a Cloudflare RPKI outage in 2022 caused routes to flap). The chicken-and-egg: partial deployment helps, but full benefit requires near-universal adoption. Tier-1 ISPs now largely deploy ROV; smaller ISPs lag. The trajectory is clear — RPKI will eventually be the default — but “eventually” has been five years and counting.
- 01Explain why RPKI exists and why deployment has been slow.
- 02How does DNS amplification work and why does BCP 38 prevent it?
- 03A mtr trace shows 0% loss at hop 7 (the destination) but 100% loss at hop 6. What is almost certainly true about hop 6?
IP has no built-in source authentication — any host can spoof any source IP — and no route authentication — any AS can announce any prefix. BCP 38 egress filtering blocks spoofed packets at ~60% of ISPs, preventing amplification attacks where an attacker’s forged UDP packets bounce large responses to victims. RPKI binds prefixes to AS numbers via signed ROAs; ROV-enabled routers drop BGP announcements from unauthorised ASes. Deployment is incomplete, so BGP hijacks occur regularly. DDoS mitigation layers anycast (distribute attack across POPs), upstream scrubbers (Cloudflare Magic Transit, AWS Shield), and BCP 38. Modern router data planes use ASICs or eBPF/XDP for line-rate forwarding; bugs manifest as silent packet loss. Operational tools: mtr for path + loss, tcpdump for raw capture, BGP looking-glass for remote route views, RIPE Atlas for global measurement.