awesome-everything RU
↑ Back to the climb

Networking & Protocols

Proxy intercepts and security gates: rate limiters, WAF, mTLS

Crux Before a request reaches origin processing, it passes through reverse proxy health-routing, token-bucket rate limiters, WAF signature matching, and optional mTLS service-mesh auth — each adding latency budgets and operational complexity.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at middle altitude — in the sky
◷ 13 min

A botnet sends 10,000 requests per second from 10,000 different IPs. Your per-IP rate limiter sees 1 request per IP — well under the limit. Your WAF sees legitimate-looking browser fingerprints. The origin server is about to be overwhelmed. The security gates are running, but they are running alone instead of in layers. This lesson examines each gate, its cost, and why none of them works without the others.

Gate 1: Reverse proxy health routing

Before Bea’s SYN packet reaches origin, it may arrive at Patty — a reverse proxy or CDN edge. Patty makes a routing decision:

Cache hit path. Patty checks her cache: “Do I have the response for this URL with matching Vary headers?” If yes, she responds immediately without forwarding to origin. TCP + TLS were negotiated against Patty’s edge server (~20 ms RTT); origin never sees the request.

Cache miss path. Patty checks origin health. Health checks run out-of-band every few seconds — a periodic GET /health or TCP probe to each origin server. If origin is healthy, Patty forwards. If unhealthy (failed check, process down, 5xx rate too high), Patty routes to a backup server in the pool. This happens mid-flight — a SYN arriving during a failover may be routed to a different server than the one the previous SYN reached.

Connection draining. When a server is being removed from the pool (deploy, scale-down), Patty stops sending new connections to it but lets existing connections finish their response. The draining window (typically 30–60 s) ensures in-flight requests complete cleanly before the server is taken offline.

GateLatencyWhat it blocksWhat it misses
Proxy cache check<1 msCacheable traffic — serves from edgeNon-cacheable, mutation endpoints
Per-IP rate limiter<1 msSingle-IP floods, naive scrapersDistributed botnets (1 req/IP/s)
WAF signature match1–5 msSQLi, XSS, CVE signaturesNovel attacks, encrypted payloads
mTLS client cert verify20–40 msUnauthenticated services in service meshCompromised but valid cert holder

Gate 2: Rate limiter (token bucket)

The rate limiter enforces quotas: “This IP may send N requests per second.” Implementation: token bucket algorithm. Each IP has a bucket with capacity C. Every second, T tokens are added (up to C). Each request consumes one token. When tokens are exhausted, new requests are dropped or returned 429 Too Many Requests.

Token bucket vs leaky bucket. Token bucket allows bursts up to capacity C. Leaky bucket drains at a constant rate, preventing bursts. Most API rate limiters use token bucket because it allows bursty-but-honest clients (a browser loading 30 subresources simultaneously) without penalising them.

The distributed botnet problem. A botnet with 10,000 IPs, each sending 0.9 req/s, sends 9,000 req/s total. Per-IP limit of 2 req/s passes all of them. Defence: add a global concurrent-connection limit or adaptive concurrency limiting at origin — when in-flight requests exceed the server’s capacity, reject new ones with fast 503 regardless of the per-IP quota.

Jitter on limit resets. If all clients hit their rate limit simultaneously and all tokens reset at second 0, a thundering herd hits the origin at t=1 s. Solution: randomise token-refill offsets per client (add ±10% jitter to the refill window). Rate limit resets stagger across clients, smoothing origin load.

Gate 3: WAF — Web Application Firewall

The WAF inspects application-layer content. Two modes:

Signature-based (rule mode). OWASP ModSecurity Core Rule Set (CRS) defines patterns: SQL injection (union select), XSS (<script>), path traversal (../), CVE-specific payloads. Each request is matched against hundreds of rules. Running at PL1–PL2 (Paranoia Level 1-2) balances false-positives against coverage. PL4 (paranoid) blocks legitimate traffic that happens to match aggressive patterns.

Anomaly-based. Baseline normal request shape (rate, user-agent, header structure, payload entropy). Deviations are scored. An IP sending 100 req/min with headless Chrome fingerprint that suddenly sends 3,000 req/min with identical headers is flagged as a bot. Anomaly detection cannot be bypassed by knowing the rules.

WAF cost. 1–5 ms per request for a rule-based WAF running at the edge. Justification: eliminates broad classes of application attacks that would otherwise require expensive application-layer code paths or DB query defenses.

Gate 4: mTLS — mutual TLS for service-to-service auth

Ordinary TLS authenticates only the server (Bea trusts Sven via Cara’s certificate). mTLS authenticates both parties. During the TLS handshake, Sven sends a CertificateRequest message. Bea sends her client certificate. Sven verifies it against a trusted CA.

Where mTLS matters. Internal microservices in a zero-trust network. If you have a payment service and a user service on the same Kubernetes cluster, mTLS ensures the user service cannot impersonate the payment service — even if an attacker gains network access inside the cluster.

SPIFFE. The SPIFFE/SPIRE framework automates certificate issuance for workloads: each service gets a short-lived certificate (SVIDs) from a SPIFFE server. Certificates rotate automatically (e.g., hourly). mTLS is enforced by the service mesh (Istio, Linkerd) as a sidecar proxy — application code does not handle TLS directly.

mTLS cost. One extra round-trip per new connection (client cert request + send). 20–40 ms added to first-connection latency. On warm connections (session resumption), the cost is zero. Operational cost: certificate distribution, rotation, and revocation must be automated — doing this manually for hundreds of services is impractical.

Edge cases

mTLS does not protect against a compromised-but-valid certificate holder. If an attacker steals a valid client certificate, mTLS trusts them. Defence: short certificate lifetimes (1 hour via SPIFFE) + OCSP stapling for revocation. The probability of an attacker intercepting and using a stolen cert before it expires drops dramatically when certs live only an hour.

Trace it
1/6

Trace a request from a normal user and a botnet IP through all four gates.

1
Step 1 of 6
Normal user: TCP SYN arrives at Patty's edge. First check?
2
Locked
Cache miss. Rate limiter check for normal user (2 req/s IP limit)?
3
Locked
WAF check for normal user?
4
Locked
Botnet IP: 10,000 IPs each sending 1 req/s. Per-IP rate limiter?
5
Locked
Global concurrency limit at origin. In-flight requests exceed threshold. What happens?
6
Locked
mTLS for an internal API call (payment service → user service). What extra step?
Quiz

A botnet sends 1 request/second from 10,000 different IPs. Your per-IP rate limit is 5 req/s. How do you defend without blocking legitimate users?

Quiz

Why do mTLS certificates need to be short-lived (e.g., 1 hour) in a zero-trust service mesh?

Recall before you leave
  1. 01
    What is connection draining, and why is it needed during a rolling deploy?
  2. 02
    Why does adding jitter to rate-limit token-refill timing reduce origin load spikes?
  3. 03
    What does SPIFFE add over manually managing mTLS certificates?
Recap

Before a request reaches origin, it passes through four gates: CDN cache lookup (eliminates the request entirely if cached), per-IP rate limiter using a token bucket (stops single-IP floods), WAF signature matching (blocks known attack patterns), and optional mTLS client certificate verification (prevents unauthenticated service-to-service calls in a zero-trust mesh). Each gate adds under 5 ms except mTLS (20–40 ms on first connection). Distributed botnets defeat per-IP rate limits by spreading load — the defence is a global adaptive concurrency limiter at origin that rejects excess in-flight requests with fast 503 regardless of per-IP quota. SPIFFE automates mTLS certificate issuance and rotation, making short-lived hourly certs operationally practical for hundreds of microservices.

Connected lessons
appears again in258
Continue the climb ↑Alternate paths: QUIC 0-RTT, WebSocket upgrade, connection migration
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources3
expand
  1. 01
  2. 02
  3. 03

Trademarks belong to their respective owners. Editorial reference only.