awesome-everything RU
↑ Back to the climb

Networking & Protocols

Bufferbloat and congestion

Crux Oversized buffers at the network edge trade packet drops for latency — a saturated link queues 100s of ms, breaking interactive apps while throughput numbers stay green.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 12 min

Someone on the team is on a video call and their voice keeps freezing — but the speed test says 300 Mbps, green across the board. Meanwhile a cloud backup is uploading in the background. The throughput is fine. The latency is not. That gap has a name.

The buffer that ate your latency

A network buffer exists to smooth bursts: when packets arrive faster than the link drains them, they wait in a queue instead of being dropped. That is healthy in moderation. Bufferbloat is the failure that comes from too much of it.

The edge of the Internet — DOCSIS cable modems, LTE basebands, home routers — is where the fast side (your gigabit LAN) meets the slow side (the metered uplink). That mismatch is the bottleneck, and vendors over-provisioned the buffer at exactly that point. A modem can hold 100–500 ms of buffered data. Under saturation, every packet behind a full backup queue waits the full depth of that buffer.

The symptom is unmistakable once you know it: idle ping is 20 ms, then someone starts an upload and ping climbs to 300–500 ms. Throughput looks perfect — the link is fully used — but every interactive packet (a DNS lookup, a SYN, a video frame) sits behind the bulk transfer. Calls stutter, games lag, web pages stall, all while the speed test reads green.

Why TCP needs the drop

The deeper cause is a mismatch between buffers and congestion control. Loss-based TCP — CUBIC, the Linux default for years — has no other way to learn the link is full. It keeps increasing its sending window until a packet is dropped, reads that drop as “the pipe is full,” and backs off.

A correctly sized buffer drops a packet early, when the queue is shallow, so TCP gets the congestion signal while latency is still low. An oversized buffer absorbs the packet instead. TCP sees no loss, assumes there is room, and pushes more data — which the giant buffer also absorbs. The window inflates until the buffer is finally, completely full, hundreds of milliseconds deep. Vendors thought “never drop a packet” was the safe choice. It is the exact opposite: by hiding the congestion signal, over-buffering guarantees the queue runs deep.

Bufferbloat quick reference
Edge buffer depth (DOCSIS / LTE)
100–500 ms of queued data
Idle vs saturated ping (no AQM)
20 ms → 300–500 ms
Saturated ping with fq_codel / CAKE
stays under ~30 ms
CUBIC over GEO (~600 ms RTT)
starves — windows sent before loss returns
BBR throughput recovery on GEO
most of link capacity, loss-independent
LEO RTT (Starlink, ~550 km orbit)
~50–60 ms — CUBIC behaves terrestrial

Why it persists

Bufferbloat has been understood since the 1980s, yet it is still everywhere. Three reasons:

  1. Wrong incentive. Vendors treated packet drops as a defect and over-provisioned buffers to “avoid drops” — the wrong fix, because the drop is the signal.
  2. Invisible metric. Residential users measure throughput, not latency under load. A speed test runs on an otherwise-idle link and never sees the bloat. The problem only surfaces when an interactive app fights a bulk transfer.
  3. Deployment lag. The cure ships in firmware. It needs the router to run Smart Queue Management (SQM), and a lot of carrier-issued kit never gets that update.

The fix itself is not exotic.

Active Queue Management

Active Queue Management (AQM) — marketed as Smart Queue Management on home routers — drops or ECN-marks packets early and fairly, before the queue grows deep, restoring the congestion signal TCP needs.

  • fq_codel (RFC 8290) — flow-queued CoDel. CoDel (“Controlled Delay”) watches how long packets dwell in the queue, not how many there are; once dwell time exceeds a target (~5 ms) for too long, it starts dropping. The flow-queue half hashes flows into separate sub-queues so one bulk upload cannot starve a latency-sensitive flow.
  • PIE (RFC 8033) — Proportional Integral controller Enhanced. Estimates queue delay and drops with a probability tuned to hold delay near a target. PIE is the AQM mandated by DOCSIS 3.1, so it ships inside modern cable modems.
  • CAKE — Common Applications Kept Enhanced. fq_codel plus built-in bandwidth shaping, per-host fairness, and DOCSIS/ATM framing compensation. It is the SQM most home-router projects (OpenWrt) reach for.

The shared idea: cap queueing delay, not queue length. Under full saturation, a link with CAKE or fq_codel keeps ping under ~30 ms instead of letting it balloon to 500 ms.

Why this works

Why SQM has to shape below line rate. AQM can only manage a queue it actually owns. If the bottleneck buffer lives inside the ISP’s modem, your router’s queue is never the one that fills — packets sail through your router and pile up downstream where you have no control. So SQM deliberately shapes egress to ~90–95% of the real uplink rate. That moves the bottleneck back into your router, where fq_codel or CAKE governs it. You trade a sliver of peak throughput for a queue you can actually discipline — almost always the right trade for an interactive household.

When the buffer is not the problem: BBR vs CUBIC

Congestion control also breaks on long-RTT paths, and there the answer is a different algorithm rather than a different queue.

LEO satellite (Starlink, ~550 km orbit, ~50–60 ms total RTT) behaves like a terrestrial link — loss-based CUBIC works fine. GEO satellite (~36,000 km orbit, ~600 ms RTT) does not. With a 600 ms feedback loop, by the time a loss signal travels back to the sender, multiple full windows have already been transmitted. Loss-based control reacts far too late; CUBIC ramps slowly and never fills the pipe — it starves.

BBR (Bottleneck Bandwidth and Round-trip propagation time) sidesteps loss entirely. It actively probes the path’s bandwidth and minimum RTT, builds a model of the bottleneck, and paces sends to that model. Because it does not wait for a drop, the 600 ms feedback delay no longer cripples it, and BBR recovers most of the GEO link’s capacity. The same loss-independence makes BBR strong on any lossy path — cellular, congested Wi-Fi.

These two threads meet at the cell tower: mobile operators deploy CAKE-style AQM at the radio cell so the buffer for users sharing one cell stays disciplined. The pattern to remember: BBR + AQM + small buffers wins on high-latency or lossy paths; CUBIC + a sanely sized buffer is fine on terrestrial wired links.

Quiz

A modem buffers 300 ms of data and 'never drops a packet.' Why does this make latency worse, not better?

Pick the best fit

A household runs video calls that stutter whenever a cloud backup uploads. Pick the fix.

Trace it
1/4

A remote worker reports video calls freeze every afternoon. Speed tests look perfect. Trace the diagnosis.

1
Step 1 of 4
Step 1: the speed test reads 300 Mbps down / 40 Mbps up, green. What does that miss?
2
Locked
Step 2: idle ping to 8.8.8.8 is 18 ms. Start a large upload and re-ping. Ping climbs to 420 ms. What is happening?
3
Locked
Step 3: the router supports SQM. How do you configure it?
4
Locked
Step 4: after enabling CAKE, re-run the saturated ping test. What should you see, and what did you trade?

BBR throughput on GEO satellite

1/3
Recall before you leave
  1. 01
    Explain bufferbloat and why it persists despite being a known problem since the 1980s.
  2. 02
    What does Active Queue Management do differently from a plain FIFO buffer, and name three AQM algorithms.
  3. 03
    Why does loss-based CUBIC starve over a GEO satellite link while BBR recovers most of the throughput?
Recap

Bufferbloat is oversized buffering at the network edge — DOCSIS modems, LTE basebands, home routers hold 100–500 ms of queued data. Loss-based TCP like CUBIC needs an early packet drop to sense congestion; a giant buffer absorbs the drop instead, so TCP keeps inflating its window until ping balloons from 20 ms to 300–500 ms while throughput still looks green. It persists because vendors chased “no drops,” users measure throughput not latency, and the fix ships in firmware. Active Queue Management — fq_codel (RFC 8290), PIE (RFC 8033), CAKE — drops or marks early and fairly, capping queueing delay under ~30 ms even at saturation; SQM shapes below line rate to own the bottleneck queue. On long-RTT paths the answer is a different algorithm: BBR probes bandwidth and RTT instead of waiting for loss, recovering throughput on GEO satellite where CUBIC starves.

Connected lessons
appears again in162
Continue the climb ↑The datacentre fabric
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources4
expand
  1. 01
  2. 02
  3. 03
  4. 04

Trademarks belong to their respective owners. Editorial reference only.