awesome-everything RU
↑ Back to the climb

AI / LLM Integration

Agents: code and loop reading

Crux Read real agent-loop snippets — tool dispatch, step caps, context trimming, error recovery — predict the behaviour, and pick the highest-leverage fix.
Your altitude — climbing toward senior
ZeroJuniorMiddleSenior
You are at senior altitude — in orbit
◷ 14 min

The agent loop is where the money is spent and the bugs hide. Read each snippet the way you would read it in a code review on a service that bills per token, then choose the fix a senior makes first.

Goal

Practise the loop you run on every agent: read the control flow, predict where it runs forever, overflows, or thrashes, and reach for the structural guard before blaming the model.

Snippet 1 — the loop with one exit

def run_agent(task, tools):
    messages = [SYSTEM, {"role": "user", "content": task}]
    while True:
        resp = model(messages, tools)          # THINK
        if not resp.tool_calls:
            return resp.content                 # only exit: model stops calling tools
        messages.append(resp)
        for call in resp.tool_calls:
            result = dispatch(call, tools)       # ACT
            messages.append(result)              # OBSERVE
Quiz

This loop is correct ReAct but unsafe for production. What is the single most important guard to add, and why that one first?

Snippet 2 — the step cap that hides a bug

for step in range(MAX_STEPS):          # MAX_STEPS = 100
    resp = model(messages, tools)
    if not resp.tool_calls:
        return resp.content
    messages.append(resp)
    for call in resp.tool_calls:
        messages.append(dispatch(call, tools))
# fell out of the loop: cap hit
return "Sorry, I couldn't complete that."
Quiz

Telemetry shows ~30% of runs fall through and return the apology. The cap is doing its job — so what does a senior conclude?

Snippet 3 — context trimming

def trim(messages, budget=8000):
    # keep most recent messages until we're under the token budget
    kept = []
    total = 0
    for m in reversed(messages):
        total += count_tokens(m)
        if total > budget:
            break
        kept.append(m)
    return list(reversed(kept))
Quiz

This keeps the loop under the window, but it has a failure mode that surfaces on long tasks. What is it, and the fix?

Snippet 4 — error recovery

for step in range(MAX_STEPS):
    resp = model(messages, tools)
    if not resp.tool_calls:
        return resp.content
    messages.append(resp)
    for call in resp.tool_calls:
        try:
            result = dispatch(call, tools)
        except ToolError as e:
            result = {"role": "tool", "content": f"Error: {e}"}  # feed error back
        messages.append(result)
Quiz

Feeding the error back lets the model adapt — but a load test shows runs where the same tool fails identically dozens of times. What guard closes the gap?

Recap

Every agent incident is read in the loop: a single ‘model stops’ exit is unsafe, so add a hard step cap plus a wall-clock/token budget; a cap that fires routinely is a seatbelt catching a real bug, not a fix to loosen; naive context trimming evicts the pinned system/task messages and makes the agent forget its job, so pin instructions and summarise the middle; and error-feedback without a per-tool retry/dedup cap turns recovery into an infinite loop of valid calls. Read the control flow, find the unbounded path, add the structural guard, then re-run under load to confirm.

Continue the climb ↑Agents: build a bounded, observable agent
shortcuts expand
search
K
prev piece
k
next piece
j
cycle tier
t
this menu
?
sources2
expand
  1. 01
  2. 02

Trademarks belong to their respective owners. Editorial reference only.