AI / LLM Integration AI · 06 · 10

Agents: build a bounded, observable agent

Hands-on project — build a bounded, observable ReAct agent with real guardrails, then drive it into a runaway loop and prove the guards hold while measuring task success.

AI Senior ◷ 240 min

Level

FoundationsJuniorMiddleSenior

Reading about the $12 trace is not the same as keeping it from happening. Build a small ReAct agent over a couple of tools, deliberately push it into the failure modes from the unit — runaway loop, identical-call thrash, context overflow — and prove that your guardrails catch every one while you still solve the task.

Goal

Turn the unit’s mental model into a reproducible engineering loop: build the agent, instrument the loop so its cost and behaviour are observable, add the independent termination exits and input/output guardrails, then evaluate task success and prove the guards hold under adversarial input.

Project

0 of 7

Objective

Build a ReAct agent over 2–3 real tools that completes a multi-step task, is fully observable per loop iteration, and cannot run away: it always terminates within a hard step, wall-clock, and token budget, never thrashes on an identical call, and never silently drops its own instructions — proven with traces and a task-success eval.

Requirements

Acceptance criteria

A loop trace for one normal run and one adversarial run, each showing per-iteration tokens (cumulative growth visible) and the exact exit reason that fired.
A before/after comparison: the same agent WITHOUT guards (or with the cap removed) runs away or thrashes on the adversarial input; WITH guards it terminates bounded within the step/wall-clock/token budget.
An eval table: solve rate over the 8–12 tasks, plus mean steps, mean tokens, and mean cost per task — measured, not estimated.
Evidence the context guard works: a long run that stays under the window with the system/task messages still present at the final step (no silent truncation of instructions).
A one-paragraph write-up: which exit defends which failure mode, why a step cap alone is insufficient, and one task where scripting beats the agent.

Senior stretch

Add a dynamic step/turn limit that adapts to a task's estimated difficulty instead of a fixed cap, and measure the cost/solve-rate tradeoff against the fixed cap.
Add an on-call runbook: how to read the loop trace, the signatures of runaway / thrash / overflow, and the guard to check or tighten for each.
Add a prompt-injection guardrail test: feed a tool result that tries to override the system instructions and show the agent ignores it; log the attempt.
Implement a summarise-the-middle memory step and compare task success and cost against naive trimming on the same long-horizon tasks.

Recap

This is the loop you run on every real agent: build the bare ReAct engine, make every iteration observable so the quadratic growth is visible, then add the independent exits — natural stop, step cap, wall-clock/token budget, dedup — so no single missing door becomes a runaway. Pin the instructions and trim the middle so the agent never forgets its task; guard tool inputs and treat tool output as untrusted data; and evaluate task success with mean steps, tokens, and cost so ‘it works’ is a number, not a vibe. Doing it once on a toy agent makes the production version muscle memory.

Something unclear?

Ask a question about this lesson. Questions are anonymous and go straight to the author to make the lesson better.