AI / LLM Integration
Tool calls: code and schema reading
Tool-calling bugs live in the schema, the dispatcher, and the loop — not in the model. Read each snippet the way you would in review, then choose the fix a senior engineer would make first.
Practise the diagnosis loop you run on every agent: read the tool schema and the loop code, predict how the model and your handler will behave, and reach for the fix that closes the trust boundary or stops the runaway.
Snippet 1 — the tool schema
{
"name": "cancel_order",
"description": "Cancel an order.",
"input_schema": {
"type": "object",
"properties": {
"order_id": { "type": "string" },
"reason": { "type": "string" }
}
}
}
A mutating cancel_order tool ships with this schema. What is the single biggest weakness, and the highest-leverage fix?
Snippet 2 — the parallel dispatcher
results = []
for block in response.tool_use_blocks: # may be several this turn
output = TOOLS[block.name](**block.input) # run sequentially
results.append(tool_result(block.id, output))
send(messages + results)
When the model emits three independent tool_use blocks in one turn, what does this dispatcher get right and what does it leave on the table?
Snippet 3 — the loop with no guard
while True:
resp = model.create(messages=messages, tools=TOOLS)
if resp.stop_reason != "tool_use":
break
for b in resp.tool_use_blocks:
out = run_tool(b.name, b.input) # may raise / hang
messages.append(tool_result(b.id, out))
messages.append(resp.message)
This loop runs in production against client tools. Which two defects will bite first under failure, and how do you fix them?
Snippet 4 — the validator
def handle(block):
args = block.input
try:
validated = ToolArgs.model_validate(args) # pydantic: shape + types
except ValidationError as e:
return tool_result(block.id, f"invalid arguments: {e}", is_error=True)
return tool_result(block.id, run(validated))
This handler schema-validates with Pydantic and returns errors as a tool_result. For a mutating tool against a multi-tenant database, what is still missing?
Every tool-calling bug is read in the schema, the dispatcher, the loop, or the validator. A loose schema (no required, no enum, empty description) lets the model guess and gives you nothing to validate. A sequential dispatcher is correct but forfeits the parallel-call latency win for independent calls. while True with no per-tool timeout is a runaway and a stall waiting to happen — cap iterations and time-box each tool. And Pydantic shape-validation is necessary but not sufficient for a mutating call: add the authorization and existence check, and return every rejection as a tool_result so the model can self-correct.