Reliability, Cost & Safety of Agents

Agents introduce failure modes and cost structures qualitatively different from single LLM calls.

Reliability, Cost & Safety of Agents

Agents introduce failure modes and cost structures qualitatively different from single LLM calls.

Why Agents Are Harder to Trust

Compounding errors: An error in step 3 of a 10-step loop contaminates steps 4 through 10. A small hallucination early can cascade into a completely wrong final output.

Unpredictable tool chains: The model decides which tools to call. In edge cases, those decisions may be wrong in ways hard to anticipate. An agent with file-write access and incorrect context about which file to modify can cause real damage.

Hallucinated tool calls: Models can generate plausible-looking calls with incorrect arguments — filenames that don't exist, API parameters with made-up values. Always validate before executing.

The Cost Reality

A 10-step task at 2K tokens input + 500 output per step costs ~$0.25 at typical 2026 API pricing. At 10,000 runs/day, that's $2,500/day.

Cost controls: - Hard maximum iteration count (e.g., 15 steps) - Cheaper models for lower-stakes subtasks - Cache tool results for recurring queries - Alert on anomalous cost-per-task

Safety: Minimal Footprint Principle

Anthropic's guidance: agents should request only the permissions they need, prefer reversible actions over irreversible ones, and confirm before consequential steps.

Reversible first — Move files to temp before deleting. Write alongside before overwriting.

Least privilege — Read-only DB connection unless writes are genuinely needed.

Confirmation gates — Sending emails, deploying code, submitting forms require explicit user confirmation.

How Claude Code Does It

Claude Code asks permission before editing files not explicitly pointed to, shows diffs before applying changes, requires confirmation before shell commands, and refuses destructive operations without explicit approval. Build these instincts into your own agents.

Have a follow-up question about this topic?

Ask AI

← Previous

Agentic Workflows in Practice

Building Your First Agent