Concepts
Model & delegation discipline
Run agents the way you'd run a team: assign the cheapest worker who can do the job well, keep the org chart shallow, and don't let anyone quietly promote themselves.
An over-powered model on a trivial task wastes money; an under-powered one on a judgment task produces confident garbage. The discipline protects both cost and quality.
Pick the cheapest tier that preserves quality#
Match the model tier to the kind of thinking the task needs, not to how important the task feels.
| Tier | Use it for |
|---|---|
| Cheapest / fastest (e.g. Haiku) | Mechanical work with no strong judgment: inventories, searches, counts, simple extraction, direct comparisons, repetitive cleanup, format conversions. |
| Mid (e.g. Sonnet) | Research, code exploration, reading repos, diagnosis, synthesis, judgment-bearing writing, and tasks that combine several sources. |
| Top (e.g. Opus) | Only real planning, conflicts between sources, complex trade-offs, architecture, product decisions, or cross-cutting changes with real impact. |
Key idea
Delegation limits#
- The cheapest tier never spawns its own subagents. If a mechanical task needs to delegate, it was scoped wrong — return it to the parent.
- Maximum depth: 2 levels (parent → subagent → one more). Deeper nesting loses context and accountability faster than it buys parallelism.
- No self-escalation. If a subagent decides it needs a smarter model, it does not upgrade itself — it returns to the parent with what it found and why. The parent decides.
- A subagent cannot approve, confirm, or execute in a hot zone. Authorization always returns to the human (or the parent acting under a human's approval). See the authorization model.
- Delegation does not replace reading the source. A summary from a subagent is an input, not the ground truth. Validate against the real files before acting.
The tool ladder#
Reach for the lightest tool that can do the job well, and only climb when it genuinely can't:
- Simple public pages → a plain fetch.
- Dynamic pages, pages behind a login, or pages needing interaction → a browser-driving tool.
- PDFs → extract the text first. Use a heavy visual/layout tool only when the layout itself carries meaning.
- Local repos → prefer fast search (
rg/grep), per-folder inventories, and selective reads before loading large amounts of context.
Encapsulate repetition#
If the same pattern shows up several times — the same multi-step lookup, the same cleanup, the same report — stop repeating it by hand and burning context. Turn it into a reusable tool, script, or documented procedure, then call that. Manual repetition is both a cost leak and an error source.
In one line#
Cheapest model that can do it well; shallow delegation (max depth 2, no self-escalation, no subagent approvals); lightest tool first; script anything you do more than a couple of times.