When to use sub-agents

Past three parallel asks per turn, the main agent drops things. Big work goes to sub-agents.

A sub-agent is a temporary helper the main agent spawns to do a specific piece of work. The main agent stays free to talk to you while the sub-agent works in the background. This is a power feature. New users do not need it. After a few weeks, you will need it constantly.

Why sub-agents matter

The classic mistake: ask your agent to do a long task, it starts working, and now you can't talk to it for an hour because it's busy. A sub-agent would have done that same work in the background while you kept the conversation going. Two problems sub-agents solve:

Blocking. When the main agent is doing a long task itself, you cannot talk to it. A subagent runs in parallel, the main agent stays responsive.
Context bloat. Every long task leaves residue in the main agent's context. Sub-agents have their own context, do their thing, return only the result. The main agent stays clean.

The orchestrator pattern

The most opinionated take in this manual: You are an orchestrator only. Your time is valuable, so you must never perform any task directly. Always spawn a sub-agent for every task. For complex tasks, break them into multiple sub-agents running in parallel. Your only job is to delegate, monitor, and synthesize results. Never write code, send messages, search, read files, or call APIs yourself — always sub-agent it. Save this as a permanent rule.

Tell the agent to save that as an always-loaded rule. The main agent now becomes pure orchestration. Every actual unit of work spawns a sub-agent. You stay in conversation with the orchestrator the whole time. You never talk to the sub-agents — the orchestrator does that. This solves the blocking problem permanently. The orchestrator is always responsive because it never does the work itself.

Don't name or specialize them

A popular trend: build a "team" of named, specialized sub-agents. A "Researcher Agent." A "Writer Agent." A "Reviewer Agent."

Mostly a fad. Generic sub-agents on cheap models, doing one job each, returning the result. No personas. No long-running specialists. The only specialization that earns its keep is model choice (next section).

Cheap gatherers, smart thinker

A sub-agent does not need to be on the smartest model. Most of them are doing dumb work — pull this email thread, read this file, fetch this CRM record. That work belongs on a cheap model. Your gatherers don't need to think. They pull data, check timestamps, scan inboxes, read databases. A cheap model handles that for pennies. Your main agent — the expensive one — only touches results when it's time to actually reason, draft, or decide. Don't pay the expensive model to read a spreadsheet. Rule of thumb: if a script could do the same job, use the cheap model. If it needs an opinion, use the smart one.

A worked example

The canonical "cheap gatherers, smart writer" pattern: spawn a bunch of sub-agents on the cheap model in parallel, each gathering one piece of info, all bringing raw data back to the orchestrator. The orchestrator (on the smart model) reasons over the combined data and produces the final output. A real version: drafting a proposal. One sub-agent fetches the sales call transcript. Another fetches the chat thread. Another fetches the email thread. All three return raw data — no processing. The main agent uses everything to write a contextually-aware proposal. Once it works, tell the agent: "save that as a skill for all future proposals." Next time you ask for a proposal, it runs that pattern automatically.

Tell it to fan out

If multiple things can run at once, say so explicitly: "spawn sub-agents in parallel to..." That phrasing tells the agent to fan out instead of running serially. Big speedups on anything that touches more than two tools.

Drift on long sessions

One pitfall with the orchestrator pattern: in long sessions, the agent forgets it is supposed to be an orchestrator and starts doing tasks itself. The fix: a scheduled task that re-injects the rule. Once a week, the agent re-reads its own orchestrator rule and confirms it's still the default. Drift solved.

When to let the main agent do the work versus spawning sub-agents that won't pollute its context.

Newsletter

Get the next one in your inbox.

One short email a week. Operator takes on AI agents, no hype.