Here's a pattern we keep seeing: someone creates a single agent, gives it Sonnet, writes a vague task, and hopes it figures things out. It runs for 10 minutes, burns through credits, and produces something half-right.
Then someone else breaks the same job into three Haiku agents in a workflow. Each one has a specific task, specific tools, and a focused prompt. The whole thing runs in a fraction of the time, costs less, and produces better results.
The second approach wins every time.
Why smaller models work better for agents
A well-prompted Haiku agent with a narrow task doesn't need to be smart. It needs to be reliable. When the prompt says "extract all URLs from this page and output them as JSON," there's one right answer. Haiku nails it. Sonnet also nails it — but costs more to get the same result.
The bigger model helps when the task is ambiguous. But ambiguity in agent tasks is usually a prompt problem, not a model problem. If your agent needs Sonnet to understand what to do, the prompt isn't specific enough.
The cost math
Haiku is roughly 1/30th the cost of Sonnet per token. A three-step workflow of Haiku agents costs less than a single Sonnet run doing the same work. And because each agent has a narrower scope, they tend to use fewer tokens per step — the gap gets wider.
Real example: a competitor research workflow.
- Step 1: Haiku agent scrapes a list of competitor URLs from search results
- Step 2: Haiku agent visits each URL and extracts key data points
- Step 3: Haiku agent summarises findings into a structured report
Three focused tasks. Each one straightforward enough that Haiku handles it without breaking a sweat. Total cost: a fraction of what a single Sonnet agent would burn trying to do all three at once.
When to use Sonnet
Sonnet earns its cost when the task genuinely requires reasoning across complex, ambiguous inputs. Code review where context matters. Analysis that requires judgement. Tasks where the "right answer" depends on nuance.
But those tasks are rarer than you'd think. Most agent work is structured: fetch this, parse that, format the output. Haiku territory.
Workflows make this practical
The reason people default to one big Sonnet agent is that breaking work into steps used to be tedious. You'd have to create each agent, wire them together, handle the output passing.
Svortie workflows handle that. Describe the job, the system creates specialised agents for each step, and output flows from one to the next. You get the cost benefits of small models without the coordination overhead.
The principle
When you reach for the model picker, start with Haiku. If the task is well-defined — and it should be — Haiku will handle it. Save Sonnet for the jobs that genuinely need it.
Compose cheap specialists. Not expensive generalists.