zerotoclaude

Claude isn't one model — it's a family. The three names you'll see most are Opus, Sonnet, and Haiku. Each has a different sweet spot for capability, speed, and price. Picking the right one for a given task is small but compounds: the cost and time savings over a month of work add up to real money and real hours.

The three tiers, in plain English

Opus

The biggest, smartest model. Best at deep reasoning, complex refactors, multi-step planning, and any task where being a little wrong costs a lot. Slowest and most expensive per token. Worth it for high-stakes work, prohibitive for cheap repetitive work.

Sonnet

The middle ground. Most of what you'll do day-to-day in Claude Code is well served by Sonnet — code edits, bug fixes, small feature work, code reading. Substantially cheaper and faster than Opus, while still very capable. The default choice unless you know you need otherwise.

Haiku

The smallest and fastest. Good for high-volume, well-bounded tasks: filtering a list, formatting text, classifying inputs, bulk operations that don't need much judgment. Cheap enough to run constantly, fast enough that you don't notice the wait.

Match the model to the task, not the user

New users tend to pick "the best model for me" and stick with it for everything. The right move is to match the model to the task. A formatting pass on a list of strings is a Haiku job even if you're a senior architect. A subtle refactor across twenty files is an Opus job even if you're on a tight budget.

How to choose, in practice

A rough decision tree:

Is the task small, well-bounded, repetitive? Haiku.
Is it a typical code task — read a few files, make a change, verify? Sonnet.
Is it a complex investigation, a tricky refactor, or architectural reasoning? Opus.
Are you not sure? Sonnet. Sonnet is the right default; you can escalate later if it struggles.

Switching mid-session

You can change models within a session — useful when you start with one and realise it's not the right fit. A common pattern: start in Sonnet for the bulk of the work, switch to Opus when you hit a thorny problem, switch back when the hard part is solved.

Switching is a one-line command (/model opus or similar). The conversation context carries over; only the model behind the curtain changes.

Subagents and model choice

Subagents inherit the parent's model by default, but you can override per spawn. This is where real cost optimisation lives. A common setup:

Main session: Sonnet. Talking to you, planning, steering.
Research subagents: Haiku. Wide reads of many files, summary back. Fast and cheap.
Code-review or architecture subagents: Opus. Want the careful eye.

A workflow built this way costs a fraction of the same workflow run entirely on Opus, with little quality loss.

The large context variants

Some models offer a large-context variant — a version of the same model with a much bigger context window (e.g., 1M tokens). Useful for:

Reading enormous codebases in one shot.
Working with very long log files or transcripts.
Tasks that legitimately need to keep many files in mind at once.

Large-context variants typically cost more per token. The right move is to use them only when you genuinely need the window — not as a default. Most tasks fit fine in the standard context.

Fast mode

Some Claude Code setups offer a fast modeon Opus — same model, faster output speed. Useful when latency is the bottleneck rather than capability. Toggle it on for interactive sessions where you want quicker turn-around; toggle it off when you're running unattended (the speedup doesn't help and the cost is the same).

Reading your usage

Anthropic's dashboard shows your token consumption by model. Once a week or so, glance at it. Surprises usually mean one of two things:

A scheduled task or background agent is running more often than you intended.
You're using a more expensive model for a class of work where a cheaper one would do.

Both are easy to fix once you notice. The cost of not noticing is the "wait, why is my bill this big" surprise nobody enjoys.

Caching is free money

Repeated context — like the contents of CLAUDE.md re-read every session — is often cached by the API, dramatically reducing cost on the repeated portion. Structure long-running sessions so the stable context comes first, followed by the variable work. The model isn't magic, but the billing is generous about repetition.

Don't let cost paralysis kill velocity

On the other side: don't become so cost-conscious that you avoid using Claude Code for valuable work. The thirty dollars Opus costs to plan and execute a tricky refactor is a fraction of the value of the refactor done well, and a tiny fraction of the cost of a human afternoon spent on it. Spend confidently where the work warrants. Be efficient where it doesn't.

What to take with you

Haiku for bulk & bounded work. Sonnet as the default. Opus for hard, high-stakes thinking.
You can switch models within a session. Often the right move is Sonnet → Opus for one hard chunk, then back.
Use subagents on Haiku for wide research; reserve Opus for the parent or the critical reviewer subagent.
Large-context variants only when you actually need the extra window. Don't default to them.
Check usage weekly. Surprises usually mean an over-eager scheduler or a model mismatch.

Models and cost