Claude Sonnet 4.6: Flagship Performance at Mid-Tier Pricing

TL;DR

Sonnet 4.6 matches Opus 4.6 on computer use (72.5% vs 72.7%) and beats it on office tasks (GDPval-AA Elo: 1633 vs Opus)
Price: $3/$15 per million tokens, vs $5/$25 for Opus 4.6. A year ago, that performance cost $15/$75 with Opus 4.5
Computer use went from 14.9% to 72.5% in 16 months — nearly a 5x improvement
1 million token context window in beta. Now the default model in claude.ai and Claude Code
If you’re paying for the most expensive model “just in case,” it’s time to audit your stack

Twelve days ago, Anthropic launched Opus 4.6 and the market shed $285 billion in market cap. Yesterday, without anyone flinching, they launched something arguably more disruptive: a model that does nearly the same thing at a fraction of the cost.

Claude Sonnet 4.6 isn’t an incremental update. It’s proof that frontier AI no longer needs to be expensive.

The numbers that matter

Let’s cut to the comparison:

Benchmark	Sonnet 4.6	Opus 4.6	GPT-5.2
SWE-bench Verified (code)	79.6%	80.8%	~75%
OSWorld (computer use)	72.5%	72.7%	—
GDPval-AA Elo (office tasks)	1633	<1633	—
Price input/1M tokens	$3	$5	$1.75
Price output/1M tokens	$15	$25	$14

Read that table again. On real-world office tasks — drafting reports, analyzing documents, organizing information — Sonnet 4.6 outperforms the model that costs 67% more. On code and computer use, the gap with Opus is less than two percentage points.

The price of intelligence is collapsing

To put this in perspective:

Opus 4.5 (late 2025): $15 input / $75 output — the frontier price benchmark
Opus 4.6 (February 2026): $5 input / $25 output — a 67% price cut
Sonnet 4.6 (February 2026): $3 input / $15 output — equivalent performance, 40% cheaper than Opus 4.6

In less than three months, the cost of frontier intelligence dropped 80%. We went from $75 per million output tokens to $15 for the same level of performance. This isn’t a trend. It’s a collapse.

And the AI cost curve keeps accelerating. What costs $15 today will probably cost $5 in six months.

Computer Use: from promise to reality

The stat generating the fewest headlines but mattering the most: the ability to use a computer like a human.

In October 2024, Claude scored 14.9% on OSWorld. Yesterday, Sonnet 4.6 scored 72.5%. That’s nearly 5x improvement in 16 months.

What does that mean in practice? A $3/$15 model can:

Navigate web applications
Fill out forms
Click buttons, type in fields
Execute multi-step workflows on your screen

The AI agent that operates your computer is no longer science fiction. And you don’t need the most expensive model to make it work.

One million tokens of context

Sonnet 4.6 ships with a 1 million token context window in beta. To put that in perspective: roughly 750,000 words, or about 3,000 pages of text.

There’s fine print: beyond 200,000 input tokens, pricing doubles ($6/$22.50). But even with that surcharge, it’s still cheaper than Opus.

For RAG applications, large document analysis, or code review of big repositories, this matters. You no longer need to reach for Opus or Gemini just for context length.

When to use Sonnet 4.6 vs Opus vs Haiku

Here’s the practical guide no press release will give you:

Use case	Recommended model	Why
Chatbot, FAQ, classification	Haiku 4.5 ($0.50/$3)	Don’t bring a bazooka to a knife fight
Code, analysis, documents, agents	Sonnet 4.6 ($3/$15)	95%+ of Opus performance for much less
Extreme scientific reasoning, research	Opus 4.6 ($5/$25)	That 1-2% matters when precision is critical
High-volume low-value tasks	Haiku 4.5 + batch ($0.25/$1.50)	50% discount on batch API

The rule is simple: start with Sonnet 4.6. Only scale up to Opus if you can prove you need that extra 1-2% of performance. And step down to Haiku for anything that doesn’t require complex reasoning.

If you’re serious about managing AI costs, this table should be your starting point.

The pattern you should recognize

This has happened before:

GPT-4 was the premium model. GPT-4o matched it for less. GPT-4.1 mini did it for pennies.
Opus 4.5 was the flagship. Opus 4.6 beat it at a third of the price. Now Sonnet 4.6 matches it for even less.

Today’s flagship is tomorrow’s mid-tier. Always.

The implication for businesses: don’t architect your system around a specific model. Design for model swappability. What costs you $15/$75 today will cost $3/$15 tomorrow, and in a year there’ll be something better for $1/$5.

If your AI vendor locks you into a specific model with long contracts, you’re overpaying. Guaranteed.

Sonnet 4.6 as the Claude Code default

For those who code: Sonnet 4.6 is now the default model in Claude Code. Claude Code already generates over $1 billion in annual revenue and has become the most widely adopted development tool in the ecosystem.

That Anthropic chose Sonnet as the default over Opus says a lot: they trust the performance is sufficient for 95% of use cases. And they’d rather developers use more (at a low price) than less (at a high price).

What this actually means for your budget

Do the math. If your company runs an internal chatbot with 10,000 daily interactions:

Model	Estimated monthly cost
Opus 4.5 (previous)	~€8,000-12,000
Opus 4.6	~€3,000-5,000
Sonnet 4.6	~€2,000-3,500
Haiku 4.5	~€400-800

For most enterprise use cases, Sonnet 4.6 is the obvious choice. Performance is practically identical to the flagship, and the savings show up on your bottom line.

If your company hasn’t started with AI FinOps, now is the time. Not because prices are going up, but because they’re dropping so fast you might be overpaying without realizing it.

My take

I’ve been using Claude as my primary tool since 2024. I’ve gone through Opus 4.5, Opus 4.6, and now Sonnet 4.6. My conclusion: for 90% of my work — code, analysis, technical writing — Sonnet 4.6 is indistinguishable from Opus.

Where do I notice the difference? On very long chains of reasoning, where Opus maintains better coherence across many steps. But those cases account for 10% of real-world usage.

The reality is that the “premium tax” for paying for the most expensive model is evaporating. And that’s good for everyone — except those whose business model was built on selling the expensive version.

If you’re still unsure which major LLM is right for you, the answer in 2026 is more nuanced than ever. But one thing is clear: price is no longer the barrier.

Keep exploring

Claude Opus 4.6: The Model That Crashed the Stock Market - The big sibling that triggered a market earthquake two weeks ago
FinOps for AI: How to Stop Bleeding Money on Inference Costs - Practical guide to controlling what you spend on LLMs
ChatGPT vs Gemini vs Claude: 2026 Comparison - The big three, head to head