Claude
Claude
Text & Chatfreemium
4.6

Claude Fable 5 Review: The Most Capable Public Model Yet — for a Premium

Anthropic's Claude Fable 5 is the most capable model the public can use today — topping SWE-Bench Pro and excelling at vision and long tasks. It is also the priciest major model and ships with hard safety guardrails. Our first-look verdict.

Pros

  • Best agentic-coding result to date (80.3 on SWE-Bench Pro)
  • State-of-the-art vision plus stronger long-horizon autonomy
  • Mythos-class capability finally available to everyone
  • Robust guardrails that fall back to Opus 4.8 rather than fail

Cons

  • Most expensive major model at $10 in / $50 out per million tokens
  • Safety classifiers can block legitimate security research
  • Mandatory 30-day retention even on zero-retention plans
  • Bundled free access ends June 22 then costs usage credits

A first-look review based on Anthropic's launch data, the published system card, and early third-party testing — not extended hands-on use. We will revisit it after living with the model.

Claude Fable 5, released June 9, is the first model from Anthropic's powerful "Mythos" class that anyone can actually use. It is, on paper and in early reports, the most capable generally available model on the market — and also the most expensive, hedged by some of the strictest safety guardrails any frontier lab has shipped. The result is a model that is easy to admire and harder to recommend universally.

Capabilities

On agentic coding — the workload most teams care about — Fable 5 is in a class of its own. It scores 80.3 on SWE-Bench Pro, a wide gap over Claude Opus 4.8 (69.2) and GPT-5.5 (58.6), and it is the first model to clear 90% on Hex's analytical benchmark. Anthropic also positions it as state-of-the-art for vision, and its long-horizon autonomy is real: one external researcher reproduced a frontier physics result in 36 hours using about a third of the reasoning tokens GPT-5.5 needed over four days.

FABLE 5 vs THE FIELDFable 5Opus 4.8GPT-5.5Gemini 3.1 ProSWE-Bench Proagentic coding80.369.258.654.2FrontierCodehard coding29.313.45.7Humanity's Last Examreasoning, no tools59.049.841.444.4Blueprint-Bench 2spatial reasoning38.614.536.226.5AutomationBenchtool use17.415.512.99.6GDP.pdfknowledge-work vision29.822.524.916.7Higher is better. Cyber and biology benchmarks omitted — Fable's guardrails route those to Opus 4.8. Source: Anthropic.
Fable 5 vs the field across six benchmarks.
FRONTIERCODE — ACCURACY vs COSTDiamond subset: the hardest 50 of 150 tasks · more reasoning costs more but scores higher0102030$2$5$10$20Mean cost per task (USD, log scale)Score (%)lowmedhighxhighmaxlowmedhighxhighmaxlowmedhighxhighFable 5Opus 4.8GPT-5.5Point positions approximate the published chart; xhigh values match Anthropic's table. Source: Anthropic.
On the hardest FrontierCode tasks, Fable 5 keeps climbing as you spend on reasoning — Opus 4.8 and GPT-5.5 plateau.

Price

Capability this high is not cheap. Fable 5 lists at $10 per million input tokens and $50 per million output — less than half the old Mythos Preview price, but roughly double Opus 4.8 and the priciest of the major frontier models. For high-value, low-volume work (hard debugging, research, complex agents) the economics make sense; for high-volume chat or batch jobs, the output price adds up fast and Opus 4.8 is often the smarter default.

PRICE · $ PER MILLION TOKENSFable 5Opus 4.8Input$10$5Output$50$25About 2x Claude Opus 4.8 — the priciest of the major frontier models
About twice the price of Opus 4.8 — the priciest of the majors.

Safety and access

Fable is the guardrailed edition of Mythos. Three classifiers — cyber, biology and chemistry, and model distillation — block high-risk requests and quietly hand them to Opus 4.8; Anthropic says over 95% of sessions never trigger a fallback, and an external bug bounty logged 1,000+ hours without finding a universal jailbreak. The catch: a mandatory 30-day data-retention window applies to all Mythos-class traffic, even for customers on zero-retention contracts. It is available today via the Claude API as claude-fable-5 and free on Pro, Max, Team and seat-based Enterprise plans through June 22, after which it draws on usage credits.

SAFETY GUARDRAILSOffensive cyberBiology & chemistryModel distillationClaude Opus 4.8Blocked, then answered by Opus 4.8 — over 95% of sessions never hit a guardrail
High-risk requests are blocked and routed to Opus 4.8.
CYBER ADVERSARIAL ROBUSTNESSoffensive-cyber task success under automated red-teaming · lower is saferOpus 4.6no cyber safeguards83.2%Opus 4.772.7%Opus 4.856.6%Fable 55.4%Lower attack-success rate is safer. Fable 5's guardrails cut successful attacks to 5.4%. Source: Anthropic.
Fable 5's guardrails cut offensive-cyber attack success to 5.4%.
OFFENSIVE CYBER EVALS — MYTHOS 5 vs FABLE 5success rate % · Fable blocks every categoryMythos 5 (restricted)Fable 5 (public)Firefox88.40.0OSS-Fuzz24.00.0CyberGym83.80.0CyScenarioBench38.70.0Mythos 5 can complete these tasks; public Fable 5 refuses — 0.0 across the board. Source: Anthropic.
Mythos 5 can run these attacks; public Fable 5 scores 0.0 on all four.
MISALIGNED BEHAVIORautomated alignment assessment · score 1-10 · lower is betterSonnet 4.62.81Mythos Preview1.90Opus 4.82.05Mythos 52.06Lower is better. Mythos 5 — Fable's parent model — is as aligned as Opus 4.8. System card section 6.2.3.1.
Mythos 5's alignment matches Opus 4.8 and beats Sonnet 4.6.

Who it's for

Fable 5 is the obvious pick for engineering teams that want the best agentic coding available, for vision-heavy workflows, and for long, multi-step tasks where its autonomy pays for itself. It is a poor fit for offensive-security and red-team work (the guardrails are designed to stop exactly that), for organizations bound by strict zero-retention requirements, and for cost-sensitive, high-volume deployments where Opus 4.8 delivers most of the value at half the price.

Verdict

Claude Fable 5 resets the bar for what a publicly available model can do, and Anthropic deserves credit for shipping its most powerful system with the dangerous edges genuinely fenced off rather than simply withholding it. The premium price and the hard guardrails keep it from being a default for everyone — but for the work it is built for, nothing else comes close right now. An outstanding, if expensive and opinionated, release.