Microsoft Aims MAI-Thinking-1 Straight at Claude: a 35B Reasoning Model It Says Beats Sonnet 4.6 and Matches Opus 4.6 on Code

At Build 2026, Microsoft’s AI Superintelligence Team unveiled MAI-Thinking-1, its first in-house reasoning model — a sparse Mixture-of-Experts design with 35B active parameters (~1T total) and a 256K context window. Microsoft says human raters prefer it to Claude Sonnet 4.6 and that it matches Opus 4.6 on the SWE-Bench Pro coding benchmark, and it pointedly trained the model from scratch with zero distillation on commercially licensed data — no OpenAI involved.

At its Build 2026 developer conference, Microsoft’s AI Superintelligence Team unveiled MAI-Thinking-1, the company’s first in-house reasoning model — and it did not pick a subtle benchmark to plant its flag against. The pitch is explicitly framed around Anthropic’s Claude. MAI-Thinking-1 is a sparse Mixture-of-Experts model with roughly 35 billion active parameters out of about one trillion total, paired with a 256,000-token context window that Microsoft says is enough to read a 600-page document in a single pass.

The headline numbers are aimed at the parts of the market Claude currently owns. Microsoft says independent human evaluations run by Surge preferred MAI-Thinking-1 to Claude Sonnet 4.6 in blind tests, and that on SWE-Bench Pro — one of the hardest agentic coding benchmarks — it scores around 53%, putting it level with Claude Opus 4.6. On the math and multi-step reasoning side it posts 97.0% on AIME 2025 and 94.5% on AIME 2026. The obvious caveat: these are Microsoft’s own figures, and no independent peer review has landed yet, so the comparisons remain claims rather than confirmed results.

The training story is as much a part of the message as the scores. Microsoft says MAI-Thinking-1 was trained from scratch “with zero distillation on enterprise grade, clean and commercially licensed data” — and, pointedly, without any data distilled from third-party models, including OpenAI’s GPT series. For a company whose AI strategy has been synonymous with OpenAI for years, that line is doing strategic work: it lets Microsoft sell enterprises a model with a clean, auditable data provenance that does not depend on a partner-slash-rival.

MAI-Thinking-1 is available now in private preview on Microsoft Foundry, with function calling, multi-layered instruction following and Chat Completions API compatibility so existing code can target it with minimal changes. It is one of seven new in-house models Microsoft showed at Build — alongside MAI-Image-2.5, MAI-Transcribe 1.5, MAI-Voice-2 and the MAI-Code-1 coding model — and Microsoft stressed that “developer choice doesn’t stop at our catalog,” noting the MAI models are also offered through Fireworks AI, Baseten and OpenRouter.

Strategically, this is the clearest signal yet of Microsoft’s shift from OpenAI reseller to multi-vendor platform. The company is not walking away from OpenAI — the partnership and the GPT models remain front and center in Foundry — but it now wants to own a credible frontier-grade option of its own, sitting inside Azure’s governance and security stack, so that enterprise developers have genuine model choice without leaving the platform. It dovetails with the rest of Microsoft’s Build narrative: an agent-first runtime, a control plane for orchestrating agents, and increasingly its own silicon to run them on.

Why aim at Claude specifically? Because in the enterprise, Claude — and Claude Code in particular — has become the default for serious coding and agentic work, exactly the high-value workloads Microsoft most wants flowing through Foundry rather than out to a competitor. A 35B-active model that can credibly claim Opus-4.6-class coding at a fraction of the token cost is a direct attempt to undercut that default on price-performance. Whether the claim survives contact with independent benchmarks is the question everyone — Anthropic included — will be waiting to see answered.

Microsoft Aims MAI-Thinking-1 Straight at Claude: a 35B Reasoning Model It Says Beats Sonnet 4.6 and Matches Opus 4.6 on Code

Want AI news before everyone else?

Related Articles

SpaceXAI's Grok 4.5 Bets on Efficiency Over the Crown — Opus-Class Coding at 4.2× Fewer Tokens

PrismML's Bonsai 27B Squeezes a 27-Billion-Parameter Model Onto Your Phone — at 3.9GB, Fully Offline

Mira Murati's Thinking Machines Releases Inkling — a 975B Open-Weight, Multimodal Model Built to Be Customized