Google Launches Gemini 3.1 Pro: Tops ARC-AGI-2 Benchmark with 2M Token Context and Agentic Coding

Google's Gemini 3.1 Pro hits a verified 77.1% on the ARC-AGI-2 reasoning benchmark — more than double its predecessor — and arrives with a 2-million-token context window, multi-step agentic coding, and support across Gemini API, Vertex AI, and NotebookLM.

Google has released Gemini 3.1 Pro in preview, positioning it as its most capable reasoning model to date. The release lands at the top of the ARC-AGI-2 benchmark — which evaluates an AI's ability to solve logic patterns it has never encountered before — with a verified score of 77.1%. Google described this as more than double the reasoning performance of Gemini 3 Pro and on par with the best scores posted by any competing model.

Developers can access Gemini 3.1 Pro through the Gemini API, Google AI Studio, Gemini CLI, Android Studio, and Google Antigravity. Enterprise customers have access via Vertex AI and Gemini Enterprise. Consumer users will find it in the Gemini app and NotebookLM, though higher usage limits are restricted to Google AI Pro and Ultra subscription tiers. The model is launching in preview status as Google gathers validation before moving to general availability.

Gemini 3.1 Pro is designed for complex, multi-step tasks: generating code-based animations, building interactive dashboards through API configuration, creating website-ready animated SVGs, and developing immersive 3D experiences with hand-tracking capabilities. The model's 2-million-token context window — shared with the Gemini 3.1 Ultra variant — allows it to process over 1,500 pages of text or hours of video in a single session. These capabilities place it in direct competition with OpenAI's GPT-5.4 and Anthropic's Claude Opus 4.7 for enterprise developers and research workflows.

The Gemini 3.1 launch also includes the release of Gemma 4 open models, available through Google AI Studio and the Gemini API. Together, the Gemini 3.1 wave marks Google's most significant AI update since Gemini 3 launched earlier in the year. As of mid-April 2026, Gemini 3.1 Ultra and GPT-5.4 Pro are tied at 57 points on the Artificial Analysis Intelligence Index, with Gemini 3.1 Ultra pulling ahead on GPQA Diamond at 94.3% — cementing Google's position at the frontier of AI reasoning benchmarks heading into the second half of 2026.

Google Launches Gemini 3.1 Pro: Tops ARC-AGI-2 Benchmark with 2M Token Context and Agentic Coding

Comments

Related Articles

Sakana AI's Fugu Turns Model Orchestration Into a Product

GLM-5.2 Review: The Open Model That Out-Codes GPT-5.5

GPT-5.6: Everything We Know So Far — Rumors, Leaks, and Where It Stands in Testing