OpenAI Unveils Jalapeño, Its First Custom AI Chip
OpenAI unveiled its first in-house chip on June 24 — Jalapeño, an LLM-inference ASIC co-built with Broadcom. Designed to tape-out in about nine months with help from OpenAI’s own models, it ships from late 2026 and targets far better performance-per-watt.
OpenAI unveiled its first custom chip on June 24, a custom AI accelerator named Jalapeño that it built with Broadcom. Billed as an "Intelligence Processor," the chip is a reticle-sized ASIC designed specifically to run large language models in response to user queries — inference — rather than to train new ones. It marks OpenAI's move from being purely a buyer of AI hardware to designing the silicon at the bottom of its own stack.
The division of labor pairs each company's strengths. OpenAI architected the accelerator around its own view of how LLM inference should work, while Broadcom handled the silicon implementation, networking and connectivity that turn a design into a deployable system. Broadcom's chief executive Hock Tan and president Charlie Kawwas hand-delivered the first chip to OpenAI's Sam Altman and Greg Brockman — a bit of theater underscoring how strategically both sides view the tie-up, which was first announced in October 2025.
Two details stand out. First, speed: OpenAI says the accelerator went from initial design to manufacturing tape-out in roughly nine months, an unusually fast cycle for custom silicon — and one the company says was accelerated by using its own AI models to help with the chip design itself. Second, efficiency: early testing reportedly shows Jalapeño delivering performance-per-watt "substantially better" than current state-of-the-art parts, the metric that matters most when inference, not training, is the cost that scales with every user.
"We have a deep understanding of the workload," Brockman said, framing the effort as a hunt for "specific workloads that are underserved" and how to "build something that will be able to accelerate what's possible." OpenAI describes working "across the stack" — co-optimizing the chip, kernels, memory, networking and deployment — to make its models "faster, more reliable, and more affordable." Jalapeño is set for initial deployment by the end of 2026 and will expand from there, part of a gigawatt-scale buildout of OpenAI-designed accelerators with Broadcom.
The strategic message is hard to miss: OpenAI wants to own more of its destiny rather than depend entirely on Nvidia GPUs for the compute that runs ChatGPT. Custom inference silicon, tuned to OpenAI's exact workloads, is a bet that vertical integration can drive down the cost of serving models at scale — the same playbook Google has run with its TPUs and Amazon with Trainium, now extended to the company with the largest consumer AI footprint.
Want AI news before everyone else?
The morning's most important AI stories, straight to your inbox. No fluff.