OpenAI has announced its first custom-built chip. Named Jalapeño, it was designed and built in partnership with Broadcom, and it marks OpenAI’s most direct move yet into hardware. The company even used its own AI models to help design it.
The chip is still in testing, but early results suggest it delivers significantly better performance per watt than current alternatives. OpenAI has been tight-lipped about specifics, but the emphasis on efficiency is telling. Running AI at scale is expensive, and even modest gains in power efficiency translate into serious savings when you’re serving millions of users every day.
The Broadcom partnership was first announced in October, though rumors about OpenAI building its own chips had been circulating long before that. The motivation is straightforward: OpenAI relies heavily on Nvidia GPUs, and that dependence is costly. Custom silicon gives the company more control over its costs and its roadmap.
It’s a path other tech giants have already taken. Google has its TPUs, Amazon has Trainium and Inferentia, and both companies use those chips to run AI workloads more cheaply than off-the-shelf Nvidia hardware allows. OpenAI is following the same logic, just later in the game.
Jalapeño is built specifically for inference, which is the process of running an already-trained AI model when a user sends a request. This is different from training, which is the far more computationally intensive process of building a model from scratch. OpenAI’s announcement highlighted the chip’s low operating cost when running real-time coding models, which points to products like Codex as an early use case.
For now, heavy-duty tasks like pre-training will almost certainly still run on Nvidia hardware. But inference is where the ongoing, day-to-day costs pile up. Every time someone uses ChatGPT or one of OpenAI’s APIs, that’s an inference request. Reducing the cost of each one, even slightly, adds up fast at OpenAI’s scale.
OpenAI president Greg Brockman laid out the company’s thinking on its in-house podcast after the Broadcom deal was announced. “We have a deep understanding of the workload,” he said. “We’ve really been looking for specific workloads that are underserved, and asking how can we build something that will be able to accelerate what’s possible?”
The chip also fits into a larger strategy OpenAI has been building quietly. The company is no longer just an AI research lab that sells API access. It’s building:
- Frontier AI models
- Consumer and developer products on top of those models
- Data centers to run everything
- Now, purpose-built chips to power it all
OpenAI put this plainly in its announcement: “OpenAI is not only developing frontier models or building products on top of them; it is designing the infrastructure underneath them: chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and product experience.”
The goal, as the company framed it, is full-stack optimization. When the same organization controls the model, the software, and the chip it runs on, each layer can be tuned to work better with the others. That kind of vertical integration is exactly what Apple does with its own silicon, and it’s paid off significantly in terms of both performance and cost efficiency.
Whether Jalapeño delivers on that promise at production scale remains to be seen. But the direction OpenAI is heading is clear. The company wants less exposure to Nvidia’s pricing, more control over its own infrastructure, and a tighter grip on the economics of running AI. A custom chip is one of the most direct ways to get all three.




