PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: openai broadcom deal

  • OpenAI and Broadcom Unveil Jalapeño, a Custom LLM Inference Chip to Cut Compute Costs and Reduce Nvidia Dependence

    OpenAI and Broadcom pulled the wrapper off Jalapeño on Wednesday, June 24, 2026, a custom silicon accelerator that OpenAI is calling its first “Intelligence Processor” and its first real move into designing the hardware underneath its own models. Broadcom President and CEO Hock Tan and President Charlie Kawwas physically handed the wafer to OpenAI CEO Sam Altman and President and Co-Founder Greg Brockman, a staged moment meant to signal that the ChatGPT maker is no longer just a models-and-products company but is now reaching all the way down to the chip. Jalapeño is purpose-built for large language model inference, the compute-intensive job of actually serving answers to users rather than training the model in the first place, and OpenAI plans to deploy it at gigawatt scale by the end of 2026 as the first step in a multi-generation platform built with Broadcom and Canadian electronics manufacturer Celestica. You can read the announcement straight from the source in OpenAI’s official post.

    TLDR

    OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI chip, an ASIC designed from a blank slate specifically for LLM inference rather than training, manufactured by TSMC and integrated into server systems by Celestica that only OpenAI will use. OpenAI claims the chip went from initial design to manufacturing tape-out in just nine months, what it calls the fastest ASIC development cycle ever in high-performance advanced semiconductors, accelerated in part by using its own AI models to design the silicon. Engineering samples are already running ML workloads in the lab, including GPT-5.3-Codex-Spark, and OpenAI says early testing shows performance per watt “substantially better” than current state-of-the-art, a self-reported and not yet independently verified claim with a full technical report promised in the coming months. Broadcom CEO Hock Tan told Reuters the chip matches Nvidia’s Blackwell and Google’s TPUs, framing the launch as part of a flywheel where OpenAI owns the full stack from chip to model to product. The chip slots into a broader infrastructure strategy targeting 10 gigawatts of custom accelerator capacity between 2026 and 2029 with deployments alongside Microsoft and other partners, and The Decoder reported Microsoft is expected to buy 40 percent of the chips, a guarantee Broadcom reportedly demanded to secure the first phase. The move is widely read as OpenAI diversifying away from Nvidia, continuing a procurement spree that already includes AWS Trainium, AMD, and Cerebras, as inference quietly becomes the company’s real cost center.

    Thoughts

    The single most important word in this announcement is “inference,” and it is the word doing the heavy lifting. Training a frontier model is a capital expense that happens in bursts. Inference is the bill that arrives every single day, forever, scaling linearly with usage. Every ChatGPT reply, every Codex task, every API call, every agent step is an inference event, and as OpenAI’s product surface explodes that recurring cost is the thing that actually threatens the unit economics. A custom chip aimed squarely at inference is therefore not a vanity project or a research flex. It is OpenAI attacking the largest variable cost in its business at the root, trying to bend its cost-per-token curve below what it pays renting Nvidia GPUs. If Jalapeño lands anywhere near its claims, the payoff is not faster benchmarks, it is gross margin.

    The performance-per-watt claim, though, deserves the most skeptical reading in the room. OpenAI says Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art, but it has not finalized the numbers, has not said which chips it tested against, on what tasks, or under what conditions, and the full technical report is somewhere in the indefinite “coming months.” These are self-reported figures from a company with an enormous interest in convincing the market it has a credible alternative to Nvidia. Hock Tan’s line that the chip is “as good as” Blackwell and Google’s TPUs is a CEO talking his own book in an interview, not a measured result. The honest posture is to treat the figures as marketing until the technical report lands. A chip running engineering samples in a lab at target frequency is real progress, but it is a very long way from a chip that holds those numbers across a production fleet under messy real-world load.

    OpenAI left the most revealing detail out of its own press release: the report, via The Decoder, that Broadcom demanded Microsoft guarantee it will buy 40 percent of the chips to secure the first phase. That single sentence tells you who is actually carrying the risk. Building gigawatt-scale custom silicon is brutally capital-intensive, and Broadcom is not willing to commit manufacturing capacity on the strength of OpenAI’s demand alone. It wants a balance sheet behind the order, and Microsoft, OpenAI’s largest backer, is the balance sheet. That detail quietly reframes the whole “OpenAI owns the stack” narrative. OpenAI may design the chip, but the deployment is underwritten by Microsoft’s purchasing commitment, which means Microsoft also gets leverage and supply security out of an OpenAI-branded part. Ownership of the design is not the same as ownership of the risk.

    The flywheel framing is genuinely interesting and probably the most defensible strategic claim OpenAI is making. OpenAI says it used its own models to accelerate parts of the chip design and optimization, compressing a normally multi-year ASIC cycle into nine months. If that is even partly true, it is a meaningful loop: the models help design the chips, the chips run the models more cheaply, the cheaper models drive more usage and revenue, and the revenue funds the next chip. That is a compounding advantage that is hard for a pure hardware vendor to replicate and hard for a pure software lab to replicate. The catch is that nine months from design to tape-out is a claim about speed, not about whether the resulting chip is actually competitive in volume. Fast tape-out and great silicon are different achievements, and the industry has seen plenty of chips that taped out quickly and underwhelmed in production.

    Strip away the “Intelligence Processor” branding and this is a playbook we have already watched run three times. Google built TPUs, Amazon built Trainium and Inferentia, Meta built MTIA, and all of them turned to Broadcom or Marvell for the design IP that is hard to replicate in-house. OpenAI is doing the same thing with the same partner, just later and louder. The diversification arc is unmistakable: OpenAI was one of the biggest Nvidia GPU buyers on earth, and in the span of a year it has signed deals for AWS Trainium, AMD accelerators, and Cerebras inference hardware, and now its own custom ASIC. Nvidia is not in trouble, demand still vastly outstrips supply, but the era where the largest AI labs were captive single-vendor customers is clearly ending. The most intriguing wildcard is OpenAI’s own line that Jalapeño is “designed with flexibility to work with all LLMs.” That is not how you describe a chip you intend to keep entirely to yourself. It hints, however faintly, at an OpenAI that could one day rent out inference infrastructure the way it now rents models, which would put it in direct competition with the very cloud providers it currently depends on.

    Key Takeaways

    • OpenAI and Broadcom unveiled Jalapeño on Wednesday, June 24, 2026, OpenAI’s first custom AI chip and its first piece of in-house silicon after years focused on models and products.
    • The chip is branded an “Intelligence Processor” and described as the first AI accelerator in a multi-generation compute platform the two companies are building together.
    • Jalapeño is purpose-built for large language model inference, the compute-intensive work of generating responses and serving answers to users, and explicitly not for training.
    • Inference is OpenAI’s recurring cost center: every ChatGPT conversation, coding request, image generation, and agent action relies on it, making it one of the highest ongoing costs in the business.
    • Broadcom President and CEO Hock Tan and President Charlie Kawwas physically delivered the first wafer to OpenAI CEO Sam Altman and President Greg Brockman.
    • OpenAI designed the chip from scratch around its understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs.
    • Jalapeño is described as a blank-slate design for modern LLM inference, not a general-purpose accelerator adapted from earlier AI workloads.
    • The chip is shaped by the systems OpenAI runs daily across ChatGPT, Codex, the API, and future agentic products, while also being designed to work with current and future LLMs across the industry.
    • The stated performance goal is to combine the throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, suiting it for interactive LLM products at scale.
    • OpenAI frames this as its full-stack advantage: it designs frontier models, builds products on top of them, and now designs the chip architecture, kernels, memory systems, networking, scheduling, and deployment systems underneath.
    • OpenAI claims Jalapeño went from initial design to manufacturing tape-out in just nine months.
    • The companies call it what they believe to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors, against a backdrop of typically multi-year timelines.
    • OpenAI used its own AI models to accelerate parts of the chip design and optimization process, which it credits for the speed.
    • OpenAI frames the result as a flywheel: the same models served to users help improve the infrastructure that runs future models, lowering compute cost across the industry.
    • Engineering samples of Jalapeño are already running ML workloads in the lab at production target frequency and power.
    • Among the workloads running on the samples is OpenAI’s GPT-5.3-Codex-Spark model.
    • GPT-5.3-Codex-Spark currently runs on Cerebras hardware, which also specializes in inference, per The Decoder.
    • OpenAI says early testing shows Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art hardware.
    • That performance-per-watt claim is self-reported and lacks independent verification; OpenAI has not said which chips it tested against, on what tasks, or under what conditions.
    • OpenAI says it is still measuring final performance and has promised a detailed technical report in the coming months.
    • The architecture reduces data movement and balances compute, memory, and networking resources to push realized utilization much closer to theoretical peak performance.
    • Jalapeño is an ASIC, which experts say is less flexible than Nvidia’s GPU but less expensive and tailorable to specific AI tasks.
    • Broadcom contributes silicon implementation and networking technologies, including its Tomahawk networking silicon, to bring the platform to large-scale production.
    • Canadian electronics manufacturer Celestica provides board, rack, and system integration expertise and will build the server systems.
    • The chips are manufactured by Taiwan’s TSMC, the world’s leading advanced semiconductor foundry, after OpenAI sent over the design.
    • Both the chips and the Celestica-built server systems will be used only by OpenAI, not sold to outside customers.
    • OpenAI plans to deploy Jalapeño at gigawatt scale by the end of 2026, with expansion in the years ahead, as the first step in a multi-generation plan.
    • Hock Tan said gigawatt-scale data center deployment will happen with Microsoft and other partners beginning in 2026.
    • The Decoder reported Microsoft is expected to buy 40 percent of the chips, with Broadcom reportedly demanding Microsoft guarantee that share to secure the first phase.
    • Broadcom CEO Hock Tan told Reuters that Jalapeño is as good as Nvidia’s Blackwell chips and the TPUs designed by Alphabet’s Google.
    • In October 2025, after 18 months of working together, OpenAI and Broadcom went public with plans to develop and deploy racks of OpenAI-designed chips starting late this year; CNBC framed the unveiling as coming eight months after that deal.
    • The prior OpenAI-Broadcom plan ultimately aimed at 10 gigawatts of custom AI accelerator capacity, with deployments expected between 2026 and 2029.
    • Estimates suggest OpenAI’s broader infrastructure plans could eventually involve around 26 gigawatts of computing capacity across custom chips, Nvidia hardware, and other accelerators.
    • OpenAI has been one of the biggest buyers of Nvidia’s GPUs since kickstarting the generative AI boom in 2022, but explosive demand has pushed it to seek other sources of advanced silicon.
    • Earlier in 2026 OpenAI struck a deal with Amazon Web Services that includes use of AWS Trainium chips, and has also signed agreements with AMD and with Cerebras, which held its IPO in May.
    • The move is widely characterized as OpenAI diversifying away from and reducing dependence on Nvidia while creating an alternative to its GPUs.
    • OpenAI’s stated goals with the chip are to reduce costs, improve energy efficiency, secure long-term computing supply, and gain more control over the infrastructure powering its services.
    • Broadcom shares climbed about 2 percent following the announcement, are up roughly 10 percent year-to-date in 2026, and have multiplied almost sevenfold since the end of 2022.
    • To build in-house chips, Meta, Amazon, and Google have turned to firms like Broadcom and Marvell for design services and IP that are hard to replicate internally; Reuters first reported OpenAI was exploring its own chip in 2023, and sources told Reuters in April 2026 that Anthropic is weighing its own AI chip.
    • Broadcom’s margin on custom AI chips is currently lower than on products like networking switches due to AI-driven high-bandwidth memory demand; Tan said SK Hynix and Samsung Electronics supply Broadcom with memory chips.

    Detailed Summary

    A blank-slate chip built only for inference

    Jalapeño is OpenAI’s first so-called Intelligence Processor, and the company is emphatic that it is not a repurposed general-purpose accelerator. It was designed from a blank slate specifically for modern large language model inference, the job of crunching data to answer a user’s query rather than the separate, bursty work of training a model. OpenAI says it designed the chip from scratch around its own deep understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs, drawing on the systems it runs every day across ChatGPT, Codex, the API, and future agentic products. The stated objective is to fuse the raw power and throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, which would make Jalapeño particularly well suited to interactive products used at scale. Notably, OpenAI also says the chip is designed with flexibility to work with all LLMs across the industry, not only its own, a claim that sits a little oddly next to its plan to keep the hardware entirely in-house.

    The full-stack flywheel and AI designing its own silicon

    OpenAI is selling Jalapeño as proof of a full-stack advantage. The argument is that because OpenAI now develops frontier models, builds products on top of them, and designs the infrastructure underneath them, including chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and the product experience, every layer can be optimized around the same goal of making its models faster, more reliable, and cheaper. OpenAI describes this as a flywheel: better infrastructure drives compute efficiency, which enables better training and serving, which powers more capable models, which become better products, which drive more usage and revenue, which funds the next generation of infrastructure. The most striking piece of that loop is that OpenAI used its own AI models to accelerate parts of the chip’s design and optimization. The company’s framing is direct: if AI can help engineers design better chips faster, it can lower the cost of compute across the industry. That self-referential loop is the part of the announcement that is genuinely novel rather than a rerun of an existing hyperscaler playbook.

    Nine-month tape-out and the partner stack

    OpenAI claims it took roughly nine months to go from initial design to manufacturing tape-out, and calls this what it believes to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors, against an industry norm measured in years. It credits deep software-hardware co-development, Broadcom’s silicon implementation expertise, and the use of its own models to compress the schedule. The work is split across a clear partner stack: OpenAI provides the architecture and AI-specific requirements, Broadcom contributes silicon implementation and networking technology, including its Tomahawk networking silicon, and Celestica handles boards, racks, and system integration, building the actual server systems. Once the design was complete, OpenAI sent it to TSMC in Taiwan, the world’s leading advanced foundry, for manufacturing. Crucially, both the chips and the systems built around them are for OpenAI’s exclusive use; they are not products being sold to outside customers.

    Performance claims that nobody can check yet

    OpenAI says early testing shows Jalapeño will deliver performance per watt substantially better than current state-of-the-art hardware, with an architecture that reduces data movement and balances compute, memory, and networking to push realized utilization much closer to theoretical peak. Hardware program lead Richard Ho said the team optimized around the kernels, memory movement, networking, and serving patterns that matter most for frontier models, and that the chip will execute key workloads close to the hardware’s theoretical limits. He told Reuters it will be performant on what he thinks will be all kinds of future LLM iterations. The important caveat is that none of this is verifiable. OpenAI is still measuring final performance, has not finalized the numbers, and has not disclosed which chips it benchmarked against, on what tasks, or under what conditions, with the technical report only promised in the coming months. As The Decoder put it bluntly, these are self-reported numbers, unverifiable for now, that should not be taken at face value. Broadcom CEO Hock Tan’s separate claim to Reuters that the chip is as good as Nvidia’s Blackwell and Google’s TPUs is similarly an unverified assertion from an interested party.

    Gigawatts, Microsoft’s 40 percent, and who carries the risk

    Jalapeño is the opening move in a much larger infrastructure buildout. Initial deployment is targeted for the end of 2026 at gigawatt scale, expanding over multiple generations. Tan said the gigawatt-scale data centers will come online with Microsoft and other partners beginning in 2026. The deal traces back to October 2025, when, after 18 months of collaboration, OpenAI and Broadcom went public with plans to deploy racks of OpenAI-designed chips, ultimately aiming for 10 gigawatts of custom accelerator capacity with deployments expected between 2026 and 2029. Broader estimates put OpenAI’s total infrastructure ambition at around 26 gigawatts across custom chips, Nvidia hardware, and other accelerators. The detail that cuts through the optimism comes from The Decoder: Microsoft is expected to buy 40 percent of the chips, and Broadcom reportedly demanded that Microsoft guarantee that purchase to secure the first phase. That guarantee shows that the financial risk of this buildout is not OpenAI’s alone; it rests heavily on its largest backer’s balance sheet.

    The Nvidia diversification arc and Broadcom’s windfall

    Jalapeño is the clearest signal yet of OpenAI loosening its dependence on Nvidia. OpenAI has been one of the biggest buyers of Nvidia GPUs since it kickstarted the generative AI boom in 2022, but demand has exploded past what any single vendor can supply. Within 2026 alone, OpenAI has struck a deal with AWS that includes Trainium chips, signed agreements with AMD and with Cerebras, which held its IPO in May, and now rolled out its own ASIC. The pattern mirrors what Meta, Amazon, and Google already did, all of them leaning on firms like Broadcom and Marvell for design IP that is hard to build in-house, and Anthropic is reportedly weighing the same move, per sources who spoke to Reuters in April 2026. Broadcom is the obvious beneficiary, with shares up about 2 percent on the news, up roughly 10 percent in 2026, and up nearly sevenfold since the end of 2022. Even so, Tan noted that the AI-driven surge in high-bandwidth memory demand makes Broadcom’s margin on custom AI chips lower than on products like networking switches, with SK Hynix and Samsung Electronics supplying the memory.

    Notable Quotes

    “The world is moving to a compute-powered economy.”

    Greg Brockman, President and Co-Founder of OpenAI, framing the launch as a broad economic shift

    “Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access.”

    Greg Brockman, President and Co-Founder of OpenAI, on the full-stack rationale for building its own chip

    “Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers.”

    Richard Ho, who leads OpenAI’s hardware program, describing the chip as purpose-built rather than adapted

    “We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits.”

    Richard Ho, who leads OpenAI’s hardware program, on the architecture’s optimization targets and early performance

    “It will be performant on, we think, all kind of future iterations of LLMs.”

    Richard Ho, OpenAI hardware chief, to Reuters on the chip’s forward compatibility with future models

    “Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI.”

    Hock Tan, President and CEO, Broadcom, on the scale of the infrastructure commitment

    “This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.”

    Hock Tan, President and CEO, Broadcom, on the multi-generation plan and 2026 gigawatt-scale deployment with Microsoft

    “The goal is to combine the power and throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, making Jalapeño well suited for interactive LLM products at scale.”

    OpenAI, in the press release, stating the performance objective for the chip

    “These are self-reported numbers that haven’t been finalized. Take them with a grain of salt.”

    Maximilian Schreiner, The Decoder, on the unverified performance-per-watt claim

    Jalapeño is a real chip running real workloads in a lab, but the gap between an engineering sample and a profitable production fleet is exactly where this story will be decided over the next year, and the most important numbers, the performance-per-watt figures that justify the whole effort, remain self-reported and unverified until OpenAI publishes its technical report. Read OpenAI’s full announcement here.

    Related Reading

    • OpenAI, the chip’s designer and the primary source of the announcement and quotes.
    • Broadcom, the co-developer providing silicon implementation and Tomahawk networking.
    • Celestica, which builds the boards, racks, and server systems around the Jalapeño chip.
    • ASIC (application-specific integrated circuit), what Jalapeño is, a custom chip built for one task unlike a general-purpose GPU.
    • Nvidia Blackwell, the Nvidia architecture Broadcom’s CEO claims Jalapeño matches.