PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: Microsoft

Jonathan Ross on Groq’s $20 Billion NVIDIA Deal, Faster Inference, and Why Asking the Right Questions Wins the AI Age
Jonathan Ross, the founder of Groq and the inventor of Google’s Tensor Processing Unit (TPU), sits down with David Senra (host of the Founders podcast) to walk through Groq’s roughly $20 billion partnership with NVIDIA and the decade of near-death struggle that preceded it. You can watch the full conversation here. Ross, now a senior executive at NVIDIA following the deal, is unusually candid about being one of the world’s worst leaders when he started, about coming three weeks from running out of money, and about the single contrarian bet (that faster inference would make AI both faster and smarter) that almost everyone, including his own engineers, told him was pointless.

TLDW

Ross explains the structure of the NVIDIA deal (a call to Jensen Huang about buying 100,000 GPUs turned, in three weeks, into NVIDIA’s largest deal by nearly 3x) and why pairing Groq’s LPU with the GPU defeats the many different bottlenecks inside an LLM the way you would use both 18-wheelers and delivery vans in a logistics network. He unpacks the AlphaGo moment that revealed faster inference makes models smarter, the shift from the information age (answering questions) to the AI age (asking the right questions), and a leadership philosophy built on autonomy, one brutally clear priority (25 million tokens per second on a challenge coin), and giving people the fewest constraints so they can surprise you. He shares hard-won lessons from Jensen and NVIDIA (the least political large org he has seen, no secret one-on-ones), his concepts of reality quotient and the dominant game, return on luck and the GitHub opportunity he let his team talk him out of, intentional leadership (“I intend to do this”), the Grok bonds that traded salary for equity and saved the company, hiring for negatives instead of positives, loss bias and manufactured discontent, and a closing case for radical optimism: code is becoming free, software creation is being democratized like literacy, and education should stop teaching kids to answer questions and start teaching them to ask.

Thoughts

The technical spine of this interview is a genuinely counterintuitive claim: you can make a model smarter by making it faster. Ross’s proof is the AlphaGo anecdote, where the exact same model, ported from GPUs to his TPU, saw its ELO jump by hundreds of points and beat the world champion, because more compute per unit of time let it search deeper and surface moves like the famous Move 37 that were too far down the tree to find otherwise. Once you internalize that inference speed is not a convenience but a capability multiplier, the entire Groq thesis, and the logic of the NVIDIA deal, snaps into focus. The industry spent years treating fast inference as a nice-to-have. Ross treated it as the whole game, and was nearly alone in doing so for a very long time.

The most transferable material is the leadership arc, precisely because Ross is willing to say he was bad at it. His core insight is that there is no single correct way to lead, any more than there is one way to invest, and the founder’s first job is to know which way is true to them. Ross is a delegator who hires autonomous people and gives them a single, poetically compressed objective, then gets out of the way. The reason that matters is subtle: if you over-constrain the goal, your team can never surprise you with a better answer than the one you already had, which means they can never actually innovate. The Kelly Johnson line Senra offers (“extreme performance often comes from one brutally clear priority”) is the same idea from the Skunk Works side. A challenge coin that reads “25 million tokens per second” is not a slogan, it is a mechanism that lets every engineer connect their work to one dominant game.

Two ideas deserve to be lifted out and used directly. The first is intentional leadership, borrowed from David Marquet’s submarine turnaround: replace “should I do this?” with “I intend to do this.” Asking for opinions invites pessimism and hands your most timid people a veto. Declaring intent still lets someone shout “the hatch is open” when it truly matters, but it stops the reflexive no. Ross traces years of stalled progress to the simple error of asking instead of declaring. The second is his inversion of hiring: hire for negatives, not positives. Growing talent means showing people the path, so you emphasize positives. Selecting talent means screening people out, so you hunt for the disqualifying negatives, because one person’s negative trait infects the whole team. Most founders, Ross included for years, are clever enough to talk themselves into any candidate. A versioned “people spec” and a deliberate loss-averse posture are the antidote.

The Grok bonds story is the emotional center and a small masterpiece of change management. Facing a layoff list that would have killed the company (because the people slated to be cut were exactly the ones needed to make the product work at all), Ross instead asked the team to trade salary for equity, framed with World War II war-bond imagery. Eighty percent participated, half went to statutory minimum wage, and attrition actually fell. His phrase for why is “put everyone’s hands on the steering wheel.” Passengers fear a windy road, drivers feel in control. It is a reminder that morale under existential stress is often a function of agency, not comfort, and that the Phil Knight move of converting employee sacrifice into ownership is a recurring pattern in company survival stories for a reason.

Where the conversation turns almost spiritual is manufactured discontent. Ross observes that the entrepreneurs in a room of successful people were the least happy with their wealth, and that this very dissatisfaction was the fuel that kept them building. His own current discontent is stark and worth sitting with: the world does not have enough compute, and if it takes an extra year to cure cancer or slow aging because of that shortage, he considers it his fault. Whether or not you accept the moral weight he assigns himself, the mechanism is instructive. Edwin Land wrote “300 people died today” on the whiteboard while inventing anti-glare technology. A concrete, human cost attached to delay is a far more durable motivator than a revenue target. Paired with his closing optimism about code becoming free and software creation democratizing like literacy, it makes for one of the more clear-eyed and yet hopeful founder conversations in recent memory.

Key Takeaways
- The NVIDIA deal began as a request to buy about 100,000 GPUs; Jensen saw what Groq had built pairing GPUs and LPUs and decided to make it available to all NVIDIA customers, closing what Ross calls the firm’s biggest deal by nearly 3x in roughly three weeks from first call to wired money.
- GPUs and LPUs are complementary: inside an LLM’s decoder layer, the GPU is better at the compute-bound attention portion and the LPU is better at the memory-throughput-bound weights, so combining them defeats bottlenecks across the whole performance curve, like using both 18-wheelers and last-mile vans.
- As AI increasingly talks to AI, speed dominates, because agents kick off other agents and compound; a human tolerates a one-second wait, but AI is just sitting there idle.
- Agentic micro payments will make the number of payments skyrocket, but payments infrastructure is not yet built for AI operating inside an allocated budget.
- Ross prototypes cutting-edge ideas as personal hobby projects first, then brings them to work; his personalized “daily brief” evolved from long text into headlines he can interrogate with follow-up questions, like the game of 20 questions.
- The information age rewarded answering questions; the AI age rewards asking the right ones, as everyone shifts from individual contributor to leader of AI, and good leaders ask the question no one else did.
- There is no single right way to lead, just as there are many ways to invest; the founder’s job is to know themselves and pick the leadership form that is true to them (inspiration versus fear, control versus delegation).
- Ross was, by his own account, one of the world’s worst leaders at the start, which cost Groq three to four years; his fix was to define one goal simple enough to fit on a challenge coin: 25 million tokens per second.
- The fewer constraints you give a person (or an AI agent), the more freedom they have to surprise you with a better solution; over-constraining the goal makes real innovation impossible.
- Lessons from Jensen and NVIDIA: it is the least political large organization Ross has seen, Jensen never runs secret one-on-ones (tell everyone at once, copy everyone on email), and the whole strategy reduces to “what does the customer actually need?”
- Jensen manages around 60 direct reports, each smarter than him in their own domain, which he offers as the model for orchestrating AI agents that may be smarter than you.
- Asking a sharp question that makes an expert say “I didn’t think of that” is a universal founder skill (it appears in every Bezos book) and can be honed.
- Confidence, not competence, was Ross’s early bottleneck: shadowing a leader of 2,000 people, he realized he would have made the same decisions, and acting with confidence made people follow his direction without changing the decisions themselves.
- The better and more creative your people, the harder they are to manage; running 450 highly creative scientists felt more like managing 5,000.
- Reality quotient (RQ), distinct from IQ, is the ability to recognize reality and, in its extreme form, to choose the dominant game; MySpace optimized accounts signed up while Facebook optimized monthly active users and won.
- The first principle of change management is to make it feel like it is not a change; people who seem fine with change are usually anchored to something that did not change.
- Return on luck (from Jim Collins): the most successful companies do not get more lucky breaks, they seize the ones they get; Ross let his team talk him out of powering GitHub’s LLMs on Groq chips, then vowed never again.
- People adopt fast inference only when they experience it personally; an Anthropic demo three months before ChatGPT drew no reaction because the answers were not the audience’s own, and Groq later went viral off a fast-LLM video posted on X.
- Great innovators often experience a problem before others do; the future is already here, just not evenly distributed, and Ross saw fast inference’s value first because of AlphaGo.
- Intentional leadership (from David Marquet’s USS Santa Fe turnaround): say “I intend to do this” instead of asking for an opinion, which stops reflexive pessimism while still letting people flag a real problem.
- Grok bonds: three weeks from running out of money, Ross swapped a layoff for a war-bond-style salary-for-equity exchange; 80% participated, about half took statutory minimum wage, and it bought roughly two months of runway.
- “Put everyone’s hands on the steering wheel”: participation in saving the company cut attrition to under 10% during the crisis, echoing Phil Knight converting employee loans into Nike equity.
- West Coast VCs behave like lemmings (one pass triggers all passes), while East Coast VCs run independent analysis; the herd missed what became NVIDIA’s biggest deal ever, a live example of the Keynesian beauty contest.
- For the first time, top startups are not starved for cash, so putting in more money is no longer an advantage even though investors still behave as if it is.
- Hiring flip: move from hiring for positives (how you grow talent) to hiring for negatives (how you select talent), because one negative trait poisons the team; write a versioned “people spec” like a product spec.
- Loss bias (a loss feels roughly six times more painful than an equal gain) can be a hiring signal: Ross looks for people who “book the win early,” treating any missed improvement as a loss.
- Poetic design (maximum meaning in minimal expression, “every word matters”) was a positive on the people spec; its negative is maximalist, cluttered design.
- Michael Jordan manufactured pressure by taunting opponents so a loss would be humiliating, forcing superhuman performance (per his trainer Tim Grover), a deliberate version of throwing your keys over the fence.
- Manufactured discontent (David Ogilvy’s “divine discontent”): the best entrepreneurs never rest on wins; the least happy people with their wealth were the ones who kept building.
- Ross’s discontent today is the world’s lack of compute; he treats every delayed medical breakthrough as partly his responsibility, the way Edwin Land wrote a daily death count on the whiteboard while fighting headlight glare.
- Software has run on “code rationing” because code was expensive to write, enforced by “no engineers”; as the marginal cost of code approaches zero, you just implement, experience, and re-implement.
- AI democratizes software creation like the alphabet democratized literacy: Ross’s executive assistant now builds working apps, and individual founders with taste but no coding background will create valuable companies.
- Education should be revamped around asking questions and solving real community problems; if a kid can look up or prompt the answer, the assignment taught nothing, but making them ask the right questions to get AI to solve a real problem does.
Detailed Summary

The $20 Billion NVIDIA Deal and Why LPUs and GPUs Belong Together

The deal’s most striking feature is speed: the idea was first floated on a call roughly three weeks before the money was in the bank. Groq had been integrating GPUs and LPUs and went to Jensen Huang wanting to buy about 100,000 GPUs to deploy themselves. Jensen saw the combined system and decided it should be offered to all of NVIDIA’s customers. The technical logic is that processing an LLM token involves many matrix multiplies with different bottlenecks, some compute-constrained (better on the GPU, especially the attention portion) and some memory-throughput-constrained (better on the LPU, applying the trained weights). There is no single perfect architecture, so putting the two together defeats bottlenecks across the whole curve. Ross adds that as AI talks to AI, speed becomes everything, because agents spawn agents and compound exponentially.

Asking Questions, Daily Briefs, and the Shift to Leading AI

Ross builds cutting-edge tools as personal hobby projects before bringing them to work, including a personalized “daily brief” that functions like a presidential daily brief. He redesigned it from long text into headlines he can interrogate, because interactivity, like 20 questions, distills straight to what you actually care about. This grounds one of his signature ideas: success in the information age meant answering questions, but success in the AI age means asking the right questions. As people move from individual contributors to leaders of AI, the skill that matters is the leader’s skill of asking the question everyone else missed or was afraid to raise, since the question you ask determines the output you get.

Knowing Your Leadership Style and the Challenge Coin

Ross frames leadership like investing: the first principle is simply having followers, but there are infinite valid styles. New founders fail by copying advice that is not true to them. Ross is a natural delegator (he has not held a driver’s license since his teens because he would rather think than control the car) who hires unusually autonomous people. Early on this backfired badly, because he entrusted people who needed direction, and he calls himself one of the world’s worst early leaders, a gap that cost Groq years. His breakthrough was distilling the mission onto a challenge coin reading “25 million tokens per second,” which let everyone connect their work to one dominant game. He references David Marquet’s Turn the Ship Around later, but the coin embodies Kelly Johnson’s Skunk Works principle that extreme performance comes from one brutally clear priority, plus the rule that fewer constraints give people more room to surprise you, turning a team from Superman into the Avengers.

Lessons from Jensen: Killing Politics and Serving the Customer

Working at NVIDIA taught Ross how much further he could have pushed lessons he half-learned at Groq. NVIDIA is, in his experience, the least political large organization anywhere, and a big reason is that Jensen never tells different people different things in private one-on-ones. When you address a room, everyone hears the same message; separate conversations breed side cliques. Ross’s practical rules: hold big meetings for anything you want a group to know, and copy everyone on email so no one can route politics through you. The other Jensen lesson is to stop playing 3D chess and just ask what the customer needs, tell them only what you believe and can support, and refuse to sell them something they do not need. Senra notes he has covered roughly 19 ideas from The Nvidia Way on his Founders podcast, and Jensen’s line that he already manages 60 reports smarter than him is the template for managing AI agents.

Reality Quotient, the Dominant Game, and Change Management

Groq hired for reality quotient, not just IQ, because plenty of very smart people construct elaborate stories disconnected from reality. In its extreme form, RQ is the ability to choose the dominant game, the way Facebook’s focus on monthly active users beat MySpace’s focus on accounts signed up. The founder’s job is to help everyone connect their activity to that dominant game (for Groq, tokens per second), then manage the change. Ross’s first principle of change management is to make it feel like it is not a change: nobody likes change, and people who tolerate it well are usually focused on something that stayed constant. If your team is anchored to the dominant goal, a new tactic does not feel like change; if they are anchored to a narrow task, it does.

Return on Luck, the AlphaGo Insight, and the GitHub Miss

From Jim Collins’s Great by Choice, Ross took the idea that winners seize luck better, not that they get more of it. He experienced it first-hand with AlphaGo: after a DeepMind team asked whether his TPU was as fast as rumored (he said yes, Ghostbusters-style), porting the identical model from GPUs to TPUs pushed its ELO from around 3,200 to roughly 3,900 and it crushed the world champion. As Thinking Fast and Slow by Daniel Kahneman frames it, more compute lets the model virtually play out more moves and occasionally find a better second-best line, which is how the famous Move 37 surfaced. Faster thinking is smarter thinking. Yet Ross also let his own engineers talk him out of powering GitHub’s LLMs on Groq chips, twice, because they focused on why it could not be done rather than why it could. He eventually did the math himself, hit the numbers, and learned to stop inviting that pessimism.

Selling Speed and Intentional Leadership

Customers could not grasp fast inference until they felt it. Ross recalls an Anthropic demo three months before ChatGPT that drew no reaction, because seeing someone else’s answer appear is not magical, but getting your own question answered instantly is. So Groq simply put fast inference online, and it went viral after someone posted a video of a blazing-fast LLM on X (Ross noticed his own demo slowing in Norway because usage had skyrocketed). The deeper fix for internal resistance came from Turn the Ship Around, David Marquet’s account of turning the USS Santa Fe from worst to best in nuclear readiness by replacing command-and-control with intentional leadership. Saying “I intend to do this” rather than “should I?” stops people from reflexively supplying negative opinions, while still letting someone shout “the hatch is open” when there is a genuine problem.

Grok Bonds: Three Weeks From Zero

With three weeks of cash left and a layoff list on the table, Ross realized the cuts targeted exactly the people needed to finish an unprecedented compiler and reach the critical mass where the product would even work. Layoffs would not save the company; only reducing burn without losing people could. So Groq held an all-hands, put up World War II war-bond imagery, and launched “Grok bonds,” an exchange of salary for equity. Ross expected heavy attrition; instead 80% participated and about half dropped to statutory minimum wage, real pain for engineers used to six-figure salaries. It bought closer to two months of runway. His framing, “put everyone’s hands on the steering wheel,” explains why attrition actually fell below 10%: drivers feel more in control than passengers, and it echoes Phil Knight in Shoe Dog converting employee loans into Nike equity on the edge of collapse.

Hiring for Negatives, Loss Bias, and Manufactured Discontent

Ross was good at spotting smart, talented people but kept hiring ones who caused organizational problems, because he could always talk himself into a candidate. Watching a sharp head of HR screen people out, he realized he had been hiring wrong: growing talent means showing positives, but selecting talent means hunting for disqualifying negatives, since one bad trait spreads to the whole team. He formalized a versioned “people spec” with positives like return on luck and poetic design, each paired with a negative. He also hired for loss bias, the fact that a loss feels roughly six times more painful than an equal gain, seeking people who “book the win early.” That competitive, pressure-seeking wiring links to Michael Jordan manufacturing humiliation stakes (per Tim Grover in Relentless) and to David Ogilvy’s divine discontent. Ross’s own manufactured discontent today is the world’s shortage of compute, which he frames in life-and-death terms.

The Optimistic Close: Free Code and Universal Software Literacy

Ross ends on aggressive optimism. Software has long run on “code rationing” because code was expensive to write, policed by “no engineers” whose job is to say no. As the marginal cost of code approaches zero, the workflow flips to implement, experience, then re-implement. More important is accessibility: just as alphabets and universal education turned reading and writing from a scribe’s monopoly into a question of quality, AI is making software creation universal. His executive assistant now builds working apps, and a wave of individual founders with taste but no coding background will create valuable companies. The corollary for education is to stop teaching kids to answer questions and start teaching them to ask, revamping curricula around real community problems where the point is asking the right questions to get AI to solve something that matters.

Notable Quotes

“Success in the information age was about being able to answer questions. Success in the AI age will be about being able to ask the right questions.”
Jonathan Ross, on the fundamental shift AI creates

“The fewer constraints that you give someone, the more freedom they have to solve the problem, and the more freedom they have to surprise you with the solution.”
Jonathan Ross, on leading creative teams

“Being able to think faster makes you think smarter.”
Jonathan Ross, on why faster inference produces more capable models

“There are plenty of really smart people who wouldn’t recognize reality if it tapped them on the shoulder.”
Jonathan Ross, defining reality quotient versus IQ

“If you express intentional leadership, you say, ‘I intend to do this.’ People don’t tend to offer their opinion, but if it’s very wrong and there’s a reason, they will push back.”
Jonathan Ross, on the lesson from Turn the Ship Around

“When people are passengers in a car, they’re more nervous about a windy road or a scary road. But when they’re the driver, they feel more in control.”
Jonathan Ross, on why Grok bonds kept the team together

“The biggest flip in my hiring was when I went from looking for positives, which is what you do when you’re trying to grow talent, to looking for negatives, which is what you do when you’re trying to select talent.”
Jonathan Ross, on inverting his approach to hiring

“If it takes us an extra year to cure cancer because we don’t have enough compute, that’s my fault.”
Jonathan Ross, on the discontent that drives him today

Watch the full conversation between Jonathan Ross and David Senra here on YouTube.

Related Reading
- Groq the company Ross founded and the LPU behind the fast-inference story and the NVIDIA partnership.
- AlphaGo versus Lee Sedol (Wikipedia) the match, including Move 37, that showed Ross how much faster hardware raises a model’s capability.
- The Keynesian Beauty Contest (Wikipedia) the dynamic Ross uses to explain why West Coast VCs herded past what became NVIDIA’s biggest deal.
- Zero to One by Peter Thiel, the source of the first-principles thinking Ross applied to the contrarian bet on fast inference.
- Founders podcast by David Senra the host’s biography-driven show, source of the Jensen, Michael Jordan, and Edwin Land ideas referenced throughout.
July 7, 2026
OpenAI and Broadcom Unveil Jalapeño, a Custom LLM Inference Chip to Cut Compute Costs and Reduce Nvidia Dependence
OpenAI and Broadcom pulled the wrapper off Jalapeño on Wednesday, June 24, 2026, a custom silicon accelerator that OpenAI is calling its first “Intelligence Processor” and its first real move into designing the hardware underneath its own models. Broadcom President and CEO Hock Tan and President Charlie Kawwas physically handed the wafer to OpenAI CEO Sam Altman and President and Co-Founder Greg Brockman, a staged moment meant to signal that the ChatGPT maker is no longer just a models-and-products company but is now reaching all the way down to the chip. Jalapeño is purpose-built for large language model inference, the compute-intensive job of actually serving answers to users rather than training the model in the first place, and OpenAI plans to deploy it at gigawatt scale by the end of 2026 as the first step in a multi-generation platform built with Broadcom and Canadian electronics manufacturer Celestica. You can read the announcement straight from the source in OpenAI’s official post.

TLDR

OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI chip, an ASIC designed from a blank slate specifically for LLM inference rather than training, manufactured by TSMC and integrated into server systems by Celestica that only OpenAI will use. OpenAI claims the chip went from initial design to manufacturing tape-out in just nine months, what it calls the fastest ASIC development cycle ever in high-performance advanced semiconductors, accelerated in part by using its own AI models to design the silicon. Engineering samples are already running ML workloads in the lab, including GPT-5.3-Codex-Spark, and OpenAI says early testing shows performance per watt “substantially better” than current state-of-the-art, a self-reported and not yet independently verified claim with a full technical report promised in the coming months. Broadcom CEO Hock Tan told Reuters the chip matches Nvidia’s Blackwell and Google’s TPUs, framing the launch as part of a flywheel where OpenAI owns the full stack from chip to model to product. The chip slots into a broader infrastructure strategy targeting 10 gigawatts of custom accelerator capacity between 2026 and 2029 with deployments alongside Microsoft and other partners, and The Decoder reported Microsoft is expected to buy 40 percent of the chips, a guarantee Broadcom reportedly demanded to secure the first phase. The move is widely read as OpenAI diversifying away from Nvidia, continuing a procurement spree that already includes AWS Trainium, AMD, and Cerebras, as inference quietly becomes the company’s real cost center.

Thoughts

The single most important word in this announcement is “inference,” and it is the word doing the heavy lifting. Training a frontier model is a capital expense that happens in bursts. Inference is the bill that arrives every single day, forever, scaling linearly with usage. Every ChatGPT reply, every Codex task, every API call, every agent step is an inference event, and as OpenAI’s product surface explodes that recurring cost is the thing that actually threatens the unit economics. A custom chip aimed squarely at inference is therefore not a vanity project or a research flex. It is OpenAI attacking the largest variable cost in its business at the root, trying to bend its cost-per-token curve below what it pays renting Nvidia GPUs. If Jalapeño lands anywhere near its claims, the payoff is not faster benchmarks, it is gross margin.

The performance-per-watt claim, though, deserves the most skeptical reading in the room. OpenAI says Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art, but it has not finalized the numbers, has not said which chips it tested against, on what tasks, or under what conditions, and the full technical report is somewhere in the indefinite “coming months.” These are self-reported figures from a company with an enormous interest in convincing the market it has a credible alternative to Nvidia. Hock Tan’s line that the chip is “as good as” Blackwell and Google’s TPUs is a CEO talking his own book in an interview, not a measured result. The honest posture is to treat the figures as marketing until the technical report lands. A chip running engineering samples in a lab at target frequency is real progress, but it is a very long way from a chip that holds those numbers across a production fleet under messy real-world load.

OpenAI left the most revealing detail out of its own press release: the report, via The Decoder, that Broadcom demanded Microsoft guarantee it will buy 40 percent of the chips to secure the first phase. That single sentence tells you who is actually carrying the risk. Building gigawatt-scale custom silicon is brutally capital-intensive, and Broadcom is not willing to commit manufacturing capacity on the strength of OpenAI’s demand alone. It wants a balance sheet behind the order, and Microsoft, OpenAI’s largest backer, is the balance sheet. That detail quietly reframes the whole “OpenAI owns the stack” narrative. OpenAI may design the chip, but the deployment is underwritten by Microsoft’s purchasing commitment, which means Microsoft also gets leverage and supply security out of an OpenAI-branded part. Ownership of the design is not the same as ownership of the risk.

The flywheel framing is genuinely interesting and probably the most defensible strategic claim OpenAI is making. OpenAI says it used its own models to accelerate parts of the chip design and optimization, compressing a normally multi-year ASIC cycle into nine months. If that is even partly true, it is a meaningful loop: the models help design the chips, the chips run the models more cheaply, the cheaper models drive more usage and revenue, and the revenue funds the next chip. That is a compounding advantage that is hard for a pure hardware vendor to replicate and hard for a pure software lab to replicate. The catch is that nine months from design to tape-out is a claim about speed, not about whether the resulting chip is actually competitive in volume. Fast tape-out and great silicon are different achievements, and the industry has seen plenty of chips that taped out quickly and underwhelmed in production.

Strip away the “Intelligence Processor” branding and this is a playbook we have already watched run three times. Google built TPUs, Amazon built Trainium and Inferentia, Meta built MTIA, and all of them turned to Broadcom or Marvell for the design IP that is hard to replicate in-house. OpenAI is doing the same thing with the same partner, just later and louder. The diversification arc is unmistakable: OpenAI was one of the biggest Nvidia GPU buyers on earth, and in the span of a year it has signed deals for AWS Trainium, AMD accelerators, and Cerebras inference hardware, and now its own custom ASIC. Nvidia is not in trouble, demand still vastly outstrips supply, but the era where the largest AI labs were captive single-vendor customers is clearly ending. The most intriguing wildcard is OpenAI’s own line that Jalapeño is “designed with flexibility to work with all LLMs.” That is not how you describe a chip you intend to keep entirely to yourself. It hints, however faintly, at an OpenAI that could one day rent out inference infrastructure the way it now rents models, which would put it in direct competition with the very cloud providers it currently depends on.

Key Takeaways
- OpenAI and Broadcom unveiled Jalapeño on Wednesday, June 24, 2026, OpenAI’s first custom AI chip and its first piece of in-house silicon after years focused on models and products.
- The chip is branded an “Intelligence Processor” and described as the first AI accelerator in a multi-generation compute platform the two companies are building together.
- Jalapeño is purpose-built for large language model inference, the compute-intensive work of generating responses and serving answers to users, and explicitly not for training.
- Inference is OpenAI’s recurring cost center: every ChatGPT conversation, coding request, image generation, and agent action relies on it, making it one of the highest ongoing costs in the business.
- Broadcom President and CEO Hock Tan and President Charlie Kawwas physically delivered the first wafer to OpenAI CEO Sam Altman and President Greg Brockman.
- OpenAI designed the chip from scratch around its understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs.
- Jalapeño is described as a blank-slate design for modern LLM inference, not a general-purpose accelerator adapted from earlier AI workloads.
- The chip is shaped by the systems OpenAI runs daily across ChatGPT, Codex, the API, and future agentic products, while also being designed to work with current and future LLMs across the industry.
- The stated performance goal is to combine the throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, suiting it for interactive LLM products at scale.
- OpenAI frames this as its full-stack advantage: it designs frontier models, builds products on top of them, and now designs the chip architecture, kernels, memory systems, networking, scheduling, and deployment systems underneath.
- OpenAI claims Jalapeño went from initial design to manufacturing tape-out in just nine months.
- The companies call it what they believe to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors, against a backdrop of typically multi-year timelines.
- OpenAI used its own AI models to accelerate parts of the chip design and optimization process, which it credits for the speed.
- OpenAI frames the result as a flywheel: the same models served to users help improve the infrastructure that runs future models, lowering compute cost across the industry.
- Engineering samples of Jalapeño are already running ML workloads in the lab at production target frequency and power.
- Among the workloads running on the samples is OpenAI’s GPT-5.3-Codex-Spark model.
- GPT-5.3-Codex-Spark currently runs on Cerebras hardware, which also specializes in inference, per The Decoder.
- OpenAI says early testing shows Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art hardware.
- That performance-per-watt claim is self-reported and lacks independent verification; OpenAI has not said which chips it tested against, on what tasks, or under what conditions.
- OpenAI says it is still measuring final performance and has promised a detailed technical report in the coming months.
- The architecture reduces data movement and balances compute, memory, and networking resources to push realized utilization much closer to theoretical peak performance.
- Jalapeño is an ASIC, which experts say is less flexible than Nvidia’s GPU but less expensive and tailorable to specific AI tasks.
- Broadcom contributes silicon implementation and networking technologies, including its Tomahawk networking silicon, to bring the platform to large-scale production.
- Canadian electronics manufacturer Celestica provides board, rack, and system integration expertise and will build the server systems.
- The chips are manufactured by Taiwan’s TSMC, the world’s leading advanced semiconductor foundry, after OpenAI sent over the design.
- Both the chips and the Celestica-built server systems will be used only by OpenAI, not sold to outside customers.
- OpenAI plans to deploy Jalapeño at gigawatt scale by the end of 2026, with expansion in the years ahead, as the first step in a multi-generation plan.
- Hock Tan said gigawatt-scale data center deployment will happen with Microsoft and other partners beginning in 2026.
- The Decoder reported Microsoft is expected to buy 40 percent of the chips, with Broadcom reportedly demanding Microsoft guarantee that share to secure the first phase.
- Broadcom CEO Hock Tan told Reuters that Jalapeño is as good as Nvidia’s Blackwell chips and the TPUs designed by Alphabet’s Google.
- In October 2025, after 18 months of working together, OpenAI and Broadcom went public with plans to develop and deploy racks of OpenAI-designed chips starting late this year; CNBC framed the unveiling as coming eight months after that deal.
- The prior OpenAI-Broadcom plan ultimately aimed at 10 gigawatts of custom AI accelerator capacity, with deployments expected between 2026 and 2029.
- Estimates suggest OpenAI’s broader infrastructure plans could eventually involve around 26 gigawatts of computing capacity across custom chips, Nvidia hardware, and other accelerators.
- OpenAI has been one of the biggest buyers of Nvidia’s GPUs since kickstarting the generative AI boom in 2022, but explosive demand has pushed it to seek other sources of advanced silicon.
- Earlier in 2026 OpenAI struck a deal with Amazon Web Services that includes use of AWS Trainium chips, and has also signed agreements with AMD and with Cerebras, which held its IPO in May.
- The move is widely characterized as OpenAI diversifying away from and reducing dependence on Nvidia while creating an alternative to its GPUs.
- OpenAI’s stated goals with the chip are to reduce costs, improve energy efficiency, secure long-term computing supply, and gain more control over the infrastructure powering its services.
- Broadcom shares climbed about 2 percent following the announcement, are up roughly 10 percent year-to-date in 2026, and have multiplied almost sevenfold since the end of 2022.
- To build in-house chips, Meta, Amazon, and Google have turned to firms like Broadcom and Marvell for design services and IP that are hard to replicate internally; Reuters first reported OpenAI was exploring its own chip in 2023, and sources told Reuters in April 2026 that Anthropic is weighing its own AI chip.
- Broadcom’s margin on custom AI chips is currently lower than on products like networking switches due to AI-driven high-bandwidth memory demand; Tan said SK Hynix and Samsung Electronics supply Broadcom with memory chips.
Detailed Summary

A blank-slate chip built only for inference

Jalapeño is OpenAI’s first so-called Intelligence Processor, and the company is emphatic that it is not a repurposed general-purpose accelerator. It was designed from a blank slate specifically for modern large language model inference, the job of crunching data to answer a user’s query rather than the separate, bursty work of training a model. OpenAI says it designed the chip from scratch around its own deep understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs, drawing on the systems it runs every day across ChatGPT, Codex, the API, and future agentic products. The stated objective is to fuse the raw power and throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, which would make Jalapeño particularly well suited to interactive products used at scale. Notably, OpenAI also says the chip is designed with flexibility to work with all LLMs across the industry, not only its own, a claim that sits a little oddly next to its plan to keep the hardware entirely in-house.

The full-stack flywheel and AI designing its own silicon

OpenAI is selling Jalapeño as proof of a full-stack advantage. The argument is that because OpenAI now develops frontier models, builds products on top of them, and designs the infrastructure underneath them, including chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and the product experience, every layer can be optimized around the same goal of making its models faster, more reliable, and cheaper. OpenAI describes this as a flywheel: better infrastructure drives compute efficiency, which enables better training and serving, which powers more capable models, which become better products, which drive more usage and revenue, which funds the next generation of infrastructure. The most striking piece of that loop is that OpenAI used its own AI models to accelerate parts of the chip’s design and optimization. The company’s framing is direct: if AI can help engineers design better chips faster, it can lower the cost of compute across the industry. That self-referential loop is the part of the announcement that is genuinely novel rather than a rerun of an existing hyperscaler playbook.

Nine-month tape-out and the partner stack

OpenAI claims it took roughly nine months to go from initial design to manufacturing tape-out, and calls this what it believes to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors, against an industry norm measured in years. It credits deep software-hardware co-development, Broadcom’s silicon implementation expertise, and the use of its own models to compress the schedule. The work is split across a clear partner stack: OpenAI provides the architecture and AI-specific requirements, Broadcom contributes silicon implementation and networking technology, including its Tomahawk networking silicon, and Celestica handles boards, racks, and system integration, building the actual server systems. Once the design was complete, OpenAI sent it to TSMC in Taiwan, the world’s leading advanced foundry, for manufacturing. Crucially, both the chips and the systems built around them are for OpenAI’s exclusive use; they are not products being sold to outside customers.

Performance claims that nobody can check yet

OpenAI says early testing shows Jalapeño will deliver performance per watt substantially better than current state-of-the-art hardware, with an architecture that reduces data movement and balances compute, memory, and networking to push realized utilization much closer to theoretical peak. Hardware program lead Richard Ho said the team optimized around the kernels, memory movement, networking, and serving patterns that matter most for frontier models, and that the chip will execute key workloads close to the hardware’s theoretical limits. He told Reuters it will be performant on what he thinks will be all kinds of future LLM iterations. The important caveat is that none of this is verifiable. OpenAI is still measuring final performance, has not finalized the numbers, and has not disclosed which chips it benchmarked against, on what tasks, or under what conditions, with the technical report only promised in the coming months. As The Decoder put it bluntly, these are self-reported numbers, unverifiable for now, that should not be taken at face value. Broadcom CEO Hock Tan’s separate claim to Reuters that the chip is as good as Nvidia’s Blackwell and Google’s TPUs is similarly an unverified assertion from an interested party.

Gigawatts, Microsoft’s 40 percent, and who carries the risk

Jalapeño is the opening move in a much larger infrastructure buildout. Initial deployment is targeted for the end of 2026 at gigawatt scale, expanding over multiple generations. Tan said the gigawatt-scale data centers will come online with Microsoft and other partners beginning in 2026. The deal traces back to October 2025, when, after 18 months of collaboration, OpenAI and Broadcom went public with plans to deploy racks of OpenAI-designed chips, ultimately aiming for 10 gigawatts of custom accelerator capacity with deployments expected between 2026 and 2029. Broader estimates put OpenAI’s total infrastructure ambition at around 26 gigawatts across custom chips, Nvidia hardware, and other accelerators. The detail that cuts through the optimism comes from The Decoder: Microsoft is expected to buy 40 percent of the chips, and Broadcom reportedly demanded that Microsoft guarantee that purchase to secure the first phase. That guarantee shows that the financial risk of this buildout is not OpenAI’s alone; it rests heavily on its largest backer’s balance sheet.

The Nvidia diversification arc and Broadcom’s windfall

Jalapeño is the clearest signal yet of OpenAI loosening its dependence on Nvidia. OpenAI has been one of the biggest buyers of Nvidia GPUs since it kickstarted the generative AI boom in 2022, but demand has exploded past what any single vendor can supply. Within 2026 alone, OpenAI has struck a deal with AWS that includes Trainium chips, signed agreements with AMD and with Cerebras, which held its IPO in May, and now rolled out its own ASIC. The pattern mirrors what Meta, Amazon, and Google already did, all of them leaning on firms like Broadcom and Marvell for design IP that is hard to build in-house, and Anthropic is reportedly weighing the same move, per sources who spoke to Reuters in April 2026. Broadcom is the obvious beneficiary, with shares up about 2 percent on the news, up roughly 10 percent in 2026, and up nearly sevenfold since the end of 2022. Even so, Tan noted that the AI-driven surge in high-bandwidth memory demand makes Broadcom’s margin on custom AI chips lower than on products like networking switches, with SK Hynix and Samsung Electronics supplying the memory.

Notable Quotes

“The world is moving to a compute-powered economy.”
Greg Brockman, President and Co-Founder of OpenAI, framing the launch as a broad economic shift

“Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access.”
Greg Brockman, President and Co-Founder of OpenAI, on the full-stack rationale for building its own chip

“Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers.”
Richard Ho, who leads OpenAI’s hardware program, describing the chip as purpose-built rather than adapted

“We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits.”
Richard Ho, who leads OpenAI’s hardware program, on the architecture’s optimization targets and early performance

“It will be performant on, we think, all kind of future iterations of LLMs.”
Richard Ho, OpenAI hardware chief, to Reuters on the chip’s forward compatibility with future models

“Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI.”
Hock Tan, President and CEO, Broadcom, on the scale of the infrastructure commitment

“This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.”
Hock Tan, President and CEO, Broadcom, on the multi-generation plan and 2026 gigawatt-scale deployment with Microsoft

“The goal is to combine the power and throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, making Jalapeño well suited for interactive LLM products at scale.”
OpenAI, in the press release, stating the performance objective for the chip

“These are self-reported numbers that haven’t been finalized. Take them with a grain of salt.”
Maximilian Schreiner, The Decoder, on the unverified performance-per-watt claim

Jalapeño is a real chip running real workloads in a lab, but the gap between an engineering sample and a profitable production fleet is exactly where this story will be decided over the next year, and the most important numbers, the performance-per-watt figures that justify the whole effort, remain self-reported and unverified until OpenAI publishes its technical report. Read OpenAI’s full announcement here.

Related Reading
- OpenAI, the chip’s designer and the primary source of the announcement and quotes.
- Broadcom, the co-developer providing silicon implementation and Tomahawk networking.
- Celestica, which builds the boards, racks, and server systems around the Jalapeño chip.
- ASIC (application-specific integrated circuit), what Jalapeño is, a custom chip built for one task unlike a general-purpose GPU.
- Nvidia Blackwell, the Nvidia architecture Broadcom’s CEO claims Jalapeño matches.
June 24, 2026
Lloyd Blankfein on the 3 Sectors Where He Puts His Money Now: Big Tech, Energy, and Financial Services, Day Trading From an iPad, and the Warren Buffett Handshake That Backed Goldman in 2008
Lloyd Blankfein spent almost 40 years at Goldman Sachs, the last dozen as its chairman and chief executive, and he still trades almost every day from an iPad. In this wide ranging conversation on the My First Million podcast, the former Goldman boss lays out exactly where he is putting his own money right now, why a supportive spouse beats nearly any investment, how Warren Buffett wired five billion dollars into Goldman on a handshake during the 2008 crisis, and why he reads medieval history to stay calm about the present. It is part stock picking, part risk philosophy, and part a frank accounting of money, marriage, and the scars of growing up in the projects.

TLDW

Blankfein says he is roughly 98 percent in risky assets, almost all equities, and concentrated in three sectors he knows cold: big tech, energy, and financial services. His personal book leans heavily into single stocks over ETFs, weighted toward the big hyperscalers and a few second tier names, and he trades daily, alone, from an iPad and a phone, using calls and texts as his research network. Yet the advice he gives a normal investor is the boring opposite: a diversified S&P 500 fund like VOO, more risk when you are young because you will outlive your mistakes, the same thing Warren Buffett would tell you. The conversation ranges across the 2008 Buffett investment in Goldman, the cost of trying to legislate risk out of markets, the thin margin between the best and the rest, luck and the myth of the genius, why reputation is the real contract on Wall Street, why a supportive spouse is the highest return asset he knows, the money anxiety he carried out of a Brooklyn housing project, the dignity of a 500 dollar financial aid check, giving with a warm hand versus a cold one, the dangers of gamified investing, the big misses like SpaceX and early cellular, the obituary test a senior partner once gave him, and why reading history keeps the present in proportion.

Thoughts

The most useful tension in this interview is the gap between what Blankfein practices and what he preaches. He tells young people to buy a diversified S&P 500 index fund, he holds VOO himself, and he calls the host’s plain 90 percent stocks and 10 percent bonds split sensible. Then he admits his own portfolio is something like 90 percent single stocks that he trades by hand every day. The honest read is that his edge is not a transferable tip. It is a 40 year information network of phone calls and a tolerance for risk that most people neither have nor should want. The replicable lesson is the boring half, not the day trading half.

The most contrarian idea here is not a stock pick, it is his defense of risk itself. His argument that regulators trying to prevent the hundred year storm also forfeit the 99 normal years of growth in between is a serious claim about the price of safety, and it travels far beyond Wall Street. The same goes for his point that a good risk manager sometimes has to push people to take more risk, not less. The moment after a loss, when everyone goes gunshy, is exactly when the best operators lean back in. That is an uncomfortable thing for a former bank CEO to say out loud, and it is the part of the conversation most worth sitting with.

The Warren Buffett story is a master class in what actually moves markets, and it is not cash. Goldman did not need the five billion dollars. Blankfein says the money was almost irrelevant because the firm already had money. What it could not manufacture was confidence, and Buffett’s name supplied it. The handshake, the commitment with no paperwork, the line about worrying enough for the both of us, all point to the same thing. At the top, reputation is the collateral. His aside that most trades are never written down because you will never eat lunch in this town again is the same idea wearing street clothes.

Quietly, the personal finance thread may be the most valuable part for a normal listener. A former Goldman CEO saying that a supportive partner is more game changing than any investment, that a bad marriage is financially worse than being lonely, and that he has not paid a bill in over 40 years because his wife runs the household economy, is a reminder that household stability is itself an asset class. The 500 dollar financial aid check he still remembers half a century later, and his give with your warm hand philosophy, reframe wealth as something measured by how it feels to give and to receive, not just by the size of a pie chart.

Finally, the history obsession is not a side hobby, it is his risk model. Reading about the black plague, the McCarthy era, and the Vietnam draft is how he keeps the present in proportion. His Mark Twain line, that history does not repeat but it rhymes, is the direct antidote to the in this economy defeatism he and the host both complain about. For an investor, that long view is close to the whole game. It is what lets you hold through the drawdowns that scare everyone else out of the market.

Key Takeaways
- Blankfein estimates he is about 98 percent in risky assets, with roughly 95 of those 98 points in equities, and the rest spread thin. He invests in risky assets because, in his words, that is what is fun for him.
- Within his equities, he is heavily tilted toward single stocks rather than ETFs. He frames it as roughly a quarter to a third in ETFs and the rest in single names, and concedes it could be as lopsided as 90 percent single stocks because picking names is what he enjoys.
- The three sectors he has concentrated in for years are big tech, energy, and financial services, and he says his outperformance comes from where he focused, not from any special genius.
- On tech he owns the big hyperscalers, the Googles, Microsofts, and Nvidias of the world, plus a tier just below them, naming Oracle and Larry Ellison as an example of a slightly riskier second tier name. He thinks in categories, not fixed tickers, because he changes positions constantly.
- He says he has a background in trading energy, which is why energy is a core sleeve, and he knows financial services from the inside after almost 40 years at Goldman, so those are natural areas of edge.
- He still owns a lot of Goldman Sachs stock, out of affection for the firm he spent his career building.
- He is bullish on big tech and plans to stay bullish until it stops going up. His foreseeable future, he jokes, lasts until he finishes the conversation and checks the screen again.
- He trades every single day, alone, with no team. He does it from an iPad and a phone, not a computer, and treats the market like background music rather than a job.
- His research is human, not algorithmic. He chats and texts with people, then calls them because he is tired of fixing typos, and he reads the New York Post, the Wall Street Journal, the New York Times, the Financial Times, and Bloomberg.
- The advice he gives ordinary investors is deliberately boring and different from his own behavior: hold a diversified equity portfolio like an S&P 500 fund, with VOO as his own example, and tilt more aggressively when you are young because you have time to outlive mistakes.
- He notes that broad indexes are already heavily weighted toward tech because of market cap, so a plain index gives meaningful tech exposure, and a tech focused ETF on top can add a disproportionate tilt for believers.
- He calls the host’s simple 90 percent index and 10 percent bonds allocation sensible, and says this is essentially the same advice Warren Buffett would give a normal person.
- The older you get, the more conservative you should become, shifting from maximizing gains toward not losing what you have. Young people can afford more risk precisely because they will outlive their errors.
- During the 2008 financial crisis, Warren Buffett invested about five billion dollars in Goldman through a preferred stock structure, essentially on a phone call and a handshake, with no demand for due diligence.
- Buffett’s real value was confidence, not capital. Goldman already had money, but it had lost the confidence of the market while peers were failing. Buffett’s name signaled the firm was a good investment being beaten down by circumstances that would reverse.
- Buffett asked for a verbal commitment that Goldman would not sell shares before he did, and declined to put it in writing. He waved off the worry with the line that five billion dollars going bad would not even be a bad hurricane for Berkshire, an insurer.
- Most trading is done on reputation, not paper. Blankfein says people buy and sell bonds worth enormous sums without written contracts, relying on probity, because anyone who reneges will never eat lunch in this town again.
- On risk and regulation, he argues you cannot legislate risk away. Trying to prevent the hundred year storm also forgoes the 99 in between years of growth, and a good risk manager sometimes has to encourage people to take risk, not suppress it.
- The best traders have resilience. They bounce back, focus on new information rather than the past, and adapt quickly instead of staying gunshy after a loss.
- The difference between someone who is really good and someone who cannot make it is small. He compares it to a golf tournament won by one stroke with six people tied for second, and notes much of life is winner take all at razor thin margins.
- Luck matters enormously. He became Goldman CEO partly because his predecessor was nominated to be Treasury Secretary, a reference to Hank Paulson, and the timing of opportunities is often out of your control.
- He is skeptical of the word genius. He says he can usually see how successful people do what they do, with Elon Musk as a rare exception, and that powerful people are more normal, more insecure, and more flawed than outsiders assume.
- On democratized investing, he thinks apps that make markets accessible are good in their own terms, but gamifying trading with confetti and high fives can mask real danger for people who can lose more than they can afford.
- He has missed plenty. He thought SpaceX was overpriced at a 100 billion dollar valuation, now discussed near a trillion and three quarters, and passed on early cellular because he could not imagine why anyone would carry a bulky phone when payphones existed. He says he missed far more than he got.
- He frames a supportive spouse as more game changing than almost any investment, and warns that a bad marriage, with custody fights and property settlements, is financially and personally worse than being lonely.
- He has not paid a bill in over 40 years. His wife Laura, a former lawyer he says now chairs Barnard College, runs a bill paying service and manages the household economy. He generates the money, she distributes it.
- He grew up in an East New York, Brooklyn housing project, the son of a postal worker, and carried money anxiety well into his 30s. He recalls buying a vacation home that cost more than all their savings, with his wife unable to make the math work until they remembered the down payment.
- A 500 dollar financial aid check, handed to him without shame as a college freshman around 1971, shaped his philosophy on giving. He learned it is not enough to give people what they need, you have to give it in a way that feels dignified.
- He embraces the give with your warm hand, not your cold hand idea, the notion of giving while alive so you can experience the joy, which connects to the spirit of the book Die With Zero.
- He admits ambivalence about giving to his kids, the strange feeling of resenting that they have what he provided, and notes the heavy burden carried by children of prominent people who must prove they earned their place.
- He describes himself as wired for anxiety, inherited from his father, and says looking around corners for what could go wrong actually suited a career in a risky business with a big balance sheet.
- When he made partner, a senior partner gave him rules of the road, including avoiding misconduct, being conservative on taxes, setting up a charitable foundation, and living so that no more than three of the nine paragraphs in his eventual obituary would be about Goldman. He says he stayed too long to pass that test.
- He reads history as a discipline, favoring Barbara Tuchman, Robert Caro’s The Power Broker, Ron Chernow, Rick Atkinson, and Stephen Ambrose. His core belief, borrowed from Mark Twain, is that history does not repeat but it rhymes, which is why he would not bet against America.
Detailed Summary

The three sectors he actually invests in

The headline answer to where the former Goldman CEO is putting his money is simple: big tech, energy, and financial services. He says he has been focused on those three areas for a long time, and that his outperformance is a function of where he aimed rather than any unusual investing gift. Energy is natural because he has a background trading it. Financial services is natural because he spent nearly 40 years inside the industry. Tech is where he is most heavily concentrated, and he expects to stay there for good reason, citing the threshold of large changes in technology. He owns the major hyperscalers by category, the Googles, Microsofts, and Nvidias, plus a tier just below, offering Oracle and Larry Ellison as a polite example of a slightly riskier second tier name. He is careful to say he thinks in categories rather than fixed tickers because he changes his positions all the time.

How the portfolio is really built: single stocks over ETFs

Asked to describe his portfolio as a pie chart, Blankfein says he is about 98 percent in risky assets, with roughly 95 of those points in equities. He pushes back on the idea that index funds are safe, pointing out that a diversified equity ETF is still equities and still risky, just spread out, and very different from debt or short term money markets. Within his equity sleeve he leans into single stocks, framing it as somewhere between a quarter and a third in ETFs and the rest in individual names, and conceding it might be as extreme as 10 percent ETFs and 90 percent single stocks. The reason is preference, not theory. Picking and trading names is what he likes to do, and he is honest that this is a hobby pursued by a professional, not a model for someone investing for a living.

How he actually trades: an iPad, a phone, and a network

He trades every day, by himself, with no team. There is no Bloomberg terminal and no desk of analysts. He uses an iPad and a phone, and admits it takes discipline not to glance at his screen mid conversation. The market, he says, is like music playing in the background while he does other things. His information edge is relational. People text him, he texts back, and then he calls because he is tired of fixing typos with what he calls his fat fingers. He follows general and business news, reads a stack of newspapers starting with the New York Post, and treats companies like little stories, almost like gossip. He even notes, with some delight, that he still watches commercials on Netflix, a small window into a frugality that never fully left him.

The advice he gives young investors, and what Buffett would say

For a normal person, his counsel is the opposite of his own behavior. He would hold a diversified portfolio of equities like an S&P 500 fund, naming the SPY and VOO tickers and saying he personally uses VOO. Because of the importance of technology, he might add a tech oriented ETF for extra tilt, while noting the broad index is already tech heavy by market cap. He endorses the host’s plain 90 percent index and 10 percent bonds split as sensible and says it mirrors what Warren Buffett would advise. His one piece of age based guidance is that younger investors should accept more risk through equities, because they have time to recover, while older investors should grow more conservative and focus on not losing what they have rather than maximizing returns.

The Warren Buffett handshake that backed Goldman in 2008

The most cinematic story in the conversation is Buffett’s roughly five billion dollar investment in Goldman during the financial crisis, structured as a preferred stock that sits between a loan and equity. Blankfein describes a deal done largely on trust. When he offered to walk Buffett through everything he was worried about, Buffett replied that he knew Lloyd well enough to know he worried enough for the both of them. Buffett also asked, verbally and without writing, for a commitment that Goldman would not sell shares before he did. Blankfein is clear that the cash itself was almost irrelevant, since Goldman had money. What the firm lacked was the confidence of a frightened market, and Buffett’s willingness to invest before things improved supplied exactly that signal. Buffett, he stresses, was acting for his own shareholders, not as a rescuer, which is precisely what made the vote of confidence credible.

Why you cannot legislate risk out of the system

Reflecting on the post crisis regulatory push to make sure 2008 never happened again, Blankfein makes a careful argument about the price of safety. Once you are in the business of taking risk, anything can happen, and trying to legislate it away has a hidden cost. You may think you are protecting the world from the hundred year storm, but you also forgo the 99 years of growth in between. He extends this inside the firm too. After a period of big losses, partners had become gunshy and were talking themselves out of every idea. A good risk manager, he argues, sometimes has to promote risk taking rather than repress it, because without risk there is no growth, no entrepreneurship, and no progress. The flip side is real: take risk and there is a meaningful chance you fail and lose other people’s money, which is a terrible outcome. But the alternative, never risking anything, buys comfort at the cost of ever moving forward.

Small margins, big outcomes, and the role of luck

Asked what separated the traders who could not outperform from the rest, Blankfein says the gap between the very good and those who cannot make it is surprisingly small. He likens it to a golf tournament decided by a single stroke with six players tied for second, and to acting, where the best performer gets every role and the second best waits tables. Much of life, he says, is winner take all at tiny margins. Luck compounds this. He freely credits fortune for his own rise, noting he became CEO in part because his predecessor was tapped to be Treasury Secretary. He is also skeptical of the genius label. He can usually see how accomplished people do what they do, with Elon Musk a rare exception, and insists the powerful are more normal, more insecure, and more driven by their flaws than outsiders imagine.

Reputation is the real contract

A recurring theme is that the financial world runs on reputation more than paperwork. Blankfein notes that most of what traders do is not written down. People buy and sell bonds and other instruments that settle days later, relying on probity rather than signed contracts, because anyone who lies or reneges will never eat lunch in this town again. He references the casual texts between Elon Musk and Larry Ellison around the Twitter acquisition as proof that big does not mean complicated. There are big things that are simple and little things that are complicated. Documentation is good when execution is far off, but when a deal will be performed in two days, dotting every i is often pointless. The point is not that documents do not matter, it is that trust and reputation are the load bearing structure.

A supportive spouse as the highest return asset

The conversation turns personal when both men agree that a supportive partner may be the single most game changing factor in a life, more than any investment. Blankfein adds the inverse warning: a bad marriage, with breakups, custody battles, and property settlements, is worse than loneliness. He credits his wife Laura, a former big firm lawyer he says now chairs Barnard College, with handling everything when his career moved the family overseas, from the car to the house to the kids’ schooling, while he took the visible victory laps at work. He has not paid a bill in over 40 years. Laura manages a bill paying service and runs the household finances. As he puts it, he is in charge of generating the money and she is in charge of distributing it. The host contrasts this with his own monthly money meetings with his wife, a discipline he picked up from a personal finance author friend.

Money scars, the 500 dollar check, and giving with a warm hand

Blankfein grew up in an East New York housing project, the son of a postal worker who had earlier lost a job, in a household where rent was scarce. He calls himself an urban hick who barely left Brooklyn as a kid. That scarcity left a mark that lasted into his 30s. He tells the story of buying a small beach house that cost more than all their savings, and of his wife driving 30 miles while failing to make the closing math work, until they realized she had forgotten to count the 10 percent down payment. The most resonant memory is a 500 dollar financial aid check handed to him as a freshman around 1971, made out on the spot by a clerk with a generosity of spirit that let him receive it without shame. That experience shaped a lifelong view that giving well means preserving dignity, and he now co chairs a financial aid campaign at his university. It also connects to his embrace of the idea of giving with your warm hand rather than your cold hand, giving while alive so you can feel the joy, the same spirit as the book Die With Zero. He is candid about a strange ambivalence, the way he can resent that his kids enjoy what he himself gave them.

Robinhood, confetti, and the misses

On apps like Robinhood, Blankfein takes a balanced view. Democratizing investing and making assets accessible is good in its own terms, and advertising can pull people toward markets they would otherwise ignore. But if you make trading too much like a video game, with confetti and high fives, you can mask the danger and lure people who cannot afford to lose into losing more than they can. He is equally frank about his own misses. He thought SpaceX was overpriced at a 100 billion dollar valuation, a figure now discussed near a trillion and three quarters. He passed on early cellular because he could not imagine why anyone would carry a bulky phone with payphones everywhere. His blunt summary is that he missed far more than he got, and that nobody is great at predicting the future.

The obituary test, thick skin, and staying too long

When Blankfein made partner, a senior partner assigned to acculturate new partners gave him rules of the road: avoid anything that would today be called misconduct, be rigorous and conservative on taxes, set up and actually use a charitable foundation, and keep enough balance that, if your obituary runs nine paragraphs, no more than three are about Goldman. Blankfein says he failed that last test by staying too long, even titling his memoir around the firm. He also reflects on having a thick skin, recalling unflattering press and concluding that he could take a punch, a trait not everyone has and one he did not know he possessed until he was tested. He is careful to say this does not make people who cannot take a punch bad, just differently wired.

Why he reads history: it rhymes

The final stretch is a love letter to reading history. Blankfein favors Barbara Tuchman, whose A Distant Mirror he has read twice and whose Guns of August he calls fantastic and influential, along with Robert Caro’s The Power Broker on Robert Moses, Ron Chernow’s biographies, Rick Atkinson’s Revolution series, and Stephen Ambrose’s Undaunted Courage. He describes rereading the Robert Moses book after 40 years of trying to get things done and finding his appreciation for the achievements rise, even as the flaws stayed the same, because he had changed. He ties history directly to markets through the Mark Twain line that history does not repeat but it rhymes. Patterns recur, every generation maximizes its own crises and minimizes resolved ones, and reading about the black plague, the McCarthy era, or the Vietnam draft is how he stays calm. His conclusion, echoing a sentiment often attributed to Buffett, is that he would not bet against America, a country he describes as mostly good and able to improve.

Notable Quotes

“I invest in risky assets. That’s what’s fun for me.”
Lloyd Blankfein, describing his own portfolio, which he says is roughly 98 percent risky assets

“It’s been good to be bullish on big tech, and I’ll stop being bullish on it when it stops going up.”
Lloyd Blankfein, on why he stays concentrated in technology

“I’m not at a computer. I don’t have a computer. I have an iPad.”
Lloyd Blankfein, on how he day trades every day, alone and with no team

“To me, the market is like music. It’s out there. It’s going on.”
Lloyd Blankfein, on why trading daily feels like a hobby rather than work

“Look, $5 billion if it all goes bad, that’s not even a bad hurricane on the East Coast.”
Warren Buffett to Lloyd Blankfein, waving off the risk of his 2008 investment in Goldman Sachs

“The difference between somebody who’s really, really good and somebody who can’t make it is not that great.”
Lloyd Blankfein, on the thin margin between the best and the rest

“You may think you’re protecting the world from the hundred-year storm, but you’re also going to forego the 99 years of in between when there was growth.”
Lloyd Blankfein, on the cost of trying to legislate risk out of markets after 2008

“I’m in charge of generating the money, and she’s in charge of distributing it.”
Lloyd Blankfein, on his 40-plus-year marriage to Laura and why he has not paid a bill in decades

“History doesn’t repeat, but to paraphrase Mark Twain, it rhymes.”
Lloyd Blankfein, on why reading history keeps the present in proportion

Watch the full conversation with Lloyd Blankfein on the My First Million podcast here.

Related Reading
- Lloyd Blankfein (Wikipedia) background on the former Goldman Sachs chairman and CEO whose investing views anchor the conversation.
- My First Million podcast the show where this interview took place, for the full back catalog of investor and founder conversations.
- Berkshire Hathaway primary source on Warren Buffett’s company, which made the roughly five billion dollar Goldman investment in 2008.
- Vanguard S&P 500 ETF (VOO) the diversified index fund Blankfein names as the sensible core holding for a normal investor.
- Die With Zero by Bill Perkins the book behind the give with your warm hand, not your cold hand philosophy discussed near the end.
June 16, 2026
Thomas Laffont of Coatue on the $4 Trillion AI IPO Wave: SpaceX, Anthropic, OpenAI, and Why the New Unicorn Economy Is Healthier
Thomas Laffont, co-founder of the $55 billion hedge fund Coatue Management, made his All-In Podcast premiere with a data-dense walk through what he calls a once-in-a-generation moment for the unicorn economy. In front of Chamath Palihapitiya, Jason Calacanis, David Sacks, and David Friedberg, he argued that a roughly $4 trillion wave of private value is about to hit the public markets, led by SpaceX, Anthropic, and OpenAI, and that the new AI-driven unicorn economy is actually healthier than the one that came before it. You can watch the full presentation and Q&A on YouTube.

TLDW

Laffont presents Coatue’s slide deck on the state of the unicorn economy and argues it has rebalanced after the excesses of 2021. The average unicorn is up about 70 percent since September 2024, AI keeps taking a bigger share of all fundraising, and the model has shifted from many small unicorns to fewer companies each raising far more, with funding per unicorn up roughly 5x since 2021. He introduces a “Magnificent 8” private index (SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, Anduril, and more) worth nearly $4 trillion that has crushed the public Mag 7, then shows that exits are finally thawing as SpaceX heads to an IPO in weeks and Anthropic confidentially files its S1. He lays out Coatue’s “CODE” framework for why SpaceX gets more valuable the more it launches, a counterintuitive finding that the odds of a 10x actually rise as companies get bigger (31 percent for $100 billion-plus centicorns), the explosive revenue ramp of OpenAI and Anthropic past Workday, ServiceNow, Adobe, Salesforce, and now the hyperscalers, a three-pillar map of where AI revenue comes from (consumer, ads, enterprise), and the AI memory thesis. The Q&A with Chamath and Calacanis digs into the power law, K-shaped outcomes, whether these valuations are disconnected from reality, the public market as the great antiseptic, and what happens when trillions in private value finally recycles back through GPs and LPs.

Thoughts

The most useful idea in the talk is not the $4 trillion headline, it is the cohort-health chart. Laffont splits unicorns into eras and shows that the pre-2021 cohort was healthy, roughly 80 percent had raised again or exited 20 quarters after minting, while the giant 2021 ZIRP cohort of 479 companies is stuck with under 20 percent doing either. That single comparison reframes the whole AI boom. The bullish read is that the 2024 AI cohort is small, concentrated, and cash-generative, so it looks more like the healthy pre-ZIRP group than the 2021 hangover. The bearish read is that we are watching the same movie with bigger numbers, and the test only comes when these companies face public markets. Laffont is honest that we do not yet know which cohort the AI class resembles, and that intellectual humility is what makes the deck credible rather than promotional.

The SpaceX “CODE” framework is the sharpest analytical move of the presentation. Most people would assume a launch business gets cheaper per launch as it scales. Laffont shows the opposite, the market pays more per launch as cadence rises, and explains it as a phase change in business quality: from one-time government launch revenue, to a single recurring-revenue constellation, to multiple constellations, to a platform with optional upside in space data centers, the moon, and Mars. It is a clean way to think about any company that climbs from a project business to a platform business, and it applies far beyond rockets. The lesson for investors is that valuation can rationally expand even as unit economics look like they should compress, because the nature of the revenue underneath is changing.

The counterintuitive 10x odds finding deserves more attention than it got in the room. Conventional wisdom says the bigger you are, the harder it is to grow, so a $100 billion company should be less likely to 10x than a $10 billion one. Coatue’s data says the reverse: centicorns have a 31 percent shot at a 10x, far higher than the 8 percent a unicorn has at becoming a decacorn. Laffont’s explanation is a filtering mechanism, every step up validates a compounding advantage and durability of earnings, so survivors are increasingly the kind of business that keeps compounding. This is essentially a quantitative restatement of quality investing, and it is the intellectual backbone of the LP strategy the besties tease out, just buy whoever reaches $100 billion and hold.

Where the argument gets genuinely contested is valuation, and the panel does not let it slide. The pushback that “these are not fake companies” is true and important, OpenAI and Anthropic are growing faster than any software company in history, and Anthropic reportedly had a profitable month. But growth and reality do not settle the question of price when you are paying 50 to 100 times revenue for trillion-dollar private companies, as Bill Ackman pointed out earlier in the day. Laffont’s answer is the most grounded thing he says all session: the public market is the great antiseptic, it will not care about anyone’s slide deck, and he wants to see these names withstand short sellers and skeptics. That is the right posture. The deck is a thesis, not a verdict, and the verdict arrives roughly six months and one day after the IPOs, once passive flows and supply have washed through.

The closing thread, that almost every sector is being transformed at once and we still do not have superintelligence, is the part worth sitting with. The risk in a presentation this bullish is treating the trend as destiny. The value is in the framing tools Laffont hands you, cohort health, phase-change business quality, the filtering odds, the three revenue pillars, and the antiseptic of public scrutiny. Use those to interrogate each name rather than to buy the index on faith, and the talk earns its premiere billing.

Key Takeaways
- Coatue Management is one of the most successful hedge funds of the last two decades with about $55 billion under management, and is raising roughly another billion dollars specifically to invest in AI.
- The unicorn economy is up about 70 percent on average since September 2024, and the public market has made a similar move up over the same period.
- The unicorn economy’s share of the NASDAQ rose significantly after 2015 but has plateaued in recent years, reflecting strong performance from public companies.
- AI keeps increasing its wallet share of all venture fundraising, multiple years in a row now.
- The composition of funding has changed. The unicorn “factory” peaked in the ZIRP era of 2021 and has normalized at a much lower level since.
- Funding per unicorn has increased roughly 5x since 2021. There are fewer unicorns, and each one is raising more.
- Cohort health, pre-ZIRP group: of about 73 unicorns, 20 quarters after minting roughly 80 percent had either raised a new round or exited, which is healthy.
- Cohort health, 2021 group: of about 479 unicorns, 20 quarters in, fewer than 20 percent had exited or raised again. Far larger cohort, far worse outcomes.
- The open question is which cohort the new 2024 AI cohort will resemble.
- Funding is concentrating: the top 10 companies capture a large share, and it is a small number of AI companies, not all of them, with Anthropic and OpenAI raising massive rounds.
- Laffont proposes a “Magnificent 8” private index: SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, Anduril, and more, spanning internet, AI, fintech, and space tech.
- That private index represents almost $4 trillion of value and has crushed the traditional public Mag 7, with almost every name outperforming.
- Exits are thawing. 2026 is on a good trend for cash returned versus consumed, not quite 2021 levels, with half a year still to go.
- That trend does not yet include three imminent liquidity events: SpaceX (IPO expected in weeks) and Anthropic (confidentially filed its S1), whose combined value could exceed the prior decade of exits combined.
- The ecosystem is far more balanced than when Laffont first presented at the 2024 All-In Summit, when it was consuming much more cash than it returned.
- OpenAI and Anthropic revenue growth is unlike anything previously seen. Starting from January 2025, they passed Workday, then ServiceNow, then Adobe, then Salesforce, and are now bigger than Google Cloud and Azure.
- On current forecasts, that revenue could pass AWS by the end of the year and exceed all of Microsoft by 2028.
- Hyperscalers are not sitting still. The largest companies in the world are funding the disruption, investing unprecedented sums to enable the ChatGPT moment.
- The SpaceX “CODE” framework: the number one driver correlated to SpaceX’s valuation is cadence of launches, and valuation per launch rises as launches increase.
- Why per-launch value rises: business quality improves through phases, pre-constellation (one-time government revenue), initial ramp (one recurring-revenue constellation), scale (multiple constellations), and platform (space data centers, moon and Mars optionality).
- Anthropic in particular is scaling like no company seen across the PC, internet, or mobile eras.
- Counterintuitive 10x odds: a unicorn has about an 8 percent chance of becoming a decacorn, a decacorn has 8 to 13 percent odds of reaching $100 billion, but a centicorn ($100 billion-plus) has a 31 percent chance of a 10x.
- Value creation has accelerated. It typically takes years to go from $500 billion to $1 trillion in market cap, yet recently three companies did it in one year and two did it in a matter of weeks.
- Cerebras is the counterexample of slow success: years of dark periods and no new capital developing its technology, then a massive OpenAI contract that quintupled the company’s value ahead of its IPO.
- Semiconductors are on a generational run, with the sector dramatically outperforming the index since the 2024 All-In Summit.
- AI memory thesis: the more an AI system knows about you, the more useful it is, so memory per user could quintuple, which helps explain recent moves in memory companies.
- Where the revenue is: the AI ecosystem is roughly $140 billion today, about $300 billion this year, and is expected to double in 2027.
- Three revenue pillars: consumer (subscribers times ARPU), ads (about a quarter of Meta and Google ads are AI-enabled today, heading toward 100 percent and roughly $150 billion), and enterprise (tools like Claude Code and Codex inside businesses).
- Disruption is hitting every sector: software, telco (Starlink-powered global phone calls), semis, energy (data centers reshaping Pennsylvania’s grid), auto (Ferrari’s electric and autonomous stumble), and consumer (GLP-1s reshaping food, alcohol, and wellness).
- Final takeaways: the new unicorn economy is healthier thanks to AI, winners are compounding faster so the cost of not owning a winner is higher than ever, disruption is everywhere, and we do not even have superintelligence yet.
- In the Q&A, both Anthropic and OpenAI publicly say they want to be public, and big outcomes now look likely to become liquid within roughly a 12-month window.
- The valuation pushback: these are not fake companies, they generate substantial revenue at scale and grow faster than anything before, and Anthropic reportedly even had a profitable month.
- The public market is framed as the great equalizer and antiseptic, but with passive buying the true price discovery may not land on day one, more like six months and a day after listing.
- A floated LP strategy: wait for whoever reaches $100 billion and concentrate capital there as the least brittle, quickest-return bet, tempered by the warning that valuations are disconnecting from any historical metric (50x to 100x revenue).
- An open risk: with so much capital, OpenAI and Anthropic could rationally start a price war, the way ride-sharing and food-delivery players once did, though heavy infrastructure spend complicates it.
Detailed Summary

The unicorn economy has rebalanced after 2021

Laffont opens by reframing a market many assume is frothy. The average unicorn is up about 70 percent since September 2024, and the public market has tracked a similar climb, so private and public value are moving together rather than diverging. The unicorn economy’s share of the NASDAQ rose sharply after 2015 and then plateaued, which he reads as a sign of how strong public companies have become. Underneath the headline, the structure of funding has changed. The 2021 ZIRP era was a unicorn factory that minted enormous numbers of companies, and that machine has since normalized to a much lower level. The result is a barbell: fewer new unicorns, but each raising far more, with funding per unicorn up roughly 5x since 2021. AI sits at the center of this, taking a steadily larger share of all venture dollars for several years running.

Cohort health is the real story

The deck’s most important slide measures the health of the ecosystem by cohort. The pre-ZIRP cohort, about 73 unicorns, looks healthy: 20 quarters after becoming unicorns, roughly 80 percent had either raised a new round or exited. The 2021 cohort tells the opposite story. It is enormous, about 479 unicorns, and 20 quarters in, fewer than 20 percent had raised again or exited. That contrast sets up the central question of the talk. A new 2024 cohort of AI companies is forming, and no one yet knows whether it will resemble the healthy pre-ZIRP group or the bloated, stuck 2021 group. Laffont’s framing leans optimistic because the AI cohort is small and concentrated, but he is careful not to declare the answer.

The Magnificent 8 and a $4 trillion private index

Funding is not just flowing to AI, it is flowing to a handful of AI names, with the top 10 capturing a large share and Anthropic and OpenAI raising the biggest rounds. From this concentration Laffont builds a private index he half-jokingly calls the Magnificent 8, a number he expects to shrink as companies go public. The members span sectors: SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, and Anduril, covering internet, AI, fintech, and space tech. He says he would be comfortable owning that index for the next decade-plus. Collectively it represents almost $4 trillion of value and has outperformed the public Mag 7, with nearly every constituent beating that benchmark.

Exits are thawing and a wall of liquidity is coming

One of Laffont’s recurring concerns at past summits has been balance: the unicorn economy is great at consuming cash, but a healthy ecosystem must also return it. On that score 2026 is trending well, not quite 2021, but solid with half a year left. Crucially, that figure does not yet include three imminent events. SpaceX is expected to go public within weeks, and Anthropic confidentially filed its S1 the day of the talk. Adding those up, just a few companies could deliver more liquidity than the prior ten years combined. The takeaway is that the ecosystem that was dangerously out of balance in 2024 is now meaningfully more balanced, and improving.

The revenue ramp past the hyperscalers

The growth rates of OpenAI and Anthropic, Laffont argues, are unlike anything previously seen. Charting from January 2025, the leading AI labs passed Workday, then ServiceNow, then Adobe by year end, then Salesforce by January, and are now bigger than Google Cloud and Azure. On forecast, that revenue could surpass AWS by the end of the year and exceed all of Microsoft by 2028. He stresses that the hyperscalers are not passive bystanders, they are actively funding the disruption, pouring unprecedented capital into enabling the change that began with the ChatGPT moment.

The SpaceX CODE framework

Laffont devotes real time to how Coatue thinks about SpaceX. The single factor most correlated with SpaceX’s valuation is cadence of launches, which is intuitive for a launch business. The surprise is that valuation per launch has risen rather than fallen as cadence climbed. His explanation, the CODE framework, is that the quality of the business model improves the more SpaceX launches. In phase one, pre-constellation, you are simply proving rockets, with a few government customers and lumpy, unpredictable one-time revenue. In the initial ramp you stand up a constellation, which is an end market and a recurring-revenue business that grows with every satellite and subscriber. At scale you operate multiple constellations, and Laffont expects companies, governments, and militaries to want to own their own. Ultimately it becomes a platform, with new businesses layered on top, from space data centers to the optionality of the moon and Mars.

Counterintuitive odds and the speed of value creation

Coatue bucketed companies and asked the odds of a 10x within each. A unicorn has roughly an 8 percent chance of becoming a decacorn. A decacorn has 8 to 13 percent odds of reaching $100 billion. But a centicorn, $100 billion or more, has a 31 percent chance of a 10x, counting both public and private companies. The bigger you are, the better your odds, which inverts intuition. Laffont pairs this with the sheer speed of recent value creation. Going from $500 billion to $1 trillion in market cap normally takes years, yet three companies did it in a single year and two did it in a matter of weeks. He also offers Cerebras as the patient counterexample, a chip company that endured years of dark periods and no new capital before a massive OpenAI contract quintupled its value ahead of IPO, part of a broader generational run for semiconductors.

AI memory and where the revenue actually comes from

A throughline from the day’s other speakers is that the more an AI knows about you, the more useful it is, from your restaurant preferences to your work context. Laffont turns that into a thesis: memory per user could quintuple based on what these systems require, which helps explain recent moves in memory companies. He then tackles the most contested question, where is the revenue. He sizes the AI ecosystem at about $140 billion today, roughly $300 billion this year, and doubling in 2027, built on three pillars. Consumer is subscribers times ARPU. Ads are the pillar people forget, with about a quarter of Meta and Google ads already AI-enabled and penetration heading toward 100 percent, a roughly $150 billion opportunity. Enterprise is the breakthrough category, exemplified by tools like Claude Code and Codex operating inside businesses.

Every sector is being transformed at once

What makes this era different, Laffont says, is that nearly every sector is being transformed simultaneously. Software is obvious, but look at telco, where he believes Starlink will soon power a device that lets you make a phone call anywhere on earth, attacking the global telco and broadband profit pool with a better product. Compute is driving massive change in semis, data centers are reshaping the energy equation in places like Pennsylvania, and the auto business is being upended, as Ferrari’s stumble introducing electric and autonomous technology showed. In consumer, GLP-1 drugs are profoundly changing consumption of food and alcohol and the broader focus on wellness. His takeaways close the loop: the new unicorn economy is healthier thanks to AI, winners are compounding faster so the cost of missing them is higher than ever, disruption is everywhere, and superintelligence has not even arrived yet.

The Q&A: power law, valuation, and the public market test

Chamath and Jason Calacanis press Laffont on what this means for allocators. The recurring theme is the power law and K-shaped outcomes, with gains consolidating into a small number of companies. The positive side, Laffont notes, is that outcomes are enormous and increasingly liquid within a 12-month window, and both Anthropic and OpenAI say they want to be public. The hard part is valuation. The besties cite Bill Ackman’s framing that investors are making venture bets on trillion-dollar companies at 50 to 100 times revenue. Laffont’s pushback is that these are not fake companies, they generate substantial revenue at scale and grow faster than anything before, and Anthropic reportedly had a profitable month. But he embraces the discipline ahead: the public market is the great antiseptic and will not care about anyone’s presentation, though with heavy passive buying, true price discovery may take roughly six months and a day rather than landing on day one. Asked whether the compounding is a market inefficiency or survivor bias, he declines to over-read a small sample, noting that Anthropic before Claude Code was a completely different company than after. The conversation closes on what happens when trillions recycle from GPs to LPs, the case for simply owning whoever crosses $100 billion, the risk of everyone crowding into three names, and the possibility of an eventual OpenAI versus Anthropic price war.

Notable Quotes

“So we have fewer unicorns that are each raising more.”
Thomas Laffont, summarizing how funding per unicorn has risen roughly 5x since 2021

“The reason is that the quality of SpaceX’s business model increases the more you launch.”
Thomas Laffont, explaining the CODE framework and why valuation per launch rises with cadence

“The winners are compounding faster than ever, which means the costs of not being in a winner are higher than ever.”
Thomas Laffont, on the central risk of a power-law market

“And by the way, we don’t even have super intelligence yet.”
Thomas Laffont, closing his takeaways on how early the transformation still is

“These are companies generating substantial revenue at scale that are growing faster than anything we’ve ever seen.”
Thomas Laffont, pushing back on the idea that AI valuations rest on fake companies

“It will be the great antiseptic. It will not care about my presentation.”
Thomas Laffont, on the public market as the ultimate test for SpaceX, OpenAI, and Anthropic

“Anthropic pre-cloud code was a completely different company than post cloud code.”
Thomas Laffont, on why he won’t over-read a small sample of hyper-compounders

“The power law rules our lives. All the great gains are being consolidated into small numbers of companies.”
An All-In host, framing the Q&A on concentration in private markets

This is a curated set of highlights. To hear the full presentation, the slide walkthrough, and the complete Q&A with Chamath and Jason Calacanis, watch the full conversation here.

Related Reading
- Coatue Management. Primary source for Thomas Laffont’s firm and the technology investing strategy behind the deck.
- The All-In Podcast. The show and summit where Laffont made this premiere presentation.
- Power law (Wikipedia). Background on the distribution Laffont and the hosts say governs venture and public-market returns.
- The Magnificent Seven (Wikipedia). The public-market benchmark Laffont’s private “Magnificent 8” index is measured against.
- Cerebras Systems. The AI chipmaker Laffont cites as the slow-grind IPO that was eventually transformed by a major OpenAI contract.
June 4, 2026
Bill Ackman on Investment Strategy, What the Market Is Missing, and How AI Breaks Businesses
Bill Ackman, founder and CEO of Pershing Square, joined the All-In Podcast for a conversation about how his investment approach has shifted toward permanent, long-term ownership, why he believes the highest-quality companies are being left behind by a market chasing the new new thing, and how AI is raising the risk of disruption for almost every business. He also lays out his plan to turn Howard Hughes into a Berkshire Hathaway-style compounding machine built on insurance. You can watch the full conversation here. Below is a structured breakdown of the ideas, the stories, and the frameworks he uses to underwrite a business.

TLDW

Ackman explains how his philosophy evolved from a smaller, more liquid activist toward concentrated, permanent ownership of durable, non-disruptible businesses, with much of his activism now playing out on X rather than in the boardroom. He tells the origin story of his first big trade, Wendy’s and the Tim Hortons spin-off, and explains why a large long-term shareholder on a board is an antidote to short-term markets. On AI, he argues that this is the greatest era in history to build a company, which means the risk of being disrupted has gone up enormously, and that the market is mispricing high-quality compounders like Microsoft, Meta, and Amazon while crowding into chips, semiconductors, and energy. He works through the SaaS question and why niche software is more at risk than platforms, how he underwrites SpaceX, xAI, OpenAI, Anthropic, and Palantir like late-stage venture bets using a people, opportunity, context, deal framework, and why founder-led companies have an edge in making radical calls. The back half covers his Howard Hughes plan to copy Buffett’s insurance-float model, the role of cost of capital and reflexivity in markets, the meme-stock era, going direct on social media, and the three different ways an investor can put money to work with Pershing Square.

Thoughts

The most useful idea in the interview is the way Ackman reframes disruption as the central investing problem of the AI era. His point is that the same forces making this the best time in history to start a company, meaning near-unlimited compute, capital, and talent, also raise the odds that any given incumbent gets disrupted. That reframes the word quality. It is no longer mostly about margins and moats. It becomes about non-disruptibility, which is a much higher bar than most quality investors were using a decade ago, and it is why he says most of his research time now goes into assessing that single risk.

The what-the-market-is-missing thesis is classic contrarian Ackman. Arguing that Microsoft, Meta, and Amazon are the new old-fashioned, undervalued names while capital piles into semiconductors and energy is a direct echo of 2000, when Berkshire Hathaway bottomed precisely because money was chasing internet stocks. It is worth keeping in mind that he owns all three, so the call is also his book. The durable signal here is the framework, not the specific tickers: capital reliably chases the new new thing, and genuinely high-quality businesses get left behind during those rotations.

The Howard Hughes plan is the most concrete bet in the conversation. Copying Buffett’s insurance-float playbook, short-term treasuries for policyholder money and equities for the surplus, onto a discounted real-estate holding company is elegant. The hard part is exactly what Ackman flags about insurance as an industry: the best investors go to hedge funds, not insurers, so most insurance companies only ever manage the liability side well. Pershing Square’s edge is that Ackman can both write the business and invest the float, which is the same reason it worked for Buffett. The framing of going from a four billion dollar company to a trillion over fifty years is a statement of intent, not a forecast, and should be read that way.

Underneath all of it sits cost of capital and reflexivity. His observation that a higher stock price literally makes a company more valuable, because it lowers the cost of capital and creates acquisition currency, is the mechanism behind both Elon Musk’s empire and the meme-stock era he is wary of. Going direct on X is the same lever pointed at himself: communicate the vision, lower your own cost of capital, and make the bet easier for other people to place. It is a coherent worldview in which narrative and balance sheet continuously feed each other, and it explains a lot of his behavior over the last few years.

Key Takeaways
- The biggest change in Ackman’s approach over time is an appreciation for business quality, meaning long-term, durable, protected, non-disruptible growth as the most important factor.
- He says he is as activist as ever, but more of it now happens on X than in the traditional corporate context.
- His first big investment was Wendy’s, which owned Tim Hortons. The simple thesis was to buy Wendy’s, spin off Tim Hortons, and double the money.
- Early on no one returned his calls, so he had Steve Schwarzman’s Blackstone write a fairness opinion, filed it publicly, and the company spun off Tim Hortons six weeks later. The CEO later thanked him after being fired with a large exit package.
- Reputation compounds. Where Pershing Square once had to bang down the door, companies now sometimes tweet a welcome when it buys a stake.
- A large long-term shareholder on a board is a counterweight to short-term markets, letting management test ideas privately and pursue initiatives that hurt the next few quarters of earnings.
- Pershing Square owns Microsoft, Meta, and Amazon. Ackman argues you are either invested in AI directly or indirectly, or it is a threat, so you have to understand it.
- The hardest and most important job for a concentrated investor is judging the risk of disruption, and that risk has risen dramatically.
- This is the greatest era in history to build a business because of near-unlimited access to compute, capital, and talent, which is exactly why the probability of being disrupted has gone up enormously.
- Markets bring their eye to the new new thing, currently chips, semiconductors, and energy, while high-quality companies get left behind.
- He draws an analogy to 2000, when Berkshire Hathaway traded at one of its lowest valuations because everyone chased internet stocks. He sees a similar dynamic around Amazon, Meta, and Microsoft today.
- On the SaaS question, he worries more about a Salesforce than a platform like Microsoft, because niche software charging high per-seat or per-year prices is most exposed, while low-priced platforms are safer.
- Any software company today has to be as AI-enabled as possible, or risk losing the monopolistic pricing it once enjoyed.
- His famous March 2020 CNBC appearance was an attempt to reach President Trump and argue for a short shutdown, paired with the view that stocks were incredibly cheap and worth buying.
- He describes valuation as a tether on the market: when prices stretch too high they snap back, and when they get too cheap the same rubber band pulls valuations up. Calling that out publicly can trigger a psychological reset.
- His recent bullish call came because stocks of really high-quality companies had gotten crazy cheap on fundamentals, meaning the present value of the cash they generate.
- He underwrites high-multiple names like SpaceX as venture investments using a framework from business school: people, opportunity, context, deal.
- On SpaceX, people and opportunity are one of one, the context is incredible, and Starlink plus near-monopoly low-cost launch make it strategically valuable. The complicated part is the deal, meaning the valuation. He invested via an SPV after Ron Baron’s nudge, and also invested in xAI.
- He treats OpenAI, Anthropic, and Palantir as late-stage venture bets that have proven they can generate real revenue, and says OpenAI should do a better job communicating how it thinks about its enormous capital commitments.
- Every CEO in America is asking how to use AI, how it applies to their business, and how it is a threat. It is top of mind and boards open every meeting with it.
- He has not seen much enterprise AI success yet, citing a McKinsey study that 95 percent of enterprise initiatives fail and the rise of the forward deployed engineer as the hot role bridging promise and ROI. Pershing Square itself uses AI mainly for legal, compliance, and back-office work.
- Founder-led companies have an advantage because founders have the authority and the economic stake to make radical calls, while the average S&P 500 CEO has a roughly three to four year tenure and is incentivized not to make mistakes.
- He cites Mark Zuckerberg buying Instagram and WhatsApp as the kind of shocking-at-the-time calls that a founder with a track record can make.
- Ben Graham’s enduring lesson is that a stock is an interest in a business, not a piece of paper, but Graham mostly invested in liquidations and cash-rich shells, and made most of his money on Geico.
- Most of Buffett’s value at Berkshire came from owning insurance operations and focusing on the asset side of the balance sheet, not just the liability side.
- Insurance is hard to copy because top investors do not go to work for insurers. Buffett owned half his company and was a great investor, which is why it worked.
- Howard Hughes came out of the General Growth bankruptcy and owns master-planned cities like Summerlin, with 26,000 acres in the Las Vegas area, comparable to the Irvine Company that built roughly a hundred billion dollars of wealth for Donald Bren.
- The plan is to reinvest the cash Howard Hughes generates into insurance, put policyholder float in short-term treasuries and the surplus in common stocks, and build a compounding machine over fifty years, buying it at roughly sixty cents on the dollar.
- A company must earn a return above its cost of capital for the stock to rise. Elon Musk has kept his companies’ cost of capital extremely low, and a SpaceX IPO near a 1.75 trillion dollar valuation could be one of the lowest cost of equity capital transactions ever.
- Markets have changed less because of Ackman and more because of figures like Ryan Cohen and GameStop, where a stock can trade well above its value on personality and an army of followers.
- Higher valuations are reflexive: a rising stock price lowers cost of capital and creates currency to issue stock and acquire businesses, which is part of how Elon built Tesla.
- There are three ways to invest with Pershing Square: the management company itself (a royalty on compounding assets with no capex), PSUS (a portfolio of best ideas trading at an 18 percent discount), and Howard Hughes (a bet on building the next Berkshire). A dollar invested 22 years ago became roughly 27 to 28 times net of fees.
- Going direct on X, with 2.2 million followers, lets him communicate his vision and lower the friction for others to back his bets, even as his very long tweets have become a running meme.
Detailed Summary

From activist trades to permanent capital

Ackman frames the evolution of his career as a steady move toward business quality. As a smaller, more liquid investor early on, he did not have to think as long-term. As Pershing Square became a bigger, more concentrated investor, durable growth became the dominant factor in every decision. He insists he is still as activist as ever, but a lot of that energy has shifted to X, where he can argue a position publicly rather than only inside a boardroom. The best investments, he notes, are the ones where you do not need to join the board and do anything at all.

The Wendy’s and Tim Hortons origin story

One of Pershing Square’s first investments was Wendy’s, which owned the Canadian coffee and donut chain Tim Hortons. The value of Tim Hortons alone was greater than the entire value of Wendy’s, so the idea was simple: buy Wendy’s, spin off Tim Hortons, and double the money. Ackman bought ten percent of the company and could not get the CEO to return a single call, so he had a contact at Blackstone, with Steve Schwarzman’s sign-off, write a fairness opinion on what Wendy’s would be worth after a spin-off, filed it publicly, and watched the spin-off happen six weeks later. The CEO eventually called back to thank him, having been fired but rewarded with a large exit package. Over the years that scrappy approach gave way to a reputation that now opens doors on its own.

Why a long-term shareholder on the board matters

The core problem of being a public company, in Ackman’s telling, is the short-term nature of markets and analysts, when a good business should be run in the context of years and even decades. A large, supportive shareholder on the board gives management a place to test ideas before exposing them to the public and a credible voice willing to back initiatives that hurt earnings for a few quarters. That is the value-add he believes a constructive activist can bring to a mature public company, as opposed to a startup where the best outcome is simply to own a great business and stay out of the way.

AI and the rising risk of disruption

For a concentrated, long-term investor, the most challenging task is judging the risk that two people from Stanford in a garage build something that destroys your thesis. Ackman argues that risk has climbed dramatically because this is the greatest era in history to build a company, with near-unlimited access to compute, capital, and talent. The paradox is that the conditions that make building easier also make incumbents more fragile, so the bulk of his research now centers on assessing how disruptible a business really is.

What the market is missing

Investors bring their attention to the new new thing, currently chips, semiconductors, and energy, which leaves high-quality companies behind. Ackman compares the moment to 2000, when Berkshire Hathaway traded at one of its lowest valuations ever because capital was chasing internet stocks. He sees an echo today in how Amazon, Meta, and Microsoft are treated as old-fashioned, and he considers them undervalued on fundamentals, where value is the present value of the cash a business generates over its life. His recent bullish call, like his March 2020 appearance, came because stocks of really high-quality companies had simply gotten too cheap.

The SaaS question and AI-enabled software

On the so-called SaaS apocalypse, Ackman says it is a company-by-company analysis. He worries more about something like Salesforce than about a low-priced platform. The companies most at risk are those that extracted near-monopolistic profits by charging a high annual price for a niche product, because AI lowers the barrier to replicating that functionality. A platform where the average customer pays a small amount per seat, like Microsoft, is far less exposed. The takeaway for any software company is to become as AI-enabled as it possibly can.

Underwriting SpaceX, xAI, and the AI labs like venture

For the highest-multiple private companies, Ackman uses a venture lens and a framework a business school professor taught him: people, opportunity, context, deal. SpaceX scores as one of one on people and opportunity, with an incredible context and a near-monopoly in low-cost launch through Starlink, which makes even Amazon a likely customer. The complicated variable is the deal, meaning the valuation, and he admits he has not done all the math, having invested through an SPV after Ron Baron encouraged him, along with a position in xAI. He treats OpenAI, Anthropic, and Palantir as late-stage venture bets that have proven real revenue, and argues OpenAI in particular should communicate more clearly how it justifies capital commitments that vastly exceed current revenue.

Founder-led companies and the authority to act

Ackman agrees that founder-led companies have a structural advantage in a fast-changing environment. The average S&P 500 CEO has a tenure of roughly three to four years, a small economic stake, and an incentive not to make a career-ending mistake. A founder is betting an entire life and reputation, has the authority of a major voting and economic position, and has usually made several hard, contrarian calls that turned out right. He points to Mark Zuckerberg’s acquisitions of Instagram and WhatsApp, which looked shocking at the time, as exactly the kind of decision a founder with a track record can make and a hired manager often cannot.

Howard Hughes as Berkshire Hathaway 2.0

Ackman points to a detailed financial history of Berkshire Hathaway showing that the vast majority of Buffett’s value creation came from owning insurance and focusing on the asset side of the balance sheet, not just the liability side. Insurance is hard to replicate because skilled investors join hedge funds rather than insurers, but Buffett owned half his company and was a great investor. Pershing Square is applying the same idea to Howard Hughes, a company created out of the General Growth bankruptcy that owns master-planned cities such as Summerlin, with 26,000 acres around Las Vegas, in the spirit of the Irvine Company that made Donald Bren roughly a hundred billion dollars. The plan is to reinvest the company’s cash into insurance, place policyholder float in short-term treasuries and the surplus in common stocks, avoid issuing stock the way Buffett did, and compound for fifty years, all bought at around sixty cents on the dollar.

Cost of capital, reflexivity, and going direct

A company only creates value when it earns above its cost of capital, which is why Howard Hughes, seen as a high-cost-of-capital real-estate business, has long traded at a discount, and why Ackman is repurposing its assets into a higher-returning model. He highlights how reflexive markets are: a higher stock price itself makes a company more valuable by lowering its cost of capital and creating currency to raise money and acquire businesses, a lever Elon Musk used to build Tesla. He attributes real market change less to himself and more to figures like Ryan Cohen and GameStop, where personality and a following can lift a stock far above its value. His own going-direct strategy on X, with 2.2 million followers and famously long posts, is the same mechanism applied to communicating a vision and lowering friction for investors. He closes by laying out three ways to invest with Pershing Square: the management company as a royalty on compounding assets, the PSUS portfolio trading at an 18 percent discount, and Howard Hughes as a bet on building the next Berkshire.

Notable Quotes

“The best investments are one where you don’t need to join the board and do anything.”
Bill Ackman, on the kind of business he most wants to own

“The probability of your being disrupted has gone up enormously.”
Bill Ackman, on why assessing disruption risk now dominates his research

“Valuation is like a tether on the market, right? When it gets too high, it’s like this rubber band that’s stretching and inevitably it bounces back.”
Bill Ackman, on how prices revert at both extremes

“People, opportunity, context, deal.”
Bill Ackman, on the business school framework he uses to underwrite companies like SpaceX

“Every CEO in America today is like, how do I use AI?”
Bill Ackman, on AI as the top opportunity and threat in every boardroom

“A closed mouth gathers no foot.”
Bill Ackman, quoting the line a friend put next to his name in his high school yearbook

“The increase in value of the company increases the value of the company, right? Because it lowers the cost of capital, it gives you more flexibility, gives you the ability to issue stock, raise capital, acquire other businesses.”
Bill Ackman, on the reflexivity between stock price and corporate value

“The company’s got like a $4 billion market cap and the goal is to build it into a trillion dollar thing over time compounding.”
Bill Ackman, on his fifty-year plan for Howard Hughes

Taken together, the conversation is a tour of how Ackman now thinks about quality, disruption, and compounding, and a preview of the Berkshire-style machine he wants to build out of Howard Hughes. Watch the full conversation here.

Related Reading
- Pershing Square Holdings the public vehicle and primary source for Ackman’s portfolio and strategy.
- Howard Hughes Holdings the master-planned community company Ackman is reshaping into an insurance-driven compounder.
- Bill Ackman (Wikipedia) background on the investor’s career and major activist campaigns.
- Berkshire Hathaway the insurance-float compounding model he is trying to emulate.
- The All-In Podcast the show where this conversation took place.
June 3, 2026
Gavin Baker on Orbital Compute, TSMC, Frontier AI Models, Anthropic’s Vertical Take Off, and the Coming Wafer Shortage
Gavin Baker, founder and CIO of Atreides Management, returns to Patrick O’Shaughnessy’s Invest Like the Best for his sixth appearance. He calls the current AI moment the most extraordinary moment in the history of capitalism, walks through what Anthropic’s vertical takeoff in revenue actually means, lays out why orbital compute is closer than skeptics believe, dissects the TSMC bottleneck that may be the only thing standing between today’s market and a full-on AI bubble, and rates every hyperscaler on how they have positioned for a world where frontier model providers may stop selling API access altogether.

TLDW

Anthropic added eleven billion dollars of ARR in a single month, which is roughly the combined business of Palantir, Snowflake, and Databricks built over a decade. That is the setup. From there Gavin Baker covers the March and April selloff, the contrarian read that a closed Strait of Hormuz was actually bullish for American manufacturing competitiveness, why Anthropic and OpenAI multiples may be misleadingly cheap on an unconstrained run rate basis, why Elon Musk’s discipline on SpaceX valuation created a superpower of permanent access to capital, the practical engineering case for orbital compute as racks in space rather than Pentagon sized space stations, why TSMC’s capacity discipline is the single most important variable in whether the AI cycle becomes a bubble, what Terafab in Texas changes, why the Pareto frontier of AI models has flipped from Google dominance to Anthropic and OpenAI dominance in nine months, the shift from all you can eat AI subscriptions to usage based pricing and what that means for revenue scaling, Richard Sutton’s bitter lesson as the largest risk to the AI trade, why frontier tokens still capture an overwhelming share of economic value, the role of continual learning as the third great open question, why most new chip startups should not try to build a better GPU, why Cerebras did something different and hard, why disaggregated inference may extend GPU useful lives to ten or fifteen years and rescue the private credit industry, why being in the token path is the new venture filter, the new prisoner’s dilemma around releasing frontier models via API, an honest rating of Google, Meta, Amazon, and Microsoft, why personal safety is becoming a real AI era risk, and why he remains an AI optimist maximalist who believes this could be the next Pax Americana.

Key Takeaways
- Anthropic added eleven billion dollars of ARR in one month, more than the combined businesses of Palantir, Snowflake, and Databricks built across a decade. There is no precedent for this in the history of capitalism.
- The SaaS and cloud revolution created between five and ten trillion dollars of value over twenty years. AI is replaying that compression on a timeline measured in months.
- The March selloff was a drawdown driven by disagreement with price action, not invalidated thesis. That is the kind of drawdown an investor can lean into.
- Deep Seek Monday in January 2025 was a similar setup. By the day of the selloff, AWS Asia GPU prices had already doubled, GPU availability had fallen, and it was obvious reasoning models would be vastly more compute hungry at inference. The market priced the opposite.
- The Strait of Hormuz closing was actually positive for America. US natural gas (the primary input into US electricity, which feeds AI) fell twenty percent on Bloomberg while Asian and European natural gas doubled or tripled. American manufacturing competitiveness improved overnight.
- The US is now the world’s largest producer and exporter of oil and gas. The economy is dramatically less energy intensive than in the 1970s. The shortage trauma comparison does not hold.
- Tech as a sector traded as cheaply versus the rest of the market in early April as at any point in the last ten years, into the single most bullish moment for AI fundamentals on record.
- Anthropic is dramatically more capital efficient than OpenAI, having burned roughly eighty percent less to reach a similar revenue scale. They have very different structural returns on invested capital.
- Anthropic at roughly nine hundred billion for fifty billion of ARR (growing a thousand percent) is striking. Adjusted for compute constraint, the unconstrained run rate could be one hundred fifty to two hundred billion, putting the implied multiple closer to five times.
- Claude Opus generates roughly seventy percent fewer tokens for the same question than previously, with token quantity tied to answer quality. Subscribers on flat-fee plans are getting a lobotomized model.
- Elon Musk’s superpower is twenty years of making investors money. He never pushes valuation. SpaceX compounded low thirty percent per year for a decade because Musk treats fair pricing as a sacred covenant.
- Capitalism will solve the watts shortage. The current bottleneck has shifted from chips and energy to zoning and political approval. Many capex decisions are paused until after the US midterms.
- The watts shortage probably begins to alleviate in 2027 and 2028. Orbital compute solves it longer term.
- Orbital compute is not Pentagon sized data centers in space. It is racks in space. A Blackwell rack is three thousand pounds, eight feet tall, four feet deep, three feet wide. SpaceX has shown a satellite roughly that size.
- The satellites operate in sun synchronous orbit so solar wings (around five hundred feet per side) always face the sun and the radiator on the dark side always points to deep space.
- Starlink V3 satellites already run at around twenty kilowatts. A Blackwell rack runs at one hundred kilowatts. SpaceX engineers express genuine confidence they have already solved cooling and radiator design at these scales.
- Racks in space are connected with lasers traveling through vacuum, the same lasers already on every Starlink. SpaceX operates the world’s largest satellite fleet and, via xAI Colossus, the world’s largest data center on Earth.
- Inference will move to orbit. Training will stay on Earth for a long time. Terrestrial data centers remain valuable for the rest of an investor’s career.
- The wafer bottleneck is structural and political. TSMC is essentially Taiwan’s GDP, water, and electricity. The leaders see themselves as inheritors of Morris Chang’s sacred legacy and they do not behave like a Western public company.
- Jensen Huang has never had a contract with TSMC. The relationship is run on handshakes and the assumption that things will be fair over time.
- If TSMC did everything Jensen wanted, Nvidia could be selling two to three trillion dollars of GPUs in 2026 and 2027. TSMC’s discipline is the single largest factor preventing a true AI bubble.
- Historically, foundational technologies always get a bubble. Railroads, canals, the internet. The current AI buildout is overwhelmingly funded out of operating cash flow, GPUs are running at one hundred percent utilization, and that is fundamentally different from the year 2000 fiber overbuild.
- If one of Intel or Samsung Foundry catches up at the leading node, the other will follow, and TSMC’s discipline collapses. Watch TSMC capacity decisions to predict a bubble.
- Terafab, the SpaceX and Tesla joint venture to build the world’s largest fab in America, has a partnership with Intel that grants access to fifty years of institutional foundry knowledge. The A teams at ASML, KLA, Lam Research, and Applied Materials will follow Elon’s reputation in hardware engineering.
- The hiring playbook for Terafab includes building Taiwan Town, Japan Town, and Korea Town next to the fab. Recruit the engineers and import their families, their restaurants, and their staff.
- Frontier tokens still capture an overwhelming share of all economic value created at the model layer. This is surprising and is one of the three big open questions for AI investing.
- The Pareto frontier of intelligence versus cost has flipped. Nine months ago Google’s TPU dominated every point on the frontier. Today Anthropic and OpenAI dominate, with Grok 4.3 on the frontier and Gemini 3.1 hanging on.
- Google’s conservative TPU V8 design (partly an attempt to reduce dependence on Broadcom and Nvidia) is the leading explanation for the loss of per token cost leadership.
- AI pricing is shifting from all you can eat to usage based, mirroring the cellular and long distance industries. Cellular stopped being a great growth industry when it went all you can eat. AI just made the opposite move.
- OpenAI and Anthropic together could exceed two hundred billion in ARR this year if compute keeps coming online and frontier token pricing holds.
- The two hundred fifty dollar a month consumer AI plan is no longer enough to evaluate frontier capability. Enterprise plans with usage based billing are required because rate limits are now severe.
- The three biggest open questions for AI investors are: violation of the bitter lesson via ASI or human ingenuity, whether frontier tokens keep commanding their premium, and when continual learning arrives.
- Today’s continual learning is crude reinforcement learning during mid training on verifiable tasks. True continual learning means weights updating dynamically, like a human who learns the first time they touch fire.
- Trying to build a better GPU is a losing strategy. Jensen will copy any one to three percent share design. Startups should target one percent share, do something different, and make it hard enough that Nvidia cannot fast follow.
- Disaggregated inference (separating prefill and decode) opens new design canvases. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently.
- Cerebras did something different and hard with wafer scale computing. Three generations of chips and real grit to get there.
- Disaggregation of inference may stretch GPU useful lives to ten or fifteen years, dropping financing costs from low sevens to five or six percent, mathematically lowering the cost of the AI buildout and likely saving the private credit industry from its SaaS loan exposure.
- Sellers of shortage outperform buyers of shortage. But owning the largest installed base of what is currently in shortage (hyperscaler CPU fleets, for example) is also a strong position.
- Most of the economic value at the application layer of AI has been destroyed, not created. The exceptions are companies in the token path or in niches small enough that frontier labs ignore them.
- Coding may be the shortest path to ASI. If you can write code, you can write code that does anything. Cursor, Cognition, and Anthropic correctly focused on it.
- Jensen could probably get close to the frontier with his own Nemotron family of models whenever he wants. The fact that he chooses not to is a strategic decision about not commoditizing his customers.
- The new prisoner’s dilemma in AI is whether frontier labs release their best model via API. If everyone agrees not to, Chinese open source falls behind. If anyone defects, the defector pulls ahead on revenue and resources, forcing everyone else to defect.
- Google still owns the largest compute installed base. Without TPU’s prior cost advantage, this matters more. YouTube data has real value in a world of robotics. GCP is going crazy.
- Meta deserves credit for becoming AI first internally faster than any other internet giant. Musa, their first MSL model, is impressively close to the Pareto frontier.
- Amazon is strong because of Trainium and robotics driven retail P&L efficiency. Nova is better than it gets credit for.
- Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Microsoft products rather than reselling to OpenAI is a courageous and probably correct call, even at the cost of an eight hundred dollar stock price.
- The hyperscalers most engaged with startups are Amazon and Nvidia by a mile, followed by Google. Broadcom is the favorite ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement and that will cost them as the best teams are now at startups.
- Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion at the speed of FaceTime is already feasible.
- Ukraine is winning largely on the back of having the best battlefield AI outside America and Israel. Adversaries are starting to internalize what AI dominance means geopolitically.
- An optimistic read is that this becomes a new Pax Americana, the way the post 1945 American nuclear monopoly was used to rebuild Germany and Japan rather than dominate.
- AI cured a friend’s daughter’s rare disease by spinning up a research effort that identified a market drug capable of impacting her condition. That is the upside that keeps Gavin an AI optimist maximalist.
Detailed Summary

The most extraordinary moment in the history of capitalism

Gavin’s framing of the current moment is unusually direct. Anthropic added eleven billion dollars of annual recurring revenue in a single month. The three highest profile SaaS companies of the last decade plus, Palantir, Snowflake, and Databricks, took a decade and tens of thousands of employees collectively to build the combined business that Anthropic added in thirty days. He has been investing through every major tech cycle and says there is no historical analog. Not the dotcom era, not the cloud transition, not mobile. This is its own thing.

The market response, then, was peculiar. The NASDAQ sold off into the single most bullish moment for AI fundamentals on record. Tech traded at roughly its widest discount versus the rest of the market in a decade. Investors who said they wished they had bought into AI during 2022, during COVID, or during Deep Seek Monday got the same valuation setup again in early April, this time with an even clearer inflection.

Why the Strait of Hormuz closing was secretly bullish for America

One reason the macro fear in March may have been mispriced is that the same geopolitical event that drove the selloff was, in practice, a relative benefit to the United States. American natural gas, the input into American electricity, which is the input into American AI training and inference, fell roughly twenty percent. Asian and European natural gas prices doubled or tripled. The US emerged with sharply improved relative manufacturing competitiveness, which is exactly what the current administration cares about.

The 1970s comparison does not hold. The US economy is dramatically less energy intensive, it is now the world’s largest producer and largest exporter of oil and gas, and there are no shortages, only price moves. That backdrop made it easier for disciplined investors to stay focused on AI fundamentals through the volatility.

Anthropic and OpenAI valuations on an unconstrained run rate

Anthropic at roughly nine hundred billion for fifty billion of ARR sounds rich until you adjust for the fact that the company is severely compute constrained. Gavin estimates that, unconstrained, Anthropic might be at one hundred fifty to two hundred billion in run rate revenue, putting the implied multiple closer to five times. He also points out that Claude Opus now generates roughly seventy percent fewer tokens for the same question than it used to. Token quantity correlates with answer quality, and Anthropic is rate limiting and shrinking outputs to ration capacity across its user base.

Anthropic and OpenAI are also structurally very different. Anthropic has burned around eighty percent less cash than OpenAI to reach a comparable revenue scale. That implies very different long term returns on invested capital, though OpenAI has done a better job locking in compute and Sarah Friar is one of the most exceptional CFOs Gavin has worked with.

Why neither lab is raising at a three trillion dollar valuation

The answer Gavin gives is that both labs are deliberately leaving valuation on the table the way Elon has done for two decades. SpaceX compounded at low thirty percent annually for a decade because Elon never pushed price. The result is a permanent superpower of access to capital. Investors trust him because they have made money with him for twenty years. That is a moat that compounds with every round.

Anthropic could probably raise at a one hundred percent premium to its rumored latest mark. They are choosing not to. In an uncertain world (Ukraine, Russia, Iran, Taiwan), preserving the ability to raise more capital later at fair prices is more valuable than maximizing this round.

Watts and wafers, the two real constraints

Capitalism is solving the watts problem. The leading PE infrastructure investors now say zoning and political approval, not chips or energy, are the gating factors. Companies are deferring big capex announcements until after the US midterms. Turbine capacity is being doubled at the manufacturers. Companies like Boom Aerospace are repurposing jet engines for grid use. Watts probably ease meaningfully in 2027 and 2028 and then orbital compute does the rest.

Wafers are the harder problem because they live in Taiwan, run on handshakes, and depend on a corporate culture that does not respond to public market incentives. TSMC is essentially the GDP, water consumption, and electricity consumption of Taiwan. Its leadership treats the company as the legacy of Morris Chang. The Silicon Shield doctrine is real and internal.

Orbital compute as racks in space

The biggest mental update Gavin asks listeners to make is to stop picturing data centers in space as Pentagon sized space stations. A Blackwell rack is three thousand pounds and roughly the size of a refrigerator. SpaceX has shown a concept satellite of about that size. Solar wings extend five hundred feet to each side and the radiator extends hundreds of feet behind, both possible because the orbit is sun synchronous and the orientation is fixed relative to the sun.

SpaceX engineers Gavin has spoken to at Starbase express genuine confidence that they have solved cooling at these power levels. They have. Starlink V3 satellites already operate at twenty kilowatts. A Blackwell rack is one hundred kilowatts. The same company operates the world’s largest satellite fleet and the world’s largest data center on Earth via xAI Colossus. The racks are connected to each other with lasers traveling through vacuum, technology already deployed in every Starlink. The naysayers, Gavin observes, are armchair skeptics and Larry Ellison’s response (he is out there landing rockets, no one else is) is the right frame.

Terafab in Texas and the threat to TSMC’s discipline

Terafab, the SpaceX and Tesla joint venture, intends to be the largest fab in the world. The partnership with Intel grants access to fifty years of foundry institutional knowledge, allowing Terafab to start three to five quarters behind the leading node rather than fifteen years behind. The A teams at the semicap equipment companies (ASML, KLA, Lam Research, Applied Materials) will follow Elon’s reputation in hardware engineering the same way they followed TSMC twenty years ago when Intel stumbled.

The talent strategy is the part most observers underestimate. Recruit the best engineers globally, then import their families, their restaurants, their staff. Build Taiwan Town, Japan Town, and Korea Town next to the fab. Optimize the human experience for the people whose work matters. Intel and Samsung do not think that way.

Bubble watch and the year 2000 comparison

Every foundational technology in modern history has had a bubble. Railroads, canals, the internet. Carlota Perez documented why. Markets correctly identify the importance, diversity of opinion collapses, supply gets ahead of demand, the bubble crashes. The current cycle has two important differences. The buildout is overwhelmingly funded out of operating cash flow, not debt. Every GPU is running at one hundred percent utilization, while at the peak of the fiber bubble ninety nine percent of fiber was unused.

TSMC discipline is the single largest reason a bubble has not formed. If Jensen could buy everything TSMC could theoretically make, Nvidia could sell two to three trillion dollars of GPUs in 2026 and 2027. At some point that becomes more than the market can absorb. If Intel or Samsung Foundry catches up at the leading node, the other will too. TSMC’s pricing discipline collapses and the bubble starts.

The Pareto frontier and the loss of Google’s cost advantage

The most important chart in AI is the Pareto frontier of model intelligence versus per token cost. Nine months ago, Google’s TPU based models dominated every point on it. OpenAI, Anthropic, and xAI sat inside the frontier. Today the frontier is dominated by Anthropic and OpenAI, with Grok 4.3 on the frontier and Gemini 3.1 hanging on by subsidization more than economics. The most likely cause is Google’s conservative TPU V8 design, an attempt to reduce dependence on Broadcom and Nvidia that sacrificed per token economics.

The bitter lesson, frontier tokens, and continual learning

Three open questions dominate AI investing. The first is whether Richard Sutton’s bitter lesson (more compute beats human algorithmic cleverness) gets violated by ASI itself optimizing for efficiency. Closer observers of AI are more skeptical of a violation. Gavin thinks ASI’s first move will be to make itself more efficient and more resourced, which is technically a temporary violation.

The second is whether frontier tokens keep capturing the overwhelming share of economic value at the model layer. Today they do, surprisingly. Gemini 3.1 Pro was mindblowing nine months ago and is intolerable today. The third is when continual learning arrives. Today’s models need a million fire touches to learn what a human learns from one. True continual learning would mean dynamic weight updates in real time and would produce a fast takeoff.

From all you can eat to usage based AI pricing

AI is shifting from flat fee plans to usage based pricing. The historical analogy is cellular and long distance. Both stopped being great growth industries when they went all you can eat. AI just made the opposite move. The consequence is that flat fee subscribers, even on premium consumer plans, get a rate limited and token throttled version of the frontier model. Enterprise plans with usage based billing are now required to evaluate true capability. Gavin thinks the combination of new compute coming online and usage based pricing is what gets OpenAI and Anthropic past two hundred billion in combined ARR this year.

Chip startups, prefill decode disaggregation, and Cerebras

Trying to build a better GPU is the wrong move. The four scaled players (Nvidia, AMD, Trainium, TPU) have copy capability for any one to three percent share design that looks attractive. The good news for startups is that disaggregated inference (separating prefill and decode) opens a richer design canvas. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently. Andrew Fox’s analogy is a British naval ship of the eighteenth century. Prefill is loading the cannon. Decode is firing it.

Cerebras is the model. Wafer scale computing is genuinely different and genuinely hard. It took three generations of chips to get right. Andrew Feldman and his team had the grit to keep going through chip one being a failure. The design has a high ratio of on chip compute and memory relative to shoreline IO, which is why Cerebras is now experimenting with putting an optical wafer on top of the compute wafer to solve scale out.

GPU useful lives and the rescue of private credit

One of the strongest claims in the conversation is that disaggregated inference will stretch GPU useful lives to ten or fifteen years. The skeptical narrative (GPUs are obsolete in two years, companies are cooking their depreciation books) is wrong. You can put a Cerebras system or Groq LPU in front of older Hopper or Ampere parts, use them only for prefill, and run them until they physically melt. Private credit, which is in pain from SaaS loans and which underwrote GPU loans on three to four year lives, may be saved by this.

If GPU financing rates can come down from low sevens to five or six percent, the mathematics of the AI buildout improves materially. That is a structural tailwind that compounds for years.

The application layer, the token path, and a new prisoner’s dilemma

Trillions of dollars of value have been destroyed at the application layer, not created. Cursor and Cognition are the rare scaled exceptions, and they got there by focusing on coding very early. As Amjad Masad noted, coding is plausibly the shortest path to ASI because a coding agent can write itself into any new domain. Jamin Ball’s frame is that the new venture filter is whether the company is in the token path. Data Bricks is. Most application layer startups are not.

Jensen could probably get close to the frontier with Nemotron whenever he wants, and the strategic question of whether to do that is a new prisoner’s dilemma. If every frontier lab agrees not to release best models via API, Chinese open source falls steadily behind. If anyone defects, the defector gains revenue and resources, and everyone else has to defect. The same dynamic exists between TSMC, Intel, and Samsung. If Nvidia or AMD ever truly used an alternative foundry, that foundry would catch up rapidly.

Rating the hyperscalers

Google has the largest compute installed base, the YouTube data that matters in a robotics world, and a search business that prints. Their loss of TPU cost leadership is the surprise of the year. If Google IO in five days does not produce a leapfrog model, the Nvidia centric narrative gets even stronger.

Meta deserves real credit. Zuckerberg made Meta AI first internally faster than any other internet giant, paid up for the talent contracts when no one else would, and shipped Musa as a first model from MSL that is close to the Pareto frontier. Amazon is well positioned on Trainium, robotics in retail, and a Nova model line that is better than it gets credit for. Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Copilot rather than reselling to OpenAI is courageous and probably correct, even at the cost of stock price.

The most interesting cross hyperscaler metric is startup engagement. Nvidia and Amazon engage deeply with startups. Google is next. Broadcom is the favored ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement, which Gavin believes will cost them as the best teams now sit at startups.

Personal safety, geopolitics, and the Pax Americana case

The closing section turns darker. Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion via something that looks exactly like your child calling on FaceTime is already feasible. Political violence against AI leaders is a real concern. Geopolitically, Ukraine is winning largely because it has the best battlefield AI outside America and Israel. How adversaries respond to that asymmetry is the next great variable.

Gavin’s optimistic frame is the Pax Americana. After 1945 the US had a nuclear monopoly and could have controlled the world. Instead it rebuilt Germany and Japan, both of which became the most reliable American allies for the next eighty years. If AI dominance plays out similarly, this is a generationally positive story rather than a destabilizing one. The personal anecdote that closes the conversation is a friend whose daughter was diagnosed with a rare genetic condition. He spun up agents, identified a drug already on the market that addresses her mutation, and her life is immeasurably different because of AI. That is the upside.

Thoughts

The Anthropic eleven billion in a month framing is the kind of stat that resets priors. The right way to interpret it is not as a one off but as a measure of how fast value can compound when the underlying technology improves on a curve steeper than the ability of the rest of the economy to absorb it. The skeptical question is whether that ARR is durable or whether it is heavily tied to a customer base of other AI companies that are themselves on a single venture funded year of runway. The bullish answer is that frontier coding, frontier research, and frontier enterprise tasks are not going to stop being valuable, and Anthropic is the best at all three. Both can be true. The number is still extraordinary.

The argument that TSMC discipline is the only thing preventing a bubble is the analytically tightest part of the conversation. The implied trade is to watch TSMC capacity additions like a hawk and to be more, not less, cautious if Intel Foundry or Samsung Foundry ever announce real share at the leading node. The Terafab thesis is more speculative but more interesting. If Elon’s talent recruiting playbook works and the Intel partnership gives Terafab a real seat at the table within five years, the geometry of the global semiconductor industry shifts in a way that is bullish for American manufacturing, bullish for power and water infrastructure in Texas, and ambiguous for TSMC itself.

The Pareto frontier discussion deserves more attention than it usually gets. Pricing leadership in AI is not a vanity metric. It determines who can subsidize free tier usage, who can absorb compute shortages, who can ship cheaper enterprise plans, and ultimately whose model becomes the default for any given workload. Google losing per token leadership in nine months is one of the most under analyzed events in the sector and it explains a lot about why Anthropic and OpenAI are growing the way they are. If Google IO does not produce a leapfrog model, the implied verdict on TPU V8 design choices gets a lot harsher.

The application layer destruction point is worth sitting with. Founders building on top of frontier models are competing in a world where the model itself moves faster than any moat they can build, where the model lab can absorb their niche if it gets interesting, and where the only protection is either deep token path integration or a niche so small the lab does not bother. That is a much harsher venture environment than the early SaaS era. The compensating opportunity is that one human can now run a hundred agents, so the ceiling on what a small team can build is correspondingly higher. The bet is that productivity per founder rises faster than competitive pressure from the labs. We will find out.

The orbital compute pitch is the section that will polarize listeners. The naive read is that this is science fiction. The closer read is that every component (sun synchronous orbit, laser interconnect, twenty kilowatt satellite buses, ten thousand satellite manufacturing cadence, full rocket reusability) already exists. The remaining engineering problems are repair, maintenance, and radiator scale, all of which are real but tractable on a five to ten year horizon. The strategic implication is that the political and zoning ceiling on terrestrial data centers becomes less binding if orbital compute is a credible alternative for inference workloads. The investor implication is that being short the watts and cooling complex on a five year horizon is a real trade, not a meme.

Watch the full conversation here.
May 20, 2026
Krishna Rao on Anthropic Going From 9 Billion to 30 Billion ARR in One Quarter and the Compute Strategy Powering Claude
Krishna Rao, Chief Financial Officer of Anthropic, sat down with Patrick O’Shaughnessy on Invest Like the Best for one of the most detailed public looks yet at the operating engine behind Claude. He covers how Anthropic compounded from $9 billion of run rate revenue at the start of the year to north of $30 billion by the end of Q1, why he spends 30 to 40 percent of his time on compute, the playbook for buying gigawatts of AI infrastructure across Trainium, TPU, and GPU platforms, how Anthropic prices its models, why returns to frontier intelligence keep climbing, and what the Mythos release tells us about the cyber capabilities of the next generation of Claude.

TLDW

Anthropic is running the most compute fungible frontier lab in the world, with active deployments across AWS Trainium, Google TPU, and Nvidia GPU, and an internal orchestration layer that lets a chip serve inference in the morning and run reinforcement learning the same evening. Krishna Rao explains the cone of uncertainty that governs gigawatt scale compute procurement, the floor Anthropic refuses to drop below on model development compute, the Jevons paradox unlock from cutting Opus pricing, the 500 percent annualized net dollar retention from enterprise customers, the layer cake of long term deals with Google, Broadcom, Amazon, and the recent xAI Colossus tie up in Memphis, the phased release of the Mythos model in response to spiking cyber capabilities, the internal use of Claude Code to produce statutory financial statements and run a Monthly Financial Review skill, and why the team believes scaling laws are alive and well. The interview also covers fundraising history through Series D and Series E, the $75 billion already raised plus another $50 billion coming, talent density beating talent mass during the Meta poaching wave, and Rao’s belief that biotech and drug discovery represent the most exciting frontier for AI.

Key Takeaways
- Anthropic entered the year with about $9 billion of run rate revenue and ended the first quarter with north of $30 billion of run rate revenue, a more than 3x leap driven by model intelligence gains and the products built around them.
- Compute is described as the lifeblood of the company, the canvas everything else is built on, and the most consequential class of decisions Rao makes. Buy too much and you go bankrupt. Buy too little and you cannot serve customers or stay at the frontier.
- Rao spends 30 to 40 percent of his time on compute, even today, and the leadership team meets repeatedly on both procurement and ongoing compute allocation.
- Anthropic is the only frontier language lab actively using all three major chip platforms in production: AWS Trainium, Google TPU, and Nvidia GPU. It is also the only major model available on all three clouds.
- Flexibility is the central design principle. Anthropic builds flexibility into the deals themselves, into the orchestration layer that maps workloads to chips, and into compilers built from the chip level up.
- The cone of uncertainty frames procurement. Small differences in weekly or monthly growth compound into wildly different two year outcomes, so the team plans across a range of scenarios rather than a single point estimate, and ranges toward the upper end while protecting downside.
- Compute allocation across the company sits in three buckets: model development and research, internal employee acceleration, and external customer serving. A non negotiable floor protects model development even when customer demand is tight.
- Anthropic estimates that if it cut off internal employee use of its own models, the freed compute could serve billions of dollars of additional revenue. It chooses not to, because internal use compounds into better future models.
- Intelligence is multi dimensional, not a single IQ score. Anthropic measures real world capability through customer feedback, long horizon task performance, tool use, computer use, and speed at agentic tasks, not just leaderboard benchmarks that have largely saturated.
- Each Opus generation, 4 to 4.5 to 4.6 to 4.7, delivers both capability improvements and an efficiency multiplier on token processing. New models often serve customers at a fraction of the prior cost while doing more.
- Reinforcement learning is described as inference inside a sandbox with a reward function, so model efficiency gains directly improve internal RL throughput. The flywheel is tightly coupled.
- Over 90 percent of code at Anthropic is now written by Claude Code, and a large share of Claude Code itself is written by Claude Code.
- Anthropic shipped roughly 30 distinct product and feature releases in January and the pace has accelerated since.
- Scaling laws, in Anthropic’s internal data, are alive and well. The team holds itself to a skeptical scientific standard and still does not see them slowing down.
- Anthropic recently signed a 5 gigawatt deal with Google and Broadcom for TPUs starting in 2027, plus an Amazon Trainium agreement for up to 5 gigawatts, totaling more than $100 billion in commitments. A significant portion lands this year and next year.
- A new partnership for capacity at the xAI Colossus facility in Memphis was announced just before the interview, aimed at expanding consumer and prosumer capacity.
- Pricing has been remarkably stable across Haiku, Sonnet, and Opus. The biggest deliberate change was lowering Opus pricing, which produced a textbook Jevons paradox: consumption rose far faster than the price drop, and the new Opus 4.6 and 4.7 slot in at the same price point.
- Mythos is the first model Anthropic chose to release in a phased way because of a sharp spike in cyber capability. In an open source codebase where a prior model found 22 security vulnerabilities, Mythos found roughly 250.
- The Mythos release framework focuses on defensive use first, expands access over time, and is presented as a template for future capability spikes.
- Anthropic now sells to 9 of the Fortune 10 and reports net dollar retention above 500 percent on an annualized basis. These are not pilots. Rao describes signing two double digit million dollar commitments during a 20 minute Uber ride to the studio.
- The platform strategy is mostly horizontal. Anthropic will go vertical with offerings like Claude for Financial Services, Claude for Life Sciences, and Claude Security where it can demonstrate the model’s capabilities, but expects most application value to accrue to customers building on top.
- Investors raised over $75 billion in equity since Rao joined, with another $50 billion in commitments tied to the Amazon and Google deals. Capital intensity is real, but the raises fund the upper end of the cone of uncertainty more than they fund current losses.
- The Series E close coincided with the day the DeepSeek news broke, forcing investors to reassess their AI thesis in real time. Anthropic closed the round anyway.
- Inside finance, Claude now produces statutory financial statements for every Anthropic legal entity, with a human checker. A library of more than 70 finance specific skills underpins workflows.
- A custom Monthly Financial Review skill produces a 90 to 95 percent ready monthly close report, so leadership discussion shifts from reconciling numbers to debating implications.
- An internal real time analytics platform called Anthrop Stats compresses weekly insight cycles from hours to about 30 minutes.
- The biggest token user inside Anthropic’s finance team is the head of tax, focused on tax policy engines and workflow automation. The most senior people, not the youngest, are leading internal adoption.
- Talent density beats talent mass. When Meta and others ran aggressive offer waves, Anthropic lost two people while peer labs lost dozens.
- All seven Anthropic co founders remain at the company, as does most of the first 20 to 30 employees, which Rao credits to a collaborative, transparent, debate friendly culture and a real culture interview that can veto otherwise top tier candidates.
- Dario Amodei holds an open all hands every two weeks, writes a short prepared document, and takes unscripted questions from anyone at the company.
- AI safety investments in interpretability and alignment have a commercial side effect. Looking inside the model helps Anthropic build better models, and enterprises selling sensitive workloads want to trust the lab they hand customer data to.
- Anthropic explicitly identifies as America first in its approach to model development, and engages closely with the US administration on capability releases such as Mythos.
- The longer term product vision is the virtual collaborator: an agent with organizational context, access to the company’s tools, persistent memory, and the ability to work on ideas, not just tasks, over long horizons.
- CoWork, Anthropic’s extension of the Claude Code paradigm into general knowledge work, is being adopted faster than Claude Code itself when indexed to the same point in its launch curve.
- Anthropic’s product teams ship daily, with a fleet of agents working across the company on specific tasks. Everyone effectively becomes a manager of agents.
- The dominant downside risks to Anthropic’s high end forecast are slower customer diffusion of model capability into real workflows, scaling laws flattening unexpectedly, and Anthropic losing its position at the frontier.
- Rao is most excited about biotech and healthcare outcomes, especially the prospect that AI could push drug discovery and lab throughput up 10x or 100x, turning currently incurable diagnoses into treatable ones within a patient’s lifetime.
Detailed Summary

Compute as Lifeblood and the Cone of Uncertainty

Rao opens with the claim that compute is the most important resource at Anthropic, and the most consequential decision class in the company. You cannot buy a gigawatt of compute next week. You have to anticipate demand a year or two in advance, and the cost of being wrong in either direction is high. Buy too much and the unit economics collapse. Buy too little and you cannot serve customers or stay at the frontier, which are described as the same failure mode. To navigate this, the team uses a cone of uncertainty rather than point estimates. Small differences in weekly growth compound into vastly different two year outcomes, and Anthropic tries to position itself toward the upper end of that cone while preserving optionality. Rao notes he has had to consciously break a lifetime of linear thinking and force himself into exponential models.

Three Chip Platforms, One Orchestration Layer

Anthropic uses Amazon’s Trainium, Google’s TPUs, and Nvidia’s GPUs fungibly. That was not free. Adopting TPUs at scale started around the third TPU generation, when outside observers thought it was a strange choice. Anthropic invested years into compilers and orchestration so workloads can flow across chips by generation and by job type. The team works deeply with Annapurna Labs at AWS to influence Trainium roadmaps because Anthropic stresses these chips harder than almost anyone. The result is what Rao believes is the most efficient utilization of compute across any frontier lab, with a dollar of compute going further inside Anthropic than anywhere else.

Three Buckets and the Model Development Floor

Compute gets allocated across model development, internal acceleration of employees, and customer serving. The conversations are collaborative rather than zero sum, but there is a hard floor on model development that the company refuses to cross even if it makes customer demand harder to serve in the short term. The thesis is simple. The returns to frontier intelligence are extremely high, especially in enterprise, so cutting model investment to chase near term revenue is a bad trade. Internal employee use is also explicitly protected. Rao notes that diverting that internal usage to external customers would unlock billions of additional revenue today, but the compounding benefit of accelerating researchers and engineers outweighs that.

Intelligence Is Multi Dimensional

Rao pushes back hard on the IQ framing of model progress. Benchmarks saturate quickly, and the real signal comes from how customers actually use the models. Anthropic looks at long horizon task completion, tool use, computer use, and time to result on agentic tasks. Two equally capable agents who differ only in speed produce dramatically different value, because the faster one compounds into more attempts and more outcomes. Frontier model leaps are also fuel efficient. The sedan to sports car analogy breaks down because each Opus generation, 4 to 4.5 to 4.6 to 4.7, delivers a step up in capability and a multiplier on per token efficiency.

From 9 Billion to 30 Billion ARR in One Quarter

The headline number for the quarter is a leap from about $9 billion of run rate revenue to over $30 billion, accomplished without onboarding a corresponding step up in compute, because new compute lands on ramps locked in 12 months prior. Rao attributes the leap to model capability gains, products that surface that intelligence in usable form factors, and an enterprise customer base that pulls more workloads onto Claude as each generation unlocks new use cases. Coding started the wave with Sonnet 3.5 and 3.6, and the same pattern is now playing out elsewhere in the economy.

Recursive Self Improvement and Talent Density

Over 90 percent of Anthropic’s code is now written by Claude Code, including most of Claude Code itself. Rao describes this as a structural reason to keep allocating internal compute to employees even when external demand is hungry. Recursive self improvement is not happening through models that need no humans. It is happening through researchers who set direction and use frontier models to compress months of work into days. Talent density beats talent mass. When Meta and other labs went after Anthropic researchers with very large packages, Anthropic lost two people while peer labs lost dozens.

Procurement Strategy and the Layer Cake

Compute lands as a layer cake. Last month Anthropic signed a 5 gigawatt TPU deal with Google and Broadcom starting in 2027, alongside an Amazon Trainium agreement for up to 5 gigawatts. The total is north of $100 billion in commitments. A new tie up with xAI’s Colossus facility in Memphis was announced just before the interview, intended for nearer term capacity to support consumer and prosumer growth. Anthropic evaluates near term and long term compute deals against the same set of variables: price, duration, location, chip type, and how efficiently the team can run it. The relationships are deeper than procurement. The hyperscalers are also distribution channels for the model.

Platform First, Selective Vertical Bets

Rao describes Anthropic as a platform first business, with most expected value accruing to customers building on the platform. The team will only go vertical when it can either demonstrate capabilities that are skating to where the puck is going, like Claude Code did before the models could fully support it, or when it wants to set a template for an industry vertical, as with Claude for Financial Services, Claude for Life Sciences, and Claude Security. He acknowledges that surprise capability jumps make customers anxious about the platform competing with them, and frames Anthropic’s mitigation as deeper partnerships, early access programs, and an emphasis on accelerating customer building rather than disintermediating it.

Pricing, Jevons Paradox, and Return on Compute

Pricing across Haiku, Sonnet, and Opus has been stable. The notable exception is Opus, which Anthropic deliberately repriced lower when launching Opus 4.5 because Opus class problems were being squeezed into Sonnet workloads. Efficiency gains made it possible to serve Opus profitably at the new level. The consumption response was a classic Jevons paradox, with usage rising far more than the price reduction would have predicted, and Opus 4.6 then slotted in at the same price with a capability bump. Margins are not framed as a per token markup. Compute is fungible across model development, internal acceleration, and customer serving, so Anthropic measures return on the entire compute envelope rather than software style variable cost per call.

Fundraising, DeepSeek, and Capital Intensity

Rao joined while Anthropic was closing its Series D, mid frontier model launch and during the FTX share liquidation. Investors initially questioned whether Anthropic needed a frontier model, whether AI safety and a real business could coexist, and why the sales team was so small. The Series E closed the same day the DeepSeek news broke, with markets violently re pricing AI in real time. Since Rao joined, Anthropic has raised over $75 billion, with another $50 billion tied to the Amazon and Google compute deals. The reason for the size of the raises is the cone of uncertainty, not current losses. Returns on compute today are described as robust.

Mythos, Cyber Capability, and Phased Releases

The Mythos release marks the first time Anthropic shipped a model under a deliberately phased rollout because of a specific capability spike. Cyber is the dimension that spiked. Where a prior model found 22 vulnerabilities in an open source codebase, Mythos found roughly 250. The defensive applications, automatically patching massive codebases, are genuinely valuable, but the offensive risk is real enough that Anthropic chose to release to a smaller group first and expand access over time. Rao positions this as a template for future capability spikes, not a permanent restriction. He also describes the relationship with the US administration as cooperative, including the Department of War interaction, with Anthropic supporting a regulatory framework that does not strangle innovation but takes responsibility seriously.

Claude Inside Finance

Anthropic’s finance team is one of the strongest internal case studies. Statutory financial statements for every legal entity are produced by Claude, with a human reviewer. A skill library of more than 70 finance specific skills underpins a Monthly Financial Review skill that drafts the monthly close at 90 to 95 percent ready, so leadership meetings shift from explaining the numbers to discussing what to do about them. An internal analytics platform called Anthrop Stats compresses weekly insight cycles from hours to 30 minutes. The biggest internal token user in finance is the head of tax, building policy engines, which Rao highlights as evidence that adoption is driven by the most senior people, not just younger engineers.

Culture, Co Founders, and the Race to the Top

Seven co founders should not, on paper, work as a leadership group. Rao argues it works because the culture was set early around collaboration, intellectual honesty, transparency, and humility. The culture interview is a real veto, not a checkbox. Dario Amodei runs an all hands every two weeks with a short written piece followed by unscripted questions, and decisions, once made, get clean alignment rather than residual politics. Anthropic frames its approach as a race to the top, where being a model for how to build the technology responsibly is itself a recruiting and retention advantage.

The Virtual Collaborator and the Frontier Ahead

The product vision Rao describes is the virtual collaborator. Not just a smarter chatbot, but an agent with organizational context, access to the company’s tools, memory, and the ability to work on ideas over long horizons. Coding was the first domain to feel this, but CoWork, Anthropic’s extension of the Claude Code pattern into general knowledge work, is being adopted faster than Claude Code was at the same age. Product development inside Anthropic already looks different. Teams ship daily, with fleets of agents working across the company, and individual humans increasingly act as managers of those fleets.

Downside Risks and What Excites Him Most

The three risks Rao names if asked to do a premortem on a softer year are slower customer diffusion of model capability into real workflows, scaling laws unexpectedly flattening, and Anthropic losing its frontier position to competitors. None of these are observed today, but he is unwilling to claim them with certainty. On the upside, he is most excited about biotech and healthcare. Lab throughput rising 10x or 100x, paired with AI assisted clinical workflows, could turn currently incurable diagnoses into treatable ones within a patient’s lifetime. That is the outcome he wants the technology to chase.

Thoughts

The most consequential structural point in this interview is the framing of compute as a single fungible resource pool measured by return on the entire envelope, not as a variable cost per inference call. That accounting shift, if you accept it, breaks most of the bear cases about AI lab unit economics. The bear argument almost always assumes that a token served to a customer is the only thing the chip did that day. Rao’s version is that the same fleet trains models in the morning, runs reinforcement learning at lunch, serves customers in the afternoon, and accelerates internal engineers in the evening. If even half of that is real, the right comparison is total compute spend versus total enterprise value created by the platform, and on that ratio Anthropic looks structurally strong rather than weak.

The Jevons paradox on Opus pricing is the most actionable insight for anyone running an AI product. Most teams default to either chasing premium pricing on the newest model or undercutting to chase volume. Anthropic did something more disciplined: it left Sonnet and Haiku alone, dropped Opus when efficiency gains made it serveable, and watched aggregate usage rise faster than the price cut. The lesson is that frontier model pricing is not really a price problem. It is a capability access problem, and elasticity around the right tier is much higher than the standard SaaS playbook implies.

The Mythos cyber jump deserves more attention than it has gotten. Going from 22 to 250 vulnerabilities found in the same codebase is the kind of capability discontinuity that genuinely changes the regulatory calculus. Anthropic is signaling that it can identify these discontinuities ahead of release and choose a deployment shape that respects them. Whether peer labs adopt similar discipline is the open question. Anthropic’s race to the top framing assumes they will be forced to. The competitive market may say otherwise.

The hiring data point is the most underrated investor signal. Two departures while peer labs lost dozens, during the most aggressive talent war in tech history, is not a culture poster. It is a structural advantage that compounds every time another lab tries to buy its way to the frontier. Money can be matched. Conviction in the mission, transparent leadership, and a culture interview that can veto otherwise stellar candidates cannot. If you believe scaling laws hold, talent retention at this density is one of the few moats that actually scales with capital.

Finally, the most interesting personal admission is that Krishna Rao, a finance leader trained at Blackstone and Cedar, is openly telling investors that linear thinking is the failure mode he had to break out of. The companies that pattern match this moment to prior technology waves are mispricing it, in both directions. The cone of uncertainty Anthropic uses internally is the right metaphor for everyone else too. If you are forecasting AI as if it is cloud in 2010, you are almost certainly wrong, and the magnitude of the error is much larger than it would be in any prior era.

Watch the full conversation with Krishna Rao on Invest Like the Best here.
May 13, 2026
Inside Microsoft’s AGI Masterplan: Satya Nadella Reveals the 50-Year Bet That Will Redefine Computing, Capital, and Control
1) Fairwater 2 is live at unprecedented scale, with Fairwater 4 linking over a 1 Pb AI WAN

Nadella walks through the new Fairwater 2 site and states Microsoft has targeted a 10x training capacity increase every 18 to 24 months relative to GPT-5’s compute. He also notes Fairwater 4 will connect on a one petabit network, enabling multi-site aggregation for frontier training, data generation, and inference.

2) Microsoft’s MAI program, a parallel superintelligence effort alongside OpenAI

Microsoft is standing up its own frontier lab and will “continue to drop” models in the open, with an omni-model on the roadmap and high-profile hires joining Mustafa Suleyman. This is a clear signal that Microsoft intends to compete at the top tier while still leveraging OpenAI models in products.

3) Clarification on IP: Microsoft says it has full access to the GPT family’s IP

Nadella says Microsoft has access to all of OpenAI’s model IP (consumer hardware excluded) and shared that the firms co-developed system-level designs for supercomputers. This resolves long-standing ambiguity about who holds rights to GPT-class systems.

4) New exclusivity boundaries: OpenAI’s API is Azure-exclusive, SaaS can run elsewhere with limited exceptions

The interview spells out that OpenAI’s platform API must run on Azure. ChatGPT as SaaS can be hosted elsewhere only under specific carve-outs, for example certain US government cases.

5) Per-agent future for Microsoft’s business model

Nadella describes a shift where companies provision Windows 365 style computers for autonomous agents. Licensing and provisioning evolve from per-user to per-user plus per-agent, with identity, security, storage, and observability provided as the substrate.

6) The 2024–2025 capacity “pause” explained

Nadella confirms Microsoft paused or dropped some leases in the second half of last year to avoid lock-in to a single accelerator generation, keep the fleet fungible across GB200, GB300, and future parts, and balance training with global serving to match monetization.

7) Concrete scaling cadence disclosure

The 10x training capacity target every 18 to 24 months is stated on the record while touring Fairwater 2. This implies the next frontier runs will be roughly an order of magnitude above GPT-5 compute.

8) Multi-model, multi-supplier posture

Microsoft will keep using OpenAI models in products for years, build MAI models in parallel, and integrate other frontier models where product quality or cost warrants it.

Why these points matter
- Industrial scale: Fairwater’s disclosed networking and capacity targets set a new bar for AI factories and imply rapid model scaling.
- Strategic independence: MAI plus GPT IP access gives Microsoft a dual track that reduces single-partner risk.
- Ecosystem control: Azure exclusivity for OpenAI’s API consolidates platform power at the infrastructure layer.
- New revenue primitives: Per-agent provisioning reframes Microsoft’s core metrics and pricing.
Pull quotes

“We’ve tried to 10x the training capacity every 18 to 24 months.”

“The API is Azure-exclusive. The SaaS business can run anywhere, with a few exceptions.”

“We have access to the GPT family’s IP.”

TL;DW
- Microsoft is building a global network of AI super-datacenters (Fairwater 2 and beyond) designed for fast upgrade cycles and cross-region training at petabit scale.
- Strategy spans three layers: infrastructure, models, and application scaffolding, so Microsoft creates value regardless of which model wins.
- AI economics shift margins, so Microsoft blends subscriptions with metered consumption and focuses on tokens per dollar per watt.
- Future includes autonomous agents that get provisioned like users with identity, security, storage, and observability.
- Trust and sovereignty are central. Microsoft leans into compliant, sovereign cloud footprints to win globally.
Detailed Summary

1) Fairwater 2: AI Superfactory

Microsoft’s Fairwater 2 is presented as the most powerful AI datacenter yet, packing hundreds of thousands of GB200 and GB300 accelerators, tied by a petabit AI WAN and designed to stitch training jobs across buildings and regions. The key lesson: keep the fleet fungible and avoid overbuilding for a single hardware generation as power density and cooling change with each wave like Vera Rubin and Rubin Ultra.

2) The Three-Layer Strategy
- Infrastructure: Azure’s hyperscale footprint, tuned for training, data generation, and inference, with strict flexibility across model architectures.
- Models: Access to OpenAI’s GPT family for seven years plus Microsoft’s own MAI roadmap for text, image, and audio, moving toward an omni-model.
- Application Scaffolding: Copilots and agent frameworks like GitHub’s Agent HQ and Mission Control that orchestrate many agents on real repos and workflows.
This layered approach lets Microsoft compete whether the value accrues to models, tooling, or infrastructure.

3) Business Models and Margins

AI raises COGS relative to classic SaaS, so pricing blends entitlements with consumption tiers. GitHub Copilot helped catalyze a multibillion market in a year, even as rivals emerged. Microsoft aims to ride a market that is expanding 10x rather than clinging to legacy share. Efficiency focus: tokens per dollar per watt through software optimization as much as hardware.

4) Copilot, GitHub, and Agent Control Planes

GitHub becomes the control plane for multi-agent development. Agent HQ and Mission Control aim to let teams launch, steer, and observe multiple agents working in branches, with repo-native primitives for issues, actions, and reviews.

5) Models vs Scaffolding

Nadella argues model monopolies are checked by open source and substitution. Durable value sits in the scaffolding layer that brings context, data liquidity, compliance, and deep tool knowledge, exemplified by Excel Agent that understands formulas and artifacts beyond screen pixels.

6) Rise of Autonomous Agents

Two worlds emerge: human-in-the-loop Copilots and fully autonomous agents. Microsoft plans to provision agents with computers, identity, security, storage, and observability, evolving end-user software into an infrastructure business for agents as well as people.

7) MAI: Microsoft’s In-House Frontier Effort

Microsoft is assembling a top-tier lab led by Mustafa Suleyman and veterans from DeepMind and Google. Early MAI models show progress in multimodal arenas. The plan is to combine OpenAI access with independent research and product-optimized models for latency and cost.

8) Capex and Industrial Transformation

Capex has surged. Microsoft frames this era as capital intensive and knowledge intensive. Software scheduling, workload placement, and continual throughput improvements are essential to maximize returns on a fleet that upgrades every 18 to 24 months.

9) The Lease Pause and Flexibility

Microsoft paused some leases to avoid single-generation lock-in and to prevent over-reliance on a small number of mega-customers. The portfolio favors global diversity, regulatory alignment, balanced training and inference, and location choices that respect sovereignty and latency needs.

10) Chips and Systems

Custom silicon like Maia will scale in lockstep with Microsoft’s own models and OpenAI collaboration, while Nvidia remains central. The bar for any new accelerator is total fleet TCO, not just raw performance, and system design is co-evolved with model needs.

11) Sovereign AI and Trust

Nations want AI benefits with continuity and control. Microsoft’s approach combines sovereign cloud patterns, data residency, confidential computing, and compliance so countries can adopt leading AI while managing concentration risk. Nadella emphasizes trust in American technology and institutions as a decisive global advantage.

Key Takeaways
1. Build for flexibility: Datacenters, pricing, and software are optimized for fast evolution and multi-model support.
2. Three-layer stack wins: Infrastructure, models, and scaffolding compound each other and hedge against shifts in where value accrues.
3. Agents are the next platform: Provisioned like users with identity and observability, agents will demand a new kind of enterprise infrastructure.
4. Efficiency is king: Tokens per dollar per watt drives margins more than any single chip choice.
5. Trust and sovereignty matter: Compliance and credible guarantees are strategic differentiators in a bipolar world.
November 12, 2025
The Benefits of Bubbles: Why the AI Boom’s Madness Is Humanity’s Shortcut to Progress
TL;DR:

Ben Thompson’s “The Benefits of Bubbles” argues that financial manias like today’s AI boom, while destined to burst, play a crucial role in accelerating innovation and infrastructure. Drawing on Carlota Perez and the newer work of Byrne Hobart and Tobias Huber, Thompson contends that bubbles aren’t just speculative excess—they’re coordination mechanisms that align capital, talent, and belief around transformative technologies. Even when they collapse, the lasting payoff is progress.

Summary

Ben Thompson revisits the classic question: are bubbles inherently bad? His answer is nuanced. Yes, bubbles pop. But they also build. Thompson situates the current AI explosion—OpenAI’s trillion-dollar commitments and hyperscaler spending sprees—within the historical pattern described by Carlota Perez in Technological Revolutions and Financial Capital. Perez’s thesis: every major technological revolution begins with an “Installation Phase” fueled by speculation and waste. The bubble funds infrastructure that outlasts its financiers, paving the way for a “Deployment Phase” where society reaps the benefits.

Thompson extends this logic using Byrne Hobart and Tobias Huber’s concept of “Inflection Bubbles,” which he contrasts with destructive “Mean-Reversion Bubbles” like subprime mortgages. Inflection bubbles occur when investors bet that the future will be radically different, not just marginally improved. The dot-com bubble, for instance, built the Internet’s cognitive and physical backbone—from fiber networks to AJAX-driven interactivity—that enabled the next two decades of growth.

Applied to AI, Thompson sees similar dynamics. The bubble is creating massive investment in GPUs, fabs, and—most importantly—power generation. Unlike chips, which decay quickly, energy infrastructure lasts decades and underpins future innovation. Microsoft, Amazon, and others are already building gigawatts of new capacity, potentially spurring a long-overdue resurgence in energy growth. This, Thompson suggests, may become the “railroads and power plants” of the AI age.

He also highlights AI’s “cognitive capacity payoff.” As everyone from startups to Chinese labs works on AI, knowledge diffusion is near-instantaneous, driving rapid iteration. Investment bubbles fund parallel experimentation—new chip architectures, lithography startups, and fundamental rethinks of computing models. Even failures accelerate collective learning. Hobart and Huber call this “parallelized innovation”: bubbles compress decades of progress into a few intense years through shared belief and FOMO-driven coordination.

Thompson concludes with a warning against stagnation. He contrasts the AI mania with the risk-aversion of the 2010s, when Big Tech calcified and innovation slowed. Bubbles, for all their chaos, restore the “spiritual energy” of creation—a willingness to take irrational risks for something new. While the AI boom will eventually deflate, its benefits, like power infrastructure and new computing paradigms, may endure for generations.

Key Takeaways
- Bubbles are essential accelerators. They fund infrastructure and innovation that rational markets never would.
- Carlota Perez’s “Installation Phase” framework explains how speculative capital lays the groundwork for future growth.
- Inflection bubbles drive paradigm shifts. They aren’t about small improvements—they bet on orders-of-magnitude change.
- The AI bubble is building the real economy. Fabs, power plants, and chip ecosystems are long-term assets disguised as mania.
- Cognitive capacity grows in parallel. When everyone builds simultaneously, progress compounds across fields.
- FOMO has a purpose. Speculative energy coordinates capital and creativity at scale.
- Stagnation is the alternative. Without bubbles, societies drift toward safety, bureaucracy, and creative paralysis.
- The true payoff of AI may be infrastructure. Power generation, not GPUs, could be the era’s lasting legacy.
- Belief drives progress. Mania is a social technology for collective imagination.
1-Sentence Summary:

Ben Thompson argues that the AI boom is a classic “inflection bubble” — a burst of coordinated mania that wastes money in the short term but builds the physical and intellectual foundations of the next technological age.
November 6, 2025
Microsoft Unveils Majorana 1: A Quantum Leap in Computing
Introduction Microsoft has introduced Majorana 1, the world’s first quantum chip utilizing a groundbreaking Topological Core architecture. This innovation, built on the newly developed topoconductor material, aims to accelerate the realization of scalable, industrial-grade quantum computing, transforming problem-solving capabilities in fields ranging from materials science to artificial intelligence.

Topoconductors: The Foundation of Majorana 1 The Majorana 1 chip leverages a revolutionary material class—topoconductors—to enable more reliable and scalable qubits, the fundamental units of quantum computation. This breakthrough positions Microsoft to lead the quantum computing industry towards achieving a million-qubit system within years rather than decades. By integrating error-resistant properties at the hardware level, the Majorana 1 ensures greater qubit stability, a crucial factor for scaling quantum operations.

Scalability and Real-World Applications Unlike current quantum architectures, which require fine-tuned analog control, Microsoft’s approach employs digital control for qubits, simplifying quantum computations and reducing hardware constraints. This architecture enables the integration of a million qubits on a single chip, unlocking solutions to some of the most complex industrial and environmental challenges, such as:
- Microplastic Breakdown: Quantum calculations could facilitate the development of catalysts capable of breaking down plastics into harmless byproducts.
- Self-Healing Materials: Engineering materials that can autonomously repair structural damage in construction and manufacturing.
- Advanced Enzyme Engineering: Enhancing agricultural productivity and healthcare by designing more efficient biological catalysts.
- Corrosion Prevention: Analyzing material interactions at the atomic level to create corrosion-resistant structures.
Microsoft’s Quantum Roadmap and DARPA Collaboration Recognizing the potential of Majorana 1, the Defense Advanced Research Projects Agency (DARPA) has selected Microsoft as one of two companies progressing to the final stage of its US2QC program. This initiative aims to accelerate the development of utility-scale, fault-tolerant quantum computers capable of commercial impact.

Precision Measurement and Digital Control A key challenge in quantum computing is qubit instability due to environmental perturbations. Microsoft has overcome this hurdle with a pioneering measurement approach that enables digital qubit control, making quantum systems easier to manage and scale. This precise measurement technique distinguishes between one billion and one billion and one electrons, ensuring the accuracy needed for advanced computations.

Engineering Breakthrough: Atom-By-Atom Material Design Majorana 1 is built on a meticulously engineered materials stack comprising indium arsenide and aluminum. Microsoft designed and fabricated this stack atom by atom to create the necessary topological state for stable qubits. This breakthrough is pivotal in overcoming the scalability limitations of traditional quantum computing approaches.

Integration with AI and Cloud Computing Quantum computing’s synergy with artificial intelligence will redefine problem-solving across industries. Microsoft’s Azure Quantum platform provides enterprises with early access to quantum capabilities, enabling AI-driven insights and innovation. The combination of quantum computing and AI will revolutionize material science, drug discovery, and sustainable technology development.

Microsoft’s Majorana 1 chip marks a paradigm shift in quantum computing, paving the way for practical, large-scale quantum applications. With its topologically protected qubits, digital control systems, and scalable architecture, Majorana 1 is set to drive the next frontier of computational advancements. As quantum computing progresses towards commercial viability, industries worldwide stand to benefit from solutions that were previously unattainable with classical computing methods.
February 19, 2025