PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: NVIDIA

The Next 3 Years of AI, According to Steve Jurvetson: Moore’s Law, Superintelligence Odds, Elon Musk’s Operating Principles, and Where the Legendary SpaceX and Tesla Investor Is Betting Next
Steve Jurvetson has spent 30 years funding the future before it was a category: an early check into SpaceX when space was not a venture sector, Tesla before electric cars were taken seriously, and now a portfolio spanning fusion, analog AI chips, and epigenetic editing at his firm Future Ventures. In this fireside chat he lays out what the next three years of AI actually look like, the three principles he has learned from working alongside Elon Musk for nearly three decades, the question he uses to separate missionary founders from opportunists, and why he thinks alignment of frontier AI systems may simply not be possible.

TLDW

Jurvetson argues the 130-year exponential in compute per dollar (Ray Kurzweil’s abstraction of Moore’s Law from his book The Age of Spiritual Machines) will keep running for at least three more years, carried by analog and custom AI silicon, and that this compounding is what makes startups and disruption possible at all. His gut says the next big leap will be “architecturally variant”: a new generation of labs going back to DeepMind’s founding premise of reinforcement learning, continuous learning, and novelty-seeking goal functions rather than bigger LLMs. He relays Anthropic co-founder Jack Clark’s 30 percent odds of superintelligence within a year but notes the crucial missing piece is that humans still set every goal. Adoption will be wildly uneven: anything made of atoms (cars, robots) switches over glacially, while creative work and white-collar categories like call centers (roughly 1 percent of US GDP) flip almost instantly. From Musk he draws three lessons: insane focus and saying no, maniacal attention to the cycle time of learning loops (Tesla gathers more AI training data every 4 days than Waymo has in its entire history), and being a magnet for talent by selling a grander mission. He explains Future Ventures’ current bets (fusion, free diagnostics via phone, slaughter-free meat, epigenetic editing, critical minerals, analog in-memory compute), tells solo founders their 30-day plan is to find a co-founder, predicts a turbulent transition to abundance, doubts Neuralink can keep pace with AI, dismisses Penrose’s quantum consciousness argument, and frames the post-work question with Man's Search for Meaning: humans need symbolic immortality, not just employment.

Thoughts

The most load-bearing claim in this conversation is not about scaling laws, it is about architecture. Jurvetson is telling you where the smart contrarian money is looking: away from ever-larger language models and back toward reinforcement learning agents with continuous learning and self-generated goals, the original DeepMind thesis that got shelved when LLMs took off. His framing of the open problem is unusually precise. The recursive self-improvement loops everyone is excited about are real, but every one of them is still human-directed. The goal-setting layer, what he calls the selection pressure of the evolutionary algorithm, is the “thin veneer of activity” AI does not yet do, and it happens to be the layer where superintelligence either does or does not arrive. That is a much sharper way to track AGI progress than benchmark scores: watch who cracks autonomous goal formation, not who tops a leaderboard.

Almost everything else Jurvetson says reduces to a single metric: the cycle time of the learning loop. It is his explanation for Musk’s edge (launch cadence, the Tesla fleet as a data-collection machine), his filter for which industries flip fast (bits iterate at machine speed, atoms are stuck with 11-to-12-year car replacement cycles and FDA timelines), and even his bear case on Neuralink, which he has invested in. Biology cannot iterate at synthetic speed, so the substrate that learns fastest wins. Once you see the pattern, it becomes a genuinely useful lens for evaluating any company, career, or technology: ask how fast the loop spins, not how impressive the current artifact is.

The aside that deserves the most attention is his flat statement that mechanistic interpretability will not bear fruit and that control and alignment of a cutting-edge system is not possible. His reasoning is structural, not rhetorical: anything produced by an iterative algorithm run billions of times (evolution, neural network training) is inherently inscrutable, and it will always be easier to build a new intelligence than to reverse engineer one you already made. He swaps “teenager” for “AI” whenever he thinks about control, which is funny until you notice he is one of the most connected investors in the Musk orbit saying the safety agenda rests on a false premise. Sitting that next to the 30 percent superintelligence odds he cites from Jack Clark produces an uncomfortable arithmetic that nobody on stage follows to its conclusion.

For builders, the practical gold is the 50-year question. Ask a founder what their business looks like in 50 years: the opportunist laughs at the question, the missionary is relieved someone finally asked. Paired with his other filters (if only two out of ten people think your idea is crazy it is not bold enough, and a good business is one that could not have been started three years ago), it doubles as a hiring screen and a self-diagnostic. And his 30-day plan for a solo founder is refreshingly unglamorous: do not build the MVP, do not pitch investors, go persuade one person to give up their job and join you. If you cannot recruit a co-founder, that is the market’s first answer about your idea.

Key Takeaways
- Jurvetson invested early in SpaceX and Tesla precisely because space and automotive were not venture categories at all; a software-centric systems engineering approach applied to a sleepy industry that has not changed in decades unlocks enormous value, and that playbook is now rippling through every industry.
- The Kurzweil curve plots 130 years of compute per dollar across five substrates (mechanical, relay, vacuum tube, discrete transistor, integrated circuit) and shows a 10,000 billion billion X improvement; Jurvetson calls it the most important thing ever graphed.
- Customers buy compute capacity and memory, not transistors, and both have been “on rails” for 130 years; the default prediction for the next three years is simply that the curve keeps going.
- When an incumbent declares Moore’s Law dead, it usually signals they are losing their business to someone new, as Intel was to Nvidia 15 years ago.
- Analog chips and customized AI silicon that do discrete matrix multiply-and-add extremely efficiently will carry the mantle of Moore’s Law over the next three years.
- Without exponential technological change there would be no startups: if business is predictable, the big get bigger and incumbents block new entrants; disruption is almost always computationally based.
- Over the next three years AI ripples through energy, agriculture, and construction: three enormous industries that are growing as a percentage of GDP and are the least digitized on the planet, with healthcare close behind.
- His gut says the next driver will be architecturally variant, possibly subsuming today’s models the way mixture of experts subsumes other architectures or massively parallel diffusion models reinterpret the transformer.
- A whole new generation of neural labs is returning to the founding premise of DeepMind: reinforcement learning with continuous learning, let loose on the internet’s data sets, hunting for the algorithm that bootstraps intelligence.
- The open question for these systems is the goal function: what plays the role of evolutionary selection pressure? Candidates include understanding the universe (the xAI mission) or a novelty-seeking algorithm that uses new discoveries as its measure of progress.
- Jack Clark, co-founder of Anthropic, gives roughly 30 percent odds that superintelligence arrives within a year; Jurvetson declines to put odds on it himself and admits “I do not know” is the honest answer.
- Today’s self-improving AI loops (automated verification, hyperparameter adjustment between training runs, AI-mediated experimentation) are real but still human-directed; goal setting remains the thin veneer AI does not do, and it may be the most important layer.
- Human intelligence was bootstrapped on top of reactive limbic systems and emotional centers with cortex layered on top; it is an open philosophical question whether AI systems need to recapitulate that functional specialization to take on purpose and meaning.
- Anything involving atoms switches over slowly: fully autonomous vehicles are inevitable (every car, train, and airplane), but people keep cars 11 to 12 years, so the physical swap-out cycle makes the transition feel glacial.
- Physical robotics faces the same constraint: making a billion robots takes time even with recursive manufacturing techniques.
- The domains that flip like wildfire are the ones we held as uniquely human: creative arts, moviemaking, and imagery came first, which Jurvetson finds somewhat shocking.
- Call centers represent roughly 1 percent of US GDP and can switch over almost entirely and almost instantly; white-collar work generally has no physical swap-out cycle to slow it down.
- People will increasingly prefer AI to human interactions when the AI is better: studies of physician bedside manner and customer service already show AIs doing a better job with emotional connection than humans.
- Musk principle one is an insane ability to focus: running many companies forces ruthless prioritization, and he says no to anything that is not mission-critical right now, including a Craig Venter brainstorm on terraforming Mars because “none of this stuff on Mars matters” until Starship flies.
- Musk principle two, the most important: maniacal focus on the cycle time of innovation, the core learning loop, whether launch cadence or fleet data; Tesla cameras gather more AI training data every 4 days than Waymo has collected in its entire history, because every vehicle collects data whether or not the customer paid for full self-driving.
- Musk principle three: being a magnet for talent, screening for mastery by drilling into engineering crises a candidate actually solved rather than leaning on credentials (which are often an albatross), and framing the company as something grander (sustainable energy, multi-planetary humanity, understanding the universe) so the best people want to join.
- Jurvetson filters founders with one question: what does your business look like in 50 years? Opportunists chuckle at the absurdity; missionaries are relieved and finally tell you what has been driving them all along. He passes on the ones who laugh.
- The best startups hold two things in tension simultaneously: an audacious 50-to-500-year vision and a concrete plan to iterate with real customers over the next three years, chaining backward from the future to what must be built now.
- The perpetual surprise of great companies is expanding option value: autonomous driving was nowhere in Tesla’s founding plan, and Starlink, direct-to-cell, and orbital data centers were not on SpaceX’s dance card even five years ago. Exploring the option space beats purposeful ten-year planning.
- Future Ventures invests in things unlike anything they have seen before yet adjacent to what they know, ideally companies that are literally one of a kind.
- Current bets include nuclear fusion and subcritical fusion that avoids NRC regulation, because energy is the third bottleneck for AI after talent and compute.
- Other 500-year-problem bets: free healthcare via a cell phone (all diagnostics as a free global service, probably launching outside the US to bypass FDA and insurance), slaughter-free meat via cellular agriculture and mycelium, and construction, where labor productivity has been flat for 30 years.
- Recent investments span epigenetic editing (the software of biology rather than the firmware of the genome, applied to crops, pesticides, and human health), critical minerals from deep sea mining to copper refining, and reshoring US industrial capacity.
- Three separate analog AI chip investments approach the same goal from different angles, including Mythic’s in-memory compute doing 8-bit multiplication in a single transistor, each chasing 100X and then another 100X reduction in power per calculation.
- The portfolio is roughly 40 percent life sciences and 60 percent IT, deliberately hunting the weird edge cases that fall through the cracks of traditional pharma VC: organ harvesting for transplant, a male birth control pill, dramatically improved IVF.
- Old industries with no new entrants are the best targets: the four largest tunnel boring companies competing with the Boring Company were all started in the 1800s.
- The 30-day plan for a single person with an idea: find a co-founder. Great startups tend to have a dynamic duo at the founding (Jobs and Wozniak, Sergey Brin and Larry Page, Larry Ellison and Bob Miner), and persuading one person to quit their job for your mission is the first real test of the idea.
- A founding pair with diverse backgrounds and mutual respect sets the culture for everyone hired afterward and creates cognitive diversity that ripples through the whole firm.
- Calibrate boldness by the crazy ratio: if 100 percent of people say your idea is crazy, take the feedback; nine out of ten is pretty good; if only two out of ten think it is crazy, it is not bold enough. Also ask whether the business could have been started three years ago; if yes, that is a bad sign.
- Co-founders most often meet at universities, one of the few places where people cross academic disciplines; breakthrough innovation happens at the interstices between formally discrete fields, and LLMs are exceptionally good at exactly that cross-domain translation, opening a fountainhead of idea discovery.
- Roughly 19 percent of global employment involves driving vehicles, and that work is going away, just more slowly than people imagine.
- Humans have a fundamental desire for symbolic immortality: contributing something that outlasts our brief time here, whether children, books, philanthropy, or companies. Accumulated cultural knowledge, not biology, is the primary vector of human evolutionary progress.
- There is no peaceful path from full employment to no employment: passing through 30, 40, 50 percent unemployment will be turbulent, and no politicians are taking a long-term perspective on it.
- On Neuralink (which he invested in): expanding the sensory periphery is very doable (higher data rates, restoring hearing and spinal function, seeing more wavelengths), but upgrading core intelligence requires reverse engineering an inscrutable iterated system, and biology’s FDA-and-wetware timescales cannot keep up with synthetic learning loops.
- Any product of an iterative algorithm run billions of times (evolution, neural networks, genetic programming) is inherently inscrutable; Jurvetson doubts mechanistic interpretability will bear fruit and does not think control or alignment of a cutting-edge AI system is possible, likening it to mind-controlling a teenager.
- On Penrose’s quantum consciousness argument: there is no clear mechanism and no evidence of quantum processes in the brain, and arguments that consciousness requires our specific substrate are uncompelling; machines may one day have consciousness, just not necessarily human consciousness, the same way computer memory is real memory without being human memory.
Detailed Summary

Betting on Sectors That Do Not Exist Yet

Asked what he saw in SpaceX that other investors missed, Jurvetson flips the question: there were almost no investors even considering space, just as automotive and nuclear energy were not venture sectors. The bet was on Elon Musk, whom he has known for 29 years and backed across all his companies (“and his cousins, too”), and on a thesis that has since crystallized: a software-centric systems engineering approach applied to a sleepy industry that has not changed in decades unlocks extraordinary value. Aerospace and automotive proved it, and the same conversion of industrial low-margin businesses into information businesses is now playing out across the economy.

The 130-Year Compute Curve and the Next 3 Years

Jurvetson polls the room on Kurzweil’s famous graph, first published around 1999, and finds only a quarter have seen what he calls the most important thing ever graphed: five successive technology substrates delivering a 10,000 billion billion X improvement in the computation a dollar buys, sustained over 130 years. Moore’s Law is just the most recent refraction of a longer, almost cosmological trend that transcends the dramas of individual companies. His baseline prediction for the next three years is that the curve keeps going, carried by analog chips and custom AI silicon optimized for matrix math, and he notes that when a company like Intel declares the end of Moore’s Law, it usually means they are losing to someone new, as they did to Nvidia. The deeper point: exponential technological change is the precondition for startups existing at all, because predictable business favors incumbents. AI is the most intense crucible of compute-centric innovation yet, and over the next three years it flows into energy, agriculture, construction, and healthcare, the largest and least digitized sectors.

Architecturally Variant: The Return of Reinforcement Learning

Pressed on what technology drives the next wave (better LLMs, world models, robotics), Jurvetson shares a gut feeling he stresses he has not yet invested in: something architecturally variant that may subsume today’s models. He points to a new generation of neural labs returning to DeepMind’s founding premise, reinforcement learning, which was set aside when LLMs took off. The open design problem is the goal function: what is the multi-decade agentic drive, the selection pressure, the definition of success beyond reproductive fitness? He floats understanding the universe (the Grok and xAI framing) and novelty-seeking algorithms that treat new discoveries as progress. The question these labs chase is whether a single reinforcement learning algorithm with continuous learning, let loose on the internet’s data, could bootstrap intelligence. He adds a caution about today’s chatbots: we ascribe consciousness and meaning where there is none. “There’s no light on inside,” at least for now.

Superintelligence Odds and the Missing Goal-Setting Layer

On whether self-directed, goal-setting AI arrives within three years, Jurvetson cites Jack Clark of Anthropic giving 30 percent odds of superintelligence next year, which he finds fun mostly because at least someone put a stake in the ground. The recursive self-improvement debate is live, but he insists on a distinction: the huge improvements in the current self-improving loop (automated verification, hyperparameter tuning between runs, AI-mediated experimentation) are all still directed by humans. Goal setting remains human, and while that may be only a thin veneer of remaining activity, it is arguably the most important part, and nobody is sure how the transition happens. It may require recapitulating the brain’s functional specialization, the limbic-then-cortex layering that produced our bootstrapped consciousness. His honest answer: he does not know and does not even have odds, because three years out is genuinely hard to predict.

Atoms Move Slowly, Bits Sweep Like Wildfire

The gap between what the technology can do and how we use it is governed by physics and replacement cycles. Fully autonomous vehicles are, to him, obviously inevitable for everything that moves on Earth, yet cars stay on the road 11 to 12 years, so the switchover feels glacial; a billion robots likewise take time to manufacture. What flips fast is the world of bits, and strangely it started with what we considered most human: creative arts, movies, and images. White-collar work follows because there is no physical swap-out cycle: call centers, about 1 percent of US GDP, can convert almost overnight. And people will increasingly prefer the AI when it is better, showing more emotional understanding and better reading of the situation, something already visible in comparisons of physician bedside manner and customer service quality.

Three Principles from Working with Elon Musk

Jurvetson opens with humility (even Maye Musk cannot explain how Elon became Elon, and the books piling up on his bedside table may not have been written by humans), but offers three observations from close range. First, an insane ability to focus. Running multiple companies paradoxically helps: nobody questions Elon skipping a holiday party, and he says no to fascinating distractions, including Jurvetson’s attempt to connect him with Craig Venter to brainstorm terraforming Mars with gene sequencers. Musk’s answer: none of it matters until Starship flies. Second, and even more important, a maniacal focus on the cycle time of innovation: how fast the core learning loop runs, whether launch cadence or fleet learning. The Tesla data flywheel is the exemplar: every car collects training data whether or not the owner paid for FSD, so Tesla gathers more data every 4 days than Waymo has in its history. Third, a well-honed talent stack: pattern recognition that ignores credentials (often an albatross), drills candidates on the engineering crises they actually navigated to test for real mastery, and wraps the company in a mission grand enough (sustainable energy, multi-planetary life, understanding the universe) that the best people want in, which compounds because great people attract great people.

The 50-Year Question and Expanding Option Value

How do founders stay true to a mission when 99 percent of the world says it is too early? Jurvetson admits selection bias: for 30 years he has tried to back only people with a sincere, almost messianic mission rather than arbitrage-seeking opportunists. His filter is to ask what the business looks like in 50 years. Opportunists laugh (“I’ll be on my third startup by then”); the best founders are relieved to finally unload the dream they have been hiding because “colonizing Mars is an uninvestable proposition” as a day-one pitch. The best startups pair an audacious 50-to-500-year vision with a plausible path of customer iteration over the next three years, chaining backward from the future. What still surprises him is how the option value of frontier companies keeps expanding: autonomous driving was not in Tesla’s founding plan at all, and SpaceX kept unfolding from cheap launch to Starlink to direct-to-cell to orbital data centers, none of which was on the dance card five years ago. Exploring the light cone of possibilities beats designing a ten-year plan.

Where Future Ventures Is Betting Now

The firm looks for companies unlike anything it has seen before yet adjacent to familiar ground, targeting problems that will obviously be solved 500 years from now. In energy: multiple fusion investments plus subcritical fusion that sidesteps NRC regulation, because energy is the third bottleneck for AI after people and compute. In health: free diagnostic healthcare delivered by cell phone as a global free service, likely launched outside the US to bypass FDA and reimbursement. In food: slaughter-free meat via cellular agriculture and mycelium. In construction: still looking, after trying and failing a few times in an industry where labor productivity has been flat for 30 years. Recent themes include epigenetic editing (the software of biology rather than the firmware of the genome, spanning crop health, pesticides, herbicides, and human health), critical minerals and metals from deep sea mining to copper refining as part of reshoring, and three separate analog AI chip bets, including Mythic’s in-memory compute doing 8-bit multiplication in a single transistor, each chasing successive 100X reductions in power per calculation. The mix runs about 40 percent life sciences, 60 percent IT, with a taste for the weird edge: organs grown for transplant, a male birth control pill, radically improved IVF. His favorite hunting ground is old, crappy industries with no new entrants, like tunnel boring, where the Boring Company’s four largest competitors were founded in the 1800s.

Advice for Founders: Find Your Batman and Robin

His 30-day plan for a single person with an idea is not an MVP or a pitch deck: find a co-founder. Startups tend to be founded by dynamic duos (Jobs and Wozniak, Sergey Brin and Larry Page, Larry Ellison and the lesser-known Bob Miner), and a pair with diverse backgrounds and mutual respect creates a rapid iteration loop and sets the cultural template for every future hire. Persuading one person to quit their job for your crazy idea is the first proof the mission can recruit. On calibrating craziness: if literally everyone thinks the idea is crazy, take the feedback; nine out of ten is pretty good; only two out of ten means it is not bold enough, because obvious ideas get done by others. Ask whether the business could have been started three years ago; the right answer is no. Co-founders most often meet at universities, where students (unlike professors in their stovepipes) cross-pollinate between academic disciplines, and breakthrough innovation lives at those interstices. As an aside, he notes LLMs excel at exactly this translation between domains, opening a new fountainhead of idea discovery we are only beginning to tap.

When Machines Do Everything: Meaning, Abundance, and Turbulence

Asked the closing question (when machines do everything, what is the meaning of life?), Jurvetson starts with scale: roughly 19 percent of global employment is driving vehicles, and it is going away. But humans want meaningful work, driven by what he calls a fundamental desire for symbolic immortality: children, books, philanthropy, companies named after founders, all instantiations of the urge to contribute something that outlasts us. Translating the question into humanity’s mission statement, he lands where Yuri Milner and Musk do: to understand the universe and add to accumulated knowledge, because culture, not biology, is the primary vector of human evolutionary progress. If we could hyperspace-jump to Peter Diamandis-style abundance, where everything physical costs a dollar a pound and machines do all labor, we could all be philosopher kings and artists. But he refuses to end on false comfort: there is no visible peaceful path from full employment through 30, 40, 50 percent unemployment, that transition will be turbulent, and no politicians are taking a long-term view of it.

Neuralink, Inscrutable Systems, and the Alignment Heresy

In audience Q&A, Jurvetson confirms he invested in Neuralink (the idea traces to the neural lace of Iain M. Banks’ novel Surface Detail, which he recommends) but offers a contrarian view. Working from the periphery is very promising: restoring broken function, fixing spinal cords, expanding senses, higher-bandwidth communication. Upgrading core functionality, actually making someone smarter, is another matter. His reasoning comes from decades of watching complex systems: any artifact produced by an iterative algorithm run billions of times (evolution, neural networks, genetic programming, cellular automata) is inherently inscrutable. That is why he doubts mechanistic interpretability will bear fruit and flatly does not think control and alignment are possible for a cutting-edge AI system; he mentally swaps “teenager” for “AI” whenever the control question comes up. The same inscrutability applies to the brain: it will be easier to build a new intelligence than to reverse engineer one already made, and FDA cycles plus human biology cannot iterate at the speed of synthetic learning loops, so he lacks faith Neuralink keeps up with AI. Kurzweil’s uploading dream, he suggests, is a case of wanting something to be true within one’s lifetime.

Penrose, Quantum Brains, and Machine Consciousness

On Roger Penrose’s argument that consciousness depends on quantum processes and is therefore unreachable by AI, Jurvetson is respectful of the man and dismissive of the claim: there is no clear mechanism (a speculative lithium isotope coupling aside), and it amounts to wishful thinking. Generalizing, he finds all vitalist arguments that our substrate is uniquely necessary uncompelling; you could make a better case that carbon is special to life than that neurons are essential to consciousness. His favorite reframe swaps in the word memory: computers have memory that is nothing like holographic, gracefully degrading human memory, yet nobody debates whether computer memory is real. Machines may likewise develop a different kind of consciousness without human consciousness. Declaring something impossible is a much higher-order proposition than admitting ignorance, so his position is: he does not know whether the current AI path leads to consciousness, but his gut says machines will get there one day, perhaps via evolution-like reinforcement learning approaches that recapitulate what biology already proved possible.

Notable Quotes

“I have this gut feeling that it’ll be something architecturally variant. It might subsume the models that we know now.”
Steve Jurvetson, on what drives the next three years of AI

“It’s almost cosmological. Like, why has humanity’s capacity to compute compounded for 130 years?”
Steve Jurvetson, on the Kurzweil abstraction of Moore’s Law

“If business is predictable, if there isn’t disruptive technological change, the big get bigger.”
Steve Jurvetson, on why exponential compute is the precondition for startups

“The Tesla cars today in their cameras gather for their AI training set more data every 4 days than Waymo has in its entire history.”
Steve Jurvetson, on the data flywheel behind Musk’s learning-loop obsession

“If it’s like only two people think it’s crazy, that’s bad because it’s clearly not bold enough. If it’s an obvious idea, other people will do it.”
Steve Jurvetson, on calibrating how crazy a startup idea should be

“Despite attempts at mechanistic interpretability in AI, I don’t think that’s going to bear fruit.”
Steve Jurvetson, on why iterated systems are inherently inscrutable

“It’d be easier to build a new intelligence than it is to reverse engineer one you’ve made.”
Steve Jurvetson, on why he doubts Neuralink can keep pace with AI

“I think all humans have a fundamental desire for symbolic immortality, this belief that we’ve contributed something to the world that transcends our brief time on this world.”
Steve Jurvetson, on the meaning of life when machines do everything

“It’s much higher order proposition to say something is impossible than to say I don’t know.”
Steve Jurvetson, on whether AI can ever be conscious

Watch the full conversation here: The Next 3 Years of AI: Lessons from Elon Musk’s First Investor.

Related Reading
- Steve Jurvetson (Wikipedia) background on the investor behind early bets on SpaceX, Tesla, and Hotmail.
- Future Ventures the firm Jurvetson co-founded with Maryanna Saenko, primary source for the investment theses discussed on stage.
- Accelerating change (Wikipedia) the broader idea behind Kurzweil’s 130-year compute curve and the law of accelerating returns.
- Reinforcement learning (Wikipedia) the architecture Jurvetson’s gut says produces the next breakthrough, back to DeepMind’s founding premise.
- The Pursuit of Purpose our guide to the meaning-of-life question Jurvetson closes the conversation on.
July 9, 2026
Jonathan Ross on Groq’s $20 Billion NVIDIA Deal, Faster Inference, and Why Asking the Right Questions Wins the AI Age
Jonathan Ross, the founder of Groq and the inventor of Google’s Tensor Processing Unit (TPU), sits down with David Senra (host of the Founders podcast) to walk through Groq’s roughly $20 billion partnership with NVIDIA and the decade of near-death struggle that preceded it. You can watch the full conversation here. Ross, now a senior executive at NVIDIA following the deal, is unusually candid about being one of the world’s worst leaders when he started, about coming three weeks from running out of money, and about the single contrarian bet (that faster inference would make AI both faster and smarter) that almost everyone, including his own engineers, told him was pointless.

TLDW

Ross explains the structure of the NVIDIA deal (a call to Jensen Huang about buying 100,000 GPUs turned, in three weeks, into NVIDIA’s largest deal by nearly 3x) and why pairing Groq’s LPU with the GPU defeats the many different bottlenecks inside an LLM the way you would use both 18-wheelers and delivery vans in a logistics network. He unpacks the AlphaGo moment that revealed faster inference makes models smarter, the shift from the information age (answering questions) to the AI age (asking the right questions), and a leadership philosophy built on autonomy, one brutally clear priority (25 million tokens per second on a challenge coin), and giving people the fewest constraints so they can surprise you. He shares hard-won lessons from Jensen and NVIDIA (the least political large org he has seen, no secret one-on-ones), his concepts of reality quotient and the dominant game, return on luck and the GitHub opportunity he let his team talk him out of, intentional leadership (“I intend to do this”), the Grok bonds that traded salary for equity and saved the company, hiring for negatives instead of positives, loss bias and manufactured discontent, and a closing case for radical optimism: code is becoming free, software creation is being democratized like literacy, and education should stop teaching kids to answer questions and start teaching them to ask.

Thoughts

The technical spine of this interview is a genuinely counterintuitive claim: you can make a model smarter by making it faster. Ross’s proof is the AlphaGo anecdote, where the exact same model, ported from GPUs to his TPU, saw its ELO jump by hundreds of points and beat the world champion, because more compute per unit of time let it search deeper and surface moves like the famous Move 37 that were too far down the tree to find otherwise. Once you internalize that inference speed is not a convenience but a capability multiplier, the entire Groq thesis, and the logic of the NVIDIA deal, snaps into focus. The industry spent years treating fast inference as a nice-to-have. Ross treated it as the whole game, and was nearly alone in doing so for a very long time.

The most transferable material is the leadership arc, precisely because Ross is willing to say he was bad at it. His core insight is that there is no single correct way to lead, any more than there is one way to invest, and the founder’s first job is to know which way is true to them. Ross is a delegator who hires autonomous people and gives them a single, poetically compressed objective, then gets out of the way. The reason that matters is subtle: if you over-constrain the goal, your team can never surprise you with a better answer than the one you already had, which means they can never actually innovate. The Kelly Johnson line Senra offers (“extreme performance often comes from one brutally clear priority”) is the same idea from the Skunk Works side. A challenge coin that reads “25 million tokens per second” is not a slogan, it is a mechanism that lets every engineer connect their work to one dominant game.

Two ideas deserve to be lifted out and used directly. The first is intentional leadership, borrowed from David Marquet’s submarine turnaround: replace “should I do this?” with “I intend to do this.” Asking for opinions invites pessimism and hands your most timid people a veto. Declaring intent still lets someone shout “the hatch is open” when it truly matters, but it stops the reflexive no. Ross traces years of stalled progress to the simple error of asking instead of declaring. The second is his inversion of hiring: hire for negatives, not positives. Growing talent means showing people the path, so you emphasize positives. Selecting talent means screening people out, so you hunt for the disqualifying negatives, because one person’s negative trait infects the whole team. Most founders, Ross included for years, are clever enough to talk themselves into any candidate. A versioned “people spec” and a deliberate loss-averse posture are the antidote.

The Grok bonds story is the emotional center and a small masterpiece of change management. Facing a layoff list that would have killed the company (because the people slated to be cut were exactly the ones needed to make the product work at all), Ross instead asked the team to trade salary for equity, framed with World War II war-bond imagery. Eighty percent participated, half went to statutory minimum wage, and attrition actually fell. His phrase for why is “put everyone’s hands on the steering wheel.” Passengers fear a windy road, drivers feel in control. It is a reminder that morale under existential stress is often a function of agency, not comfort, and that the Phil Knight move of converting employee sacrifice into ownership is a recurring pattern in company survival stories for a reason.

Where the conversation turns almost spiritual is manufactured discontent. Ross observes that the entrepreneurs in a room of successful people were the least happy with their wealth, and that this very dissatisfaction was the fuel that kept them building. His own current discontent is stark and worth sitting with: the world does not have enough compute, and if it takes an extra year to cure cancer or slow aging because of that shortage, he considers it his fault. Whether or not you accept the moral weight he assigns himself, the mechanism is instructive. Edwin Land wrote “300 people died today” on the whiteboard while inventing anti-glare technology. A concrete, human cost attached to delay is a far more durable motivator than a revenue target. Paired with his closing optimism about code becoming free and software creation democratizing like literacy, it makes for one of the more clear-eyed and yet hopeful founder conversations in recent memory.

Key Takeaways
- The NVIDIA deal began as a request to buy about 100,000 GPUs; Jensen saw what Groq had built pairing GPUs and LPUs and decided to make it available to all NVIDIA customers, closing what Ross calls the firm’s biggest deal by nearly 3x in roughly three weeks from first call to wired money.
- GPUs and LPUs are complementary: inside an LLM’s decoder layer, the GPU is better at the compute-bound attention portion and the LPU is better at the memory-throughput-bound weights, so combining them defeats bottlenecks across the whole performance curve, like using both 18-wheelers and last-mile vans.
- As AI increasingly talks to AI, speed dominates, because agents kick off other agents and compound; a human tolerates a one-second wait, but AI is just sitting there idle.
- Agentic micro payments will make the number of payments skyrocket, but payments infrastructure is not yet built for AI operating inside an allocated budget.
- Ross prototypes cutting-edge ideas as personal hobby projects first, then brings them to work; his personalized “daily brief” evolved from long text into headlines he can interrogate with follow-up questions, like the game of 20 questions.
- The information age rewarded answering questions; the AI age rewards asking the right ones, as everyone shifts from individual contributor to leader of AI, and good leaders ask the question no one else did.
- There is no single right way to lead, just as there are many ways to invest; the founder’s job is to know themselves and pick the leadership form that is true to them (inspiration versus fear, control versus delegation).
- Ross was, by his own account, one of the world’s worst leaders at the start, which cost Groq three to four years; his fix was to define one goal simple enough to fit on a challenge coin: 25 million tokens per second.
- The fewer constraints you give a person (or an AI agent), the more freedom they have to surprise you with a better solution; over-constraining the goal makes real innovation impossible.
- Lessons from Jensen and NVIDIA: it is the least political large organization Ross has seen, Jensen never runs secret one-on-ones (tell everyone at once, copy everyone on email), and the whole strategy reduces to “what does the customer actually need?”
- Jensen manages around 60 direct reports, each smarter than him in their own domain, which he offers as the model for orchestrating AI agents that may be smarter than you.
- Asking a sharp question that makes an expert say “I didn’t think of that” is a universal founder skill (it appears in every Bezos book) and can be honed.
- Confidence, not competence, was Ross’s early bottleneck: shadowing a leader of 2,000 people, he realized he would have made the same decisions, and acting with confidence made people follow his direction without changing the decisions themselves.
- The better and more creative your people, the harder they are to manage; running 450 highly creative scientists felt more like managing 5,000.
- Reality quotient (RQ), distinct from IQ, is the ability to recognize reality and, in its extreme form, to choose the dominant game; MySpace optimized accounts signed up while Facebook optimized monthly active users and won.
- The first principle of change management is to make it feel like it is not a change; people who seem fine with change are usually anchored to something that did not change.
- Return on luck (from Jim Collins): the most successful companies do not get more lucky breaks, they seize the ones they get; Ross let his team talk him out of powering GitHub’s LLMs on Groq chips, then vowed never again.
- People adopt fast inference only when they experience it personally; an Anthropic demo three months before ChatGPT drew no reaction because the answers were not the audience’s own, and Groq later went viral off a fast-LLM video posted on X.
- Great innovators often experience a problem before others do; the future is already here, just not evenly distributed, and Ross saw fast inference’s value first because of AlphaGo.
- Intentional leadership (from David Marquet’s USS Santa Fe turnaround): say “I intend to do this” instead of asking for an opinion, which stops reflexive pessimism while still letting people flag a real problem.
- Grok bonds: three weeks from running out of money, Ross swapped a layoff for a war-bond-style salary-for-equity exchange; 80% participated, about half took statutory minimum wage, and it bought roughly two months of runway.
- “Put everyone’s hands on the steering wheel”: participation in saving the company cut attrition to under 10% during the crisis, echoing Phil Knight converting employee loans into Nike equity.
- West Coast VCs behave like lemmings (one pass triggers all passes), while East Coast VCs run independent analysis; the herd missed what became NVIDIA’s biggest deal ever, a live example of the Keynesian beauty contest.
- For the first time, top startups are not starved for cash, so putting in more money is no longer an advantage even though investors still behave as if it is.
- Hiring flip: move from hiring for positives (how you grow talent) to hiring for negatives (how you select talent), because one negative trait poisons the team; write a versioned “people spec” like a product spec.
- Loss bias (a loss feels roughly six times more painful than an equal gain) can be a hiring signal: Ross looks for people who “book the win early,” treating any missed improvement as a loss.
- Poetic design (maximum meaning in minimal expression, “every word matters”) was a positive on the people spec; its negative is maximalist, cluttered design.
- Michael Jordan manufactured pressure by taunting opponents so a loss would be humiliating, forcing superhuman performance (per his trainer Tim Grover), a deliberate version of throwing your keys over the fence.
- Manufactured discontent (David Ogilvy’s “divine discontent”): the best entrepreneurs never rest on wins; the least happy people with their wealth were the ones who kept building.
- Ross’s discontent today is the world’s lack of compute; he treats every delayed medical breakthrough as partly his responsibility, the way Edwin Land wrote a daily death count on the whiteboard while fighting headlight glare.
- Software has run on “code rationing” because code was expensive to write, enforced by “no engineers”; as the marginal cost of code approaches zero, you just implement, experience, and re-implement.
- AI democratizes software creation like the alphabet democratized literacy: Ross’s executive assistant now builds working apps, and individual founders with taste but no coding background will create valuable companies.
- Education should be revamped around asking questions and solving real community problems; if a kid can look up or prompt the answer, the assignment taught nothing, but making them ask the right questions to get AI to solve a real problem does.
Detailed Summary

The $20 Billion NVIDIA Deal and Why LPUs and GPUs Belong Together

The deal’s most striking feature is speed: the idea was first floated on a call roughly three weeks before the money was in the bank. Groq had been integrating GPUs and LPUs and went to Jensen Huang wanting to buy about 100,000 GPUs to deploy themselves. Jensen saw the combined system and decided it should be offered to all of NVIDIA’s customers. The technical logic is that processing an LLM token involves many matrix multiplies with different bottlenecks, some compute-constrained (better on the GPU, especially the attention portion) and some memory-throughput-constrained (better on the LPU, applying the trained weights). There is no single perfect architecture, so putting the two together defeats bottlenecks across the whole curve. Ross adds that as AI talks to AI, speed becomes everything, because agents spawn agents and compound exponentially.

Asking Questions, Daily Briefs, and the Shift to Leading AI

Ross builds cutting-edge tools as personal hobby projects before bringing them to work, including a personalized “daily brief” that functions like a presidential daily brief. He redesigned it from long text into headlines he can interrogate, because interactivity, like 20 questions, distills straight to what you actually care about. This grounds one of his signature ideas: success in the information age meant answering questions, but success in the AI age means asking the right questions. As people move from individual contributors to leaders of AI, the skill that matters is the leader’s skill of asking the question everyone else missed or was afraid to raise, since the question you ask determines the output you get.

Knowing Your Leadership Style and the Challenge Coin

Ross frames leadership like investing: the first principle is simply having followers, but there are infinite valid styles. New founders fail by copying advice that is not true to them. Ross is a natural delegator (he has not held a driver’s license since his teens because he would rather think than control the car) who hires unusually autonomous people. Early on this backfired badly, because he entrusted people who needed direction, and he calls himself one of the world’s worst early leaders, a gap that cost Groq years. His breakthrough was distilling the mission onto a challenge coin reading “25 million tokens per second,” which let everyone connect their work to one dominant game. He references David Marquet’s Turn the Ship Around later, but the coin embodies Kelly Johnson’s Skunk Works principle that extreme performance comes from one brutally clear priority, plus the rule that fewer constraints give people more room to surprise you, turning a team from Superman into the Avengers.

Lessons from Jensen: Killing Politics and Serving the Customer

Working at NVIDIA taught Ross how much further he could have pushed lessons he half-learned at Groq. NVIDIA is, in his experience, the least political large organization anywhere, and a big reason is that Jensen never tells different people different things in private one-on-ones. When you address a room, everyone hears the same message; separate conversations breed side cliques. Ross’s practical rules: hold big meetings for anything you want a group to know, and copy everyone on email so no one can route politics through you. The other Jensen lesson is to stop playing 3D chess and just ask what the customer needs, tell them only what you believe and can support, and refuse to sell them something they do not need. Senra notes he has covered roughly 19 ideas from The Nvidia Way on his Founders podcast, and Jensen’s line that he already manages 60 reports smarter than him is the template for managing AI agents.

Reality Quotient, the Dominant Game, and Change Management

Groq hired for reality quotient, not just IQ, because plenty of very smart people construct elaborate stories disconnected from reality. In its extreme form, RQ is the ability to choose the dominant game, the way Facebook’s focus on monthly active users beat MySpace’s focus on accounts signed up. The founder’s job is to help everyone connect their activity to that dominant game (for Groq, tokens per second), then manage the change. Ross’s first principle of change management is to make it feel like it is not a change: nobody likes change, and people who tolerate it well are usually focused on something that stayed constant. If your team is anchored to the dominant goal, a new tactic does not feel like change; if they are anchored to a narrow task, it does.

Return on Luck, the AlphaGo Insight, and the GitHub Miss

From Jim Collins’s Great by Choice, Ross took the idea that winners seize luck better, not that they get more of it. He experienced it first-hand with AlphaGo: after a DeepMind team asked whether his TPU was as fast as rumored (he said yes, Ghostbusters-style), porting the identical model from GPUs to TPUs pushed its ELO from around 3,200 to roughly 3,900 and it crushed the world champion. As Thinking Fast and Slow by Daniel Kahneman frames it, more compute lets the model virtually play out more moves and occasionally find a better second-best line, which is how the famous Move 37 surfaced. Faster thinking is smarter thinking. Yet Ross also let his own engineers talk him out of powering GitHub’s LLMs on Groq chips, twice, because they focused on why it could not be done rather than why it could. He eventually did the math himself, hit the numbers, and learned to stop inviting that pessimism.

Selling Speed and Intentional Leadership

Customers could not grasp fast inference until they felt it. Ross recalls an Anthropic demo three months before ChatGPT that drew no reaction, because seeing someone else’s answer appear is not magical, but getting your own question answered instantly is. So Groq simply put fast inference online, and it went viral after someone posted a video of a blazing-fast LLM on X (Ross noticed his own demo slowing in Norway because usage had skyrocketed). The deeper fix for internal resistance came from Turn the Ship Around, David Marquet’s account of turning the USS Santa Fe from worst to best in nuclear readiness by replacing command-and-control with intentional leadership. Saying “I intend to do this” rather than “should I?” stops people from reflexively supplying negative opinions, while still letting someone shout “the hatch is open” when there is a genuine problem.

Grok Bonds: Three Weeks From Zero

With three weeks of cash left and a layoff list on the table, Ross realized the cuts targeted exactly the people needed to finish an unprecedented compiler and reach the critical mass where the product would even work. Layoffs would not save the company; only reducing burn without losing people could. So Groq held an all-hands, put up World War II war-bond imagery, and launched “Grok bonds,” an exchange of salary for equity. Ross expected heavy attrition; instead 80% participated and about half dropped to statutory minimum wage, real pain for engineers used to six-figure salaries. It bought closer to two months of runway. His framing, “put everyone’s hands on the steering wheel,” explains why attrition actually fell below 10%: drivers feel more in control than passengers, and it echoes Phil Knight in Shoe Dog converting employee loans into Nike equity on the edge of collapse.

Hiring for Negatives, Loss Bias, and Manufactured Discontent

Ross was good at spotting smart, talented people but kept hiring ones who caused organizational problems, because he could always talk himself into a candidate. Watching a sharp head of HR screen people out, he realized he had been hiring wrong: growing talent means showing positives, but selecting talent means hunting for disqualifying negatives, since one bad trait spreads to the whole team. He formalized a versioned “people spec” with positives like return on luck and poetic design, each paired with a negative. He also hired for loss bias, the fact that a loss feels roughly six times more painful than an equal gain, seeking people who “book the win early.” That competitive, pressure-seeking wiring links to Michael Jordan manufacturing humiliation stakes (per Tim Grover in Relentless) and to David Ogilvy’s divine discontent. Ross’s own manufactured discontent today is the world’s shortage of compute, which he frames in life-and-death terms.

The Optimistic Close: Free Code and Universal Software Literacy

Ross ends on aggressive optimism. Software has long run on “code rationing” because code was expensive to write, policed by “no engineers” whose job is to say no. As the marginal cost of code approaches zero, the workflow flips to implement, experience, then re-implement. More important is accessibility: just as alphabets and universal education turned reading and writing from a scribe’s monopoly into a question of quality, AI is making software creation universal. His executive assistant now builds working apps, and a wave of individual founders with taste but no coding background will create valuable companies. The corollary for education is to stop teaching kids to answer questions and start teaching them to ask, revamping curricula around real community problems where the point is asking the right questions to get AI to solve something that matters.

Notable Quotes

“Success in the information age was about being able to answer questions. Success in the AI age will be about being able to ask the right questions.”
Jonathan Ross, on the fundamental shift AI creates

“The fewer constraints that you give someone, the more freedom they have to solve the problem, and the more freedom they have to surprise you with the solution.”
Jonathan Ross, on leading creative teams

“Being able to think faster makes you think smarter.”
Jonathan Ross, on why faster inference produces more capable models

“There are plenty of really smart people who wouldn’t recognize reality if it tapped them on the shoulder.”
Jonathan Ross, defining reality quotient versus IQ

“If you express intentional leadership, you say, ‘I intend to do this.’ People don’t tend to offer their opinion, but if it’s very wrong and there’s a reason, they will push back.”
Jonathan Ross, on the lesson from Turn the Ship Around

“When people are passengers in a car, they’re more nervous about a windy road or a scary road. But when they’re the driver, they feel more in control.”
Jonathan Ross, on why Grok bonds kept the team together

“The biggest flip in my hiring was when I went from looking for positives, which is what you do when you’re trying to grow talent, to looking for negatives, which is what you do when you’re trying to select talent.”
Jonathan Ross, on inverting his approach to hiring

“If it takes us an extra year to cure cancer because we don’t have enough compute, that’s my fault.”
Jonathan Ross, on the discontent that drives him today

Watch the full conversation between Jonathan Ross and David Senra here on YouTube.

Related Reading
- Groq the company Ross founded and the LPU behind the fast-inference story and the NVIDIA partnership.
- AlphaGo versus Lee Sedol (Wikipedia) the match, including Move 37, that showed Ross how much faster hardware raises a model’s capability.
- The Keynesian Beauty Contest (Wikipedia) the dynamic Ross uses to explain why West Coast VCs herded past what became NVIDIA’s biggest deal.
- Zero to One by Peter Thiel, the source of the first-principles thinking Ross applied to the contrarian bet on fast inference.
- Founders podcast by David Senra the host’s biography-driven show, source of the Jensen, Michael Jordan, and Edwin Land ideas referenced throughout.
July 7, 2026
OpenAI and Broadcom Unveil Jalapeño, a Custom LLM Inference Chip to Cut Compute Costs and Reduce Nvidia Dependence
OpenAI and Broadcom pulled the wrapper off Jalapeño on Wednesday, June 24, 2026, a custom silicon accelerator that OpenAI is calling its first “Intelligence Processor” and its first real move into designing the hardware underneath its own models. Broadcom President and CEO Hock Tan and President Charlie Kawwas physically handed the wafer to OpenAI CEO Sam Altman and President and Co-Founder Greg Brockman, a staged moment meant to signal that the ChatGPT maker is no longer just a models-and-products company but is now reaching all the way down to the chip. Jalapeño is purpose-built for large language model inference, the compute-intensive job of actually serving answers to users rather than training the model in the first place, and OpenAI plans to deploy it at gigawatt scale by the end of 2026 as the first step in a multi-generation platform built with Broadcom and Canadian electronics manufacturer Celestica. You can read the announcement straight from the source in OpenAI’s official post.

TLDR

OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI chip, an ASIC designed from a blank slate specifically for LLM inference rather than training, manufactured by TSMC and integrated into server systems by Celestica that only OpenAI will use. OpenAI claims the chip went from initial design to manufacturing tape-out in just nine months, what it calls the fastest ASIC development cycle ever in high-performance advanced semiconductors, accelerated in part by using its own AI models to design the silicon. Engineering samples are already running ML workloads in the lab, including GPT-5.3-Codex-Spark, and OpenAI says early testing shows performance per watt “substantially better” than current state-of-the-art, a self-reported and not yet independently verified claim with a full technical report promised in the coming months. Broadcom CEO Hock Tan told Reuters the chip matches Nvidia’s Blackwell and Google’s TPUs, framing the launch as part of a flywheel where OpenAI owns the full stack from chip to model to product. The chip slots into a broader infrastructure strategy targeting 10 gigawatts of custom accelerator capacity between 2026 and 2029 with deployments alongside Microsoft and other partners, and The Decoder reported Microsoft is expected to buy 40 percent of the chips, a guarantee Broadcom reportedly demanded to secure the first phase. The move is widely read as OpenAI diversifying away from Nvidia, continuing a procurement spree that already includes AWS Trainium, AMD, and Cerebras, as inference quietly becomes the company’s real cost center.

Thoughts

The single most important word in this announcement is “inference,” and it is the word doing the heavy lifting. Training a frontier model is a capital expense that happens in bursts. Inference is the bill that arrives every single day, forever, scaling linearly with usage. Every ChatGPT reply, every Codex task, every API call, every agent step is an inference event, and as OpenAI’s product surface explodes that recurring cost is the thing that actually threatens the unit economics. A custom chip aimed squarely at inference is therefore not a vanity project or a research flex. It is OpenAI attacking the largest variable cost in its business at the root, trying to bend its cost-per-token curve below what it pays renting Nvidia GPUs. If Jalapeño lands anywhere near its claims, the payoff is not faster benchmarks, it is gross margin.

The performance-per-watt claim, though, deserves the most skeptical reading in the room. OpenAI says Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art, but it has not finalized the numbers, has not said which chips it tested against, on what tasks, or under what conditions, and the full technical report is somewhere in the indefinite “coming months.” These are self-reported figures from a company with an enormous interest in convincing the market it has a credible alternative to Nvidia. Hock Tan’s line that the chip is “as good as” Blackwell and Google’s TPUs is a CEO talking his own book in an interview, not a measured result. The honest posture is to treat the figures as marketing until the technical report lands. A chip running engineering samples in a lab at target frequency is real progress, but it is a very long way from a chip that holds those numbers across a production fleet under messy real-world load.

OpenAI left the most revealing detail out of its own press release: the report, via The Decoder, that Broadcom demanded Microsoft guarantee it will buy 40 percent of the chips to secure the first phase. That single sentence tells you who is actually carrying the risk. Building gigawatt-scale custom silicon is brutally capital-intensive, and Broadcom is not willing to commit manufacturing capacity on the strength of OpenAI’s demand alone. It wants a balance sheet behind the order, and Microsoft, OpenAI’s largest backer, is the balance sheet. That detail quietly reframes the whole “OpenAI owns the stack” narrative. OpenAI may design the chip, but the deployment is underwritten by Microsoft’s purchasing commitment, which means Microsoft also gets leverage and supply security out of an OpenAI-branded part. Ownership of the design is not the same as ownership of the risk.

The flywheel framing is genuinely interesting and probably the most defensible strategic claim OpenAI is making. OpenAI says it used its own models to accelerate parts of the chip design and optimization, compressing a normally multi-year ASIC cycle into nine months. If that is even partly true, it is a meaningful loop: the models help design the chips, the chips run the models more cheaply, the cheaper models drive more usage and revenue, and the revenue funds the next chip. That is a compounding advantage that is hard for a pure hardware vendor to replicate and hard for a pure software lab to replicate. The catch is that nine months from design to tape-out is a claim about speed, not about whether the resulting chip is actually competitive in volume. Fast tape-out and great silicon are different achievements, and the industry has seen plenty of chips that taped out quickly and underwhelmed in production.

Strip away the “Intelligence Processor” branding and this is a playbook we have already watched run three times. Google built TPUs, Amazon built Trainium and Inferentia, Meta built MTIA, and all of them turned to Broadcom or Marvell for the design IP that is hard to replicate in-house. OpenAI is doing the same thing with the same partner, just later and louder. The diversification arc is unmistakable: OpenAI was one of the biggest Nvidia GPU buyers on earth, and in the span of a year it has signed deals for AWS Trainium, AMD accelerators, and Cerebras inference hardware, and now its own custom ASIC. Nvidia is not in trouble, demand still vastly outstrips supply, but the era where the largest AI labs were captive single-vendor customers is clearly ending. The most intriguing wildcard is OpenAI’s own line that Jalapeño is “designed with flexibility to work with all LLMs.” That is not how you describe a chip you intend to keep entirely to yourself. It hints, however faintly, at an OpenAI that could one day rent out inference infrastructure the way it now rents models, which would put it in direct competition with the very cloud providers it currently depends on.

Key Takeaways
- OpenAI and Broadcom unveiled Jalapeño on Wednesday, June 24, 2026, OpenAI’s first custom AI chip and its first piece of in-house silicon after years focused on models and products.
- The chip is branded an “Intelligence Processor” and described as the first AI accelerator in a multi-generation compute platform the two companies are building together.
- Jalapeño is purpose-built for large language model inference, the compute-intensive work of generating responses and serving answers to users, and explicitly not for training.
- Inference is OpenAI’s recurring cost center: every ChatGPT conversation, coding request, image generation, and agent action relies on it, making it one of the highest ongoing costs in the business.
- Broadcom President and CEO Hock Tan and President Charlie Kawwas physically delivered the first wafer to OpenAI CEO Sam Altman and President Greg Brockman.
- OpenAI designed the chip from scratch around its understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs.
- Jalapeño is described as a blank-slate design for modern LLM inference, not a general-purpose accelerator adapted from earlier AI workloads.
- The chip is shaped by the systems OpenAI runs daily across ChatGPT, Codex, the API, and future agentic products, while also being designed to work with current and future LLMs across the industry.
- The stated performance goal is to combine the throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, suiting it for interactive LLM products at scale.
- OpenAI frames this as its full-stack advantage: it designs frontier models, builds products on top of them, and now designs the chip architecture, kernels, memory systems, networking, scheduling, and deployment systems underneath.
- OpenAI claims Jalapeño went from initial design to manufacturing tape-out in just nine months.
- The companies call it what they believe to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors, against a backdrop of typically multi-year timelines.
- OpenAI used its own AI models to accelerate parts of the chip design and optimization process, which it credits for the speed.
- OpenAI frames the result as a flywheel: the same models served to users help improve the infrastructure that runs future models, lowering compute cost across the industry.
- Engineering samples of Jalapeño are already running ML workloads in the lab at production target frequency and power.
- Among the workloads running on the samples is OpenAI’s GPT-5.3-Codex-Spark model.
- GPT-5.3-Codex-Spark currently runs on Cerebras hardware, which also specializes in inference, per The Decoder.
- OpenAI says early testing shows Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art hardware.
- That performance-per-watt claim is self-reported and lacks independent verification; OpenAI has not said which chips it tested against, on what tasks, or under what conditions.
- OpenAI says it is still measuring final performance and has promised a detailed technical report in the coming months.
- The architecture reduces data movement and balances compute, memory, and networking resources to push realized utilization much closer to theoretical peak performance.
- Jalapeño is an ASIC, which experts say is less flexible than Nvidia’s GPU but less expensive and tailorable to specific AI tasks.
- Broadcom contributes silicon implementation and networking technologies, including its Tomahawk networking silicon, to bring the platform to large-scale production.
- Canadian electronics manufacturer Celestica provides board, rack, and system integration expertise and will build the server systems.
- The chips are manufactured by Taiwan’s TSMC, the world’s leading advanced semiconductor foundry, after OpenAI sent over the design.
- Both the chips and the Celestica-built server systems will be used only by OpenAI, not sold to outside customers.
- OpenAI plans to deploy Jalapeño at gigawatt scale by the end of 2026, with expansion in the years ahead, as the first step in a multi-generation plan.
- Hock Tan said gigawatt-scale data center deployment will happen with Microsoft and other partners beginning in 2026.
- The Decoder reported Microsoft is expected to buy 40 percent of the chips, with Broadcom reportedly demanding Microsoft guarantee that share to secure the first phase.
- Broadcom CEO Hock Tan told Reuters that Jalapeño is as good as Nvidia’s Blackwell chips and the TPUs designed by Alphabet’s Google.
- In October 2025, after 18 months of working together, OpenAI and Broadcom went public with plans to develop and deploy racks of OpenAI-designed chips starting late this year; CNBC framed the unveiling as coming eight months after that deal.
- The prior OpenAI-Broadcom plan ultimately aimed at 10 gigawatts of custom AI accelerator capacity, with deployments expected between 2026 and 2029.
- Estimates suggest OpenAI’s broader infrastructure plans could eventually involve around 26 gigawatts of computing capacity across custom chips, Nvidia hardware, and other accelerators.
- OpenAI has been one of the biggest buyers of Nvidia’s GPUs since kickstarting the generative AI boom in 2022, but explosive demand has pushed it to seek other sources of advanced silicon.
- Earlier in 2026 OpenAI struck a deal with Amazon Web Services that includes use of AWS Trainium chips, and has also signed agreements with AMD and with Cerebras, which held its IPO in May.
- The move is widely characterized as OpenAI diversifying away from and reducing dependence on Nvidia while creating an alternative to its GPUs.
- OpenAI’s stated goals with the chip are to reduce costs, improve energy efficiency, secure long-term computing supply, and gain more control over the infrastructure powering its services.
- Broadcom shares climbed about 2 percent following the announcement, are up roughly 10 percent year-to-date in 2026, and have multiplied almost sevenfold since the end of 2022.
- To build in-house chips, Meta, Amazon, and Google have turned to firms like Broadcom and Marvell for design services and IP that are hard to replicate internally; Reuters first reported OpenAI was exploring its own chip in 2023, and sources told Reuters in April 2026 that Anthropic is weighing its own AI chip.
- Broadcom’s margin on custom AI chips is currently lower than on products like networking switches due to AI-driven high-bandwidth memory demand; Tan said SK Hynix and Samsung Electronics supply Broadcom with memory chips.
Detailed Summary

A blank-slate chip built only for inference

Jalapeño is OpenAI’s first so-called Intelligence Processor, and the company is emphatic that it is not a repurposed general-purpose accelerator. It was designed from a blank slate specifically for modern large language model inference, the job of crunching data to answer a user’s query rather than the separate, bursty work of training a model. OpenAI says it designed the chip from scratch around its own deep understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs, drawing on the systems it runs every day across ChatGPT, Codex, the API, and future agentic products. The stated objective is to fuse the raw power and throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, which would make Jalapeño particularly well suited to interactive products used at scale. Notably, OpenAI also says the chip is designed with flexibility to work with all LLMs across the industry, not only its own, a claim that sits a little oddly next to its plan to keep the hardware entirely in-house.

The full-stack flywheel and AI designing its own silicon

OpenAI is selling Jalapeño as proof of a full-stack advantage. The argument is that because OpenAI now develops frontier models, builds products on top of them, and designs the infrastructure underneath them, including chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and the product experience, every layer can be optimized around the same goal of making its models faster, more reliable, and cheaper. OpenAI describes this as a flywheel: better infrastructure drives compute efficiency, which enables better training and serving, which powers more capable models, which become better products, which drive more usage and revenue, which funds the next generation of infrastructure. The most striking piece of that loop is that OpenAI used its own AI models to accelerate parts of the chip’s design and optimization. The company’s framing is direct: if AI can help engineers design better chips faster, it can lower the cost of compute across the industry. That self-referential loop is the part of the announcement that is genuinely novel rather than a rerun of an existing hyperscaler playbook.

Nine-month tape-out and the partner stack

OpenAI claims it took roughly nine months to go from initial design to manufacturing tape-out, and calls this what it believes to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors, against an industry norm measured in years. It credits deep software-hardware co-development, Broadcom’s silicon implementation expertise, and the use of its own models to compress the schedule. The work is split across a clear partner stack: OpenAI provides the architecture and AI-specific requirements, Broadcom contributes silicon implementation and networking technology, including its Tomahawk networking silicon, and Celestica handles boards, racks, and system integration, building the actual server systems. Once the design was complete, OpenAI sent it to TSMC in Taiwan, the world’s leading advanced foundry, for manufacturing. Crucially, both the chips and the systems built around them are for OpenAI’s exclusive use; they are not products being sold to outside customers.

Performance claims that nobody can check yet

OpenAI says early testing shows Jalapeño will deliver performance per watt substantially better than current state-of-the-art hardware, with an architecture that reduces data movement and balances compute, memory, and networking to push realized utilization much closer to theoretical peak. Hardware program lead Richard Ho said the team optimized around the kernels, memory movement, networking, and serving patterns that matter most for frontier models, and that the chip will execute key workloads close to the hardware’s theoretical limits. He told Reuters it will be performant on what he thinks will be all kinds of future LLM iterations. The important caveat is that none of this is verifiable. OpenAI is still measuring final performance, has not finalized the numbers, and has not disclosed which chips it benchmarked against, on what tasks, or under what conditions, with the technical report only promised in the coming months. As The Decoder put it bluntly, these are self-reported numbers, unverifiable for now, that should not be taken at face value. Broadcom CEO Hock Tan’s separate claim to Reuters that the chip is as good as Nvidia’s Blackwell and Google’s TPUs is similarly an unverified assertion from an interested party.

Gigawatts, Microsoft’s 40 percent, and who carries the risk

Jalapeño is the opening move in a much larger infrastructure buildout. Initial deployment is targeted for the end of 2026 at gigawatt scale, expanding over multiple generations. Tan said the gigawatt-scale data centers will come online with Microsoft and other partners beginning in 2026. The deal traces back to October 2025, when, after 18 months of collaboration, OpenAI and Broadcom went public with plans to deploy racks of OpenAI-designed chips, ultimately aiming for 10 gigawatts of custom accelerator capacity with deployments expected between 2026 and 2029. Broader estimates put OpenAI’s total infrastructure ambition at around 26 gigawatts across custom chips, Nvidia hardware, and other accelerators. The detail that cuts through the optimism comes from The Decoder: Microsoft is expected to buy 40 percent of the chips, and Broadcom reportedly demanded that Microsoft guarantee that purchase to secure the first phase. That guarantee shows that the financial risk of this buildout is not OpenAI’s alone; it rests heavily on its largest backer’s balance sheet.

The Nvidia diversification arc and Broadcom’s windfall

Jalapeño is the clearest signal yet of OpenAI loosening its dependence on Nvidia. OpenAI has been one of the biggest buyers of Nvidia GPUs since it kickstarted the generative AI boom in 2022, but demand has exploded past what any single vendor can supply. Within 2026 alone, OpenAI has struck a deal with AWS that includes Trainium chips, signed agreements with AMD and with Cerebras, which held its IPO in May, and now rolled out its own ASIC. The pattern mirrors what Meta, Amazon, and Google already did, all of them leaning on firms like Broadcom and Marvell for design IP that is hard to build in-house, and Anthropic is reportedly weighing the same move, per sources who spoke to Reuters in April 2026. Broadcom is the obvious beneficiary, with shares up about 2 percent on the news, up roughly 10 percent in 2026, and up nearly sevenfold since the end of 2022. Even so, Tan noted that the AI-driven surge in high-bandwidth memory demand makes Broadcom’s margin on custom AI chips lower than on products like networking switches, with SK Hynix and Samsung Electronics supplying the memory.

Notable Quotes

“The world is moving to a compute-powered economy.”
Greg Brockman, President and Co-Founder of OpenAI, framing the launch as a broad economic shift

“Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access.”
Greg Brockman, President and Co-Founder of OpenAI, on the full-stack rationale for building its own chip

“Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers.”
Richard Ho, who leads OpenAI’s hardware program, describing the chip as purpose-built rather than adapted

“We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits.”
Richard Ho, who leads OpenAI’s hardware program, on the architecture’s optimization targets and early performance

“It will be performant on, we think, all kind of future iterations of LLMs.”
Richard Ho, OpenAI hardware chief, to Reuters on the chip’s forward compatibility with future models

“Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI.”
Hock Tan, President and CEO, Broadcom, on the scale of the infrastructure commitment

“This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.”
Hock Tan, President and CEO, Broadcom, on the multi-generation plan and 2026 gigawatt-scale deployment with Microsoft

“The goal is to combine the power and throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, making Jalapeño well suited for interactive LLM products at scale.”
OpenAI, in the press release, stating the performance objective for the chip

“These are self-reported numbers that haven’t been finalized. Take them with a grain of salt.”
Maximilian Schreiner, The Decoder, on the unverified performance-per-watt claim

Jalapeño is a real chip running real workloads in a lab, but the gap between an engineering sample and a profitable production fleet is exactly where this story will be decided over the next year, and the most important numbers, the performance-per-watt figures that justify the whole effort, remain self-reported and unverified until OpenAI publishes its technical report. Read OpenAI’s full announcement here.

Related Reading
- OpenAI, the chip’s designer and the primary source of the announcement and quotes.
- Broadcom, the co-developer providing silicon implementation and Tomahawk networking.
- Celestica, which builds the boards, racks, and server systems around the Jalapeño chip.
- ASIC (application-specific integrated circuit), what Jalapeño is, a custom chip built for one task unlike a general-purpose GPU.
- Nvidia Blackwell, the Nvidia architecture Broadcom’s CEO claims Jalapeño matches.
June 24, 2026
Lloyd Blankfein on the 3 Sectors Where He Puts His Money Now: Big Tech, Energy, and Financial Services, Day Trading From an iPad, and the Warren Buffett Handshake That Backed Goldman in 2008
Lloyd Blankfein spent almost 40 years at Goldman Sachs, the last dozen as its chairman and chief executive, and he still trades almost every day from an iPad. In this wide ranging conversation on the My First Million podcast, the former Goldman boss lays out exactly where he is putting his own money right now, why a supportive spouse beats nearly any investment, how Warren Buffett wired five billion dollars into Goldman on a handshake during the 2008 crisis, and why he reads medieval history to stay calm about the present. It is part stock picking, part risk philosophy, and part a frank accounting of money, marriage, and the scars of growing up in the projects.

TLDW

Blankfein says he is roughly 98 percent in risky assets, almost all equities, and concentrated in three sectors he knows cold: big tech, energy, and financial services. His personal book leans heavily into single stocks over ETFs, weighted toward the big hyperscalers and a few second tier names, and he trades daily, alone, from an iPad and a phone, using calls and texts as his research network. Yet the advice he gives a normal investor is the boring opposite: a diversified S&P 500 fund like VOO, more risk when you are young because you will outlive your mistakes, the same thing Warren Buffett would tell you. The conversation ranges across the 2008 Buffett investment in Goldman, the cost of trying to legislate risk out of markets, the thin margin between the best and the rest, luck and the myth of the genius, why reputation is the real contract on Wall Street, why a supportive spouse is the highest return asset he knows, the money anxiety he carried out of a Brooklyn housing project, the dignity of a 500 dollar financial aid check, giving with a warm hand versus a cold one, the dangers of gamified investing, the big misses like SpaceX and early cellular, the obituary test a senior partner once gave him, and why reading history keeps the present in proportion.

Thoughts

The most useful tension in this interview is the gap between what Blankfein practices and what he preaches. He tells young people to buy a diversified S&P 500 index fund, he holds VOO himself, and he calls the host’s plain 90 percent stocks and 10 percent bonds split sensible. Then he admits his own portfolio is something like 90 percent single stocks that he trades by hand every day. The honest read is that his edge is not a transferable tip. It is a 40 year information network of phone calls and a tolerance for risk that most people neither have nor should want. The replicable lesson is the boring half, not the day trading half.

The most contrarian idea here is not a stock pick, it is his defense of risk itself. His argument that regulators trying to prevent the hundred year storm also forfeit the 99 normal years of growth in between is a serious claim about the price of safety, and it travels far beyond Wall Street. The same goes for his point that a good risk manager sometimes has to push people to take more risk, not less. The moment after a loss, when everyone goes gunshy, is exactly when the best operators lean back in. That is an uncomfortable thing for a former bank CEO to say out loud, and it is the part of the conversation most worth sitting with.

The Warren Buffett story is a master class in what actually moves markets, and it is not cash. Goldman did not need the five billion dollars. Blankfein says the money was almost irrelevant because the firm already had money. What it could not manufacture was confidence, and Buffett’s name supplied it. The handshake, the commitment with no paperwork, the line about worrying enough for the both of us, all point to the same thing. At the top, reputation is the collateral. His aside that most trades are never written down because you will never eat lunch in this town again is the same idea wearing street clothes.

Quietly, the personal finance thread may be the most valuable part for a normal listener. A former Goldman CEO saying that a supportive partner is more game changing than any investment, that a bad marriage is financially worse than being lonely, and that he has not paid a bill in over 40 years because his wife runs the household economy, is a reminder that household stability is itself an asset class. The 500 dollar financial aid check he still remembers half a century later, and his give with your warm hand philosophy, reframe wealth as something measured by how it feels to give and to receive, not just by the size of a pie chart.

Finally, the history obsession is not a side hobby, it is his risk model. Reading about the black plague, the McCarthy era, and the Vietnam draft is how he keeps the present in proportion. His Mark Twain line, that history does not repeat but it rhymes, is the direct antidote to the in this economy defeatism he and the host both complain about. For an investor, that long view is close to the whole game. It is what lets you hold through the drawdowns that scare everyone else out of the market.

Key Takeaways
- Blankfein estimates he is about 98 percent in risky assets, with roughly 95 of those 98 points in equities, and the rest spread thin. He invests in risky assets because, in his words, that is what is fun for him.
- Within his equities, he is heavily tilted toward single stocks rather than ETFs. He frames it as roughly a quarter to a third in ETFs and the rest in single names, and concedes it could be as lopsided as 90 percent single stocks because picking names is what he enjoys.
- The three sectors he has concentrated in for years are big tech, energy, and financial services, and he says his outperformance comes from where he focused, not from any special genius.
- On tech he owns the big hyperscalers, the Googles, Microsofts, and Nvidias of the world, plus a tier just below them, naming Oracle and Larry Ellison as an example of a slightly riskier second tier name. He thinks in categories, not fixed tickers, because he changes positions constantly.
- He says he has a background in trading energy, which is why energy is a core sleeve, and he knows financial services from the inside after almost 40 years at Goldman, so those are natural areas of edge.
- He still owns a lot of Goldman Sachs stock, out of affection for the firm he spent his career building.
- He is bullish on big tech and plans to stay bullish until it stops going up. His foreseeable future, he jokes, lasts until he finishes the conversation and checks the screen again.
- He trades every single day, alone, with no team. He does it from an iPad and a phone, not a computer, and treats the market like background music rather than a job.
- His research is human, not algorithmic. He chats and texts with people, then calls them because he is tired of fixing typos, and he reads the New York Post, the Wall Street Journal, the New York Times, the Financial Times, and Bloomberg.
- The advice he gives ordinary investors is deliberately boring and different from his own behavior: hold a diversified equity portfolio like an S&P 500 fund, with VOO as his own example, and tilt more aggressively when you are young because you have time to outlive mistakes.
- He notes that broad indexes are already heavily weighted toward tech because of market cap, so a plain index gives meaningful tech exposure, and a tech focused ETF on top can add a disproportionate tilt for believers.
- He calls the host’s simple 90 percent index and 10 percent bonds allocation sensible, and says this is essentially the same advice Warren Buffett would give a normal person.
- The older you get, the more conservative you should become, shifting from maximizing gains toward not losing what you have. Young people can afford more risk precisely because they will outlive their errors.
- During the 2008 financial crisis, Warren Buffett invested about five billion dollars in Goldman through a preferred stock structure, essentially on a phone call and a handshake, with no demand for due diligence.
- Buffett’s real value was confidence, not capital. Goldman already had money, but it had lost the confidence of the market while peers were failing. Buffett’s name signaled the firm was a good investment being beaten down by circumstances that would reverse.
- Buffett asked for a verbal commitment that Goldman would not sell shares before he did, and declined to put it in writing. He waved off the worry with the line that five billion dollars going bad would not even be a bad hurricane for Berkshire, an insurer.
- Most trading is done on reputation, not paper. Blankfein says people buy and sell bonds worth enormous sums without written contracts, relying on probity, because anyone who reneges will never eat lunch in this town again.
- On risk and regulation, he argues you cannot legislate risk away. Trying to prevent the hundred year storm also forgoes the 99 in between years of growth, and a good risk manager sometimes has to encourage people to take risk, not suppress it.
- The best traders have resilience. They bounce back, focus on new information rather than the past, and adapt quickly instead of staying gunshy after a loss.
- The difference between someone who is really good and someone who cannot make it is small. He compares it to a golf tournament won by one stroke with six people tied for second, and notes much of life is winner take all at razor thin margins.
- Luck matters enormously. He became Goldman CEO partly because his predecessor was nominated to be Treasury Secretary, a reference to Hank Paulson, and the timing of opportunities is often out of your control.
- He is skeptical of the word genius. He says he can usually see how successful people do what they do, with Elon Musk as a rare exception, and that powerful people are more normal, more insecure, and more flawed than outsiders assume.
- On democratized investing, he thinks apps that make markets accessible are good in their own terms, but gamifying trading with confetti and high fives can mask real danger for people who can lose more than they can afford.
- He has missed plenty. He thought SpaceX was overpriced at a 100 billion dollar valuation, now discussed near a trillion and three quarters, and passed on early cellular because he could not imagine why anyone would carry a bulky phone when payphones existed. He says he missed far more than he got.
- He frames a supportive spouse as more game changing than almost any investment, and warns that a bad marriage, with custody fights and property settlements, is financially and personally worse than being lonely.
- He has not paid a bill in over 40 years. His wife Laura, a former lawyer he says now chairs Barnard College, runs a bill paying service and manages the household economy. He generates the money, she distributes it.
- He grew up in an East New York, Brooklyn housing project, the son of a postal worker, and carried money anxiety well into his 30s. He recalls buying a vacation home that cost more than all their savings, with his wife unable to make the math work until they remembered the down payment.
- A 500 dollar financial aid check, handed to him without shame as a college freshman around 1971, shaped his philosophy on giving. He learned it is not enough to give people what they need, you have to give it in a way that feels dignified.
- He embraces the give with your warm hand, not your cold hand idea, the notion of giving while alive so you can experience the joy, which connects to the spirit of the book Die With Zero.
- He admits ambivalence about giving to his kids, the strange feeling of resenting that they have what he provided, and notes the heavy burden carried by children of prominent people who must prove they earned their place.
- He describes himself as wired for anxiety, inherited from his father, and says looking around corners for what could go wrong actually suited a career in a risky business with a big balance sheet.
- When he made partner, a senior partner gave him rules of the road, including avoiding misconduct, being conservative on taxes, setting up a charitable foundation, and living so that no more than three of the nine paragraphs in his eventual obituary would be about Goldman. He says he stayed too long to pass that test.
- He reads history as a discipline, favoring Barbara Tuchman, Robert Caro’s The Power Broker, Ron Chernow, Rick Atkinson, and Stephen Ambrose. His core belief, borrowed from Mark Twain, is that history does not repeat but it rhymes, which is why he would not bet against America.
Detailed Summary

The three sectors he actually invests in

The headline answer to where the former Goldman CEO is putting his money is simple: big tech, energy, and financial services. He says he has been focused on those three areas for a long time, and that his outperformance is a function of where he aimed rather than any unusual investing gift. Energy is natural because he has a background trading it. Financial services is natural because he spent nearly 40 years inside the industry. Tech is where he is most heavily concentrated, and he expects to stay there for good reason, citing the threshold of large changes in technology. He owns the major hyperscalers by category, the Googles, Microsofts, and Nvidias, plus a tier just below, offering Oracle and Larry Ellison as a polite example of a slightly riskier second tier name. He is careful to say he thinks in categories rather than fixed tickers because he changes his positions all the time.

How the portfolio is really built: single stocks over ETFs

Asked to describe his portfolio as a pie chart, Blankfein says he is about 98 percent in risky assets, with roughly 95 of those points in equities. He pushes back on the idea that index funds are safe, pointing out that a diversified equity ETF is still equities and still risky, just spread out, and very different from debt or short term money markets. Within his equity sleeve he leans into single stocks, framing it as somewhere between a quarter and a third in ETFs and the rest in individual names, and conceding it might be as extreme as 10 percent ETFs and 90 percent single stocks. The reason is preference, not theory. Picking and trading names is what he likes to do, and he is honest that this is a hobby pursued by a professional, not a model for someone investing for a living.

How he actually trades: an iPad, a phone, and a network

He trades every day, by himself, with no team. There is no Bloomberg terminal and no desk of analysts. He uses an iPad and a phone, and admits it takes discipline not to glance at his screen mid conversation. The market, he says, is like music playing in the background while he does other things. His information edge is relational. People text him, he texts back, and then he calls because he is tired of fixing typos with what he calls his fat fingers. He follows general and business news, reads a stack of newspapers starting with the New York Post, and treats companies like little stories, almost like gossip. He even notes, with some delight, that he still watches commercials on Netflix, a small window into a frugality that never fully left him.

The advice he gives young investors, and what Buffett would say

For a normal person, his counsel is the opposite of his own behavior. He would hold a diversified portfolio of equities like an S&P 500 fund, naming the SPY and VOO tickers and saying he personally uses VOO. Because of the importance of technology, he might add a tech oriented ETF for extra tilt, while noting the broad index is already tech heavy by market cap. He endorses the host’s plain 90 percent index and 10 percent bonds split as sensible and says it mirrors what Warren Buffett would advise. His one piece of age based guidance is that younger investors should accept more risk through equities, because they have time to recover, while older investors should grow more conservative and focus on not losing what they have rather than maximizing returns.

The Warren Buffett handshake that backed Goldman in 2008

The most cinematic story in the conversation is Buffett’s roughly five billion dollar investment in Goldman during the financial crisis, structured as a preferred stock that sits between a loan and equity. Blankfein describes a deal done largely on trust. When he offered to walk Buffett through everything he was worried about, Buffett replied that he knew Lloyd well enough to know he worried enough for the both of them. Buffett also asked, verbally and without writing, for a commitment that Goldman would not sell shares before he did. Blankfein is clear that the cash itself was almost irrelevant, since Goldman had money. What the firm lacked was the confidence of a frightened market, and Buffett’s willingness to invest before things improved supplied exactly that signal. Buffett, he stresses, was acting for his own shareholders, not as a rescuer, which is precisely what made the vote of confidence credible.

Why you cannot legislate risk out of the system

Reflecting on the post crisis regulatory push to make sure 2008 never happened again, Blankfein makes a careful argument about the price of safety. Once you are in the business of taking risk, anything can happen, and trying to legislate it away has a hidden cost. You may think you are protecting the world from the hundred year storm, but you also forgo the 99 years of growth in between. He extends this inside the firm too. After a period of big losses, partners had become gunshy and were talking themselves out of every idea. A good risk manager, he argues, sometimes has to promote risk taking rather than repress it, because without risk there is no growth, no entrepreneurship, and no progress. The flip side is real: take risk and there is a meaningful chance you fail and lose other people’s money, which is a terrible outcome. But the alternative, never risking anything, buys comfort at the cost of ever moving forward.

Small margins, big outcomes, and the role of luck

Asked what separated the traders who could not outperform from the rest, Blankfein says the gap between the very good and those who cannot make it is surprisingly small. He likens it to a golf tournament decided by a single stroke with six players tied for second, and to acting, where the best performer gets every role and the second best waits tables. Much of life, he says, is winner take all at tiny margins. Luck compounds this. He freely credits fortune for his own rise, noting he became CEO in part because his predecessor was tapped to be Treasury Secretary. He is also skeptical of the genius label. He can usually see how accomplished people do what they do, with Elon Musk a rare exception, and insists the powerful are more normal, more insecure, and more driven by their flaws than outsiders imagine.

Reputation is the real contract

A recurring theme is that the financial world runs on reputation more than paperwork. Blankfein notes that most of what traders do is not written down. People buy and sell bonds and other instruments that settle days later, relying on probity rather than signed contracts, because anyone who lies or reneges will never eat lunch in this town again. He references the casual texts between Elon Musk and Larry Ellison around the Twitter acquisition as proof that big does not mean complicated. There are big things that are simple and little things that are complicated. Documentation is good when execution is far off, but when a deal will be performed in two days, dotting every i is often pointless. The point is not that documents do not matter, it is that trust and reputation are the load bearing structure.

A supportive spouse as the highest return asset

The conversation turns personal when both men agree that a supportive partner may be the single most game changing factor in a life, more than any investment. Blankfein adds the inverse warning: a bad marriage, with breakups, custody battles, and property settlements, is worse than loneliness. He credits his wife Laura, a former big firm lawyer he says now chairs Barnard College, with handling everything when his career moved the family overseas, from the car to the house to the kids’ schooling, while he took the visible victory laps at work. He has not paid a bill in over 40 years. Laura manages a bill paying service and runs the household finances. As he puts it, he is in charge of generating the money and she is in charge of distributing it. The host contrasts this with his own monthly money meetings with his wife, a discipline he picked up from a personal finance author friend.

Money scars, the 500 dollar check, and giving with a warm hand

Blankfein grew up in an East New York housing project, the son of a postal worker who had earlier lost a job, in a household where rent was scarce. He calls himself an urban hick who barely left Brooklyn as a kid. That scarcity left a mark that lasted into his 30s. He tells the story of buying a small beach house that cost more than all their savings, and of his wife driving 30 miles while failing to make the closing math work, until they realized she had forgotten to count the 10 percent down payment. The most resonant memory is a 500 dollar financial aid check handed to him as a freshman around 1971, made out on the spot by a clerk with a generosity of spirit that let him receive it without shame. That experience shaped a lifelong view that giving well means preserving dignity, and he now co chairs a financial aid campaign at his university. It also connects to his embrace of the idea of giving with your warm hand rather than your cold hand, giving while alive so you can feel the joy, the same spirit as the book Die With Zero. He is candid about a strange ambivalence, the way he can resent that his kids enjoy what he himself gave them.

Robinhood, confetti, and the misses

On apps like Robinhood, Blankfein takes a balanced view. Democratizing investing and making assets accessible is good in its own terms, and advertising can pull people toward markets they would otherwise ignore. But if you make trading too much like a video game, with confetti and high fives, you can mask the danger and lure people who cannot afford to lose into losing more than they can. He is equally frank about his own misses. He thought SpaceX was overpriced at a 100 billion dollar valuation, a figure now discussed near a trillion and three quarters. He passed on early cellular because he could not imagine why anyone would carry a bulky phone with payphones everywhere. His blunt summary is that he missed far more than he got, and that nobody is great at predicting the future.

The obituary test, thick skin, and staying too long

When Blankfein made partner, a senior partner assigned to acculturate new partners gave him rules of the road: avoid anything that would today be called misconduct, be rigorous and conservative on taxes, set up and actually use a charitable foundation, and keep enough balance that, if your obituary runs nine paragraphs, no more than three are about Goldman. Blankfein says he failed that last test by staying too long, even titling his memoir around the firm. He also reflects on having a thick skin, recalling unflattering press and concluding that he could take a punch, a trait not everyone has and one he did not know he possessed until he was tested. He is careful to say this does not make people who cannot take a punch bad, just differently wired.

Why he reads history: it rhymes

The final stretch is a love letter to reading history. Blankfein favors Barbara Tuchman, whose A Distant Mirror he has read twice and whose Guns of August he calls fantastic and influential, along with Robert Caro’s The Power Broker on Robert Moses, Ron Chernow’s biographies, Rick Atkinson’s Revolution series, and Stephen Ambrose’s Undaunted Courage. He describes rereading the Robert Moses book after 40 years of trying to get things done and finding his appreciation for the achievements rise, even as the flaws stayed the same, because he had changed. He ties history directly to markets through the Mark Twain line that history does not repeat but it rhymes. Patterns recur, every generation maximizes its own crises and minimizes resolved ones, and reading about the black plague, the McCarthy era, or the Vietnam draft is how he stays calm. His conclusion, echoing a sentiment often attributed to Buffett, is that he would not bet against America, a country he describes as mostly good and able to improve.

Notable Quotes

“I invest in risky assets. That’s what’s fun for me.”
Lloyd Blankfein, describing his own portfolio, which he says is roughly 98 percent risky assets

“It’s been good to be bullish on big tech, and I’ll stop being bullish on it when it stops going up.”
Lloyd Blankfein, on why he stays concentrated in technology

“I’m not at a computer. I don’t have a computer. I have an iPad.”
Lloyd Blankfein, on how he day trades every day, alone and with no team

“To me, the market is like music. It’s out there. It’s going on.”
Lloyd Blankfein, on why trading daily feels like a hobby rather than work

“Look, $5 billion if it all goes bad, that’s not even a bad hurricane on the East Coast.”
Warren Buffett to Lloyd Blankfein, waving off the risk of his 2008 investment in Goldman Sachs

“The difference between somebody who’s really, really good and somebody who can’t make it is not that great.”
Lloyd Blankfein, on the thin margin between the best and the rest

“You may think you’re protecting the world from the hundred-year storm, but you’re also going to forego the 99 years of in between when there was growth.”
Lloyd Blankfein, on the cost of trying to legislate risk out of markets after 2008

“I’m in charge of generating the money, and she’s in charge of distributing it.”
Lloyd Blankfein, on his 40-plus-year marriage to Laura and why he has not paid a bill in decades

“History doesn’t repeat, but to paraphrase Mark Twain, it rhymes.”
Lloyd Blankfein, on why reading history keeps the present in proportion

Watch the full conversation with Lloyd Blankfein on the My First Million podcast here.

Related Reading
- Lloyd Blankfein (Wikipedia) background on the former Goldman Sachs chairman and CEO whose investing views anchor the conversation.
- My First Million podcast the show where this interview took place, for the full back catalog of investor and founder conversations.
- Berkshire Hathaway primary source on Warren Buffett’s company, which made the roughly five billion dollar Goldman investment in 2008.
- Vanguard S&P 500 ETF (VOO) the diversified index fund Blankfein names as the sensible core holding for a normal investor.
- Die With Zero by Bill Perkins the book behind the give with your warm hand, not your cold hand philosophy discussed near the end.
June 16, 2026
Whale Rock Capital Founder Alex Sacerdote on S-Curve Investing, Why Anthropic Is His Highest Conviction Bet, and the Decommoditization of AI Hardware
Alex Sacerdote built Whale Rock Capital into one of the most respected technology hedge funds in the world by treating markets through a single disciplined lens: the technology adoption S-curve. In this long conversation on Invest Like the Best with Patrick O’Shaughnessy, he lays out the full framework that has carried him through internet 1.0, mobile, cloud, e-commerce, and now AI, and he explains why Anthropic became his highest conviction position, why his fund went net short application software, and why the least glamorous corner of the market, the hardware and chips that build out data centers, may be one of the best ways to play artificial intelligence right now. What follows is the working theory of a money manager who has spent twenty years trying to think exponentially while the rest of the market thinks one quarter at a time.

TLDW

Sacerdote walks through Whale Rock’s three-part investment framework: find the right part of an S-curve, identify the company with a durable competitive advantage, and buy when long-term earnings power is underappreciated. He tells the story of investing in Anthropic at a 180 billion dollar valuation in August 2025 after Claude Code made coding the true unlock of AI, and frames the foundational model market as a three-horse race between Anthropic, OpenAI, and Google that resolved from sixty startups into an oligopoly. He argues enterprise AI is less than 1 percent penetrated, calls the adoption shape an L curve rather than an S-curve, and warns there is not enough compute in the world. He explains why he sold almost all of his application software and went net short, why he loves the decommoditization of AI hardware (Celestica, Corning, Elite Materials, Delta, Advanced Energy, high bandwidth memory, 40-layer PCBs), introduces a modified rule of 40 for chip investing, surveys the moats that let leaders win (network effects, industry standard, scale, critical IP, brand, recursive self-improvement), discusses moving from public markets into private deals like Stripe and Anthropic, lays out Whale Rock’s fund products including the new Mega Cap Tech Fund, defends old-fashioned scuttlebutt research in an AI age, and closes on the kindest thing anyone ever did for him, his father joining the firm after 41 years at Goldman Sachs.

Thoughts

The most useful idea in this conversation is not the bullishness on AI, which is everywhere now, but the discipline underneath it. Sacerdote’s framework forces a separation that most investors collapse. A great market is not a great investment. A great company is not a great investment. You need a tall S-curve, a company with a moat that survives the curve, and a price that does not yet reflect the earnings power. He says the quiet part out loud: he has repeatedly bought the best companies in the world at four or five times earnings precisely because the market refuses to extrapolate exponential growth. Nvidia at four times earnings in 2023, Tesla at five times in 2019, Amazon where AWS came free. The edge is not information, it is the willingness to underwrite two to four years out when the consensus cannot see past the next quarter.

The Anthropic story is the framework applied in real time, and it is worth noting how late and how cautious he was. Whale Rock passed on the 60 billion dollar round because gross margins were negative and coding had not yet exploded. They only got conviction once Claude Code flipped from autocomplete to agentic work, once they heard Anthropic engineers were burning 100 dollars a day in tokens, and once the math on twenty million coders implied a half trillion dollar market from coding alone. The lesson he repeats throughout, that it is okay to be late, that you can miss the first 100 percent if the curve is tall enough, is a direct rebuke to the fear of missing out that drives most AI investing. He waited for the moat to be visible before he paid up.

His most contrarian and most actionable call is on hardware. The consensus reflex is that chips and components are commodities that get competed to zero. Sacerdote argues the opposite is happening: AI workloads growing 10x a year are pushing every layer of the server to its physical limits, and that pressure is decommoditizing the entire stack. A liquid-cooled AI server is a 300,000 dollar piece of critical infrastructure, not a 5,000 dollar throwaway box, which means the supplier becomes a permanent fixture like a parts vendor on a plane. The Celestica example is the template: a contract manufacturer left for dead since 1999 that turned out to be the sole supplier of Google’s TPU server and a leader in liquid cooling and Ethernet switching, trading at eight times earnings. If he is right that we are 30 percent short on DRAM, NAND, and PCBs, the picks-and-shovels trade has years left to run regardless of which model company wins.

The software bear case deserves the most scrutiny because it is the most consequential and the least certain. Going from 40 to 50 percent of the portfolio in software to net short is a violent reallocation, and his reasons are layered: AI products that nobody will pay for, CIO budgets being raided to fund Anthropic tokens, pricing power evaporating, and the long-term threat that AI-native startups rebuild incumbents from scratch. But he is honest that the bull case is real too, that old technology is sticky, that companies prefer to buy rather than build, and that AI might actually make platforms like Slack or CRM more important if agents end up operating inside them. This is the genuine uncertainty in the whole AI trade. The bottom of Jensen’s cake, chips and models, is where the value has accrued so far, but historically the application layer captured most of the market cap. Sacerdote is betting that this time the infrastructure and model layers hold the value longer, and he admits the application ecosystem is still unclear and a little bit dangerous. That admission is more valuable than any of his confident calls.

Finally, the section on research in an AI age is a quiet refutation of the idea that this work automates away. Sacerdote runs a Philip Fisher scuttlebutt operation, 2,500 to 3,000 face-to-face management meetings a year, two decades of compounding relationships, the tripod of conviction where he, his analyst, and a respected outsider all independently like an idea. AI writes better notes now, but the paragraph on top, the wisdom about what it means and how it fits the thesis, is still human. The durable moat in his own business is the same one he looks for in the companies he buys: an accumulated advantage that newcomers cannot replicate quickly. That consistency between how he invests and how he operates is the most credible thing in the interview.

Key Takeaways
- Whale Rock’s framework has three legs: identify the right part of a technology S-curve, find the company with a powerful competitive advantage, and invest when long-term earnings power is underappreciated.
- The core insight is exponential, not linear. Strong tech business models grow earnings exponentially, and because the market refuses to extrapolate, you can buy elite companies at very low multiples.
- Concrete examples of buying exponential growth cheaply: Nvidia at four times earnings in 2023, Tesla at five times in 2019, Apple at four times, and Amazon where AWS was effectively free.
- When ChatGPT launched in November 2022, Whale Rock did a firm-wide deep dive and chose to invest in chips and infrastructure first, because demand arrives there first and the winners are knowable regardless of who wins the model layer.
- The foundational model market went from roughly 60 startups to a three-horse race: Anthropic, OpenAI, and Google. Most startups died, Amazon never showed up, and Meta faltered and had to reboot.
- Anthropic was the dark horse that focused purely on enterprise while OpenAI won consumer. Whale Rock made it their highest conviction position.
- Coding is the true unlock of AI. The progression went from Microsoft Copilot at 20 dollars a month (fixing grammar, finding a bug) to Claude running agentically and writing most of the code.
- The market math: Anthropic engineers were reportedly spending 100 dollars a day on tokens, roughly 20 to 30 thousand dollars a year, and with about 20 million coders in the world that implies a half trillion dollar market from coding alone.
- Whale Rock invested in Anthropic at the 180 billion dollar valuation in August 2025, when the company hoped to reach 9 billion in revenue and nobody yet knew what 2026 could be.
- Andrej Karpathy and Linus Torvalds both flipped on AI coding. Karpathy went from 80 percent handwritten code to writing almost no code except in English.
- Models are not pure commodities. There is real differentiation: Anthropic is strong for private equity and finance, Google is strong at ingesting PDFs, and routers that switch between models mask but do not erase that differentiation.
- Anthropic is building an ecosystem around the API (SDK, orchestration, the harness, tools), echoing how AWS built lock-in with products around commodity servers starting in 2013.
- The 800 million people using AI are mostly using AI 1.0, a search engine on steroids. Sundar Pichai estimated only about 10 basis points of knowledge workers are truly using AI’s new capabilities.
- Enterprise AI is less than 1 percent penetrated. Whale Rock calls the adoption shape an L curve or backwards L curve because it goes straight up, unlike the slower 30 to 50 percent growth of cloud and SaaS.
- There is not enough compute in the world. Anthropic reportedly has half of what it needs, and Marc Andreessen said the one thing he is sure of is that there will not be enough compute for the next four years.
- The infrastructure S-curve is only about 10 percent penetrated and remains one of the best ways to play AI.
- Getting into private deals requires a double opt-in. Whale Rock did a 90-page deck (built with Claude Code) on the coding market to win their Anthropic allocation, and their first private was Stripe in 2020 at a 35 billion dollar valuation.
- The unicorn private market is now bigger than most European stock markets, larger than Germany or the UK individually. Whale Rock does 2,500 to 3,000 management meetings a year, 10 to 15 percent with privates.
- S-curves come in two sizes: mega S-curves (internet, mobile, cloud, e-commerce, AI) and sub S-curves within them. AI is the biggest of all and each curve builds on the last.
- Adoption inflects when barriers fall. Steve Jobs cut the smartphone price to 200 dollars on a 3G touchscreen, Elon cut the EV price to 40,000 with 300-mile range and a working supply chain. Remove the barriers and you get the tornado of demand.
- Knowing how tall the curve is tells you when to sell. Growth stops being exponential around 30 to 40 percent penetration, when the sell side catches up and big beats end. EVs hit a wall at 10 to 15 percent instead of the expected 40 to 50 percent.
- Selling Apple in 2012 at roughly 50 percent US smartphone penetration was a mistake, because the moat let it keep compounding around 20 percent even after the explosive phase ended.
- At strategic inflection points you cannot trust the data (Andy Grove). The signal is intuition and anecdote: a 12-year-old in China on a giant phone playing a real game, or standing-room-only sessions at the Gartner IT Symposium for AWS, VMware, and Splunk.
- Adoption slope varies. The radio curve hit near-full penetration in about 7 years, while B2B and infrastructure (the dishwasher that has to be plugged in) take far longer. AI is fast because you just open a browser.
- The moats that let leaders win: network effects, becoming an industry standard, rapid scale, critical intellectual property, brand, and platform lock-in. Anthropic appears to have critical IP, enterprise brand, escape velocity, and recursive self-improvement from using its own code on its own models.
- On the internet, the leader usually goes bigger, faster, and wins, and compounds on itself (Amazon, Shopify). Exceptions come at paradigm shifts, like AOL failing to make the dialup-to-broadband transition.
- Whale Rock went from 40 to 50 percent in software five years ago to net short entering this year, which helped performance in the first quarter. AI products were not good enough to charge for and were not moving the needle.
- Software faces a stack of headaches: falling priority on CIO to-do lists, budget pressure from token spend, lost pricing power, hiring freezes that hurt seat-based models, and the long-term threat of AI-native replacements.
- The classic rule of 40 is growth rate plus operating margin. Whale Rock’s modified rule of 40 for chip investing is percent of sales that are AI plus market share in that category. Software AI exposure is still only 1 to 2 percent.
- AI may make some platforms more important. The first thing you do with Claude is plug it into Slack, which could make Slack a permanent repository, and agents may end up operating inside incumbent tools like CRM, solidifying rather than killing them.
- The data center stood still for 40 years on Intel x86, with every component commoditized. AI changed that. Workloads growing 10x a year are driving the decommoditization of the hardware industry.
- Celestica is the template: a contract manufacturer left for dead since 1999, sole supplier of the Google TPU server, strong in liquid cooling and Ethernet white-box switching, with 50 to 60 percent share of the cloud Ethernet switch market, once trading at eight times earnings.
- The whole supply chain is rerating: high bandwidth memory stacked 10 chips high, 40-layer PCBs (versus 10 for a normal server), Elite Materials copper clad laminate, Corning fiber (enough to circle the world four and a half times in one Microsoft data center), and Delta and Advanced Energy power supplies seeing ASPs rise 40 percent a year.
- Networking has three layers: scale out (racks together), scale across (data centers together), and scale up (every GPU in a rack, currently copper, eventually fiber). The copper-to-fiber shift could two-to-three-x Corning’s opportunity.
- Whale Rock estimates the market is roughly 30 percent short on DRAM, NAND, and PCBs even at today’s 10 basis points of real AI usage.
- Rate of change matters more than absolute level. When Claude plotted market share data it missed the rate of change, the thing that drives accelerating growth and margins as a company moves from 10 to 30 percent share.
- Key risks: public and government negativity toward AI (Maine reportedly banned data centers, only 20 percent of people are optimistic), models hitting a wall and letting open source catch up into a race to the bottom, and a major player faltering and stranding compute.
- Chip companies do not care who wins the token war, which makes them a relatively safe way to play AI. Jensen Huang actively wants open source to take off.
- Research is still human work. Whale Rock runs a Philip Fisher scuttlebutt process, the tripod of conviction (Alex, the analyst, and a respected outsider), and 20 years of compounding knowledge. AI writes better notes but cannot supply the wisdom paragraph on top or pick stocks.
- The firm’s product evolution: 15 years as a long short fund, a long only fund in 2020 that is now larger than the long short, opt-in privates formalized around 2015 and activated in 2020, an 80 percent privates hybrid fund in 2021, and the new Whale Rock Mega Cap Tech Fund.
- The Mega Cap Tech Fund thesis: endowments are structurally underweight the largest tech companies because they believe there is no alpha in large cap. Whale Rock takes the top 30 global market caps and picks the best 12 or 13, arguing it takes 100 diversified PMs to realize Google is a winner.
- The kindest thing anyone ever did for Sacerdote: his father, after 41 years at Goldman Sachs, joined Whale Rock as chairman and the gray hair for six years until he passed away in 2011.
Detailed Summary

The Anthropic Investment and the Three-Horse Race

When ChatGPT launched in November 2022, Whale Rock immediately took its 10-person team and ran a firm-wide deep dive. Sacerdote’s first principle is that every new compute paradigm creates a new stack with new winners and losers, and in this stack the layers run from power and chips at the bottom, to the clouds, to the foundational models, to the applications on top. In early 2023 the firm deliberately positioned in chips and infrastructure first, reasoning that demand arrives there first and the winners are knowable no matter who wins above. At an April 2023 webinar they framed the model layer as a coin flip between winner-take-all, total commodity, a race to zero, or an oligopoly of three or four. Over the next three years the answer became clear: of roughly 60 startups, almost all died, Amazon never really showed up, Meta came in strong then faltered and rebooted, and Anthropic emerged as the dark horse focused purely on enterprise while OpenAI won consumer and Google remained a perennial threat. The result looked like the cloud market, where three companies underpin the entire SaaS world with excellent businesses.

The decisive factor was code. Sacerdote says the firm was initially skeptical AI could replace labor, given the negative corporate feedback on early models. That changed in 2025 when Claude Code and the agentic coding tools exploded. The progression ran from Microsoft Copilot at 20 dollars a month, which could improve coding grammar or find a bug, to Claude running agentically and doing far more. The token economics were staggering: Anthropic engineers reportedly spending 100 dollars a day, which annualizes to 20 to 30 thousand dollars, and with 20 million coders worldwide that implied a half trillion dollar market from coding alone, on technology that was only 7 to 9 months old. Whale Rock made the investment at the 180 billion dollar valuation in August 2025, writing in their letter that the company hoped to reach 9 billion in revenue, with growth like nothing they had ever seen, 100 million to a billion on the way to 9 billion, and no one yet knowing what 2026 could bring.

Why the Models Are Not Commodities

Everyone expected the foundational models to be pure commodities, but Sacerdote argues there is tremendous differentiation within them. Different training methods produce different skills: Anthropic excels at anything touching private equity and finance, Google is strong at ingesting PDFs. Routers that switch between models make them look like commodities but mask genuine, critical IP. Beyond the model itself, Anthropic is building a whole ecosystem around the API: the SDK, the orchestration layer, the tools, and the harness, the software wrapped around the API that gets the most out of the model. He compares this directly to AWS in 2013, when people dismissed cloud as commodity servers in a warehouse and missed that Amazon was inventing products that slowly built lock-in. The open-source risk from China is real, but Sacerdote got comfortable that leading-edge token quality is superior, because going from 80 to 85 percent of benchmark performance is a huge unlock and the open-source players lack the compute to leapfrog the frontier.

The S-Curve Framework in Full

Whale Rock’s whole edge is thinking exponentially when the world thinks linearly. Sacerdote argues very few people believe you can accurately predict two, three, or four years out, but if you understand the S-curve, the moats, and how to model, you can. Every technology follows the same pattern: it exists hidden for years (smartphones 10 years before the iPhone, the internet 20 years before Netscape, EVs 15 years before Tesla went vertical in 2019) until the barriers to adoption fall and demand inflects into a tornado. Knowing how tall the curve is tells you when to sell, because exponential growth stops around 30 to 40 percent penetration when the sell side catches up. Curves can also be dynamic: AWS turned out to address a far larger TAM than expected once it became clear cloud was not actually deflationary. There are mega S-curves (internet, mobile, cloud, e-commerce, AI) and sub S-curves within them. AI is the biggest. And slope varies enormously by the nature of the technology, the radio curve hitting full penetration in 7 years, B2B and infrastructure taking decades because, like a dishwasher, they have to be plugged into existing systems.

On timing, Sacerdote is relaxed about being late. Citing Peter Lynch, who mentored him at Fidelity and told him to white out the chart because it is all about the future, he argues it is fine to miss the first one, two, or three years and even the first 100 percent if the top of the curve is half a trillion. At strategic inflection points, per Andy Grove, you cannot trust the data, so the firm relies on intuition and anecdote: a 12-year-old in China playing a real video game on a huge phone, or the AWS session at the Gartner IT Symposium that was standing-room-only at 9, 10, and 11 in the morning. Spotting the leader pulling away matters because, on the internet, the leader usually goes bigger, faster, and wins, compounding on itself, with exceptions only at paradigm shifts like AOL missing the move from dialup to broadband.

The Software Bear Case

Five years ago Whale Rock had 40 to 50 percent of its portfolio in software. Their April 2023 thesis was that incumbents with huge sales forces and proprietary data would take the AI APIs and build great products. Instead, the AI products were not good enough to charge for and did not move the needle, so the firm sold almost all of its application software and entered this year net short, which helped in the first quarter. The bear case is layered: software has fallen down the CIO priority list, budgets are being raided to fund Anthropic tokens with faster ROI, annual price increases look risky, and hiring freezes hurt seat-based models. The deeper threat is that AI-native startups could rebuild any incumbent from scratch, obviating the data advantage. The bull case is genuine too: old tech is sticky (mobile games did not kill consoles, tablets did not kill the PC), companies prefer to buy rather than build, and an ERP is hard to replace. Sacerdote also floats an optimistic twist, that AI could make platforms like Slack more important as agent repositories, and that agents operating inside CRM could solidify rather than destroy it, even as the bear case is that CRM goes headless and gets relegated to a database.

The Decommoditization of AI Hardware

This is Sacerdote’s most differentiated call. For 40 years nothing changed in the data center; Intel x86 became the standard, compute grew 25 to 40 percent a year in line with Moore’s law, and every component, from the printed circuit board to memory to enclosures to networking, commoditized. AI broke that. Workloads now grow 10x a year and push every aspect of the hardware to its physical limits, creating both tremendous unit growth and what Whale Rock calls the decommoditization of the hardware industry. He cites Sean Maguire wishing he could run a hardware hedge fund because all the companies are public with powerful IP, and compares it to Sequoia’s best early hardware investments in Apple and Cisco. The economics flip because an AI server is a liquid-cooled, 200 to 300 thousand dollar piece of critical infrastructure where a single failure brings the whole thing down, so suppliers become permanent like a critical part on a plane.

Celestica is the marquee example: a contract manufacturer that had been a disaster industry since 1999 and went offshore to China, but kept its IBM supercomputing heritage and talent, became the sole supplier of the Google TPU server, and was trading at eight times earnings three years ago. It turned out to be excellent at liquid cooling where others failed, holds 50 to 60 percent share of the crucial cloud Ethernet switch market, and its engineers helped write the open-source SONiC software, working closely with Broadcom. The same dynamic runs up and down the chain: high bandwidth memory stacked 10 chips high that took Samsung years to master, 40-layer PCBs versus 10 for a normal server with very few suppliers able to make them, Elite Materials supplying the copper clad laminate, and Corning’s fiber, thinner and more bendable, with enough in a single Microsoft data center to circle the world four and a half times. Networking splits into scale out, scale across, and scale up, with the eventual copper-to-fiber shift in scale up potentially two-to-three-x-ing Corning’s opportunity. Power supplies from Delta and Advanced Energy are seeing ASPs rise 40 percent a year at higher margins because each Nvidia rack uses 50 to 125 percent more power. Visibility has gone from we’ll call you next week to design this roadmap with us for four years, turning 5 percent low-margin businesses into 35 to 50 percent topline growers with rising margins, and the whole market is roughly 30 percent short on DRAM, NAND, and PCBs.

Private Markets, Risks, and the Research Machine

Moving from public markets into privates meant adapting to a double opt-in, where the company has to choose to let you in. Whale Rock won its Anthropic allocation partly by building a 90-page deck with Claude Code scouring the internet for feedback on the coding market. Their first private was Stripe in April 2020 at a 35 billion dollar valuation, which they could only underwrite because they knew the public comp Adyen cold, and they upsized to a 100 million dollar block. The unicorn market is now bigger than most European stock markets combined. On risk, Sacerdote worries about public and government negativity (Maine reportedly banning data centers, only 20 percent of people optimistic), the possibility that models hit a wall and open source catches up into a race to the bottom, and a major player faltering and stranding compute, though he notes someone else (like Meta stepping into a cancelled Oracle deal) would likely absorb it, and that chip companies benefit regardless of who wins the token war. He explains his caution on the application layer by noting it always comes later, the iPhone took years to spawn its app economy, and the ecosystem is still unclear and a little dangerous, while pointing to Brett Taylor’s Sierra as the kind of company that could prove it out.

On the research itself, Sacerdote insists AI has not supplanted the analyst. Whale Rock runs the scuttlebutt approach straight out of Philip Fisher’s Common Stocks and Uncommon Profits, doing 2,500 to 3,000 face-to-face management meetings a year and talking to suppliers, customers, and competitors. AI now writes much better notes and gets the team up to speed quickly on complex areas like ABF substrates, but there must be a wisdom paragraph on top, and it cannot pick stocks or replicate the work two analysts did building conviction in AppLovin and a relationship with Adam Foroughi. He calls the firm the Whale Rock learning machine, a group of 10 highly experienced people compounding knowledge for 20 years, with the tripod of conviction (himself, his analyst, and a respected outside investor all liking an idea) as the test. The firm’s products evolved from a 15-year long short fund to a 2020 long only fund now larger than the original, opt-in privates, an 80 percent privates hybrid in 2021, and the new Mega Cap Tech Fund built on the thesis that endowments are structurally underweight the largest tech companies because they wrongly believe large cap has no alpha. He closes on his father, who left Goldman after 41 years to join Whale Rock as chairman and the gray hair until his death in 2011, a mentor remembered by countless people for his humility and grace.

Notable Quotes

“When you get the right part of the S-curve, you get exponential unit growth. If you have a very strong business model, your earnings don’t grow linearly, they grow exponentially.”
Alex Sacerdote, stating the core of the Whale Rock investment framework

“The world doesn’t think exponentially. Very few people believe you can accurately predict two, three, four years out. But if you follow and understand the S-curve and you know the moats and you know how to model, you really can predict these great things.”
Alex Sacerdote, on why the market consistently underprices long-term earnings power

“The enterprise AI or enterprise application AI market is less than 1 percent penetrated, and we’ve never seen, you know, we talk about S-curves, we call this an L curve, just straight up.”
Alex Sacerdote, on why AI adoption looks different from every prior technology curve

“We’re at 10 basis points of people really using AI and we’re already sold out. There’s not enough compute in the world. So Anthropic has half of what they need right now, and that’s before this huge takeup.”
Alex Sacerdote, on the scale of the compute shortage relative to actual adoption

“It’s okay to be late. It’s okay to miss the first one, two, three years in a lot of cases, because if the top of the S-curve is half a trillion, the growth can go on for a long time. It’s okay to miss the first 100 percent.”
Alex Sacerdote, on why fear of missing out is the wrong instinct in a tall S-curve

“The old way of software is like using a pen and paper or a horse and buggy. The new way of software is like a jet engine or frankly like the transporter from Star Trek. It’s so revolutionary it feels like it has to be disruptive.”
Alex Sacerdote, explaining why Whale Rock went net short application software

“You become like critical infrastructure, like selling a critical part on a plane. You’ll never get swapped out.”
Alex Sacerdote, on how liquid-cooled AI servers turned commodity hardware suppliers into permanent fixtures

“Why do you tell everyone your secret? It’s like why does the casino teach people how to play blackjack? It’s harder. It’s really hard to do.”
Alex Sacerdote, quoting his mother on why a public framework does not erase the edge

“He said, you know, I’ve been at Goldman for 41 years. How about I come and join you? I’ll be the gray hair. I’ll be the oversight. I’ll be the chairman. You do what you do.”
Alex Sacerdote, recalling his father joining Whale Rock, the kindest thing anyone ever did for him

Watch the full conversation here: Whale Rock Capital Founder on Investing in the Age of Exponential AI.

Related Reading
- Invest Like the Best (Colossus) — the podcast where Patrick O’Shaughnessy hosts this conversation and a deep archive of investor interviews.
- Technology adoption life cycle (Wikipedia) — the tinkerers-to-mainstream model that underpins the entire S-curve framework Sacerdote uses.
- Anthropic — the maker of Claude and Claude Code, Whale Rock’s highest conviction position and the center of this discussion.
- Common Stocks and Uncommon Profits by Philip Fisher — the 1950s classic whose scuttlebutt method still drives Whale Rock’s research process.
- Andy Grove (Wikipedia) — the Intel leader whose idea that you cannot trust the data at strategic inflection points anchors Sacerdote’s approach to timing.
June 9, 2026
Uber CEO Dara Khosrowshahi on AI, Autonomous Vehicles, Robotaxis, Drones, and the Future of Transportation
Uber CEO Dara Khosrowshahi sat down with Patrick O’Shaughnessy on the Invest Like the Best podcast for a long, candid conversation about the forces remaking transportation. There is artificial intelligence inside the company, and there is physical AI out in the real world, meaning autonomous vehicles, robotaxis, and delivery drones. He calls the autonomous opportunity another trillion dollar marketplace and argues it will change how society operates. You can watch the full interview here. What follows is a structured breakdown of the most useful ideas, the strategy behind Uber’s AV bet, and the operating philosophy that runs underneath all of it.

TLDW

Dara Khosrowshahi explains how he brought order to the chaos he inherited at Uber in 2017 by treating hard problems like vector mathematics, and how an immigrant childhood shaped his all-in, low-stress operating style. He describes AI hitting Uber on two fronts at once: much larger digital models that predict rider intent, and physical AI that changes how rides and food get fulfilled in the real world. The conversation covers Uber blowing through a full year of AI budget in a single quarter, metering headcount as engineers become superhuman, the more than 30 AV partnerships with Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI, and why supply, not demand, is the whole game. It runs through the coexistence model borrowed from travel and Uber Eats, the Uber One membership flywheel at 50 million members, the push from on-demand to planned travel through hotels and Uber Reserve, the economics of cheaper autonomous cars and delivery drones, the regional race from the Middle East to Europe, and the lessons from Barry Diller and Herbert Allen about getting to ground truth and betting on people. It closes on his capital allocation philosophy of prioritizing organic growth and AV commitments over buybacks.

Thoughts

The most underappreciated line in the whole interview is the budget one. Blowing a full year of AI spend in a single quarter is the clearest signal yet that frontier intelligence is being consumed far faster than even an AI-native company planned for. Dara’s response has quietly become the default enterprise playbook: explore on the expensive frontier models, then scale the proven interactions onto cheaper or open-source models. The deeper tension is that he is simultaneously telling teams to drive adoption and metering headcount, which is the real story of AI in large companies. The productivity gains are showing up as fewer hires, not only as faster shipping.

The supply-first framing is the strategic core, and it inverts the demand-first logic he learned at Expedia. In autonomous vehicles this means Uber does not need to win the self-driving race itself. It needs to own the demand layer and aggregate every AV maker’s supply, the same way online travel agents coexist with hotels and Uber Eats coexists with McDonald’s. The 30 percent higher utilization figure for AVs on Uber’s network is the wedge in that argument. It is the reason a Waymo stays on the platform even while building its own brand, because filling more of an expensive asset’s day changes the entire return on the car.

His premortem answer is unusually honest. Asked what kills the opportunity, he does not name an Uber-specific execution failure. He names AI’s unpopularity with the general public. That is a CEO admitting the gating factor is social license, not technology. The early data he leans on, drivers in Austin and Atlanta earning more and signing up in greater numbers as AVs add incremental demand, is the counter-narrative he is betting the public conversation on. Whether that story holds as AV volume scales from thousands of vehicles to hundreds of thousands is the open risk the entire industry shares.

Underneath the strategy is one repeated instinct: get to ground truth. It shows up in the Barry Diller story about reading the model from the analyst who built it, in his hunt for the troublemakers who keep a company mutating, and in the fact that he bought an ebike to deliver food in San Francisco. It is the same move applied at every altitude, and it is why he frames AI as a chance to rebuild processes from first principles rather than shave 20 percent off the ones that exist. The leaders who treat AI as an efficiency tool will likely lose to the ones who rebuild from the ground up.

Key Takeaways
- Dara took the Uber job in 2017 after Daniel Ek recommended him at the Allen and Company Sun Valley conference and told him, when he hesitated, that life is about impact rather than happiness.
- He inherited what he calls complete chaos: a board fighting for control, lost trust with regulators and the public, and a committee running the company after Travis Kalanick stepped back.
- His method for chaos is to treat it like vector mathematics, breaking a seemingly unassailable problem into component dimensions and solving each one.
- Early moves included bringing in chairman Ron Sugar to unite the board, running a listening tour with stakeholders, and rebuilding the executive team with leaders like Andrew McDonald and Tony West.
- He credits an engineering mindset and an immigrant childhood for his calm under pressure. His family lost everything leaving Iran when he was nine and rebuilt from nothing.
- On parenting, he argues that overcoming challenges is what forms people, and that doing everything for your kids is a long-term disservice disguised as a short-term favor.
- Uber has always operated in a probabilistic real world of traffic, cancellations, and late food, so it has used machine learning longer than most consumer companies.
- The current inflection is AI on two fronts: larger digital models that predict intent, and physical AI that changes how Uber fulfills in the real world.
- Uber’s feed and search models are now roughly 10,000 times bigger than the older ones, enabling universal search across rides, eats, and grocery in a single query.
- Uber can already guess a rider’s destination about three quarters of the time, turning booking into a one-tap interaction.
- AI adoption is bottoms-up across engineering, legal, and marketing. Developers in India are driving roughly ten times the code commits using autonomous agents.
- Dara pushes teams to rebuild processes from first principles with AI rather than settling for 20 to 30 percent optimization of an existing process.
- He wants the rebels and troublemakers to win, and treats unpredictable internal adoption patterns as something to find and promote.
- Uber blew through its full-year AI budget in a single quarter, which is now forcing it to meter headcount as engineer throughput climbs.
- The token strategy is to explore on expensive frontier models, then scale proven interactions onto cheaper or open-source models.
- Uber generates over 10 billion dollars in free cash flow on more than 10 billion trips a year, but it is not a high-margin business, so efficiency funds lower prices and higher earnings.
- In autonomous vehicles, the thesis is supply: own the demand layer and aggregate every AV maker’s vehicles, the way Uber aggregates drivers and restaurants.
- Uber has more than 30 AV partnerships, including Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI.
- Uber is building the surrounding ecosystem: depots, charging, fleet partners, a one billion dollar Santander financing line for EV and AV fleets, and autonomous insurance.
- AVs operating on Uber’s network are about 30 percent busier in trips and revenue per vehicle per day than vehicles not on the network, which transforms the return on an expensive car.
- The build, partner, or buy answer is coexistence, mirroring how travel agents coexist with hotels and airlines and how Uber Eats coexists with McDonald’s, Starbucks, and Chipotle.
- His public premortem is that AI’s unpopularity, not Uber-specific execution, is the biggest risk, so the company must move at the pace society will accept to avoid backlash.
- Early data in Austin and Atlanta shows drivers earning more and more drivers joining, suggesting AVs are adding incremental demand rather than only displacing humans.
- AV hardware costs typically fall 30 to 40 percent per generation. A Lucid midsize built with Nuro could land around 60,000 to 70,000 dollars and bring transportation costs down.
- Lower cost expands demand. Uber already dwarfs the taxi market it was once sized against, and Dara expects the same dynamic with AVs.
- Traditional OEMs are now investing in L4-ready systems and should arrive over the next two to four years. Each AV drives roughly three to four times what a human driver does.
- Chinese manufacturing capability and bill of materials are described as unrivaled. A low-cost Western, Foxconn-style player for AVs is being worked on but does not exist yet.
- Drones are gated by battery density. Food and grocery drones should reach real scale in two to five years and become normal in five to ten, with Joby and Zipline cited as examples.
- The Middle East, including Abu Dhabi, Dubai, and Saudi Arabia, is moving fastest thanks to entrepreneurial regulators. Europe is catching up, with London robotaxi pilots expected before year end.
- Uber Eats wins the number one position more often internationally. The playbook is selection plus reliability, amplified by cross-platform upsell, with about 13 percent of Eats bookings coming from the mobility app.
- Uber One has 50 million members growing 50 percent year on year. Dara frames it like Netflix, more content for the same price, and accepts a first-year loss for multi-year profit.
- Uber is pushing from on-demand to planned through hotels, via a deal with Expedia, and through Uber Reserve, now at over a 5 billion dollar run rate with 99 percent-plus reliability.
- His leadership lessons: from Barry Diller, get to ground truth from source material and tell the truth as a leader. From Herbert Allen, bet on people, not companies.
- On capital allocation, he prioritizes organic growth and financialized AV commitments over buybacks, while keeping costs growing slower than revenue.
Detailed Summary

From chaos to structure: the 2017 turnaround

Dara came to Uber from 13 years running Expedia under Barry Diller, recruited through a head hunter after Daniel Ek floated his name at the Sun Valley conference. He arrived into what he describes as complete chaos, with the board fighting over control rather than the fate of the company and trust badly damaged with regulators, the public, and employees. His approach was to decompose the situation the way an engineer decomposes a multidimensional problem, solving each dimension and reassembling the whole. Practically that meant a new chairman in Ron Sugar to unite the board, a listening tour to understand stakeholder concerns, and a rebuild of the leadership team that kept strong insiders like Andrew McDonald while adding people like Tony West.

An engineering mind and an immigrant chip on the shoulder

His wife Sid calls him a robot, by which she means he does not get rattled. He traces that to an engineering education and to a childhood upheaval. His family left Iran when he was nine and lost the business his father had built, and he watched that loss diminish his father over the years. The experience produced a durable drive to rebuild and a refusal to let external chaos define him internally. He applies a similar philosophy to his kids, arguing that challenges and the act of overcoming them are what form a person, and that helicopter parenting removes the very friction that builds capability.

AI inside Uber: prediction, agents, and superhuman engineers

Uber has always lived in a probabilistic world where the digital booking is deterministic but the real-world fulfillment is not, so it adopted machine learning earlier than most consumer companies. The newest models are roughly 10,000 times larger than the prior generation and power universal search and destination prediction that is right about three quarters of the time. Internally, adoption is bottoms-up and uneven in a good way, with engineers in India shipping around ten times the code commits using autonomous agents. Rather than mandate from the top, Dara pushes teams to rebuild whole processes from first principles with AI instead of trimming a fifth off the existing ones.

The cost of intelligence

The flip side of fast adoption is cost. Uber blew through its annual AI budget in a single quarter, and that is forcing a real adjustment. Because engineer throughput is climbing, the company is metering headcount increases rather than simply hiring. The operating rule is to keep driving adoption while pursuing efficiency, using frontier models from providers like OpenAI and Anthropic to experiment with new interactions, then moving the scaled experiences onto more efficient or open-source models to bring the per-token cost down. With more than 10 billion dollars of free cash flow on over 10 billion trips, Uber is not a high-margin business, so efficiency directly funds lower prices for riders and higher earnings for drivers.

Why supply decides the AV race

At Expedia, Dara learned a demand-first model where you attract consumers and then build inventory to match. Uber is the opposite, a supply company, where securing every car, restaurant, courier, and retailer causes the demand to follow. Applied to autonomous vehicles, the strategy is to be the go-to-market and demand layer for anyone building a digital driver. Uber wants to aggregate the largest pool of AV supply, just as it aggregates human drivers, so that the companies building the actual self-driving software can focus on the driver while Uber handles distribution and utilization.

Building the ecosystem around the digital driver

Uber now has more than 30 AV partnerships spanning Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI, and it expects many winners rather than one, the same shape as the foundation model market. Around those partners it is assembling the connective infrastructure: depots and charging in cities where the regulatory path is opening, fleet partners, a one billion dollar financing line with Santander for EV and AV fleets, and work on autonomous insurance. It is also collecting street data today that can feed the models, so that when a partner’s cars hit the market there is instant demand waiting. The early proof point is that AVs on Uber’s network run about 30 percent busier than comparable vehicles off it, which materially improves the return on a costly car.

The premortem and the public’s patience

Asked what derails the opportunity, Dara points outward rather than inward. The risk is that AI is powerful but unpopular, and the average person experiences it as a threat to electricity costs or a cousin’s job rather than as magic. The same dynamic could hit AVs even though the technology should end up safer than human drivers, which is why questions about emergency services, equitable access, and driver earnings have to be worked through with regulators and communities. The encouraging early signal is in Austin and Atlanta, where drivers are making more money and more are joining because AVs appear to be adding incremental demand. The controllable risk, he says, is access to supply, which is exactly why Uber has partnered with nearly every AV provider across mobility, delivery, and freight.

A trillion dollar marketplace: cheaper cars and delivery drones

Dara sizes the autonomous opportunity as another trillion dollar marketplace. As AV software and hardware costs fall, typically 30 to 40 percent per generation, a Lucid midsize built with Nuro could come in around 60,000 to 70,000 dollars, which starts to lower the real cost of transportation. History says lower cost expands demand, and Uber already became multiples larger than the taxi market it was once compared to. Manufacturing scales from hundreds to thousands to hundreds of thousands of vehicles, each driving three to four times what a human does, with traditional OEMs investing in L4-ready systems over the next two to four years and Chinese manufacturers setting the bar on cost and quality. Delivery drones are further out, gated mainly by battery density, but should reach real scale in two to five years and feel normal in five to ten.

Membership, hotels, and the shift from on-demand to planned

Uber Eats often reaches the number one position internationally by nailing selection and reliability and then layering on cross-platform advantages, with roughly 13 percent of Eats bookings flowing from the mobility app. Uber One, at 50 million members growing 50 percent year on year, is the loyalty engine, and Dara likens it to Netflix in that members get more for the same price. He explains the membership economics through Amazon Prime, accepting a money-losing first year to earn multi-year profit as members spend more across services. The newest expansion is travel: hotels through a deal with Expedia, and a broader move from Uber’s on-demand brand toward planned bookings, proven out by Uber Reserve at a 5 billion dollar-plus run rate and 99 percent-plus reliability. The end state he wants is a trip where Uber pre-books your ride to the airport, knows your hotel, and brings in-market magic to the whole journey.

Operating philosophy: ground truth, troublemakers, and capital allocation

The mentors thread through everything. From Barry Diller, with whom he worked for more than 20 years, he took the discipline of getting unfiltered truth from the source, illustrated by Diller insisting on hearing the Paramount LBO model from the young analyst who built it. From Herbert Allen he took the lesson to bet on people rather than companies, because great people stay great across cycles. In his own practice that becomes radical transparency, a deliberate hunt for the troublemakers who act as the mutations that keep an organism from dying, and a willingness to be wrong, since learning, often through pain, is what he finds interesting. On capital, he treats allocation as an art, prioritizing organic growth, which took Uber Eats from under a billion to over a hundred billion in gross bookings, then AV commitments that can be financialized, with buybacks coming after growth rather than instead of it.

Notable Quotes

“I know who I am, and I’m always going to be that same person. I’m not going to let the chaos of the world affect me mentally.”
Dara Khosrowshahi, on why crisis does not rattle him

“We blew through our AI budget in a quarter, you know, for the whole year essentially. And it is forcing us to adjust.”
Dara Khosrowshahi, on the real cost of AI adoption at Uber

“What’s magical now is going to seem normal to all of us 10 years from now.”
Dara Khosrowshahi, on how fast riders stop noticing autonomous vehicles

“We think it’s another trillion dollar marketplace.”
Dara Khosrowshahi, on the scale of the autonomous vehicle opportunity

“If we do that, the demand will take care of itself.”
Dara Khosrowshahi, on why Uber obsesses over securing supply first

“I’m looking for those mutations. I’m looking for those troublemakers constantly.”
Dara Khosrowshahi, on keeping a large company adaptive

“It’s the filtering that gets the edge out of the story or out of the situation. And it’s often the edge that gives you an edge.”
Dara Khosrowshahi, on a lesson from Barry Diller about going to the source

“If I’m not wrong, if I’m not making mistakes, it’s just not very interesting.”
Dara Khosrowshahi, on why learning, often through pain, drives him

“Meeting her and seeing her operate, I think, finally allowed me to be the person I want to be versus the person I thought I was supposed to be.”
Dara Khosrowshahi, on his wife Sid, when asked the kindest thing someone has done for him

The throughline is that Uber intends to be the demand layer for autonomous transportation the way it became the demand layer for human drivers, while rebuilding its own operations around AI from first principles. Whether the public grants the industry enough patience is the open question Dara keeps returning to. Watch the full conversation here.

Related Reading
- Uber primary source for the company, products, and AV partnerships discussed in the interview.
- Dara Khosrowshahi (Wikipedia) background on the CEO’s path from Iran to Expedia to Uber.
- Invest Like the Best the podcast with Patrick O’Shaughnessy where this conversation took place.
- Waymo the autonomous driving company behind the Austin and Atlanta partnerships referenced.
- Barry Diller (Wikipedia) the mentor whose lessons on ground truth shaped Dara’s leadership style.
June 3, 2026
Dan Loeb on Building Third Point’s $25 Billion Investment Empire: AI, Activism, Credit, and the FTX Mistake
Dan Loeb has spent three decades turning a $3 million fund into Third Point, a roughly $25 billion collection of hedge fund, credit, insurance, and venture businesses. In this Invest Like the Best conversation with Patrick O’Shaughnessy, Loeb walks through how he reinvented his strategy from deep value and event-driven trades into quality and thematic investing, why he now believes every serious investor has to be a technology investor, how he reads the AI cycle and the semiconductor melt-up, where activism and corporate governance still pay, and the single mistake that taught him the most. It is a rare, unhurried look at how a famously sharp-elbowed activist actually thinks about markets, businesses, and people.

TLDW

Loeb covers an enormous amount of ground: his daily process for staying ahead of the information firehose, Jensen Huang’s AI stack as a mental model, and why Nvidia, Anthropic, and Elon Musk’s companies are the three most consequential firms he tracks. He traces Third Point’s roots in credit and event-driven investing at Jefferies, the influence of Joel Greenblatt’s “You Can Be a Stock Market Genius,” and his later pivot to quality investing shaped by “The Outsiders” and Lawrence Cunningham’s “Quality Investing.” He argues the AI rally is not a dot-com-style valuation bubble because the leaders generate enormous cash, explains why human judgment and structural market quirks still create alpha, and makes the case that AI will never fully run a capital system. He digs into corporate governance and his father’s influence, the Sotheby’s and Sony activism campaigns, the hard reality of activism in Japan, and what investing in Danaher’s operating system taught him. He names FTX as his hardest lesson, breaks down Third Point’s evolution into a 60-percent-credit platform spanning CLOs, structured credit, reinsurance and annuities, describes how he is pushing his analysts to use AI and Claude daily, and closes on kindness and the friend who let him sleep on a couch before he made it.

Thoughts

The most striking thing about Loeb is that he treats his own strategy as a thing to be disrupted rather than defended. He built his reputation on Greenblatt-style special situations, spin-offs, demutualizations, and post-reorg equities bought cheap because of forced selling and sandbagged guidance. Most investors who win that way spend the rest of their careers protecting the formula. Loeb instead watched the people who stayed rigid about deep value and low multiples underperform or disappear, and deliberately retrained himself and his team around business quality and thematic conviction. The willingness to abandon a winning identity is the actual edge here, more than any single trade. It is the rare investor who can say his current strategy would not fit cleanly on a PowerPoint deck and treat that as a feature.

His AI framing deserves attention because it is unfashionably calm. The bear case on AI is usually about valuation, and Loeb dismantles it on the leaders’ own numbers: these are companies investing off their balance sheets, generating enormous cash, trading at multiples that do not resemble 1999. He was short the dot-com bubble, so he is not a permabull cheering from the sidelines. His real point is subtler, that the danger is expectations, not valuations. The semiconductor index ran up 40 percent on genuinely strong fundamentals, but Micron and Nvidia both put up monster quarters and saw their stocks fall because expectations had simply outrun even great results. That gap between fundamentals and price is where he thinks the human investor still earns a living, precisely because quant strategies, CTAs, and risk-managed pods are forced to sell into weakness rather than buy it.

The governance material is the most quietly radical part of the conversation. Loeb defends shareholder primacy against the Business Roundtable’s softer stakeholder language, but his argument is not the cartoon version where shareholder value means strip-mining a company. It is that boards have one job, accountability for capital allocation and management, and that vague multi-stakeholder mandates become an excuse for directors to avoid the hard work. His read on bad governance is almost always relational: directors who let loyalty to an underperforming CEO override their duty, or who sit on boards for status and income. The Sotheby’s story is the clean illustration, a centuries-old, high-status business run unprofitably because nobody treated it like a business. Loeb’s pattern is to find the gap between claimed status and actual performance and to raise the social cost of coasting.

What is genuinely new in Loeb’s posture is how he talks about AI inside his own firm. He is not pitching it as a moat or a headcount-reduction story. He frames Claude and AI tools as a way to make each person a more autonomous self-improver, something that gives back whatever you put into it, with some analysts running agents overnight and burning tokens while he personally uses it more for queries. Coming from a 30-year fundamental investor, the absence of defensiveness is the signal. He pairs it with Brad Gerstner’s nod to “Essentialism”: the firehose is now infinite, so the scarce skill is deciding what is actually relevant. That is a more honest answer to the AI question than either doom or hype.

Finally, the FTX confession is worth sitting with because of how he frames it. He does not retreat into cynicism about venture or crypto. He notes that Sam Bankman-Fried, fraud aside, had a real nose for value, with stakes in Anthropic, Cursor, and Solana that would have made him a top venture investor of the era. The lesson Loeb extracts is procedural, not philosophical: their due diligence now includes checking bank balances, the most basic verification that would have surfaced the problem. It is a useful reminder that even sophisticated capital can skip boring fundamentals when a company is growing fast and the cap table looks good. The discipline is not in having a grand theory of fraud, it is in never skipping the unglamorous checks.

Key Takeaways
- Loeb’s macro focus right now collapses to two variables: where oil goes, dictated by war and geopolitics, and what AI does on the spending and infrastructure front and its impact on society and the economy.
- He argues you can no longer punt on technology and focus on industrials or consumer; tech is a big, growing, compounding part of the economy that affects everything else, so every investor has to become a tech investor.
- He uses Jensen Huang’s AI stack as a mental model: power and energy at the bottom, then chips and infrastructure, up through large language models, software, and applications.
- The three most consequential companies he tracks are Nvidia, Anthropic, and Elon Musk’s companies collectively.
- Third Point’s roots are in credit and event-driven investing, shaped by his time at Jefferies watching investors like David Tepper before he founded Appaloosa, Eric Mindich at Goldman, and firms like Angelo Gordon and Farallon.
- Joel Greenblatt’s “You Can Be a Stock Market Genius” was his foundational framework: spin-offs, demutualizations, privatizations, and post-reorg equities where a new, illiquid security gets dumped by holders who will not do the work.
- Spin-off managers often sandbag guidance because their incentive packages get set at the time of the spin-off, creating a predictable gap between conservative numbers and real value.
- From 1995 to roughly 2013-2015, event-driven special situations were Third Point’s bread and butter; those opportunities still exist, but the real edge now is overlaying them with a business-quality lens.
- The pivot to quality and thematic investing was influenced most by “The Outsiders” (capital allocation plus great operations) and Lawrence Cunningham’s “Quality Investing” (high-moat, high-return-on-capital businesses to own for years).
- AI disruption made last year one of the worst for many apparently high-quality companies, as businesses that looked durable rapidly became less so.
- Loeb sees the AI rally as fundamentally different from the dot-com bubble: the leaders invest off their balance sheets, generate enormous cash, and do not carry the valuation excess of 1999.
- The danger in semis is expectations, not valuation: Nvidia and Micron posted spectacular quarters yet saw stocks fall because expectations had outrun even great numbers.
- Structural forces still create alpha for fundamental investors: quants, CTAs, and multi-strategy pods have risk metrics that force selling on the way down, the opposite of what is rational for long-term holders.
- He believes AI will not fully run a capital system; private equity, restructurings, creditor committees, and high-touch negotiation will always need humans.
- His interest in governance came from his father, a securities lawyer and corporate governance expert who sat on the boards of Mattel and Williams-Sonoma and pushed ethical sourcing ahead of his time.
- Loeb defends shareholder primacy, citing Milton Friedman and Warren Buffett, and criticizes the Business Roundtable’s move away from shareholder value as a distraction from the board’s real duty.
- Bad governance usually comes from directors letting loyalty to a weak CEO override fiduciary duty, lacking the knowledge to do the job, or serving for status and income.
- Writing is a core activism lever: great writing is clear thinking, and social pressure through writing and PR is one of the most effective ways to move a board, alongside financial and legal levers.
- The Sotheby’s campaign targeted a high-status, centuries-old business run unprofitably; Third Point bought 9.9 percent, eventually brought in Tad Smith from MSG, who cleaned up operations and technology before the company sold.
- Third Point increasingly prefers to back great companies with excellent management and cheer them on rather than hunt for mismanaged businesses, because bad management tends to cluster into a morass.
- Third Point is a collection of businesses; the flagship hedge fund grew from $3 million to about $9 billion and is roughly 30 percent credit, with the broader firm closer to 60 percent credit.
- The firm spans a roughly $7 billion CLO business, structured and corporate credit, an insurance company, asbestos liabilities, a small private credit unit, and a venture capital arm.
- The unifying thread is valuing enterprises across early, mid, and mature stages and investing in whichever fulcrum security offers the best risk-reward, from equity to senior debt.
- Loeb cites buying Twitter’s financing debt near 96-97 cents at a 12 percent yield when most credit investors were scared, and a difficult xAI debt financing, as examples of cross-discipline conviction.
- He is the portfolio manager only of the hedge fund; the credit, CLO, structured credit, and high-yield businesses have their own PMs and investment committees he does not sit on.
- The Sony campaign saw Third Point own up to 7 percent and push to separate the conglomerate; management resisted for years before spinning out the semiconductor and financial services businesses.
- He learned that activism in Japan is hard, but the government often wants reform; he co-wrote a paper with Larry Lindsey and Niall Ferguson urging corporate governance and return on invested capital as a fourth arrow of Abenomics, picked up as a Wall Street Journal editorial.
- Investing in Danaher was his most instructive experience, teaching him how the Danaher Business System drives continuous improvement (Kaizen) and how the company celebrates rather than shames underperformance because problems are fixable.
- FTX was his hardest lesson; it looked great and was verifiable on the blockchain, but was not what it appeared, and now Third Point’s diligence includes checking bank balances.
- He notes that, fraud aside, Sam Bankman-Fried had a strong nose for value with stakes in Anthropic, Cursor, and Solana.
- Recent mistakes also include shorts where Third Point thought certain info-services businesses would resist AI disruption; he still expects a shakeout with some phoenixes rising from the ashes.
- He is pushing his whole team to use AI daily, hiring native computer scientists and system integrators, and describes Claude as a tool that makes you autonomous and gives back whatever you put into it.
- Third Point’s distinctive edge is optimism about AI creating net jobs and the ability to default into credit investing during stressed times, as it did with investment-grade credit in 2020.
- Credit is hard to copy because it runs on relationships, not electronic trading; that is why Third Point built into CLOs and eyes the roughly $6 trillion structured credit market rather than treating it as tourism.
- The great analyst has changed: 20 years ago it was someone who could model fast and crack a complex restructuring (Loeb made a career-defining bet on Drexel Burnham claims); today it is a Gavin Baker type who deeply understands an industry, like the analyst who flew to Texas and realized Casey’s General Stores was really a pizza chain.
- Outside the US, Loeb is more bullish on Korea, Taiwan, and Japan as hunting grounds, finds Europe tough on regulation (though he owns Rolls-Royce and ASML), and finds the Middle East the most vibrant region.
- What worries him most is not the business but running out of time for family, surfing, and reading; what excites him is incorporating everything relevant about the world and forming relationships with people building interesting things.
- His closing reflection is on kindness as a top-tier value, and the friend, Carter, who let him sleep on a couch and seeded his early fund, echoing a Palmer Luckey line that money cannot buy friends who believed in you when you had nothing.
Detailed Summary

Staying ahead of the firehose and reading the macro

Loeb opens by admitting he does not have a perfectly organized system for processing the modern flood of information. He checks the news for what is relevant to the economy and to Third Point’s positions, tries not to obsess over minute-to-minute moves, and leans more tactical than strategic. When people ask him about macro, he says the usual government-reported metrics (growth, unemployment, inflation, rates, currencies, gold, crypto) are trumped right now by two things: where oil goes, which depends on war and geopolitics, and what AI does on the spending and infrastructure side and its impact on society and the economy. To understand technology, he leans on Jensen Huang’s framing of the AI stack and talks to smart people regularly, and he watches three companies above all: Nvidia, Anthropic, and Elon Musk’s companies as a group.

From event-driven roots to quality investing

Third Point’s DNA comes from Loeb’s time as a credit investor at Jefferies, where he watched some of the best distressed, event-driven, and risk-arbitrage investors operate, from David Tepper to Eric Mindich to firms like Angelo Gordon and Farallon. His first lens was event-driven: spin-offs, demutualizations, privatizations, and post-reorg equities, where a newly created and illiquid security gets dumped by holders who will not do the work, and management sandbags guidance because incentive packages are set at the spin date. He barely thought about moats or returns on capital; he just wanted to buy something genuinely cheap with those characteristics. That was the firm’s bread and butter from 1995 until roughly 2013-2015. Those opportunities still exist, but Loeb describes deliberately evolving toward business quality and thematic investing, influenced by “The Outsiders” on capital allocation and Lawrence Cunningham’s “Quality Investing” on durable, high-return businesses. He organized the team around industry experts rather than generalists. The twist: AI disruption recently turned many apparently high-quality companies into much lower-quality ones, fast.

The AI cycle, bubbles, and the human edge

Loeb resists the bubble narrative. He was short the dot-com bubble and remembers the valuation excess; today’s AI leaders, by contrast, invest off their balance sheets and generate enormous cash, so unless you believe the capex yields no return, the earnings and multiples do not look like 1999. The real driver of volatility, he argues, is expectations: the semiconductor index ran up 40 percent on strong fundamentals, but Nvidia and Micron both delivered blowout quarters and still saw their stocks fall because expectations had run too high. That dynamic is exactly where a fundamental investor earns a living, because quants, CTAs, and risk-managed pods are structurally forced to sell into weakness. He also doubts AI will ever fully run a capital system, since private equity, restructurings, creditor committees, and high-touch credit always need humans. He cites “Reminiscences of a Stock Operator” and Ecclesiastes: there is nothing new under the sun, and human nature, with its bubbles, panics, and extremes, does not change.

Governance, his father, and the duty of boards

Loeb traces his governance interest to his father, a securities lawyer and corporate-governance expert who served on the boards of Mattel and Williams-Sonoma and championed ethical sourcing before it was common. He calls the American board system beautiful: directors are answerable to shareholders and accountable for strategy and key financial decisions. Governance breaks down when directors lose sight of their fiduciary duty, lack the knowledge or talent diversity to do the job, or prioritize things other than shareholders. He invokes Milton Friedman and Warren Buffett to argue that caring about communities, employees, and conduct is not inconsistent with shareholder value but part of it, and criticizes the Business Roundtable for muddying the board’s core duty. The most common failure he sees is directors letting loyalty to an underperforming CEO override their duty. Most of the time Third Point redirects existing boards without even taking a seat; the extreme proxy fights are the exception.

Activism, writing, Sotheby’s, and Sony

Great writing, Loeb says, is clear thinking and organizing your thoughts to get a desired outcome, and it is one of activism’s most effective levers alongside financial and legal pressure. Social pressure through writing and PR can move a board on its own. He sees a pattern in his campaigns: targets that hold themselves out as high status but are not living up to it. Sotheby’s is the clean example, a centuries-old, high-status business run unprofitably, where Third Point bought 9.9 percent, gave the existing CEO a year, then helped install Tad Smith from MSG, who modernized operations and technology before the company was sold. Sony was a two-act campaign in which Third Point owned up to 7 percent and pushed to break up the conglomerate; he recounts sharing the thesis with Andrew Ross Sorkin at the New York Times under embargo, the panic it caused, and how management resisted for years before spinning out the semiconductor and financial services units. The lesson: activism in Japan is genuinely hard, even though the government wanted reform. He co-authored a paper with Larry Lindsey and Niall Ferguson arguing corporate governance and return on invested capital should be a fourth arrow of Abenomics, which ran as a Wall Street Journal editorial.

The Danaher operating system

Loeb calls Danaher his most instructive investment. He and his partner persuaded the company to compress its five-day Danaher Business System training into a single day, and he came away with a deep appreciation for how a real operating system drives continuous improvement. The standout lesson was cultural: Danaher holds people individually accountable, but when it finds someone underperforming it celebrates rather than shames, because the problems are addressable and fixable, and it does this relentlessly across operations and working capital. He also points to the diaspora of Danaher executives, including Larry Culp and the leadership at Ingersoll Rand, as evidence of the system’s depth. The investment worked for about four years before COVID-era order surges and inventory swings turned tailwinds into headwinds; Third Point sold and has recently bought back in modestly.

The structure of Third Point and the fulcrum security

Third Point is not one fund but a collection of businesses. The flagship hedge fund grew from $3 million to about $9 billion and is roughly 30 percent credit, generically around 110 percent long and 30-40 percent short on the equity side. Across the firm the credit weight is closer to 60 percent, spanning a roughly $7 billion CLO business, several billion in structured and corporate credit, an insurance company, a couple billion in asbestos liabilities, a small new private credit unit, and a venture arm. The unifying thread is valuing enterprises at any stage and investing in whichever fulcrum security (the one with the best risk-reward) makes sense. Loeb illustrates with Credit Suisse’s takeover by UBS, where the holdco paper proved the fulcrum, and with buying Twitter’s resold financing debt near 96-97 cents at a 12 percent yield when other credit investors were scared, plus a difficult xAI debt financing that few credit people wanted. He pushes back on the idea that he sits atop everything: he is the PM only of the hedge fund, while the other businesses have their own PMs and committees he is not on.

Insurance, the FTX lesson, and recent mistakes

Loeb started a Bermuda reinsurance company in 2010, backed by himself, Kelso, and Pinebrook, on a barbell thesis of investing the float in Third Point and treasuries to defer taxes and lever capital. The reinsurance side soured, and about three years ago he concluded they had the right idea but the wrong vehicle, that plain-vanilla annuities (which can only invest in credit) would have fit better. Third Point merged the reinsurer into its UK closed-end fund, Third Point Offshore Investors, reincorporated from Guernsey to Cayman, and repurposed it into an insurance company managing private credit, structured credit, whole-loan mortgages, real estate lending, and investment-grade debt. His hardest lesson was FTX: it looked great, was verifiable on the blockchain, and had a strong cap table, but was not what it seemed; diligence now includes checking bank balances. He notes Sam Bankman-Fried, fraud aside, had a great nose for value (Anthropic, Cursor, Solana). Other recent mistakes were shorts where Third Point bet certain info-services businesses would resist AI disruption; he still expects a shakeout with some survivors rising from the ashes.

AI inside the firm, the analyst of the future, and kindness

Loeb is pushing his entire team to use AI daily, hiring native computer scientists and system integrators, and describes Claude as a tool that makes you an autonomous self-improver and gives back whatever you put into it, with some analysts running agents overnight while he uses it more for queries. He pairs this with Brad Gerstner’s recommendation of “Essentialism”: you cannot do it all, so you must decide what is most relevant. The great analyst has changed: 20 years ago it was someone who could model fast and crack a complex restructuring, as Loeb did with the Drexel Burnham bankruptcy claims early in his career; today it is a Gavin Baker type who deeply understands an industry and its technology, like the analyst who flew to Texas and realized Casey’s General Stores was really a pizza chain in disguise. On the rest of the world, he is more bullish on Korea, Taiwan, and Japan, finds Europe tough on regulation (while owning Rolls-Royce and ASML), and finds the Middle East the most vibrant region. He closes on what worries and excites him (time with family, surfing, and reading versus the joy of incorporating everything relevant about the world), and on kindness, crediting his friend Carter, who let him sleep on a couch and seeded his early fund, and echoing Palmer Luckey’s line that money cannot buy friends who believed in you when you had nothing.

Notable Quotes

“I think you have to be a tech person today. It’s a big and growing and compounding part of the economy. It affects everything else.”
Dan Loeb, on why no serious investor can punt on technology anymore

“Hold on to your seats because things are only going to accelerate from here.”
Dan Loeb, recounting a 2013 Davos warning about technological change he now applies to AI

“Maybe that’s where the human element comes in, to understand and to be able to make those tough trading decisions when fundamentals are going one way and stock prices are going the other way, and to be able to take the pain of losses in the short run.”
Dan Loeb, on where a human investor still has an edge over machines

“It’s very different from the dot-com bubble, which we were short going into. You don’t have the valuation bubble now on those companies that you had back in those days.”
Dan Loeb, on why he does not see the AI rally as a 1999-style bubble

“When they found someone that was underperforming, it was celebrated instead of shamed, because look at all these things you’re doing wrong, we can fix those. And they did.”
Dan Loeb, on the accountability culture he learned from the Danaher Business System

“I would have to say our investment in FTX. It looked great. The company was growing fast. We could verify it all on the blockchain.”
Dan Loeb, naming his hardest investment lesson

“Be kind to people you have no idea how it will ever benefit you. And sometimes it will and sometimes it won’t.”
Dan Loeb, on elevating kindness in your hierarchy of values

“The one thing money doesn’t buy you is friends that believed in you when you had nothing.”
Dan Loeb, quoting Gavin Baker quoting Palmer Luckey, on the friend who seeded his early fund

Watch the full conversation between Dan Loeb and Patrick O’Shaughnessy here.

Related Reading
- Third Point LLC — the official site of Dan Loeb’s investment firm, the primary source for the strategies discussed.
- You Can Be a Stock Market Genius (Wikipedia) — Joel Greenblatt’s classic on special situations that became Loeb’s foundational framework.
- The Outsiders by William Thorndike (Wikipedia) — the capital-allocation study that helped shape Loeb’s pivot to quality investing.
- Essentialism by Greg McKeown (Wikipedia) — the book Loeb cites for deciding what is actually relevant in an age of infinite information.
- Invest Like the Best by Patrick O’Shaughnessy — the podcast where this conversation took place, with a full archive of investor interviews.
May 28, 2026
Raoul Pal: Why the Crypto Bull Run Is Just Starting, the AI Economic Singularity, and Why You Should Never Sell Bitcoin
Macro investor and Real Vision co-founder Raoul Pal returned to the When Shift Happens podcast for episode 173 to argue that the recent crypto drawdown is a nasty correction inside a much larger bull market, not the end of the cycle. Across an hour and a half he ties together the AI capital race, the coming economic singularity, why layer one blockchains are a kind of universal basic equity, and the deceptively simple discipline that actually compounds wealth: buy, hold, and almost never sell.

TLDW

Pal frames everything through what he calls the universal code, the conversion of units of energy into units of intelligence, and says the global race to fund AI is so large that no government or company can stop feeding it capital. That liquidity, plus relentless currency debasement, is the engine under both the AI stocks going vertical and the crypto market that has lagged them. He calls the Bitcoin slide from 126K toward 60K a normal correction in a bull market, says liquidity is now reaccelerating, and argues smart contract layer ones (Ethereum, Solana, Sui) are the best risk-adjusted bet because the entire financial system and a coming swarm of AI agents will run on those rails, giving crypto an effectively infinite total addressable market. He explains why he added Zcash as a Bitcoin-with-privacy and quantum-proof trade, lays out his plan to launch an NFT fund built around grail digital art and NFT-backed lending, and makes a data-backed case that buying oversold dips and never selling beats trying to trade cycles. The conversation closes on a 70/30 bullish framework for 2026 and 2027 and a reflection on kindness.

Thoughts

The strongest idea in this conversation is not a price target, it is a reframe. Pal keeps pulling the camera back from “what will Bitcoin do this quarter” to “what is the organizing principle of the entire economy right now,” and his answer is the funneling of all available capital into anything that produces intelligence. Once you accept that frame, the buy-the-dip behavior in both AI equities and crypto stops looking like mania and starts looking like a rational response to a one-way game. The part worth sitting with is his game-theory claim that neither the US nor China can stop, and that even a spectacular failure like an OpenAI blowup would simply trigger an instant asset auction rather than a collapse, because no single player can be allowed to win outright. Whether or not that is fully true, it is a genuinely different mental model than the recession-and-bust cycle most investors carry around.

His layer-one thesis is the most actionable takeaway and also the most quietly radical. The pitch is that for the first time ordinary people can own a piece of the core infrastructure that the machine economy will be built on, the way you never got to own a slice of TCP/IP or the open web. He calls this universal basic equity and treats it as humanity’s pension plan. The honest tension he admits is that the racy returns may not be in the boring base layer at all, and that the truly investable winners of this era, the private stablecoin companies, are largely closed off to retail. So the layer-one trade is partly a consolation prize for the fact that the best businesses are unreachable. That is a more candid admission than most crypto bulls will make.

The behavioral core of the episode is the most useful for a normal reader, and it is almost embarrassingly simple. Pal has been in markets for 35 years and says he does not know a single person who reliably buys bottoms and sells tops, including the legends, who he points out made most of their money on management fees rather than heroic trades. His prescription is to add only when the asset is one to two standard deviations oversold on its long-term log trend, otherwise do nothing, and to treat patience as an action rather than inaction. The line that does the most work is “the market owes you nothing.” It quietly dismantles the entitlement that drives people to overtrade, chase, and burn emotional energy on a strategy that the data says underperforms simply holding.

Where a reader should keep some skepticism is the certainty. Pal assigns the bull case a 70 percent probability and the bear case 30, but the bear case he sketches (Middle East war reignites, inflation forces tightening, liquidity gets starved, the intelligence buildout slows) is not a minor footnote, it is the whole structure failing at once. The thesis also leans hard on the assumption that AI agents will become massive on-chain economic actors, which is plausible but still mostly forward-looking rather than observed. The value here is the framework, not the forecast. If you take one thing, take the energy-into-intelligence lens and the standard-deviation discipline, and hold the specific tickers and timelines loosely.

Key Takeaways
- Pal’s central frame is the universal code: the universe, and now the economy, continuously converts units of energy into units of intelligence, and capital flows to whatever produces the most intelligence.
- The AI buildout is a race of nations and corporations that nobody can exit. Game theory means neither the US nor China can stop, because the other side would gain a decisive advantage.
- Even a catastrophic AI failure would not break the trend. If OpenAI ran out of money, its assets would be auctioned instantly to multiple buyers so no single company could double its compute and win the whole game.
- The economic singularity is the point where institutions and the way we measure the economy can no longer keep up with the speed of technology, made worse when AI and robots are added to the population as economic actors.
- AI is the first real-world example of Reed’s law, the exponential of the exponential, where most past technology followed the slower Metcalfe’s law log channel.
- By around 2028, roughly five to six years after AI went mainstream, AI will have produced more words than all of humanity has produced in sum total since the Gutenberg press.
- The current run is funded by cash flow, not debt. Unlike the late-1990s tech boom, the buildout is paid for out of the earnings of the most cash-generative firms in history.
- Chips and energy are the binding constraints. Companies report being booked out three years and beyond, and xAI is reportedly handing older data centers to Anthropic because no one can get enough compute.
- Pal expects the Fed to run a Greenspan-style playbook, cut rates and then get out of the way, letting a productivity miracle grow the economy faster than the debt pile so debt to GDP falls.
- Bitcoin falling from 126K toward 60K is a nasty correction in a bull market, not a bear market. Pal has seen many 50 percent Bitcoin drawdowns since 2013, and altcoins always fall further on the risk curve.
- The 2025 to 2026 correction has been choppy and slow rather than the fast V-shape of 2021, which is part of why sentiment feels so bad.
- Crypto lagged because liquidity is finite. The government shutdown withdrew liquidity, which hits crypto with about a three-month lag, while AI capex and Chinese gold buying sucked capital away.
- Liquidity is now reaccelerating in the US, China, and globally, which Pal sees as the reason the worst is likely over for crypto.
- The birth of economic agents in late 2024 gives crypto an effectively infinite total addressable market, since agents will be economic actors that hold treasuries, make payments, and transact on-chain.
- Smart contract layer ones are Pal’s preferred bet. He compares the structure to operating systems and cloud, where value concentrates into three to five major players plus a few specialists.
- He calls owning layer ones universal basic equity and humanity’s pension plan, the chance to own the rails the agentic economy will run on, something the internet never offered retail.
- Discounted cash flow analysis is the wrong tool for valuing a blockchain. The whole purpose of the network is to be the cheapest, fastest, and most programmable, so high fees are a bug, not a strength.
- Pal measures layer ones by intelligence density: number of developers, programmability, speed to finality, applications per user, and the ratio of stablecoins to total value locked as stored energy.
- Only three tokens maintained economic density when the market fell 80 percent: Ethereum, Solana, and Sui. ETH is the safe Microsoft-like choice, Solana is faster and cheaper, Sui is earlier but extremely fast and programmable.
- Pal added Zcash in the correction as a Bitcoin-with-privacy trade. The left-curve case is simple privacy value, the right-curve case is that it is also quantum-proof and a hedge against AI-enabled state surveillance.
- He admits he did not execute the Zcash buy well, kept meaning to add more while traveling, and watched it run up 50 percent. He treats it as a small position, not a portfolio overhaul.
- On Hyperliquid he is complimentary but uninvested, because he does not trade, use perps, or use leverage, and he expects Robinhood and Coinbase to compete hard for that niche.
- DeFi is better suited to machines than humans. Agents may not even need front ends or websites, just low-friction access to swap across multiple stablecoins and currencies instantly.
- DeFi is not dead despite mega-hacks. Pal argues hacks force better products, and notes that banks quietly absorb theft losses too, so the answer is to build more secure systems.
- The entire financial system is moving to blockchain rails because they are the most efficient way to operate, a prediction Pal first made in 2014 before smart contracts existed.
- Pal is launching an NFT fund focused on grail assets (one-of-one alien CryptoPunks, top artists) trading from roughly 600K to tens of millions, plus a convex middle tier of artists with social consensus.
- He names artists like Dies with the most likes (whom he compares to a Hunter S. Thompson of art) and Kim Asendorf, whose work uses tokens at the pixel level.
- The fund will also lend against NFTs for yields around 15 percent or more, acquiring assets cheaply if borrowers default and recycling yield into emerging artists.
- His real estate analogy: a smaller NFT in a great collection is like a modest apartment in a billionaire neighborhood, while grails are the 20 million dollar penthouses that actually compound.
- Bitcoin is partly an AI proxy because global savings should rise as AI lifts economic growth, and Bitcoin targets a share of those savings as a digital store of value.
- The core mindset shift: if you know where the world is going and roughly where market cap is heading on the log trend, you would never sell, you would only ever accumulate.
- Selling well is nearly impossible. Even if you take profit at two standard deviations overbought, adding it back at the bottom is something almost no one actually manages.
- The people who made the most money in crypto are the ones who did not trade it. Pal cites holders who profited by doing essentially nothing while active traders lost their edge.
- Pal’s discipline requires roughly two to three actions every five years: add when one to two standard deviations oversold, optionally trim when two standard deviations overbought, otherwise nothing.
- By his standard deviation measure, Bitcoin and crypto are as cheap as they have been in their long-term uptrend versus the NASDAQ, which he reads as a signal to allocate more to crypto.
- Fear and greed sat below 10 for the longest stretch in the index’s history during this correction, hitting its lowest reading ever, a classic oversold extreme.
- His 2026 to 2027 bull case stacks stablecoin explosion, the Clarity Act getting signed, rising global liquidity, debt rollovers forcing money printing, a strong business cycle, AI agents, and a cheap entry point. He puts it at roughly 70/30 to the upside.
Detailed Summary

Two economies and the money illusion

The conversation opens loosely with travel, stablecoin spending, and a riff on why people agonize over a 75 dollar airport breakfast but happily lose money on an NFT that drops 80 percent. Pal’s explanation is that we live in two economies at once. The crypto and tech economy can grow 50 to 150 percent in a good year, while the real economy grows around 2 percent. Money earned in the fast economy does not feel real, which is why people spend and speculate so freely with it. This sets up the rest of the episode, where Pal treats the fast economy as the place serious capital is being forced to go.

The AI capital race nobody can stop

Asked why the stock market only seems to go up, Pal gives two reasons: liquidity expansion and the most extraordinary capital event in human history, the funneling of all capital into intelligence. He frames it as a race of nations, corporations, and individuals that cannot be slowed because of game theory. No superpower can let another reach AGI alone, only the US and China can afford the race, and neither can stop without ceding the advantage. He even games out an OpenAI bankruptcy and concludes the US would instantly auction the assets across many buyers rather than let one firm double its compute and win, which is why he calls the whole thing too big to fail. The practical conclusion is blunt: buy the dip, because the structure forces capital to keep flowing.

The economic singularity, Reed’s law, and electricity through sand

Pal defines the economic singularity as the moment when institutions and our economic measurements can no longer cope with the speed of technology, especially once AI and robots count as population. He explains that almost all past technology adoption followed Metcalfe’s law, a log channel visible in the charts of Google, Facebook, and the NASDAQ, but AI is the first observed example of Reed’s law, the exponential of the exponential. To make it concrete he cites ARK research showing AI will, by roughly 2028, have produced more words per year than all of humanity, and notes Anthropic expected 10x growth and got 80x in a quarter. He marvels that we are putting electricity through silicon, the second most common element on Earth, and producing intelligence six orders of magnitude faster than a human neuron.

Why crypto lagged and why the worst is over

Pal explains the crypto underperformance mechanically. There is only so much liquidity, the government shutdown withdrew it, and that hits crypto with roughly a three-month lag, landing right in the middle of the October drawdown. At the same time, the AI buildout and Chinese gold buying pulled capital toward the longest-duration assets, leaving SaaS and crypto with nearly identical charts as they got left behind. His read for 2026 is that liquidity is now reaccelerating across the US, China, and the world, so there is nothing to worry about yet. The Bitcoin move from 126K toward 60K is, in his framing, a normal correction, comparable in length to the roughly six-month 2021 pullback that resolved into new highs.

Layer ones as universal basic equity

The heart of the investment thesis is that smart contract layer ones will accrue a growing share of crypto value as the investable infrastructure layer. Pal argues the entire financial system plus a coming swarm of AI agents will use these rails, giving crypto an infinite total addressable market. Like operating systems and cloud, value will concentrate into three to five chains plus specialists. He measures them by intelligence density rather than discounted cash flow, since the point of the network is to be cheapest and fastest. By his analysis only Ethereum, Solana, and Sui held economic density through an 80 percent drawdown. ETH wins on developers, security, and Lindy effects (the Microsoft you do not get fired for owning), Solana is faster and cheaper, and Sui is earlier but offers a different order of magnitude on speed, finality, and programmability. He frames owning a basket of four or five as humanity’s pension plan.

Zcash, privacy, and the quantum hedge

Pal reveals he added Zcash during the correction, alongside buying more Sui. He had said in December he would wait for it to pull back, and he did, though he admits he did not buy enough as it ran up 50 percent. His left-curve case is that privacy has real value and people will understand it more, making it essentially Bitcoin with privacy that could plausibly reach 5 to 10 percent of Bitcoin’s value. His right-curve case is that it is also quantum-proof and a hedge against governments wielding AI-enabled control over people. He dismisses the mid-curve worry that it will be banned, noting that the ban fear has shadowed crypto his entire career and never materialized.

Agents, DeFi, and financial rails

Pal argues the biggest future users of DeFi and crypto payments will be AI agents, whose scale is effectively infinite. Setting up agents himself, he keeps hitting walls that require small payments, and sees agents making endless micro-payments plus larger transactions, holding treasuries across multiple stablecoins and currencies, and rebalancing through DeFi instantly without any human involved. DeFi, he says, is actually better suited to machines than people, and may not even need front ends. On the wave of mega-hacks he is unbothered, arguing they force better products, that banks quietly absorb theft too, and that the financial system always migrates to the most efficient rails because that is how you make more money. He first predicted blockchain would become the financial industry’s infrastructure rail back in 2014.

The NFT fund and grail digital art

Pal is launching an NFT fund because so many people told him they want exposure but do not know how. The fund targets grail assets, the scarce one-of-one pieces with proven social consensus that trade from around 600K into the tens of millions, plus a convex middle tier of artists who have long-term proven value and could be wildly re-rated. He names Dies with the most likes, an Indiana artist cataloging the decline of middle America whom he likens to Hunter S. Thompson, and German artist Kim Asendorf, whose 3D works are built from individually tokenized pixels. The math of convexity is the draw: an artist re-rating from 20 to 200 ETH while ETH itself multiplies could compound into a 100x. The fund will also lend against NFTs for yields above 15 percent, acquiring assets cheaply on default and recycling yield into emerging artists, and will build a club connecting investors to artists. His real estate framing reassures smaller holders: owning a lesser piece in a top collection is like a modest flat in a billionaire neighborhood.

Never sell, and the math of patience

The behavioral spine of the episode is Pal’s argument that buying, holding, and accumulating beats trading cycles. He has built a Real Vision indicator that signals a buy when an asset is one to two standard deviations oversold on its log regression channel, and says it compounds at a stupid rate. The problem with selling is deciding how much and then having the discipline to buy it back at the bottom, which almost no one does. In 35 years he says he has never met anyone who reliably buys bottoms and sells tops, and notes the trading legends made most of their money on management fees. The people who made the most in crypto are the ones who did nothing. He reframes holding as patience, an active stance, and ties it back to the universal code: buying Bitcoin and doing nothing is the most energy-efficient trade you can make, while overtrading burns mental and emotional energy for a worse outcome. His advice to those tempted by AI’s vertical charts is to go play with AI and just hold your Bitcoin.

The 2026 to 2027 outlook

Pal closes the macro case by stacking the bull factors: a massive stablecoin expansion over the next 24 months, the Clarity Act getting signed and freeing builders, rising global liquidity, trillions in interest payments that force more money printing, a strong business cycle recycling earnings into speculative assets, the arrival of AI agents, and a cheap entry point with fear and greed at historic lows. He even floats a permanent resolution of Middle East conflict as part of the upside. The bear case is the mirror image: war reignites, inflation runs hotter, tightening starves capital, and the intelligence buildout slows. He puts the odds at roughly 70 percent bullish, 30 percent bearish, and says he does not see the bear case yet. The episode ends on a personal note about kindness, with Pal unable to name a single kindest act because, he says, everything is made of kindness.

Notable Quotes

“We’re going through the most extraordinary time in human history. Nothing else matters. This whole funneling of all capital into intelligence is the biggest race that’s ever happened.”
Raoul Pal, on why capital keeps flooding into AI

“The game is so big that nobody will stop.”
Raoul Pal, on the game theory of the US and China AI race

“This is how amazing it is. We’re putting electricity through sand and creating intelligence.”
Raoul Pal, on silicon and the universal code

“It’s a nasty correction in a bull market. I’ve been in crypto since 2013. I’ve seen many corrections, non-bear markets of 50% in Bitcoin.”
Raoul Pal, on Bitcoin falling from 126K toward 60K

“The market owes you nothing. You would just have to be better at doing a job.”
Raoul Pal, on the entitlement that ruins crypto investors

“This is humanity’s pension plan. We get to invest in the infrastructure rails of which all the agentic economy will run.”
Raoul Pal, on owning layer one blockchains

“The people who’ve made the most money out of crypto are the people who don’t trade it.”
Raoul Pal, on why holding beats trading

“Your job is to be a mercenary for your own capital. You want to make the most money over time.”
Raoul Pal, on why no one has to stay loyal to crypto

“Bitcoin and crypto is as cheap as it has been in its long-term uptrend versus NASDAQ.”
Raoul Pal, on the relative value signal he watches

This is a compressed look at a wide-ranging conversation. Watch the full episode on When Shift Happens here for Pal’s complete reasoning, the charts he references, and the back-and-forth that the summary above leaves out.

Related Reading
- Real Vision the financial media platform Raoul Pal co-founded, where his Global Macro Investor research and exponential age thesis live.
- Metcalfe’s law (Wikipedia) the network-value relationship Pal uses to model the log regression channel for crypto.
- Reed’s law (Wikipedia) background on the exponential-of-the-exponential growth Pal says AI is the first real-world example of.
- Technological singularity (Wikipedia) context for the economic singularity Pal argues is now only about four years away.
- Zcash the privacy coin Pal added in the correction as a Bitcoin-with-privacy and quantum-proof trade.
May 28, 2026
Gavin Baker on Orbital Compute, TSMC, Frontier AI Models, Anthropic’s Vertical Take Off, and the Coming Wafer Shortage
Gavin Baker, founder and CIO of Atreides Management, returns to Patrick O’Shaughnessy’s Invest Like the Best for his sixth appearance. He calls the current AI moment the most extraordinary moment in the history of capitalism, walks through what Anthropic’s vertical takeoff in revenue actually means, lays out why orbital compute is closer than skeptics believe, dissects the TSMC bottleneck that may be the only thing standing between today’s market and a full-on AI bubble, and rates every hyperscaler on how they have positioned for a world where frontier model providers may stop selling API access altogether.

TLDW

Anthropic added eleven billion dollars of ARR in a single month, which is roughly the combined business of Palantir, Snowflake, and Databricks built over a decade. That is the setup. From there Gavin Baker covers the March and April selloff, the contrarian read that a closed Strait of Hormuz was actually bullish for American manufacturing competitiveness, why Anthropic and OpenAI multiples may be misleadingly cheap on an unconstrained run rate basis, why Elon Musk’s discipline on SpaceX valuation created a superpower of permanent access to capital, the practical engineering case for orbital compute as racks in space rather than Pentagon sized space stations, why TSMC’s capacity discipline is the single most important variable in whether the AI cycle becomes a bubble, what Terafab in Texas changes, why the Pareto frontier of AI models has flipped from Google dominance to Anthropic and OpenAI dominance in nine months, the shift from all you can eat AI subscriptions to usage based pricing and what that means for revenue scaling, Richard Sutton’s bitter lesson as the largest risk to the AI trade, why frontier tokens still capture an overwhelming share of economic value, the role of continual learning as the third great open question, why most new chip startups should not try to build a better GPU, why Cerebras did something different and hard, why disaggregated inference may extend GPU useful lives to ten or fifteen years and rescue the private credit industry, why being in the token path is the new venture filter, the new prisoner’s dilemma around releasing frontier models via API, an honest rating of Google, Meta, Amazon, and Microsoft, why personal safety is becoming a real AI era risk, and why he remains an AI optimist maximalist who believes this could be the next Pax Americana.

Key Takeaways
- Anthropic added eleven billion dollars of ARR in one month, more than the combined businesses of Palantir, Snowflake, and Databricks built across a decade. There is no precedent for this in the history of capitalism.
- The SaaS and cloud revolution created between five and ten trillion dollars of value over twenty years. AI is replaying that compression on a timeline measured in months.
- The March selloff was a drawdown driven by disagreement with price action, not invalidated thesis. That is the kind of drawdown an investor can lean into.
- Deep Seek Monday in January 2025 was a similar setup. By the day of the selloff, AWS Asia GPU prices had already doubled, GPU availability had fallen, and it was obvious reasoning models would be vastly more compute hungry at inference. The market priced the opposite.
- The Strait of Hormuz closing was actually positive for America. US natural gas (the primary input into US electricity, which feeds AI) fell twenty percent on Bloomberg while Asian and European natural gas doubled or tripled. American manufacturing competitiveness improved overnight.
- The US is now the world’s largest producer and exporter of oil and gas. The economy is dramatically less energy intensive than in the 1970s. The shortage trauma comparison does not hold.
- Tech as a sector traded as cheaply versus the rest of the market in early April as at any point in the last ten years, into the single most bullish moment for AI fundamentals on record.
- Anthropic is dramatically more capital efficient than OpenAI, having burned roughly eighty percent less to reach a similar revenue scale. They have very different structural returns on invested capital.
- Anthropic at roughly nine hundred billion for fifty billion of ARR (growing a thousand percent) is striking. Adjusted for compute constraint, the unconstrained run rate could be one hundred fifty to two hundred billion, putting the implied multiple closer to five times.
- Claude Opus generates roughly seventy percent fewer tokens for the same question than previously, with token quantity tied to answer quality. Subscribers on flat-fee plans are getting a lobotomized model.
- Elon Musk’s superpower is twenty years of making investors money. He never pushes valuation. SpaceX compounded low thirty percent per year for a decade because Musk treats fair pricing as a sacred covenant.
- Capitalism will solve the watts shortage. The current bottleneck has shifted from chips and energy to zoning and political approval. Many capex decisions are paused until after the US midterms.
- The watts shortage probably begins to alleviate in 2027 and 2028. Orbital compute solves it longer term.
- Orbital compute is not Pentagon sized data centers in space. It is racks in space. A Blackwell rack is three thousand pounds, eight feet tall, four feet deep, three feet wide. SpaceX has shown a satellite roughly that size.
- The satellites operate in sun synchronous orbit so solar wings (around five hundred feet per side) always face the sun and the radiator on the dark side always points to deep space.
- Starlink V3 satellites already run at around twenty kilowatts. A Blackwell rack runs at one hundred kilowatts. SpaceX engineers express genuine confidence they have already solved cooling and radiator design at these scales.
- Racks in space are connected with lasers traveling through vacuum, the same lasers already on every Starlink. SpaceX operates the world’s largest satellite fleet and, via xAI Colossus, the world’s largest data center on Earth.
- Inference will move to orbit. Training will stay on Earth for a long time. Terrestrial data centers remain valuable for the rest of an investor’s career.
- The wafer bottleneck is structural and political. TSMC is essentially Taiwan’s GDP, water, and electricity. The leaders see themselves as inheritors of Morris Chang’s sacred legacy and they do not behave like a Western public company.
- Jensen Huang has never had a contract with TSMC. The relationship is run on handshakes and the assumption that things will be fair over time.
- If TSMC did everything Jensen wanted, Nvidia could be selling two to three trillion dollars of GPUs in 2026 and 2027. TSMC’s discipline is the single largest factor preventing a true AI bubble.
- Historically, foundational technologies always get a bubble. Railroads, canals, the internet. The current AI buildout is overwhelmingly funded out of operating cash flow, GPUs are running at one hundred percent utilization, and that is fundamentally different from the year 2000 fiber overbuild.
- If one of Intel or Samsung Foundry catches up at the leading node, the other will follow, and TSMC’s discipline collapses. Watch TSMC capacity decisions to predict a bubble.
- Terafab, the SpaceX and Tesla joint venture to build the world’s largest fab in America, has a partnership with Intel that grants access to fifty years of institutional foundry knowledge. The A teams at ASML, KLA, Lam Research, and Applied Materials will follow Elon’s reputation in hardware engineering.
- The hiring playbook for Terafab includes building Taiwan Town, Japan Town, and Korea Town next to the fab. Recruit the engineers and import their families, their restaurants, and their staff.
- Frontier tokens still capture an overwhelming share of all economic value created at the model layer. This is surprising and is one of the three big open questions for AI investing.
- The Pareto frontier of intelligence versus cost has flipped. Nine months ago Google’s TPU dominated every point on the frontier. Today Anthropic and OpenAI dominate, with Grok 4.3 on the frontier and Gemini 3.1 hanging on.
- Google’s conservative TPU V8 design (partly an attempt to reduce dependence on Broadcom and Nvidia) is the leading explanation for the loss of per token cost leadership.
- AI pricing is shifting from all you can eat to usage based, mirroring the cellular and long distance industries. Cellular stopped being a great growth industry when it went all you can eat. AI just made the opposite move.
- OpenAI and Anthropic together could exceed two hundred billion in ARR this year if compute keeps coming online and frontier token pricing holds.
- The two hundred fifty dollar a month consumer AI plan is no longer enough to evaluate frontier capability. Enterprise plans with usage based billing are required because rate limits are now severe.
- The three biggest open questions for AI investors are: violation of the bitter lesson via ASI or human ingenuity, whether frontier tokens keep commanding their premium, and when continual learning arrives.
- Today’s continual learning is crude reinforcement learning during mid training on verifiable tasks. True continual learning means weights updating dynamically, like a human who learns the first time they touch fire.
- Trying to build a better GPU is a losing strategy. Jensen will copy any one to three percent share design. Startups should target one percent share, do something different, and make it hard enough that Nvidia cannot fast follow.
- Disaggregated inference (separating prefill and decode) opens new design canvases. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently.
- Cerebras did something different and hard with wafer scale computing. Three generations of chips and real grit to get there.
- Disaggregation of inference may stretch GPU useful lives to ten or fifteen years, dropping financing costs from low sevens to five or six percent, mathematically lowering the cost of the AI buildout and likely saving the private credit industry from its SaaS loan exposure.
- Sellers of shortage outperform buyers of shortage. But owning the largest installed base of what is currently in shortage (hyperscaler CPU fleets, for example) is also a strong position.
- Most of the economic value at the application layer of AI has been destroyed, not created. The exceptions are companies in the token path or in niches small enough that frontier labs ignore them.
- Coding may be the shortest path to ASI. If you can write code, you can write code that does anything. Cursor, Cognition, and Anthropic correctly focused on it.
- Jensen could probably get close to the frontier with his own Nemotron family of models whenever he wants. The fact that he chooses not to is a strategic decision about not commoditizing his customers.
- The new prisoner’s dilemma in AI is whether frontier labs release their best model via API. If everyone agrees not to, Chinese open source falls behind. If anyone defects, the defector pulls ahead on revenue and resources, forcing everyone else to defect.
- Google still owns the largest compute installed base. Without TPU’s prior cost advantage, this matters more. YouTube data has real value in a world of robotics. GCP is going crazy.
- Meta deserves credit for becoming AI first internally faster than any other internet giant. Musa, their first MSL model, is impressively close to the Pareto frontier.
- Amazon is strong because of Trainium and robotics driven retail P&L efficiency. Nova is better than it gets credit for.
- Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Microsoft products rather than reselling to OpenAI is a courageous and probably correct call, even at the cost of an eight hundred dollar stock price.
- The hyperscalers most engaged with startups are Amazon and Nvidia by a mile, followed by Google. Broadcom is the favorite ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement and that will cost them as the best teams are now at startups.
- Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion at the speed of FaceTime is already feasible.
- Ukraine is winning largely on the back of having the best battlefield AI outside America and Israel. Adversaries are starting to internalize what AI dominance means geopolitically.
- An optimistic read is that this becomes a new Pax Americana, the way the post 1945 American nuclear monopoly was used to rebuild Germany and Japan rather than dominate.
- AI cured a friend’s daughter’s rare disease by spinning up a research effort that identified a market drug capable of impacting her condition. That is the upside that keeps Gavin an AI optimist maximalist.
Detailed Summary

The most extraordinary moment in the history of capitalism

Gavin’s framing of the current moment is unusually direct. Anthropic added eleven billion dollars of annual recurring revenue in a single month. The three highest profile SaaS companies of the last decade plus, Palantir, Snowflake, and Databricks, took a decade and tens of thousands of employees collectively to build the combined business that Anthropic added in thirty days. He has been investing through every major tech cycle and says there is no historical analog. Not the dotcom era, not the cloud transition, not mobile. This is its own thing.

The market response, then, was peculiar. The NASDAQ sold off into the single most bullish moment for AI fundamentals on record. Tech traded at roughly its widest discount versus the rest of the market in a decade. Investors who said they wished they had bought into AI during 2022, during COVID, or during Deep Seek Monday got the same valuation setup again in early April, this time with an even clearer inflection.

Why the Strait of Hormuz closing was secretly bullish for America

One reason the macro fear in March may have been mispriced is that the same geopolitical event that drove the selloff was, in practice, a relative benefit to the United States. American natural gas, the input into American electricity, which is the input into American AI training and inference, fell roughly twenty percent. Asian and European natural gas prices doubled or tripled. The US emerged with sharply improved relative manufacturing competitiveness, which is exactly what the current administration cares about.

The 1970s comparison does not hold. The US economy is dramatically less energy intensive, it is now the world’s largest producer and largest exporter of oil and gas, and there are no shortages, only price moves. That backdrop made it easier for disciplined investors to stay focused on AI fundamentals through the volatility.

Anthropic and OpenAI valuations on an unconstrained run rate

Anthropic at roughly nine hundred billion for fifty billion of ARR sounds rich until you adjust for the fact that the company is severely compute constrained. Gavin estimates that, unconstrained, Anthropic might be at one hundred fifty to two hundred billion in run rate revenue, putting the implied multiple closer to five times. He also points out that Claude Opus now generates roughly seventy percent fewer tokens for the same question than it used to. Token quantity correlates with answer quality, and Anthropic is rate limiting and shrinking outputs to ration capacity across its user base.

Anthropic and OpenAI are also structurally very different. Anthropic has burned around eighty percent less cash than OpenAI to reach a comparable revenue scale. That implies very different long term returns on invested capital, though OpenAI has done a better job locking in compute and Sarah Friar is one of the most exceptional CFOs Gavin has worked with.

Why neither lab is raising at a three trillion dollar valuation

The answer Gavin gives is that both labs are deliberately leaving valuation on the table the way Elon has done for two decades. SpaceX compounded at low thirty percent annually for a decade because Elon never pushed price. The result is a permanent superpower of access to capital. Investors trust him because they have made money with him for twenty years. That is a moat that compounds with every round.

Anthropic could probably raise at a one hundred percent premium to its rumored latest mark. They are choosing not to. In an uncertain world (Ukraine, Russia, Iran, Taiwan), preserving the ability to raise more capital later at fair prices is more valuable than maximizing this round.

Watts and wafers, the two real constraints

Capitalism is solving the watts problem. The leading PE infrastructure investors now say zoning and political approval, not chips or energy, are the gating factors. Companies are deferring big capex announcements until after the US midterms. Turbine capacity is being doubled at the manufacturers. Companies like Boom Aerospace are repurposing jet engines for grid use. Watts probably ease meaningfully in 2027 and 2028 and then orbital compute does the rest.

Wafers are the harder problem because they live in Taiwan, run on handshakes, and depend on a corporate culture that does not respond to public market incentives. TSMC is essentially the GDP, water consumption, and electricity consumption of Taiwan. Its leadership treats the company as the legacy of Morris Chang. The Silicon Shield doctrine is real and internal.

Orbital compute as racks in space

The biggest mental update Gavin asks listeners to make is to stop picturing data centers in space as Pentagon sized space stations. A Blackwell rack is three thousand pounds and roughly the size of a refrigerator. SpaceX has shown a concept satellite of about that size. Solar wings extend five hundred feet to each side and the radiator extends hundreds of feet behind, both possible because the orbit is sun synchronous and the orientation is fixed relative to the sun.

SpaceX engineers Gavin has spoken to at Starbase express genuine confidence that they have solved cooling at these power levels. They have. Starlink V3 satellites already operate at twenty kilowatts. A Blackwell rack is one hundred kilowatts. The same company operates the world’s largest satellite fleet and the world’s largest data center on Earth via xAI Colossus. The racks are connected to each other with lasers traveling through vacuum, technology already deployed in every Starlink. The naysayers, Gavin observes, are armchair skeptics and Larry Ellison’s response (he is out there landing rockets, no one else is) is the right frame.

Terafab in Texas and the threat to TSMC’s discipline

Terafab, the SpaceX and Tesla joint venture, intends to be the largest fab in the world. The partnership with Intel grants access to fifty years of foundry institutional knowledge, allowing Terafab to start three to five quarters behind the leading node rather than fifteen years behind. The A teams at the semicap equipment companies (ASML, KLA, Lam Research, Applied Materials) will follow Elon’s reputation in hardware engineering the same way they followed TSMC twenty years ago when Intel stumbled.

The talent strategy is the part most observers underestimate. Recruit the best engineers globally, then import their families, their restaurants, their staff. Build Taiwan Town, Japan Town, and Korea Town next to the fab. Optimize the human experience for the people whose work matters. Intel and Samsung do not think that way.

Bubble watch and the year 2000 comparison

Every foundational technology in modern history has had a bubble. Railroads, canals, the internet. Carlota Perez documented why. Markets correctly identify the importance, diversity of opinion collapses, supply gets ahead of demand, the bubble crashes. The current cycle has two important differences. The buildout is overwhelmingly funded out of operating cash flow, not debt. Every GPU is running at one hundred percent utilization, while at the peak of the fiber bubble ninety nine percent of fiber was unused.

TSMC discipline is the single largest reason a bubble has not formed. If Jensen could buy everything TSMC could theoretically make, Nvidia could sell two to three trillion dollars of GPUs in 2026 and 2027. At some point that becomes more than the market can absorb. If Intel or Samsung Foundry catches up at the leading node, the other will too. TSMC’s pricing discipline collapses and the bubble starts.

The Pareto frontier and the loss of Google’s cost advantage

The most important chart in AI is the Pareto frontier of model intelligence versus per token cost. Nine months ago, Google’s TPU based models dominated every point on it. OpenAI, Anthropic, and xAI sat inside the frontier. Today the frontier is dominated by Anthropic and OpenAI, with Grok 4.3 on the frontier and Gemini 3.1 hanging on by subsidization more than economics. The most likely cause is Google’s conservative TPU V8 design, an attempt to reduce dependence on Broadcom and Nvidia that sacrificed per token economics.

The bitter lesson, frontier tokens, and continual learning

Three open questions dominate AI investing. The first is whether Richard Sutton’s bitter lesson (more compute beats human algorithmic cleverness) gets violated by ASI itself optimizing for efficiency. Closer observers of AI are more skeptical of a violation. Gavin thinks ASI’s first move will be to make itself more efficient and more resourced, which is technically a temporary violation.

The second is whether frontier tokens keep capturing the overwhelming share of economic value at the model layer. Today they do, surprisingly. Gemini 3.1 Pro was mindblowing nine months ago and is intolerable today. The third is when continual learning arrives. Today’s models need a million fire touches to learn what a human learns from one. True continual learning would mean dynamic weight updates in real time and would produce a fast takeoff.

From all you can eat to usage based AI pricing

AI is shifting from flat fee plans to usage based pricing. The historical analogy is cellular and long distance. Both stopped being great growth industries when they went all you can eat. AI just made the opposite move. The consequence is that flat fee subscribers, even on premium consumer plans, get a rate limited and token throttled version of the frontier model. Enterprise plans with usage based billing are now required to evaluate true capability. Gavin thinks the combination of new compute coming online and usage based pricing is what gets OpenAI and Anthropic past two hundred billion in combined ARR this year.

Chip startups, prefill decode disaggregation, and Cerebras

Trying to build a better GPU is the wrong move. The four scaled players (Nvidia, AMD, Trainium, TPU) have copy capability for any one to three percent share design that looks attractive. The good news for startups is that disaggregated inference (separating prefill and decode) opens a richer design canvas. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently. Andrew Fox’s analogy is a British naval ship of the eighteenth century. Prefill is loading the cannon. Decode is firing it.

Cerebras is the model. Wafer scale computing is genuinely different and genuinely hard. It took three generations of chips to get right. Andrew Feldman and his team had the grit to keep going through chip one being a failure. The design has a high ratio of on chip compute and memory relative to shoreline IO, which is why Cerebras is now experimenting with putting an optical wafer on top of the compute wafer to solve scale out.

GPU useful lives and the rescue of private credit

One of the strongest claims in the conversation is that disaggregated inference will stretch GPU useful lives to ten or fifteen years. The skeptical narrative (GPUs are obsolete in two years, companies are cooking their depreciation books) is wrong. You can put a Cerebras system or Groq LPU in front of older Hopper or Ampere parts, use them only for prefill, and run them until they physically melt. Private credit, which is in pain from SaaS loans and which underwrote GPU loans on three to four year lives, may be saved by this.

If GPU financing rates can come down from low sevens to five or six percent, the mathematics of the AI buildout improves materially. That is a structural tailwind that compounds for years.

The application layer, the token path, and a new prisoner’s dilemma

Trillions of dollars of value have been destroyed at the application layer, not created. Cursor and Cognition are the rare scaled exceptions, and they got there by focusing on coding very early. As Amjad Masad noted, coding is plausibly the shortest path to ASI because a coding agent can write itself into any new domain. Jamin Ball’s frame is that the new venture filter is whether the company is in the token path. Data Bricks is. Most application layer startups are not.

Jensen could probably get close to the frontier with Nemotron whenever he wants, and the strategic question of whether to do that is a new prisoner’s dilemma. If every frontier lab agrees not to release best models via API, Chinese open source falls steadily behind. If anyone defects, the defector gains revenue and resources, and everyone else has to defect. The same dynamic exists between TSMC, Intel, and Samsung. If Nvidia or AMD ever truly used an alternative foundry, that foundry would catch up rapidly.

Rating the hyperscalers

Google has the largest compute installed base, the YouTube data that matters in a robotics world, and a search business that prints. Their loss of TPU cost leadership is the surprise of the year. If Google IO in five days does not produce a leapfrog model, the Nvidia centric narrative gets even stronger.

Meta deserves real credit. Zuckerberg made Meta AI first internally faster than any other internet giant, paid up for the talent contracts when no one else would, and shipped Musa as a first model from MSL that is close to the Pareto frontier. Amazon is well positioned on Trainium, robotics in retail, and a Nova model line that is better than it gets credit for. Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Copilot rather than reselling to OpenAI is courageous and probably correct, even at the cost of stock price.

The most interesting cross hyperscaler metric is startup engagement. Nvidia and Amazon engage deeply with startups. Google is next. Broadcom is the favored ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement, which Gavin believes will cost them as the best teams now sit at startups.

Personal safety, geopolitics, and the Pax Americana case

The closing section turns darker. Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion via something that looks exactly like your child calling on FaceTime is already feasible. Political violence against AI leaders is a real concern. Geopolitically, Ukraine is winning largely because it has the best battlefield AI outside America and Israel. How adversaries respond to that asymmetry is the next great variable.

Gavin’s optimistic frame is the Pax Americana. After 1945 the US had a nuclear monopoly and could have controlled the world. Instead it rebuilt Germany and Japan, both of which became the most reliable American allies for the next eighty years. If AI dominance plays out similarly, this is a generationally positive story rather than a destabilizing one. The personal anecdote that closes the conversation is a friend whose daughter was diagnosed with a rare genetic condition. He spun up agents, identified a drug already on the market that addresses her mutation, and her life is immeasurably different because of AI. That is the upside.

Thoughts

The Anthropic eleven billion in a month framing is the kind of stat that resets priors. The right way to interpret it is not as a one off but as a measure of how fast value can compound when the underlying technology improves on a curve steeper than the ability of the rest of the economy to absorb it. The skeptical question is whether that ARR is durable or whether it is heavily tied to a customer base of other AI companies that are themselves on a single venture funded year of runway. The bullish answer is that frontier coding, frontier research, and frontier enterprise tasks are not going to stop being valuable, and Anthropic is the best at all three. Both can be true. The number is still extraordinary.

The argument that TSMC discipline is the only thing preventing a bubble is the analytically tightest part of the conversation. The implied trade is to watch TSMC capacity additions like a hawk and to be more, not less, cautious if Intel Foundry or Samsung Foundry ever announce real share at the leading node. The Terafab thesis is more speculative but more interesting. If Elon’s talent recruiting playbook works and the Intel partnership gives Terafab a real seat at the table within five years, the geometry of the global semiconductor industry shifts in a way that is bullish for American manufacturing, bullish for power and water infrastructure in Texas, and ambiguous for TSMC itself.

The Pareto frontier discussion deserves more attention than it usually gets. Pricing leadership in AI is not a vanity metric. It determines who can subsidize free tier usage, who can absorb compute shortages, who can ship cheaper enterprise plans, and ultimately whose model becomes the default for any given workload. Google losing per token leadership in nine months is one of the most under analyzed events in the sector and it explains a lot about why Anthropic and OpenAI are growing the way they are. If Google IO does not produce a leapfrog model, the implied verdict on TPU V8 design choices gets a lot harsher.

The application layer destruction point is worth sitting with. Founders building on top of frontier models are competing in a world where the model itself moves faster than any moat they can build, where the model lab can absorb their niche if it gets interesting, and where the only protection is either deep token path integration or a niche so small the lab does not bother. That is a much harsher venture environment than the early SaaS era. The compensating opportunity is that one human can now run a hundred agents, so the ceiling on what a small team can build is correspondingly higher. The bet is that productivity per founder rises faster than competitive pressure from the labs. We will find out.

The orbital compute pitch is the section that will polarize listeners. The naive read is that this is science fiction. The closer read is that every component (sun synchronous orbit, laser interconnect, twenty kilowatt satellite buses, ten thousand satellite manufacturing cadence, full rocket reusability) already exists. The remaining engineering problems are repair, maintenance, and radiator scale, all of which are real but tractable on a five to ten year horizon. The strategic implication is that the political and zoning ceiling on terrestrial data centers becomes less binding if orbital compute is a credible alternative for inference workloads. The investor implication is that being short the watts and cooling complex on a five year horizon is a real trade, not a meme.

Watch the full conversation here.
May 20, 2026
Jensen Huang at Stanford CS153 Frontier Systems on Co-Design, Agentic Computing, Vera Rubin, Open Models, and the Million-X Decade That Reshaped AI Infrastructure
https://www.youtube.com/watch?v=tsQB0n0YV3k

NVIDIA CEO Jensen Huang returned to Stanford for the CS153 Frontier Systems class (the room nicknamed itself “AI Coachella”) to lay out, in raw form, how he thinks about the computer being reinvented for the first time in over sixty years. Across roughly seventy minutes of student questions he walks through the codesign philosophy that gave NVIDIA a million-x decade, the architectural through-line from Hopper to Grace Blackwell to Vera Rubin to Feynman, the case for open source foundation models, the realities of tokens per watt and MFU, energy demand running a thousand times higher, the China and export-control debate, and his own biggest strategic mistakes. Watch the full conversation on YouTube.

TLDW

Huang argues every layer of computing has changed: the programming model, the system architecture, the deployment pattern, the economics. Co-design across CPUs, GPUs, networking, storage, switches and compilers gave NVIDIA roughly a million-x speed-up over ten years versus the ten-x Moore’s Law era, and that headroom is what let researchers say “just train on the whole internet.” Hopper was built for pre-training, Grace Blackwell NVLink72 for inference and reasoning (50x over Hopper in two years), Vera Rubin is built for agents that load long memory, call tools and need a low-latency single-threaded CPU bolted directly to the GPU, and Feynman extends that to swarms of agents that spawn sub-agents. Open weights matter because safety, sovereignty (230-plus languages no one else will fund) and domain models for biology, autonomy, robotics and climate need a foundation that NVIDIA is willing to seed. Compute is not really the scarce resource (Huang says place the order and the chips ship), the broken thing is institutional budgeting that can’t put a billion dollars into a shared university supercomputer. Energy demand is heading a thousand times higher and this is finally the moment market forces alone will fund sustainable generation. On geopolitics he rejects the GPUs-as-atomic-bombs framing and warns America will end up like its telecom industry if it cedes two thirds of the world. On career he advises seeking suffering on purpose. On strategy he says observe, reason from first principles, build a mental model, work backwards, minimize opportunity cost, maximize optionality.

Key Takeaways
- The computing model has been substantially unchanged since the IBM System 360, sixty-plus years ago. Huang’s first computer architecture book was the System 360 manual. AI is the first true reinvention.
- Old computing was pre-recorded retrieval. New computing is generated, contextually aware and continuous. Cloud was on-demand. Agentic systems run continuously.
- Codesign is NVIDIA’s central thesis. Inherited from the Hennessy and Patterson RISC era at Stanford, extended across CPUs, GPUs, networking, switches, storage, compilers and frameworks all optimized together.
- The result of full-stack codesign: roughly 1,000,000x faster compute over ten years, versus a generous 10x to 100x for Moore’s Law in the same period. Dennard scaling effectively ended a decade ago.
- That million-x speed-up is what unlocked “train on all of the internet” as a realistic AI strategy.
- After GPT, Huang says it was obvious thinking was next. Reasoning is just generating tokens consumed internally, then using tools is generating tokens consumed externally. Agentic systems followed predictably.
- Education needs AI baked into the curriculum, not just taught as a subject. Pre-recorded textbooks cannot keep pace with knowledge being generated in real time.
- Huang says he cannot learn anymore without AI. He has the AI read the paper, then read every related paper, then become a dedicated researcher he can interrogate.
- Mead and Conway and the first-principles methodology of semiconductor design are still worth learning even though most of the scaling tricks have been exhausted.
- NVIDIA itself is one of the largest consumers of Anthropic and OpenAI tokens in the world. One hundred percent of NVIDIA engineers are now agentically supported. Huang recommends Claude and similar tools by name and says open-source downloads will not match the integrated product harness.
- NVIDIA still invests heavily in open foundation models because language and intelligence represent the codification of human knowledge. Five pillars: Nemotron (language), BioNeMo (biology), Alphamayo (autonomous vehicles), Groot (humanoid robotics) and a climate science model (mesoscale multiphysics).
- Sovereign language models matter. Roughly 230 world languages will never be a top priority for a commercial frontier lab. Nemotron is near-frontier and fully fine-tunable so any country can adapt it.
- Safety and security require open weights. You cannot defend against or audit a black box. Transparent systems let researchers interrogate models and let defenders deploy swarms.
- The future of cyber defense is not bigger-model-versus-bigger-model. It is trillions of cheap fast small models like Nemotron Nano surrounding the threat.
- Domain models fuse language priors with world models. Alphamayo learned to drive safely on a few million miles instead of billions because it can reason like a human about the road.
- MFU (Model Flops Utilization) is a misleading metric. Huang says he wants low MFU, because that means he over-provisioned every resource and never gets pinned by Amdahl’s law during a spike.
- The xAI Memphis cluster running at 11 percent MFU is not necessarily a failure mode. In disaggregated prefill plus decode inference you can deliver very high tokens per watt with very low MFU.
- The right metric is performance, ultimately tokens per watt as a proxy for intelligence per watt, and even that needs adjustment because not all tokens are equal. Coding tokens are worth more than other tokens.
- Hopper was designed for pre-training. NVIDIA chose to build multi-billion-dollar systems when the largest existing scientific supercomputer cost $350 million, with no proven customer base. It worked.
- Grace Blackwell NVLink72 was designed for inference, especially the high-memory-bandwidth decode phase. It is the world’s first rack-scale computer and delivered a 50x speed-up over Hopper in two years, against an expected 2x from Moore’s Law.
- Vera Rubin is designed for agents. Long-term memory wired into storage and into the GPU fabric, working memory, heavy tool use, and Vera, a CPU optimized for low-latency multi-core single-threaded code so a multi-billion-dollar GPU system does not stall waiting on a slow tool call.
- Feynman is being shaped for swarms of agents with sub-agents and sub-sub-agents, a recursive software topology that demands a new compute pattern.
- Tokens per watt improved 50x in one generation. Compounding energy efficiency is the lever NVIDIA controls directly.
- Total compute energy demand is heading roughly a thousand times higher than today, possibly two orders of magnitude beyond that. Huang says he would not be surprised if the estimate is low.
- For the first time in history, market forces alone are enough to fund solar, nuclear and grid upgrades. Government subsidies are no longer required to make sustainable energy investment rational.
- Copper interconnect is becoming a bottleneck. Photonics is moving from optional to structural inside racks and across them.
- Comparing NVIDIA GPUs to atomic bombs, Huang says, is a stupid analogy. A billion people use NVIDIA GPUs. He advocates them to his family. He does not advocate atomic bombs to anyone.
- If the United States cedes two thirds of the global market to competitors on policy grounds, the American technology industry will end up like American telecommunications, which was policied out of existence.
- Huang directly rejects AI doom-by-singularity narratives. It is not true that we have no idea how these systems work. It is not true that the technology becomes infinitely powerful in a nanosecond. He calls the rhetoric irresponsible and harmful to the field students are about to enter.
- On Stanford specifically: if the university president places an order, NVIDIA will deliver the chips. The bottleneck is that no university department has a billion-dollar compute budget because budgeting is fragmented across grants. Stanford’s $40 billion endowment is more than enough to fix that.
- “It’s Stanford’s fault” is meant as empowerment. If something is your fault, you can solve it.
- Career advice: do not optimize purely for passion. Most people do not yet know what they love. Pick the job in front of you and do it as well as possible. Even as CEO, Huang says, 90 percent of the work is hard and he suffers through it.
- Suffering on purpose builds the muscle of resilience. When the company, the team or the family needs you to be tough, that muscle has to already exist.
- NVIDIA’s first generation of products was technically wrong in nearly every dimension: curved surfaces instead of triangles, no Z-buffer, forward instead of inverse texture mapping, no floating point. The strategic recovery, not the technology, taught Huang the lessons that have lasted decades.
- The biggest clean strategic mistake Huang names is the move into mobile chips (Tegra). It grew to a billion dollars then went to zero when Qualcomm’s modem dominance shut NVIDIA out of the 3G to 4G transition. The recovery into automotive and robotics (the Thor chip is the great great great grandson of that mobile lineage) was real, but Huang refuses to rationalize the original choice.
- Forecasting framework: observe, reason from first principles, ask “so what” and “what next” until you have a mental model of the future, place your company inside that model, then work backwards while minimizing opportunity cost and maximizing optionality.
- Best part of the CEO job: living at the intersection of vision, strategy and execution surrounded by people capable enough to make ambitious visions real. Worst part: the responsibility for everyone who joined the spaceship, especially in the near-death moments NVIDIA had four or five times early on.
- Underrated insider note: Huang’s first apple pie with cheese, first hot fudge sandwich and first milkshake all happened at Denny’s. The Superbird, the fried chicken and a custom Superbird-style ham and cheese with tomato and mustard are his order.
Detailed Summary

Computing reinvented from the ground up

Huang frames the moment as the first true rewrite of the computer in sixty-plus years. From the IBM System 360 forward, the mental model of writing code, running code, taking a computer to market and reasoning about applications stayed roughly constant. AI changes the programming model itself. Software is no longer a compiled binary running deterministically on a CPU. It is a neural network running on a GPU producing generated, contextual, real-time output. That cascades into how companies are organized, what tools developers use, what the network and storage stack look like, and what an application is even allowed to do. Robo-taxis, he notes, are an application no one would have attempted before deep learning unlocked perception.

Codesign and the million-x decade

Codesign is the philosophical center of the talk. Huang traces it to the RISC work of John Hennessy at Stanford, where simpler instruction sets won by being co-designed with the compiler rather than maximally optimized in isolation. NVIDIA extends the principle across every layer simultaneously: GPU architecture, CPU architecture, NVLink and NVSwitch fabrics, photonic interconnects, networking silicon, storage paths, CUDA libraries, frameworks and ultimately the model design. The numbers Huang gives are arresting. Moore’s Law in its prime delivered roughly 100x per decade. By the time Dennard scaling broke, real-world gains had compressed to roughly 10x. NVIDIA’s codesigned stack delivered between 100,000x and 1,000,000x over the same ten-year window. That non-linear speed-up is, in Huang’s telling, the precondition for modern AI: it is what allowed researchers to stop curating training sets and just feed the entire internet to the model.

Education has to fuse first principles with AI tools

Asked how curriculum should evolve, Huang argues AI must be integrated into the learning process, not just taught about. He recalls Hennessy writing his textbook by hand a chapter a week while Huang was a student, and says pre-recorded textbooks cannot keep up with the rate at which AI generates new knowledge. He describes his own learning workflow: hand the paper to an AI, then have it read the entire surrounding literature, then treat the AI as a dedicated researcher who can be interrogated. At the same time he defends the classics. Mead and Conway are still the foundation. Most modern semiconductor scaling tricks have been exhausted, but knowing where the field came from sharpens judgment when designing what comes next.

Open source and the five domain pillars

Huang gives one of the most detailed public accounts of why NVIDIA invests so heavily in open foundation models even while being a top customer of closed labs. He recommends Claude and OpenAI by name for production coding work, and says 100 percent of NVIDIA engineers are now agentically supported. The open-weights case rests on three legs. First, language is the codification of intelligence, and there are at least 230 languages that no commercial lab will ever prioritize. Nemotron is built near frontier and released so any country or community can fine-tune it. Second, the same representation-learning approach has to be replicated in domains where the data is not internet text, so NVIDIA seeded BioNeMo for biology, Alphamayo for autonomy, Groot for humanoid robotics and a climate model for mesoscale multiphysics. The economics of those fields would never produce a foundation model on their own. Third, safety and security require transparency. A black box cannot be defended or audited, and the future of cyber defense is not bigger-model-versus-bigger-model but swarms of cheap fast small models like Nemotron Nano surrounding the threat.

MFU is the wrong metric, tokens per watt is closer

A student raises the leaked memo that the xAI Memphis cluster is running at 11 percent Model Flops Utilization. Huang flips the framing. He says he would rather be at low MFU all the time, because that means he over-provisioned flops, memory bandwidth, memory capacity and network capacity. Bottlenecks shift constantly, so over-provisioning across every dimension is what lets the system absorb a spike without getting pinned by Amdahl’s law. In disaggregated inference, where prefill and decode are physically separated and decode is bandwidth-bound rather than flop-bound, NVLink72 can deliver extremely high tokens per watt while reporting very low MFU. Huang argues the right framing is performance, and ultimately tokens per watt as a rough proxy for intelligence per watt, adjusted for the fact that not all tokens are equal. A coding token is worth more than a generic token.

Hopper, Grace Blackwell NVLink72, Vera Rubin, Feynman

Huang gives the clearest public framing of NVIDIA’s roadmap as a sequence of architectural answers to evolving compute patterns. Hopper was built for pre-training, at a moment when NVIDIA chose to build multi-billion-dollar machines while the largest scientific supercomputer in the world cost $350 million and the marketplace for such systems was, on paper, zero. Grace Blackwell NVLink72 was the answer to inference and reasoning: a rack-scale computer that ganged 72 GPUs together because decode needs aggregate memory bandwidth far beyond a single chip. The generation-over-generation speed-up was 50x in two years, twenty-five times what Moore’s Law would have delivered. Vera Rubin is being built explicitly for agents. Agents load long-term memory from storage that has to be wired directly into the GPU fabric, they use working memory, they call tools that run on a CPU, and they wait. So the CPU has to be Vera, optimized for low-latency single-threaded code, because the multi-billion-dollar GPU system cannot afford to idle waiting on a slow tool call. Feynman extends the pattern to swarms of agents with sub-agents and sub-sub-agents, a recursive software topology that will demand its own compute pattern.

Energy demand and the grid

Huang’s energy projection is one of the most aggressive numbers in the talk. NVIDIA can compound tokens per watt by 50x per generation through codesign, but the total compute demand is heading roughly a thousand times higher, and Huang says he would not be surprised if the real figure is one or two orders of magnitude beyond that. The reason is structural: future computing is generative and continuous, not pre-recorded and on-demand. The good news, he argues, is that this is the best moment in the history of humanity to invest in sustainable generation. Market forces alone are now sufficient to fund solar, nuclear and grid upgrades. Government subsidies are no longer required to make the math work.

Adversarial countries, export controls and the telecom warning

This is the segment where Huang is visibly fired up. He attacks the GPUs-as-atomic-bombs framing on its face. NVIDIA GPUs power medical imaging, video games and soy sauce delivery. A billion people use them. He advocates them to his family. The analogy collapses at the first comparison. He attacks the second framing, that American companies should not compete abroad because they will lose anyway, as a self-fulfilling defeat. Competition makes the company better. The third framing, that depriving the rest of the world of general-purpose computing benefits the United States, also fails on first principles: it benefits one or two American companies at the cost of an entire industry. The cautionary parallel is telecommunications. The United States once had a leading position in telecom fundamental technology and policied itself out of it. Huang’s worry, voiced explicitly to a room of CS students, is that they will graduate into a shell of a computer industry if the same path is repeated.

AI doom and rational optimism

In the same arc Huang rejects the science-fiction framing of AI as a singularity that arrives suddenly on a Wednesday at 7pm and ends civilization. He calls those claims irresponsible, says they are not true, and points out that the people advancing them are believed by audiences who then make policy on that basis. It is not true that no one understands how these systems work. It is not true that intelligence becomes infinitely powerful instantaneously. It is not true that there is no defense. His framing, which the host echoes as “rational optimism,” is that the goal is to create a future where people care about computers because the technology students are learning is worth mastering.

Stanford’s compute problem is Stanford’s fault

A student presses on the scarcity of compute for independent researchers, startups and universities inside the United States. Huang’s answer is sharp: there is no shortage. Place the order and the chips will arrive. The actual broken thing is institutional. University grants are fragmented across departments. No researcher can raise enough on a single grant to fund a billion-dollar shared cluster, and no one shares. He compares it to showing up at the grocery store demanding a billion dollars of tomatoes today. The solution is planning, aggregation and a campus-scale supercomputer, the way Stanford once built the linear accelerator. The endowment is $40 billion. Pulling a billion off it, contracting cloud capacity and giving every student and researcher AI supercomputer access is, in Huang’s view, obviously doable. When he says “it is Stanford’s fault” the host laughs, but Huang clarifies: if it is your fault you have the power to fix it.

Career, suffering and resilience

Asked how a CS student should spend the next few years, Huang pushes back on the standard “follow your passion” advice. Most people do not know what they love yet, because no one knows what they do not know. The bar of demanding joy from every working day is too high. Whatever the job is, do it as well as you can. Even as CEO of NVIDIA he says he genuinely loves about 10 percent of his work. The other 90 percent is hard and he suffers through it. He recommends suffering on purpose, because resilience is a muscle that only builds under load, and when the company, the team or the family needs that muscle, it has to already exist. Earlier in his life that meant cleaning toilets and busing tables at Denny’s. He does it today running a multi-trillion-dollar company.

The biggest mistakes

Huang separates technical mistakes from strategic mistakes. NVIDIA’s first generation of products was technically wrong in almost every way: curved surfaces instead of triangles, no Z-buffer, forward instead of inverse texture mapping, no floating point inside. The company wasted two and a half years. But the strategic genius of the recovery, the reading of the market, the conservation of resources and the reapplication of talent, is what taught him strategy. The clean strategic mistake he names is mobile. NVIDIA’s Tegra line grew to a billion dollars of revenue and then collapsed to zero when Qualcomm’s modem dominance locked NVIDIA out of the 3G to 4G transition. Huang explicitly refuses the comforting rationalization that the Tegra effort fed the Thor automotive chip (“Thor is the great great great grandson”). The original decision, he says, was a waste of time. The lesson is to think one or two clicks further about whether a market is structurally winnable before committing the company.

Forecasting under fog of war

The final substantive exchange is on forecasting. Huang’s method has four steps. Observe what is actually happening (AlexNet crushing two decades of computer vision research in one shot, GPT producing reasoning by token generation). Reason from first principles about why it works. Ask “so what” and “what next” recursively until a mental model of the future emerges. Place the company inside that future and work backwards. Crucially, expect to be partly wrong. Some outcomes will absolutely happen, some will likely happen, some might happen, and the strategy has to be robust across that distribution. The real cost of any strategic choice is the opportunity cost of the alternatives you did not take, so the discipline is to minimize that cost and maximize optionality while letting the journey itself pay for the journey.

Thoughts

The most useful thing in this conversation is the explicit architectural mapping of compute patterns to chip generations. Hopper for pre-training. Grace Blackwell NVLink72 for inference, because decode is bandwidth-bound and a single chip cannot supply it. Vera Rubin for agents, because tool calls stall multi-billion-dollar GPU systems and so the CPU has to be optimized for low-latency single-threaded code. Feynman for swarms. That sequence is not marketing. It is a falsifiable thesis about where the bottleneck moves next, and every other infrastructure company should be measuring themselves against it. If Huang is right that swarms of sub-agents are the next dominant pattern, then the design pressure shifts from raw flops to fabric topology, memory hierarchy and storage-to-GPU latency. That has implications for everyone downstream, including the hyperscalers building competing accelerators.

The MFU section is the most intellectually generous moment in the talk. The instinct in the AI ops community has been to chase MFU as if it were a virtue. Huang argues, persuasively, that low MFU is consistent with high tokens per watt in a disaggregated inference setup, and that bottlenecks rotate fast enough that over-provisioning every resource is the rational design. That reframing matters because it changes what “scarce” means. Compute is not scarce in the way the discourse treats it. What is scarce is a coherent system designed end-to-end. The xAI 11 percent number, in that frame, is not embarrassing. It is the natural reading of a workload that is mostly decode.

The Stanford segment is the part most likely to be quoted out of context. “It’s Stanford’s fault” is a deliberately provocative line, but the underlying claim is correct and load-bearing. Compute is not gated by NVIDIA refusing to ship chips. It is gated by the fact that fragmented grant funding cannot aggregate into the billion-dollar order that NVIDIA can fulfill. The implication is that universities and national labs need a structural change in how they pool capital for compute, and that the current model of every researcher buying a handful of cards is genuinely obsolete. Huang’s nudge about pulling a billion off the endowment is concrete enough to be acted on, and other major research universities should read this segment as a direct prompt.

The geopolitical segment is the highest-stakes one. The telecommunications comparison is correct as a historical pattern, and Huang is one of the very few executives in a position to deliver that warning credibly. The unresolved tension is that the argument applies symmetrically. If American AI dominance is built by selling globally, that includes selling into adversarial states, and the policy question is where the line falls. Huang does not answer that question. He attacks the framing that lets the question be answered badly. That is a meaningful contribution to the discourse even if it does not resolve the underlying tradeoff.

The career advice section is the part the social-media clips will mishandle. “Seek suffering” reads as macho when extracted. In context it is a specific operational claim about how resilience compounds, and it is paired with the Tegra story where Huang himself paid the price of not thinking one more click ahead. That kind of self-implication is rare in CEO talks, and it is the reason the talk is worth listening to in full rather than only reading the recap.

Watch the full Stanford CS153 Frontier Systems conversation with Jensen Huang here.
May 13, 2026