In the early 1990s, scientists running one of the most ambitious ecological experiments ever attempted noticed something strange. Inside Biosphere 2, a giant sealed glass structure in the Arizona desert, trees in the rainforest biome were growing fast, yet they kept falling over before they could mature.
They had perfect light, water, nutrients, and carbon dioxide. No storms, no pests, no extreme weather. Yet many of these trees became weak and spindly, unable to hold themselves up. The reason turned out to be simple but unexpected. There was no wind. In the talk above, Biosphere 2 researcher Dr. Joost van Haren walks through the science directly from inside the structure where it happened.
This detail has become one of the most striking real-world examples of why resistance and stress are necessary for building genuine strength, in trees and, by extension, in people. It is the exact image Paul Graham reached for when he sat down with Jessica Livingston to explain how Y Combinator built durability over twenty years. We broke that conversation down in Paul Graham and Jessica Livingston on resilience at Y Combinator, where the biosphere tree sits at the center of his argument about North Stars and not behaving like a weather vane.
What Was Biosphere 2?
Biosphere 2 was a 3.14-acre closed ecological system built in Oracle, Arizona. It contained several different environments, including a tropical rainforest, an ocean, mangrove wetlands, a savannah, and agricultural areas. The goal was to study how self-contained ecosystems function, with an eye toward long-term space habitation and Earth systems research.
Crews lived inside the sealed structure for extended periods. While the project is best known for its technical and human challenges, one of the most interesting findings came from the rainforest biome, and it had nothing to do with the people living inside.
The Problem: Trees That Grew Fast but Fell Over
Pioneer tree species inside the biome grew rapidly under the ideal, protected conditions, often faster than they would have grown in the wild. However, instead of developing thick trunks and strong root systems, they grew tall and thin. Many eventually toppled or snapped under their own weight long before reaching maturity.
This was not caused by bad soil or a lack of resources. After investigation, researchers pinpointed the missing factor: mechanical stress from wind. In real forests, trees are constantly moved by even light breezes. That repeated flexing turns out to be one of the most important growth signals a tree receives.
The Science: How Wind Builds Stronger Trees
Trees do not just grow passively. They respond to physical forces in their environment through a process called thigmomorphogenesis, which means growth changes triggered by touch or mechanical stress. The most consistent effect is shorter, thicker, stiffer growth: less energy spent racing upward, more spent on a trunk that can carry the load.
When wind pushes on a tree, it creates tension and compression in the trunk and branches. The tree reacts by producing stress wood, also called reaction wood. This specialized tissue has a different cellular structure that makes it denser and stronger. It helps the tree resist bending and recover its upright position, and it helps the tree position itself for better light. Wind also drives deeper, more robust root systems for better anchorage in the soil.
Without any wind inside the sealed Biosphere 2 environment, the trees skipped this reinforcement process entirely. They poured energy into rapid upward growth instead of building the structural support needed to sustain it. The result was fast but fragile trees that could not hold themselves up.
Why This Story Resonates Beyond Trees
The Biosphere 2 tree observation quickly became a favorite metaphor for resilience, and it is easy to see why. Trees that grow in perfect comfort, with no resistance, often become weak. The same principle appears to apply to people. When life is completely sheltered from difficulty, growth can happen quickly but stay shallow. Challenges, setbacks, and friction force the development of stronger internal structure: better coping skills, emotional steadiness, and real capability.
This is the move Paul Graham makes when he talks about resilient companies. In his conversation with Jessica Livingston, he argues that organizations fail when they behave like weather vanes, swinging with every gust of public opinion, and that durability comes from stress rather than from being protected from it. The biosphere tree is his shorthand for the whole idea. You can read the full breakdown of that talk in our companion piece, Paul Graham and Jessica Livingston on resilience at Y Combinator.
None of this means constant hardship is ideal. Too much wind snaps a tree, and too much stress breaks a person. The useful lesson is narrower and more practical: some appropriate resistance is necessary for strength to develop at all.
Practical Takeaways
For gardeners: Many people run a small fan on seedlings and young plants to simulate wind. Brushing the seedlings by hand a few times a day does the same thing. Both strengthen stems and prevent weak, leggy growth before transplanting.
For parents and educators: Shielding children from all discomfort and failure can leave them less prepared for real challenges later. Age-appropriate responsibility and natural consequences are the wind that builds their stress wood.
For personal growth: Avoiding all difficulty tends to keep people fragile. Deliberately taking on manageable challenges, the kind that flex you without breaking you, builds greater capacity over time.
For teams and organizations: Cultures that remove all friction often produce brittle groups. Constructive challenge, honest feedback, and real stakes tend to create stronger, more adaptable teams.
Frequently Asked Questions
Did the trees actually fall over in Biosphere 2?
Yes. Observations from the project confirmed that trees in the rainforest and savannah areas grew quickly but became structurally weak and prone to falling, and the absence of wind was identified as the key reason.
What is stress wood?
Stress wood, or reaction wood, is specialized wood tissue trees produce in response to mechanical forces like wind. Its altered cell structure increases strength and helps the tree stay stable and upright.
Is this just a metaphor or is the science real?
The science is real. Trees genuinely require mechanical stress from wind to develop proper structural strength, a phenomenon documented as thigmomorphogenesis. The application to human and organizational resilience is metaphorical but directionally accurate. Some resistance builds capability.
Can I apply this with houseplants or garden seedlings?
Yes. Placing a gentle fan near young plants, or brushing them lightly by hand each day, is a common and effective way to encourage stronger stems through simulated wind stress.
The trees in Biosphere 2 had everything they needed to grow tall, except the one thing that would have made them strong enough to stay standing. Nature includes wind for a reason. Without resistance, growth stays superficial. With the right amount of it, real strength has a chance to form. For more on how the same idea plays out in companies and founders, read our breakdown of Paul Graham and Jessica Livingston on resilience at Y Combinator.
Biosphere 2 (Wikipedia), background on the sealed Arizona ecosystem, its crews, and its scientific legacy.
Thigmomorphogenesis (Wikipedia), the formal name for how plants change their growth in response to touch and mechanical stress.
Reaction wood (Wikipedia), the specialized tissue, often called stress wood, that trees lay down to resist bending forces.
Antifragility (Wikipedia), Nassim Taleb’s framework for systems that gain strength from stress and disorder, the broader principle behind the biosphere story.
For the very first episode of Disaster Proof, the conversation goes to a garage in Palo Alto to sit down with Paul Graham and Jessica Livingston, the founders of Y Combinator. They have backed thousands of companies, including many now working in the resilience space, and the discussion covers what makes startups durable, why adaptability beats expertise, how Brian Chesky stumbled into founder mode at Airbnb, why the best ideas grow out of a founder’s own life, and the two specific risks (AI and climate change) that Paul says are the only ones he treats as genuinely game over. You can watch the full conversation on YouTube here.
TLDW
Paul Graham and Jessica Livingston explain why constant change favors young, flexible founders, and why Y Combinator picks people over ideas precisely so its judgment never goes obsolete. They unpack adaptability as the trait they hunt for in interviews, the “founder mode” story behind Brian Chesky steering Airbnb through COVID, and the 2008 strategy of funding tough, close-to-revenue “cockroaches.” Paul argues a company survives turbulence by sticking to a North Star instead of acting as a weather vane in shifting moral fashions, using the biosphere tree that collapses without wind as his metaphor for resilience. They turn to climate and energy as the next great market, the difficulty of selling into utilities, the Gridware success story, fusion no longer being thirty years away, and the trap of guilt-based business models versus the reliable assumption that users are selfish, greedy, and lazy. The personal-resilience half covers surviving Twitter mobs, Paul’s obsessive essay process, raising kids by indulging curiosity and picking your battles, prepping by living among reasonable people, political polarization, and why AI and climate are the two things that keep them up at night.
Thoughts
The most useful idea in this conversation is also the most counterintuitive: a world that feels like it is ending is structurally good for the people least invested in how it used to work. Paul’s point to terrified founders is that change is only a threat if you have sunk costs in the old order. A young founder has been doing the current plan for two weeks, so a step-function shift in the landscape costs them almost nothing to abandon. The incumbents with elaborate machinery and a decade of assumptions are the ones who should be afraid. That reframes resilience away from defense and toward optionality. The resilient party is not the one with the thickest walls, it is the one with the least to unlearn.
The founder mode discussion is worth sitting with because it quietly overturns a generation of management orthodoxy. The old rule was that a good CEO hires executives and gets out of their way, and that getting into the details is micromanaging. Brian Chesky’s COVID experience at Airbnb broke that rule under maximum pressure. With bankruptcy on the table and a travel company facing a world that stopped traveling, he went line by line through the business and told people what good looked like, then gave them freedom to execute against that standard while still demanding visibility. The interesting nuance is the permission structure. A crisis granted Chesky the license to be involved that normal operating conditions would have framed as meddling. The lesson is not “always be in the weeds,” it is that the founder’s deep understanding and disproportionate caring are assets you are wasting if you reflexively delegate them away.
Paul’s North Star argument is the part most likely to age well. His claim is that companies fail at resilience when they behave like weather vanes, swinging with each gust of public moral fashion. He pairs it with the biosphere tree that grows weak and topples because it was never exposed to wind. Both metaphors point at the same thing: resilience is built by surviving stress while holding your shape, not by avoiding stress and not by reshaping yourself to whatever the crowd currently rewards. The carbon-credit companies he mentions are the cautionary case. They built their entire premise on a fashion (customer guilt about carbon) and went out of business when the wind changed direction. Durable businesses convert a permanent human motive into value, which is why he prefers the brutally honest assumption that the user is selfish, greedy, and lazy, and that your job is to build something that produces good outcomes anyway.
The climate and energy section reframes a worthy cause as a market-timing bet rather than a moral appeal, and that is the more powerful version. The comparison to fintech in 2008 is the tell. Banking technology was a sleepy, unglamorous sector that venture investors avoided until a crisis cracked it open and made it one of the best categories of the following decade. The argument is that energy and the physical world are sitting at a similar precipice, made newly viable because hardware is starting to behave more like software (order components, assemble, do not build everything from scratch) and because AI’s hunger for power has made energy the binding constraint on the whole industry. The Gridware story crystallizes the founder lesson underneath all of it. The best founder for a hard physical problem was a lineman who worked the electric lines and lived through the fires. The idea grew authentically out of his life, which is the same pattern Jessica keeps returning to and the same advice they give for raising kids.
Finally, the personal-resilience material is more practical than it first appears. Paul’s method for surviving a Twitter mob is pattern recognition: once it has happened twenty times, you know it ends in two days and they move on to the next target, so you wait it out instead of capitulating. His essay process is the same conviction-building engine applied to ideas. He goes sentence by sentence until there is no false statement left to attack, which is why his challenge to angry readers (“point out the incorrect statement”) almost never gets answered. The throughline across the company advice, the parenting advice, and the personal advice is identical. You build durable conviction not by sitting in a room thinking, but by working the problem until it is right, then refusing to be blown off course by people who never actually engaged with the substance.
Key Takeaways
Experts are frequently wrong because they are experts in a previous version of the world, so Paul deliberately avoids permanent beliefs about the current state of technology.
Y Combinator picks startups by picking founders, not ideas, because the founders know more about the ideas than the investors do.
Living in England and visiting for each batch lets Paul arrive every quarter expecting the world to be different, which keeps his mind open instead of anchored.
A world of constant change feels bad but is actually good for a young, flexible founder who has only been on the current plan for two weeks and can switch easily.
Vibe coding went from kind-of-works to reliably works, and even experienced programmers now generate huge volumes of code with AI.
There is still a software business even with AI, because someone has to know what to tell the AI to write, and no company is going to write its own database from scratch.
The scenario Paul worries about is model companies spinning up agents to start all the startups themselves, removing the need for human founders.
The founder traits Jessica looks for are unchanged over the years: determined, flexible-minded, and willing to adapt.
In interviews you can spot rigid founders because they answer the question they prepared rather than the one they were asked, and the gears visibly grind when you redirect them.
A good adaptability signal is a founder who says “I haven’t thought about that, but here is how I would think about it” instead of freezing.
Founder mode, the term, came from Brian Chesky’s experience steering Airbnb through COVID, when bankruptcy was openly discussed in board meetings.
Ken Chenault, the former American Express CEO on Airbnb’s board, told Chesky the moment was ten times worse than 9/11 and could define the company.
Founder mode meant Chesky understood every line item, told people what good looked like, then gave them freedom to execute while still wanting to see it.
Founders see through the fog because they understand the company better than anyone and they care more than anyone, and combining understanding with caring lets them see more.
There is always some disaster at Y Combinator, the way a hospital always has someone coding, so a crisis is the normal operating environment, not an exception.
During the 2008 crash, YC kept funding because it is always a good time to start a startup, but focused on people close to making money and very tough founders they called cockroaches.
Airbnb was the ultimate cockroach, seemingly indestructible, which is exactly why they liked it during the meltdown.
YC rests on two axioms: startups matter, and founders are the most important ingredient in startups. As long as those hold, YC has room to exist.
Company values are usually written down a few years in, documenting principles that already existed rather than inventing new ones.
You cannot move with fashion; you have to stick to your North Star, especially during turbulent, noisy times.
Trees grown inside a biosphere fell over because they were never exposed to wind, so being blown around is a necessary part of becoming strong enough to stand.
What preserves YC most is that it is a fundamentally good idea: it gives lonely founders money, the right peers, and colleagues they would never otherwise have.
The measure of a good startup idea is revenue, and any other metric you care about matters only because it predicts revenue.
At the early stage you can afford to be virtuous and even tell founders to go back to college, because the power law means one startup in the batch will carry the returns.
Every startup has to find early adopters, who decide quickly, usually do not have much money, and tend to be sophisticated, which means utilities are rarely your first customer.
A company that ultimately sells to utilities should start by selling to something that says yes faster, like running a pilot on a single corporate campus.
Utilities are under so much stress from wildfire liability, renewables, EV charging, and AI demand that they are unusually willing to try new things out of necessity.
Gridware, founded by a former lineman who lived through major fires, is now backed by Sequoia with PG&E as a huge customer, an example of an idea growing out of the founder’s life.
The second-biggest chunk of YC startups after AI is hard tech and physical products, not because software is dead but because building physical things is getting more possible.
Energy is one of AI’s fundamental constraints; if Sam Altman could have two things for Christmas, they would be energy and GPUs.
Nobody says fusion is thirty years away anymore, and the old thirty-year number existed because it was far enough out to avoid demands for results but close enough to keep attention.
Energy and physical markets may be where fintech was in 2008, a sleepy sector about to be cracked open by crisis into a great decade.
Guilt is a fragile business model because fashions change what people feel guilty about, which is why carbon-credit companies collapsed when the winds shifted.
Assume the user is selfish, greedy, and lazy, then build something that causes good things to happen anyway, like clean power that is simply cheaper and more reliable.
To survive Twitter mobs, remember they move on in about two days, half are bots or people you would never talk to in real life, and you cannot become a weather vane for moral fashions.
You build conviction by working on and developing an idea, not by sitting in a room thinking, unless it is pure thought like math.
Paul writes essays sentence by sentence until nothing in them is false, which is why his challenge to point out an incorrect statement almost never gets answered.
The best startup ideas, and the best projects in life generally, grow authentically out of the founder’s own interests and experiences.
Their parenting philosophy is to give kids confidence and a stable base, indulge their curiosity, and encourage projects nobody told them to do.
You pick your battles with kids: put your foot down on cruelty, but accept defeat on things like food and screen time.
A useful interview question for anyone with an unusual experience is not “what was it like” but “how was it different than you expected,” which surfaces the genuinely novel detail.
In a time of turbulence, bet on an island full of reasonable people; the English may not be very dynamic, but they are reasonable.
The hope on political polarization is to build resilient institutions that act as a cage around any single leader, so that throwing the rattle makes no difference.
AI and climate change are the two things Paul worries about most because they are both potentially game over, like the Gulf Stream reversing and turning Europe into a frozen wasteland.
Detailed Summary
Staying an expert when the world keeps changing
The conversation opens on Paul Graham’s essay “How to Be an Expert in a Changing World,” whose core point is that experts are often wrong because they are experts in a previous version of the world. Asked how he keeps his own beliefs from going obsolete when the landscape can shift in ninety days, Paul says he focuses on people. YC picks founders rather than ideas because the founders know the ideas better than any investor could. He deliberately holds no permanent beliefs about the current state of technology, and the rhythm of flying in from England for each batch helps: he arrives every quarter already expecting everything to be different. One quarter the story is everyone training open-source models, the next quarter it is Claude code and nobody bothers with open-source models because the frontier versions are better anyway. He comes in with a completely open mind. Jessica and Paul note that today’s founders are more frightened, asking what is even still true, but the message Paul gives them is that constant change favors the young and flexible. If you have only been executing a plan for two weeks, a disruption costs you nothing; you just switch.
What adaptability looks like in a founder
Jessica describes the founders she funds as determined, flexible-minded, and willing to adapt, and calls adaptability a key trait always, but especially in uncertain times. In interviews, the rigid applicants reveal themselves by answering the question they planned to answer rather than the one they were asked, and you can almost hear the gears grind when you redirect them. Paul does not let that slide; if they dodge, he just asks again. The positive signal is a founder who, faced with a question they have not considered, says “here is how I would think about it” and reasons live. Both point out that YC itself had to adapt, and that the company they funded the interviewer’s startup as in 2009 looked very different by the end. They funded him in May 2009, in the thick of the financial crisis, after he had quit his job in August 2008 and briefly felt he had made a terrible mistake.
Founder mode and seeing through the fog
Paul points to Brian Chesky as the defining example of weathering disaster, a story he explored on This Week in Startups. When COVID hit a travel company like Airbnb, the word bankruptcy was being used in board meetings, and Ken Chenault, the former American Express CEO on the board, warned it was ten times worse than 9/11. Chesky went into what would later be named founder mode, getting into every line item, understanding exactly what was needed, telling people what good looked like, and then giving them freedom to execute while still insisting on visibility. The crisis gave him permission to be the involved CEO he had always wanted to be, the kind of involvement that normal operating conditions would have labeled micromanaging. Paul argues founders see through fog that blinds everyone else for a simple, rational reason: they understand the company better than anyone because they have been there longest and thought of most of it, and they also care more than anyone. Combine deep understanding with deep caring and of course they see more.
Cockroaches, the North Star, and the biosphere tree
Returning to 2008, when YC was self-funded and unsure whether anyone would invest by March, they decided to keep going on the principle that it is always a good time to start a startup, but to fund people close to making money and very tough founders they called cockroaches, after the creatures that survive nuclear war. Airbnb was the ultimate cockroach. Paul frames YC’s longevity around two axioms (startups matter, founders are the most important ingredient) and around resilience built through stress. He tells the story of trees grown inside a biosphere that fell over because they were never exposed to wind, since being blown about is a necessary part of a tree becoming strong enough to support its own weight. YC has been blown around and is still standing, which is exactly what gave it practice. The companion idea is the North Star: you cannot move with fashion or act as a weather vane swinging with other people’s moral fashions, you have to hold your founding principles, which Paul eventually wrote down rather than let a 23-year-old new hire do it.
Climate, energy, and selling into hard markets
The interviewer’s own path (a curiosity about wildfire that grew from living in California, watching PG&E go bankrupt, a fire on his Mendocino property, volunteering as a firefighter) becomes the case for ideas that grow authentically out of a founder’s life. Climate is framed broadly as energy, the built environment, and transportation, essentially the physical world, and those are hard markets where the buyers are utilities, governments, real estate, and insurance. The advice is to find early adopters who decide quickly, which usually means not starting with a utility but with something like a single corporate campus that will say yes faster. Utilities, though, are under so much stress from wildfire liability, renewables, EV charging, and AI demand that they are increasingly willing to try new things. Gridware, founded by a former lineman who lived through major fires, is the proof point: backed by Sequoia, with PG&E as a major customer. Paul notes the second-biggest chunk of YC startups after AI is hard tech, not because software died but because building physical things is getting more possible, more like ordering and assembling components. Energy is the binding constraint on AI, fusion no longer feels thirty years away, and the bet is that energy and physical markets are where fintech was in 2008, about to be cracked open.
Guilt versus greed as a business model
On the question of whether climate companies should sell on guilt (recycle, pay more because it is sustainable), Paul is blunt that guilt is fragile because fashions change what you are supposed to feel guilty about. The carbon-credit companies thrived until buying carbon credits stopped being cool, then went out of business. A founder’s own concern for the world can drive great companies, but depending on a customer’s guilt is shallow. The durable move is to assume the user is selfish, greedy, and lazy, someone who just wants to eat pizza and watch Netflix, and to build something that produces good outcomes despite that. Clean power is the perfect example: nobody watching Netflix is upset that fusion powers their television, and if it is cheaper and more reliable, that is simply more Netflix and more money for pizza.
Personal resilience, Twitter mobs, and the essay process
On surviving public criticism, Paul’s method is pattern recognition: after twenty mobs you stop counting and know it will be over in two days when they move to the next topic, so you wait it out even though it genuinely feels miserable. Half of them are bots or people you would never talk to in real life, but the deeper point is that companies and people stay resilient by not succumbing to mobs and not becoming weather vanes for moral fashions. Conviction is built by working on an idea, not sitting in a room thinking about it, unless it is pure thought like math. His essays are the engine: he writes a version one, notices everything wrong, and fixes it sentence by sentence until there is no false statement left. He will read an entire book for a single sentence because he would be mortified to publish something false and, having no deadlines, has no excuse. That is why his standing challenge to angry readers, to point out one incorrect statement, almost never gets answered.
Raising kids, prepping, and the things that keep them up at night
Their parenting philosophy is to give kids confidence and a stable base, indulge curiosity, and encourage projects nobody assigned, like the living room overrun by one son’s Lego. They pick their battles: they put their foot down on cruelty but admit total defeat on food, devices, and screen time. Paul’s favorite question for anyone with an unusual experience is not “what was it like” but “how was it different than you expected,” which surfaces the genuinely novel detail, and the meta-version of that became the show’s recurring question to all guests. On prepping, they joke that living in the English countryside is itself a form of preparation, and that in turbulent times you should bet on an island full of reasonable people. The episode closes on what keeps them up at night: AI and climate change, the two things Paul treats as uniquely game over, illustrated by the prospect of the Gulf Stream reversing and leaving Europe, which sits as far north as Alaska, a frozen wasteland. Jessica notes her YC superhero name was Panic, and the conversation ends, after a detour through political polarization and a child who insisted for six months on being called SR-71 forecast 80 leaping leopard, on the admission that they manage screen time by being utterly defeated.
Notable Quotes
“If you’re a startup founder, a world where things are constantly changing is actually good for you. It feels bad, but you’re better off than anybody else.”
Paul Graham, on why turbulence favors young, flexible founders
“You can’t move with fashion. You have to stick to your North Star.”
Paul Graham, on holding founding principles during noisy, turbulent times
“There’s always some kind of disaster. It’s almost a rule of thumb at Y Combinator that there’s always some disaster going on, just like in a hospital. There’s always somebody who’s coding.”
Paul Graham, on crisis as the normal operating environment for startups
“The measure of a good startup idea is revenue, sure. Let’s not pretend companies are supposed to do something else.”
Paul Graham, on how to judge whether an idea is actually good
“Assume that the user is selfish and lazy, and make something. Selfish, greedy, and lazy. And make something that causes good things to happen despite that.”
Paul Graham, on why guilt is a weak business model and greed is a source of energy
“This is where the best startup ideas come from. They grow authentically out of the founders’ lives.”
Jessica Livingston, on a wildfire curiosity turning into a company
“Please point out the incorrect statement I’ve made in this essay. And no one ever does that.”
Paul Graham, on writing essays sentence by sentence until nothing in them is false
“AI and climate change have something in common. They’re the two big things I worry about the most, because they’re both game overs.”
Bill Ackman, founder and CEO of Pershing Square, joined the All-In Podcast for a conversation about how his investment approach has shifted toward permanent, long-term ownership, why he believes the highest-quality companies are being left behind by a market chasing the new new thing, and how AI is raising the risk of disruption for almost every business. He also lays out his plan to turn Howard Hughes into a Berkshire Hathaway-style compounding machine built on insurance. You can watch the full conversation here. Below is a structured breakdown of the ideas, the stories, and the frameworks he uses to underwrite a business.
TLDW
Ackman explains how his philosophy evolved from a smaller, more liquid activist toward concentrated, permanent ownership of durable, non-disruptible businesses, with much of his activism now playing out on X rather than in the boardroom. He tells the origin story of his first big trade, Wendy’s and the Tim Hortons spin-off, and explains why a large long-term shareholder on a board is an antidote to short-term markets. On AI, he argues that this is the greatest era in history to build a company, which means the risk of being disrupted has gone up enormously, and that the market is mispricing high-quality compounders like Microsoft, Meta, and Amazon while crowding into chips, semiconductors, and energy. He works through the SaaS question and why niche software is more at risk than platforms, how he underwrites SpaceX, xAI, OpenAI, Anthropic, and Palantir like late-stage venture bets using a people, opportunity, context, deal framework, and why founder-led companies have an edge in making radical calls. The back half covers his Howard Hughes plan to copy Buffett’s insurance-float model, the role of cost of capital and reflexivity in markets, the meme-stock era, going direct on social media, and the three different ways an investor can put money to work with Pershing Square.
Thoughts
The most useful idea in the interview is the way Ackman reframes disruption as the central investing problem of the AI era. His point is that the same forces making this the best time in history to start a company, meaning near-unlimited compute, capital, and talent, also raise the odds that any given incumbent gets disrupted. That reframes the word quality. It is no longer mostly about margins and moats. It becomes about non-disruptibility, which is a much higher bar than most quality investors were using a decade ago, and it is why he says most of his research time now goes into assessing that single risk.
The what-the-market-is-missing thesis is classic contrarian Ackman. Arguing that Microsoft, Meta, and Amazon are the new old-fashioned, undervalued names while capital piles into semiconductors and energy is a direct echo of 2000, when Berkshire Hathaway bottomed precisely because money was chasing internet stocks. It is worth keeping in mind that he owns all three, so the call is also his book. The durable signal here is the framework, not the specific tickers: capital reliably chases the new new thing, and genuinely high-quality businesses get left behind during those rotations.
The Howard Hughes plan is the most concrete bet in the conversation. Copying Buffett’s insurance-float playbook, short-term treasuries for policyholder money and equities for the surplus, onto a discounted real-estate holding company is elegant. The hard part is exactly what Ackman flags about insurance as an industry: the best investors go to hedge funds, not insurers, so most insurance companies only ever manage the liability side well. Pershing Square’s edge is that Ackman can both write the business and invest the float, which is the same reason it worked for Buffett. The framing of going from a four billion dollar company to a trillion over fifty years is a statement of intent, not a forecast, and should be read that way.
Underneath all of it sits cost of capital and reflexivity. His observation that a higher stock price literally makes a company more valuable, because it lowers the cost of capital and creates acquisition currency, is the mechanism behind both Elon Musk’s empire and the meme-stock era he is wary of. Going direct on X is the same lever pointed at himself: communicate the vision, lower your own cost of capital, and make the bet easier for other people to place. It is a coherent worldview in which narrative and balance sheet continuously feed each other, and it explains a lot of his behavior over the last few years.
Key Takeaways
The biggest change in Ackman’s approach over time is an appreciation for business quality, meaning long-term, durable, protected, non-disruptible growth as the most important factor.
He says he is as activist as ever, but more of it now happens on X than in the traditional corporate context.
His first big investment was Wendy’s, which owned Tim Hortons. The simple thesis was to buy Wendy’s, spin off Tim Hortons, and double the money.
Early on no one returned his calls, so he had Steve Schwarzman’s Blackstone write a fairness opinion, filed it publicly, and the company spun off Tim Hortons six weeks later. The CEO later thanked him after being fired with a large exit package.
Reputation compounds. Where Pershing Square once had to bang down the door, companies now sometimes tweet a welcome when it buys a stake.
A large long-term shareholder on a board is a counterweight to short-term markets, letting management test ideas privately and pursue initiatives that hurt the next few quarters of earnings.
Pershing Square owns Microsoft, Meta, and Amazon. Ackman argues you are either invested in AI directly or indirectly, or it is a threat, so you have to understand it.
The hardest and most important job for a concentrated investor is judging the risk of disruption, and that risk has risen dramatically.
This is the greatest era in history to build a business because of near-unlimited access to compute, capital, and talent, which is exactly why the probability of being disrupted has gone up enormously.
Markets bring their eye to the new new thing, currently chips, semiconductors, and energy, while high-quality companies get left behind.
He draws an analogy to 2000, when Berkshire Hathaway traded at one of its lowest valuations because everyone chased internet stocks. He sees a similar dynamic around Amazon, Meta, and Microsoft today.
On the SaaS question, he worries more about a Salesforce than a platform like Microsoft, because niche software charging high per-seat or per-year prices is most exposed, while low-priced platforms are safer.
Any software company today has to be as AI-enabled as possible, or risk losing the monopolistic pricing it once enjoyed.
His famous March 2020 CNBC appearance was an attempt to reach President Trump and argue for a short shutdown, paired with the view that stocks were incredibly cheap and worth buying.
He describes valuation as a tether on the market: when prices stretch too high they snap back, and when they get too cheap the same rubber band pulls valuations up. Calling that out publicly can trigger a psychological reset.
His recent bullish call came because stocks of really high-quality companies had gotten crazy cheap on fundamentals, meaning the present value of the cash they generate.
He underwrites high-multiple names like SpaceX as venture investments using a framework from business school: people, opportunity, context, deal.
On SpaceX, people and opportunity are one of one, the context is incredible, and Starlink plus near-monopoly low-cost launch make it strategically valuable. The complicated part is the deal, meaning the valuation. He invested via an SPV after Ron Baron’s nudge, and also invested in xAI.
He treats OpenAI, Anthropic, and Palantir as late-stage venture bets that have proven they can generate real revenue, and says OpenAI should do a better job communicating how it thinks about its enormous capital commitments.
Every CEO in America is asking how to use AI, how it applies to their business, and how it is a threat. It is top of mind and boards open every meeting with it.
He has not seen much enterprise AI success yet, citing a McKinsey study that 95 percent of enterprise initiatives fail and the rise of the forward deployed engineer as the hot role bridging promise and ROI. Pershing Square itself uses AI mainly for legal, compliance, and back-office work.
Founder-led companies have an advantage because founders have the authority and the economic stake to make radical calls, while the average S&P 500 CEO has a roughly three to four year tenure and is incentivized not to make mistakes.
He cites Mark Zuckerberg buying Instagram and WhatsApp as the kind of shocking-at-the-time calls that a founder with a track record can make.
Ben Graham’s enduring lesson is that a stock is an interest in a business, not a piece of paper, but Graham mostly invested in liquidations and cash-rich shells, and made most of his money on Geico.
Most of Buffett’s value at Berkshire came from owning insurance operations and focusing on the asset side of the balance sheet, not just the liability side.
Insurance is hard to copy because top investors do not go to work for insurers. Buffett owned half his company and was a great investor, which is why it worked.
Howard Hughes came out of the General Growth bankruptcy and owns master-planned cities like Summerlin, with 26,000 acres in the Las Vegas area, comparable to the Irvine Company that built roughly a hundred billion dollars of wealth for Donald Bren.
The plan is to reinvest the cash Howard Hughes generates into insurance, put policyholder float in short-term treasuries and the surplus in common stocks, and build a compounding machine over fifty years, buying it at roughly sixty cents on the dollar.
A company must earn a return above its cost of capital for the stock to rise. Elon Musk has kept his companies’ cost of capital extremely low, and a SpaceX IPO near a 1.75 trillion dollar valuation could be one of the lowest cost of equity capital transactions ever.
Markets have changed less because of Ackman and more because of figures like Ryan Cohen and GameStop, where a stock can trade well above its value on personality and an army of followers.
Higher valuations are reflexive: a rising stock price lowers cost of capital and creates currency to issue stock and acquire businesses, which is part of how Elon built Tesla.
There are three ways to invest with Pershing Square: the management company itself (a royalty on compounding assets with no capex), PSUS (a portfolio of best ideas trading at an 18 percent discount), and Howard Hughes (a bet on building the next Berkshire). A dollar invested 22 years ago became roughly 27 to 28 times net of fees.
Going direct on X, with 2.2 million followers, lets him communicate his vision and lower the friction for others to back his bets, even as his very long tweets have become a running meme.
Detailed Summary
From activist trades to permanent capital
Ackman frames the evolution of his career as a steady move toward business quality. As a smaller, more liquid investor early on, he did not have to think as long-term. As Pershing Square became a bigger, more concentrated investor, durable growth became the dominant factor in every decision. He insists he is still as activist as ever, but a lot of that energy has shifted to X, where he can argue a position publicly rather than only inside a boardroom. The best investments, he notes, are the ones where you do not need to join the board and do anything at all.
The Wendy’s and Tim Hortons origin story
One of Pershing Square’s first investments was Wendy’s, which owned the Canadian coffee and donut chain Tim Hortons. The value of Tim Hortons alone was greater than the entire value of Wendy’s, so the idea was simple: buy Wendy’s, spin off Tim Hortons, and double the money. Ackman bought ten percent of the company and could not get the CEO to return a single call, so he had a contact at Blackstone, with Steve Schwarzman’s sign-off, write a fairness opinion on what Wendy’s would be worth after a spin-off, filed it publicly, and watched the spin-off happen six weeks later. The CEO eventually called back to thank him, having been fired but rewarded with a large exit package. Over the years that scrappy approach gave way to a reputation that now opens doors on its own.
Why a long-term shareholder on the board matters
The core problem of being a public company, in Ackman’s telling, is the short-term nature of markets and analysts, when a good business should be run in the context of years and even decades. A large, supportive shareholder on the board gives management a place to test ideas before exposing them to the public and a credible voice willing to back initiatives that hurt earnings for a few quarters. That is the value-add he believes a constructive activist can bring to a mature public company, as opposed to a startup where the best outcome is simply to own a great business and stay out of the way.
AI and the rising risk of disruption
For a concentrated, long-term investor, the most challenging task is judging the risk that two people from Stanford in a garage build something that destroys your thesis. Ackman argues that risk has climbed dramatically because this is the greatest era in history to build a company, with near-unlimited access to compute, capital, and talent. The paradox is that the conditions that make building easier also make incumbents more fragile, so the bulk of his research now centers on assessing how disruptible a business really is.
What the market is missing
Investors bring their attention to the new new thing, currently chips, semiconductors, and energy, which leaves high-quality companies behind. Ackman compares the moment to 2000, when Berkshire Hathaway traded at one of its lowest valuations ever because capital was chasing internet stocks. He sees an echo today in how Amazon, Meta, and Microsoft are treated as old-fashioned, and he considers them undervalued on fundamentals, where value is the present value of the cash a business generates over its life. His recent bullish call, like his March 2020 appearance, came because stocks of really high-quality companies had simply gotten too cheap.
The SaaS question and AI-enabled software
On the so-called SaaS apocalypse, Ackman says it is a company-by-company analysis. He worries more about something like Salesforce than about a low-priced platform. The companies most at risk are those that extracted near-monopolistic profits by charging a high annual price for a niche product, because AI lowers the barrier to replicating that functionality. A platform where the average customer pays a small amount per seat, like Microsoft, is far less exposed. The takeaway for any software company is to become as AI-enabled as it possibly can.
Underwriting SpaceX, xAI, and the AI labs like venture
For the highest-multiple private companies, Ackman uses a venture lens and a framework a business school professor taught him: people, opportunity, context, deal. SpaceX scores as one of one on people and opportunity, with an incredible context and a near-monopoly in low-cost launch through Starlink, which makes even Amazon a likely customer. The complicated variable is the deal, meaning the valuation, and he admits he has not done all the math, having invested through an SPV after Ron Baron encouraged him, along with a position in xAI. He treats OpenAI, Anthropic, and Palantir as late-stage venture bets that have proven real revenue, and argues OpenAI in particular should communicate more clearly how it justifies capital commitments that vastly exceed current revenue.
Founder-led companies and the authority to act
Ackman agrees that founder-led companies have a structural advantage in a fast-changing environment. The average S&P 500 CEO has a tenure of roughly three to four years, a small economic stake, and an incentive not to make a career-ending mistake. A founder is betting an entire life and reputation, has the authority of a major voting and economic position, and has usually made several hard, contrarian calls that turned out right. He points to Mark Zuckerberg’s acquisitions of Instagram and WhatsApp, which looked shocking at the time, as exactly the kind of decision a founder with a track record can make and a hired manager often cannot.
Howard Hughes as Berkshire Hathaway 2.0
Ackman points to a detailed financial history of Berkshire Hathaway showing that the vast majority of Buffett’s value creation came from owning insurance and focusing on the asset side of the balance sheet, not just the liability side. Insurance is hard to replicate because skilled investors join hedge funds rather than insurers, but Buffett owned half his company and was a great investor. Pershing Square is applying the same idea to Howard Hughes, a company created out of the General Growth bankruptcy that owns master-planned cities such as Summerlin, with 26,000 acres around Las Vegas, in the spirit of the Irvine Company that made Donald Bren roughly a hundred billion dollars. The plan is to reinvest the company’s cash into insurance, place policyholder float in short-term treasuries and the surplus in common stocks, avoid issuing stock the way Buffett did, and compound for fifty years, all bought at around sixty cents on the dollar.
Cost of capital, reflexivity, and going direct
A company only creates value when it earns above its cost of capital, which is why Howard Hughes, seen as a high-cost-of-capital real-estate business, has long traded at a discount, and why Ackman is repurposing its assets into a higher-returning model. He highlights how reflexive markets are: a higher stock price itself makes a company more valuable by lowering its cost of capital and creating currency to raise money and acquire businesses, a lever Elon Musk used to build Tesla. He attributes real market change less to himself and more to figures like Ryan Cohen and GameStop, where personality and a following can lift a stock far above its value. His own going-direct strategy on X, with 2.2 million followers and famously long posts, is the same mechanism applied to communicating a vision and lowering friction for investors. He closes by laying out three ways to invest with Pershing Square: the management company as a royalty on compounding assets, the PSUS portfolio trading at an 18 percent discount, and Howard Hughes as a bet on building the next Berkshire.
Notable Quotes
“The best investments are one where you don’t need to join the board and do anything.”
Bill Ackman, on the kind of business he most wants to own
“The probability of your being disrupted has gone up enormously.”
Bill Ackman, on why assessing disruption risk now dominates his research
“Valuation is like a tether on the market, right? When it gets too high, it’s like this rubber band that’s stretching and inevitably it bounces back.”
Bill Ackman, on how prices revert at both extremes
“People, opportunity, context, deal.”
Bill Ackman, on the business school framework he uses to underwrite companies like SpaceX
“Every CEO in America today is like, how do I use AI?”
Bill Ackman, on AI as the top opportunity and threat in every boardroom
“A closed mouth gathers no foot.”
Bill Ackman, quoting the line a friend put next to his name in his high school yearbook
“The increase in value of the company increases the value of the company, right? Because it lowers the cost of capital, it gives you more flexibility, gives you the ability to issue stock, raise capital, acquire other businesses.”
Bill Ackman, on the reflexivity between stock price and corporate value
“The company’s got like a $4 billion market cap and the goal is to build it into a trillion dollar thing over time compounding.”
Bill Ackman, on his fifty-year plan for Howard Hughes
Taken together, the conversation is a tour of how Ackman now thinks about quality, disruption, and compounding, and a preview of the Berkshire-style machine he wants to build out of Howard Hughes. Watch the full conversation here.
Related Reading
Pershing Square Holdings the public vehicle and primary source for Ackman’s portfolio and strategy.
Howard Hughes Holdings the master-planned community company Ackman is reshaping into an insurance-driven compounder.
Uber CEO Dara Khosrowshahi sat down with Patrick O’Shaughnessy on the Invest Like the Best podcast for a long, candid conversation about the forces remaking transportation. There is artificial intelligence inside the company, and there is physical AI out in the real world, meaning autonomous vehicles, robotaxis, and delivery drones. He calls the autonomous opportunity another trillion dollar marketplace and argues it will change how society operates. You can watch the full interview here. What follows is a structured breakdown of the most useful ideas, the strategy behind Uber’s AV bet, and the operating philosophy that runs underneath all of it.
TLDW
Dara Khosrowshahi explains how he brought order to the chaos he inherited at Uber in 2017 by treating hard problems like vector mathematics, and how an immigrant childhood shaped his all-in, low-stress operating style. He describes AI hitting Uber on two fronts at once: much larger digital models that predict rider intent, and physical AI that changes how rides and food get fulfilled in the real world. The conversation covers Uber blowing through a full year of AI budget in a single quarter, metering headcount as engineers become superhuman, the more than 30 AV partnerships with Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI, and why supply, not demand, is the whole game. It runs through the coexistence model borrowed from travel and Uber Eats, the Uber One membership flywheel at 50 million members, the push from on-demand to planned travel through hotels and Uber Reserve, the economics of cheaper autonomous cars and delivery drones, the regional race from the Middle East to Europe, and the lessons from Barry Diller and Herbert Allen about getting to ground truth and betting on people. It closes on his capital allocation philosophy of prioritizing organic growth and AV commitments over buybacks.
Thoughts
The most underappreciated line in the whole interview is the budget one. Blowing a full year of AI spend in a single quarter is the clearest signal yet that frontier intelligence is being consumed far faster than even an AI-native company planned for. Dara’s response has quietly become the default enterprise playbook: explore on the expensive frontier models, then scale the proven interactions onto cheaper or open-source models. The deeper tension is that he is simultaneously telling teams to drive adoption and metering headcount, which is the real story of AI in large companies. The productivity gains are showing up as fewer hires, not only as faster shipping.
The supply-first framing is the strategic core, and it inverts the demand-first logic he learned at Expedia. In autonomous vehicles this means Uber does not need to win the self-driving race itself. It needs to own the demand layer and aggregate every AV maker’s supply, the same way online travel agents coexist with hotels and Uber Eats coexists with McDonald’s. The 30 percent higher utilization figure for AVs on Uber’s network is the wedge in that argument. It is the reason a Waymo stays on the platform even while building its own brand, because filling more of an expensive asset’s day changes the entire return on the car.
His premortem answer is unusually honest. Asked what kills the opportunity, he does not name an Uber-specific execution failure. He names AI’s unpopularity with the general public. That is a CEO admitting the gating factor is social license, not technology. The early data he leans on, drivers in Austin and Atlanta earning more and signing up in greater numbers as AVs add incremental demand, is the counter-narrative he is betting the public conversation on. Whether that story holds as AV volume scales from thousands of vehicles to hundreds of thousands is the open risk the entire industry shares.
Underneath the strategy is one repeated instinct: get to ground truth. It shows up in the Barry Diller story about reading the model from the analyst who built it, in his hunt for the troublemakers who keep a company mutating, and in the fact that he bought an ebike to deliver food in San Francisco. It is the same move applied at every altitude, and it is why he frames AI as a chance to rebuild processes from first principles rather than shave 20 percent off the ones that exist. The leaders who treat AI as an efficiency tool will likely lose to the ones who rebuild from the ground up.
Key Takeaways
Dara took the Uber job in 2017 after Daniel Ek recommended him at the Allen and Company Sun Valley conference and told him, when he hesitated, that life is about impact rather than happiness.
He inherited what he calls complete chaos: a board fighting for control, lost trust with regulators and the public, and a committee running the company after Travis Kalanick stepped back.
His method for chaos is to treat it like vector mathematics, breaking a seemingly unassailable problem into component dimensions and solving each one.
Early moves included bringing in chairman Ron Sugar to unite the board, running a listening tour with stakeholders, and rebuilding the executive team with leaders like Andrew McDonald and Tony West.
He credits an engineering mindset and an immigrant childhood for his calm under pressure. His family lost everything leaving Iran when he was nine and rebuilt from nothing.
On parenting, he argues that overcoming challenges is what forms people, and that doing everything for your kids is a long-term disservice disguised as a short-term favor.
Uber has always operated in a probabilistic real world of traffic, cancellations, and late food, so it has used machine learning longer than most consumer companies.
The current inflection is AI on two fronts: larger digital models that predict intent, and physical AI that changes how Uber fulfills in the real world.
Uber’s feed and search models are now roughly 10,000 times bigger than the older ones, enabling universal search across rides, eats, and grocery in a single query.
Uber can already guess a rider’s destination about three quarters of the time, turning booking into a one-tap interaction.
AI adoption is bottoms-up across engineering, legal, and marketing. Developers in India are driving roughly ten times the code commits using autonomous agents.
Dara pushes teams to rebuild processes from first principles with AI rather than settling for 20 to 30 percent optimization of an existing process.
He wants the rebels and troublemakers to win, and treats unpredictable internal adoption patterns as something to find and promote.
Uber blew through its full-year AI budget in a single quarter, which is now forcing it to meter headcount as engineer throughput climbs.
The token strategy is to explore on expensive frontier models, then scale proven interactions onto cheaper or open-source models.
Uber generates over 10 billion dollars in free cash flow on more than 10 billion trips a year, but it is not a high-margin business, so efficiency funds lower prices and higher earnings.
In autonomous vehicles, the thesis is supply: own the demand layer and aggregate every AV maker’s vehicles, the way Uber aggregates drivers and restaurants.
Uber has more than 30 AV partnerships, including Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI.
Uber is building the surrounding ecosystem: depots, charging, fleet partners, a one billion dollar Santander financing line for EV and AV fleets, and autonomous insurance.
AVs operating on Uber’s network are about 30 percent busier in trips and revenue per vehicle per day than vehicles not on the network, which transforms the return on an expensive car.
The build, partner, or buy answer is coexistence, mirroring how travel agents coexist with hotels and airlines and how Uber Eats coexists with McDonald’s, Starbucks, and Chipotle.
His public premortem is that AI’s unpopularity, not Uber-specific execution, is the biggest risk, so the company must move at the pace society will accept to avoid backlash.
Early data in Austin and Atlanta shows drivers earning more and more drivers joining, suggesting AVs are adding incremental demand rather than only displacing humans.
AV hardware costs typically fall 30 to 40 percent per generation. A Lucid midsize built with Nuro could land around 60,000 to 70,000 dollars and bring transportation costs down.
Lower cost expands demand. Uber already dwarfs the taxi market it was once sized against, and Dara expects the same dynamic with AVs.
Traditional OEMs are now investing in L4-ready systems and should arrive over the next two to four years. Each AV drives roughly three to four times what a human driver does.
Chinese manufacturing capability and bill of materials are described as unrivaled. A low-cost Western, Foxconn-style player for AVs is being worked on but does not exist yet.
Drones are gated by battery density. Food and grocery drones should reach real scale in two to five years and become normal in five to ten, with Joby and Zipline cited as examples.
The Middle East, including Abu Dhabi, Dubai, and Saudi Arabia, is moving fastest thanks to entrepreneurial regulators. Europe is catching up, with London robotaxi pilots expected before year end.
Uber Eats wins the number one position more often internationally. The playbook is selection plus reliability, amplified by cross-platform upsell, with about 13 percent of Eats bookings coming from the mobility app.
Uber One has 50 million members growing 50 percent year on year. Dara frames it like Netflix, more content for the same price, and accepts a first-year loss for multi-year profit.
Uber is pushing from on-demand to planned through hotels, via a deal with Expedia, and through Uber Reserve, now at over a 5 billion dollar run rate with 99 percent-plus reliability.
His leadership lessons: from Barry Diller, get to ground truth from source material and tell the truth as a leader. From Herbert Allen, bet on people, not companies.
On capital allocation, he prioritizes organic growth and financialized AV commitments over buybacks, while keeping costs growing slower than revenue.
Detailed Summary
From chaos to structure: the 2017 turnaround
Dara came to Uber from 13 years running Expedia under Barry Diller, recruited through a head hunter after Daniel Ek floated his name at the Sun Valley conference. He arrived into what he describes as complete chaos, with the board fighting over control rather than the fate of the company and trust badly damaged with regulators, the public, and employees. His approach was to decompose the situation the way an engineer decomposes a multidimensional problem, solving each dimension and reassembling the whole. Practically that meant a new chairman in Ron Sugar to unite the board, a listening tour to understand stakeholder concerns, and a rebuild of the leadership team that kept strong insiders like Andrew McDonald while adding people like Tony West.
An engineering mind and an immigrant chip on the shoulder
His wife Sid calls him a robot, by which she means he does not get rattled. He traces that to an engineering education and to a childhood upheaval. His family left Iran when he was nine and lost the business his father had built, and he watched that loss diminish his father over the years. The experience produced a durable drive to rebuild and a refusal to let external chaos define him internally. He applies a similar philosophy to his kids, arguing that challenges and the act of overcoming them are what form a person, and that helicopter parenting removes the very friction that builds capability.
AI inside Uber: prediction, agents, and superhuman engineers
Uber has always lived in a probabilistic world where the digital booking is deterministic but the real-world fulfillment is not, so it adopted machine learning earlier than most consumer companies. The newest models are roughly 10,000 times larger than the prior generation and power universal search and destination prediction that is right about three quarters of the time. Internally, adoption is bottoms-up and uneven in a good way, with engineers in India shipping around ten times the code commits using autonomous agents. Rather than mandate from the top, Dara pushes teams to rebuild whole processes from first principles with AI instead of trimming a fifth off the existing ones.
The cost of intelligence
The flip side of fast adoption is cost. Uber blew through its annual AI budget in a single quarter, and that is forcing a real adjustment. Because engineer throughput is climbing, the company is metering headcount increases rather than simply hiring. The operating rule is to keep driving adoption while pursuing efficiency, using frontier models from providers like OpenAI and Anthropic to experiment with new interactions, then moving the scaled experiences onto more efficient or open-source models to bring the per-token cost down. With more than 10 billion dollars of free cash flow on over 10 billion trips, Uber is not a high-margin business, so efficiency directly funds lower prices for riders and higher earnings for drivers.
Why supply decides the AV race
At Expedia, Dara learned a demand-first model where you attract consumers and then build inventory to match. Uber is the opposite, a supply company, where securing every car, restaurant, courier, and retailer causes the demand to follow. Applied to autonomous vehicles, the strategy is to be the go-to-market and demand layer for anyone building a digital driver. Uber wants to aggregate the largest pool of AV supply, just as it aggregates human drivers, so that the companies building the actual self-driving software can focus on the driver while Uber handles distribution and utilization.
Building the ecosystem around the digital driver
Uber now has more than 30 AV partnerships spanning Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI, and it expects many winners rather than one, the same shape as the foundation model market. Around those partners it is assembling the connective infrastructure: depots and charging in cities where the regulatory path is opening, fleet partners, a one billion dollar financing line with Santander for EV and AV fleets, and work on autonomous insurance. It is also collecting street data today that can feed the models, so that when a partner’s cars hit the market there is instant demand waiting. The early proof point is that AVs on Uber’s network run about 30 percent busier than comparable vehicles off it, which materially improves the return on a costly car.
The premortem and the public’s patience
Asked what derails the opportunity, Dara points outward rather than inward. The risk is that AI is powerful but unpopular, and the average person experiences it as a threat to electricity costs or a cousin’s job rather than as magic. The same dynamic could hit AVs even though the technology should end up safer than human drivers, which is why questions about emergency services, equitable access, and driver earnings have to be worked through with regulators and communities. The encouraging early signal is in Austin and Atlanta, where drivers are making more money and more are joining because AVs appear to be adding incremental demand. The controllable risk, he says, is access to supply, which is exactly why Uber has partnered with nearly every AV provider across mobility, delivery, and freight.
A trillion dollar marketplace: cheaper cars and delivery drones
Dara sizes the autonomous opportunity as another trillion dollar marketplace. As AV software and hardware costs fall, typically 30 to 40 percent per generation, a Lucid midsize built with Nuro could come in around 60,000 to 70,000 dollars, which starts to lower the real cost of transportation. History says lower cost expands demand, and Uber already became multiples larger than the taxi market it was once compared to. Manufacturing scales from hundreds to thousands to hundreds of thousands of vehicles, each driving three to four times what a human does, with traditional OEMs investing in L4-ready systems over the next two to four years and Chinese manufacturers setting the bar on cost and quality. Delivery drones are further out, gated mainly by battery density, but should reach real scale in two to five years and feel normal in five to ten.
Membership, hotels, and the shift from on-demand to planned
Uber Eats often reaches the number one position internationally by nailing selection and reliability and then layering on cross-platform advantages, with roughly 13 percent of Eats bookings flowing from the mobility app. Uber One, at 50 million members growing 50 percent year on year, is the loyalty engine, and Dara likens it to Netflix in that members get more for the same price. He explains the membership economics through Amazon Prime, accepting a money-losing first year to earn multi-year profit as members spend more across services. The newest expansion is travel: hotels through a deal with Expedia, and a broader move from Uber’s on-demand brand toward planned bookings, proven out by Uber Reserve at a 5 billion dollar-plus run rate and 99 percent-plus reliability. The end state he wants is a trip where Uber pre-books your ride to the airport, knows your hotel, and brings in-market magic to the whole journey.
Operating philosophy: ground truth, troublemakers, and capital allocation
The mentors thread through everything. From Barry Diller, with whom he worked for more than 20 years, he took the discipline of getting unfiltered truth from the source, illustrated by Diller insisting on hearing the Paramount LBO model from the young analyst who built it. From Herbert Allen he took the lesson to bet on people rather than companies, because great people stay great across cycles. In his own practice that becomes radical transparency, a deliberate hunt for the troublemakers who act as the mutations that keep an organism from dying, and a willingness to be wrong, since learning, often through pain, is what he finds interesting. On capital, he treats allocation as an art, prioritizing organic growth, which took Uber Eats from under a billion to over a hundred billion in gross bookings, then AV commitments that can be financialized, with buybacks coming after growth rather than instead of it.
Notable Quotes
“I know who I am, and I’m always going to be that same person. I’m not going to let the chaos of the world affect me mentally.”
Dara Khosrowshahi, on why crisis does not rattle him
“We blew through our AI budget in a quarter, you know, for the whole year essentially. And it is forcing us to adjust.”
Dara Khosrowshahi, on the real cost of AI adoption at Uber
“What’s magical now is going to seem normal to all of us 10 years from now.”
Dara Khosrowshahi, on how fast riders stop noticing autonomous vehicles
“We think it’s another trillion dollar marketplace.”
Dara Khosrowshahi, on the scale of the autonomous vehicle opportunity
“If we do that, the demand will take care of itself.”
Dara Khosrowshahi, on why Uber obsesses over securing supply first
“I’m looking for those mutations. I’m looking for those troublemakers constantly.”
Dara Khosrowshahi, on keeping a large company adaptive
“It’s the filtering that gets the edge out of the story or out of the situation. And it’s often the edge that gives you an edge.”
Dara Khosrowshahi, on a lesson from Barry Diller about going to the source
“If I’m not wrong, if I’m not making mistakes, it’s just not very interesting.”
Dara Khosrowshahi, on why learning, often through pain, drives him
“Meeting her and seeing her operate, I think, finally allowed me to be the person I want to be versus the person I thought I was supposed to be.”
Dara Khosrowshahi, on his wife Sid, when asked the kindest thing someone has done for him
The throughline is that Uber intends to be the demand layer for autonomous transportation the way it became the demand layer for human drivers, while rebuilding its own operations around AI from first principles. Whether the public grants the industry enough patience is the open question Dara keeps returning to. Watch the full conversation here.
Related Reading
Uber primary source for the company, products, and AV partnerships discussed in the interview.
This is the full episode of Naval Ravikant’s conversation with three frontier founders: Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science. The premise is that all three are building their own factories rather than assembling off-the-shelf parts, so the interesting question is not what they are building but what they are learning about how to build in the age of AI. Over roughly an hour the discussion moves from software factories and the thousand-x engineer into hardware, regulation, healthcare economics, autonomous companies, and a long closing argument about what humans can still uniquely do. Watch the full conversation on the Naval Podcast YouTube channel. We previously published two segments of this same discussion: part one, Waste Tokens to Save Time, on software factories and whether pure software is dead, and part two, Vibe Coding Hardware, on jet engines, vertical integration, and China’s open-source bet. This post covers the entire episode end to end.
TLDW
Four builders argue that AI has turned the engineer’s job from shipping output into building the factory that produces output, which is why token leaderboards are the new vanity metric and why you should waste tokens to save time. Guillermo Rauch frames the thousand-x engineer and the building-block economy, and asks whether pure software is dead now that models speak English. Blake Scholl shows how Boom turned hardware engineering into software, letting two engineers design an entire jet engine and collapsing months of regulatory compliance documentation into minutes. Max Hodak makes the case for extreme vertical integration, a captive MEMS foundry, and a sober counter to Silicon Valley deregulation triumphalism: the bottleneck is the voters and the regulator’s asymmetric incentives, not just bad rules. The group works through healthcare as a fixed-bucket non-market, China’s cost-reduction strategy and its approved implantable brain interface, autonomous software that runs site reliability and security research with thousands of concurrent agents, a company-wide hackathon where the receptionist shipped a real automation, and a long debate on creativity, out-of-distribution surprise, intent, attribution, and the definition of art. The throughline: humans become verifiers, value moves to creativity, taste, and agency, and the single best move is to get extremely good with the tools, because it is people with AI versus people without AI.
Thoughts
The strongest idea in the episode is the quiet redefinition of what an engineer is for. Rauch’s point is that you no longer judge a person by how well they ship a single output. You judge them by whether they can build the factory that produces outputs B through Z. That reframe instantly explains why token leaderboards are nonsense. Counting tokens consumed is the same category error as counting lines of code written, a measure of motion mistaken for a measure of progress. Naval’s “waste tokens, save time” is the correct response: tokens are cheaper than people, so optimize for your own wall-clock time and the final output, and throw three models at the same problem if that gets you unstuck faster. The uncomfortable corollary, which the group says out loud, is that leverage in idea domains was never linear. The hundred-x and thousand-x engineer is not a new phenomenon. AI just made it impossible to keep pretending otherwise.
The second thread that ties the whole hour together is verification. Everyone converges on the same future: humans stop producing the work directly and move up the stack to signing off on it. Rauch is precise about what that means. Saying “I understand this pull request” no longer requires reading every line. It requires being able to say you wrote the test harness, the proofs, the type checkers, and the simulations that let you stand behind it in production. That is a profound shift, because it accepts that the code may be spaghetti you do not fully understand while insisting that the evaluator around it is trustworthy. Blake extends the same logic to regulation, and this is the most underrated argument in the episode. If you treat a 200-page lightning-strike compliance document as a test suite and a regulation as an exit criterion for an agent loop, then a body of rules you once resented becomes a guard rail that lets you move faster, not slower. The cost of change collapses, change aversion drops, and you can finally afford to iterate on physical things.
Max Hodak is the adult in the room on regulation, and the episode is better for it. The Silicon Valley consensus is that regulation is simply friction to be deleted, and there is plenty of dysfunction to point at: the NRC permitting essentially zero nuclear plants for decades, the FDA’s asymmetric incentives where approving a bad drug ends a career but blocking a good one costs nothing visible. But Hodak keeps pulling the conversation back to the harder truth. This is where the voters are. If you removed the current regulatory package, something very similar would get voted right back in, because the asymmetry reflects how the public actually weighs a visible death against an invisible delay. Real reform is not “deregulate,” it is narrow and surgical: prohibit the FDA from drawing adverse inferences across different users of a compound, build innovation zones where people consent to different rules, or copy Europe’s notified-body model so review capacity can actually scale. That is a far more serious position than the usual abundance-or-bust framing.
The healthcare segment is the part of this conversation you will not find in the two clips, and it is the most heterodox. Hodak’s diagnosis is that healthcare is a fixed bucket of money that grows with tax receipts, not a technological growth industry where falling prices expand the market the way phones and laptops did. Because there is no real private market, you get a small communist society running inside a larger capitalist one, with the waiting lines and frozen product quality that implies. His prescription is not single payer and not insurance reform. It is to drive the cost of bringing devices and drugs to market so low that a patient can buy a restored sense or an extra decade of life on a credit card, the way they finance a car, and his warning is that China’s lower approval costs and its already-approved implantable brain interface put it on track to do exactly that. Whether or not you buy the twenty-percent-of-income deductible he floats, the framing that a private market is the missing feedback loop is the kind of argument that gets too little airtime.
The closing debate on creativity is where the four of them disagree most productively, and they are careful enough to notice that their conclusions follow from their definitions. Hodak defines art as meaningful out-of-distribution behavior, which lets a military maneuver or a math proof count, and leads him to think a sufficiently capable model gets there too. Naval defines art as conveying an emotion with intent, which makes attribution load-bearing: the same photo down to the last pixel means more when a human took it, and a startup doing hardware attestation of human authorship suddenly has a real market. The shared observation that should worry every builder is that AI output collapses to a distribution mean. Every Claude-built website ends up the same serif font, the same brown and cream, the same monospace spacing, recognizable as slop precisely because it is in-distribution. The optimistic read, and the one Naval lands the episode on, is that this leaves an enormous and durable lane for humans who can step outside the system, and that the practical move for everyone is simply to become excellent with the tools, because the real divide is people with AI versus people without.
Key Takeaways
The job of an engineer has shifted from shipping a single output to building the factory that produces multiplicative outputs, so people are now judged on the leverage they create rather than the work they personally do.
There were always 10x engineers, and in idea, intellectual, and digital domains the real spread is 100x or 1000x. AI leverage just made that gap impossible to deny.
Token leaderboards and token consumption are the new lines-of-code: a measure of activity that does not map to value. Measure your own time and the final output instead.
Waste tokens to save time. Models are still far cheaper than a human, so throwing Codex, Claude, and Gemini at the same problem repeatedly is rational even when it looks wasteful.
Low-quality first-pass code is fine because you can spend more tokens later to harden it for production. The constraint is verifiable domains, not code quality.
A model is roughly as good as you are in a domain. The quality of your prompting and reprompting strongly determines the output, though this dependence should fade as models improve.
Models graduated from junior to principal engineers: they now return with multiple routes and tradeoffs rather than running away with the first idea, even if their time and cost estimates are often wrong.
A junior gets knowledge they could never have produced alone, but an experienced architect still extracts far more juice. Taste and judgment, like picking Postgres versus ClickHouse, remain the human’s edge.
Pure software’s moat is in question now that models speak fuzzy, sloppy English. For hardware founders this is a boon, since good software finally becomes cheap to produce.
The building-block economy, from Mitchell Hashimoto, argues agents need powerful reusable infrastructure rather than reinventing queues and databases every time. Shared dependencies are a cooperation value, like everyone depending on the same Postgres version.
Naval and Max both stopped writing code for years, then started building software they use daily through agents, on the strength of understanding how the pieces fit rather than syntax.
With agents you stop getting stuck on narrow debugging problems that used to consume indefinite time. The intrinsic frustration that was once “how you learn” is largely gone.
Boom turned siloed hardware engineering, much of it trapped in Excel and VBScript with no source control, into real software with automated testing and repeatable flows.
Software engineers now build the architectures and hardware engineers vibe code their pieces, letting two engineers design an entire jet engine where a single turbine-blade analysis once took one engineer a full day across a thousand blades.
Enterprise collaboration software and even spreadsheets are getting cooked, because you can now code the exact custom tool you need instead of approximating it.
AI will soon generate step files and PCB layouts, bringing the current software boom to mechanical and electrical engineering, likely within the year.
China is betting on open-source models because its hardware and supply-chain superiority pairs with on-demand software generation to erase Silicon Valley’s software advantage. Fall behind on generating software and you fall behind on generating everything.
In real usage, frontier intelligence dominates the top. Gemini “slaps at scale” as an industrial production model for support and browser automation, while Chinese models are not in the frontier coding tier.
Intelligence is an unalloyed good. Because mistakes are invisible and models are cheaper than people, you reach for the smartest available model rather than running a weaker one many times.
Max’s vertical integration thesis: when you cannot buy a part, you make it. Science owns a captive MEMS foundry because tighter integration toward a single block of bonded matter yields lower power, smaller size, and longer life.
AI’s biggest near-term impact inside hardware companies is regulatory: generating documentation and tracing which of thousands of ISO standards apply, work that used to occupy a quality team for months.
Junior engineers got promoted to senior and junior engineering got handed to agents. The same pattern hits law, where basic NDAs and red lines no longer require a lawyer.
Humans are becoming verifiers. Signing off on a PR means standing behind its consequences via tests, proofs, and type checkers, not reading every line. Creating software is easy; keeping it secure, tested, and maintained 1000 days out is the real question.
A RAG over regulatory documents collapses a 200-page compliance test plan from months to minutes, which cuts change aversion: you can alter the airplane and regenerate compliance instead of crying over rework.
Regulations can act as a test suite and exit criteria for agent loops, as long as they are non-contradictory and reasonable. The alternative is shipping slop directly into the air.
Physical building is guilty until proven innocent, illustrated by the absurdity of pre-filing a driving plan before every trip. The fix is more enforcement-based regulation rather than pre-approval, though agents on both sides could trigger a red queen race and DDoS overwhelmed agencies.
Regulation often fails to make things safer, only slower: the 737 Max shipped a single sensor with full authority over pitch, and the NRC kept us perfectly safe by approving almost no nuclear plants for decades.
The deeper problem is the voters and the regulator’s asymmetric incentives. Approve a bad thing and your career ends; block a good thing and nobody notices. Removing one agency just elects its replacement.
Targeted fixes beat blanket deregulation: bar adverse inferences across users of a compound, use single-patient IND pathways, create opt-in innovation and YIMBY zones, or adopt Europe’s competitive notified-body reviewers.
Healthcare is a fixed bucket of money tied to tax receipts, not a growth industry, so spending 10x more on it would be a catastrophe rather than a triumph. With no private market you run a small communist society inside a capitalist one.
The escape is lower cost-to-market, not single payer, so people can finance care like a car. China’s lower approval costs and its already-approved implantable BCI point that direction. LASIK, dental, and plastic surgery advance because patients pay directly.
End-of-one medicine works at the high end, as with GitLab’s Sid Sijbrandij outliving his cancer prognosis through a self-built escalation ladder, but it demands enormous agency at the patient’s weakest moment. AI should democratize that knowledge.
Vercel automated much of site reliability engineering: anomalies fire alerts, an agent investigates, can open an incident, and begins remediation, stopping just short of changing production itself.
Running an open-sourced security tool against the whole monorepo with 10,000 concurrent agents produced several quarters of security research in a couple of days for about $14,000 in tokens. Code translation and optimization are similarly autonomous now.
Blake stopped all project work for a week and had everyone, receptionist to engineers, build something with AI and demo it. He expected mostly silly projects and got mostly needle movers, including a real automation from shipping and receiving.
The autonomous company of the future may have a workforce that trains the agents doing the work rather than doing it directly, with tooling that extracts reusable skills from your inputs and outputs.
Returns are shifting from intelligence toward agency for humans, since agents supply the intelligence. The people best fit for the future open a coding agent and ask what to build instead of defaulting to passive consumption.
Maybe 10x more people are coding than a year ago, yet around 99% still never will, because to a non-coder the starting step remains unimaginable. Vibe coding is described as more addictive and entertaining than video games, with real output.
AI video lacks taste and judgment for now, but by 2030 expect fan-made films: dozens of Lord of the Rings takes, or generating unmade seasons of The Expanse from the books. The bigger prize is a genuinely new imaginative work, not a remix.
What humans uniquely do is generate meaningful surprise out of the training distribution, with intent that makes it mean something. Gödel stepping outside the formal system is the archetype; Claude’s identical-looking websites are the counterexample of in-distribution slop.
Higher productivity historically means you hire more, not fewer, of the productive people. Expect a larger number of smaller teams, an entrepreneurship explosion, and generalists winning as credentials matter less than creativity, taste, and judgment.
The throughline is people with AI versus people without AI. The single best investment right now is getting genuinely good with the tools and learning the exact edges of what they can and cannot do.
Detailed Summary
Software Factories and the Thousand-X Engineer
Guillermo Rauch opens with the idea that has him “pilled”: the engineer’s job has changed from shipping output directly to building the factory that produces multiplicative outputs. That reframes how you evaluate people and surfaces an old, controversial truth. He used to get flamed on Twitter for asserting 10x engineers, since it offends an equality instinct, but in intellectual and digital domains the real spread is 100x or 1000x, and choosing the right thing to work on is an infinite multiplier on top. AI leverage makes this less controversial, except that people now confuse token spend for productivity. The group agrees token leaderboards are the new lines-of-code. Max Hodak adds that a model is about as good as you are in a domain, so a capable developer gets a powerful collaborator while a junior gets junior-grade help, and the sporadic feedback you give, the reprompting, disproportionately determines the result. Naval’s posture is the opposite of fussy: he ignored every prompt-engineering trick on the bet that the models would improve faster than he could learn to game them, types less and less, and brute-forces problems by throwing multiple models at them. Waste tokens, save time, because tokens are cheaper than people.
Is Pure Software Dead, and the Building-Block Economy
Rauch describes models crossing from junior to principal engineer: they now return with several routes and explicit tradeoffs, push back when you try to jam high-cardinality telemetry into Postgres, and suggest ClickHouse or Athena instead. That elevates taste and judgment as the human contribution. He then poses the hard question: is pure software engineering obsolete now that models speak fuzzy, sloppy English and you no longer need code to communicate with them? For hardware founders it is a boon, echoing Patrick Collison’s line that software is art and artists are hard to hire. To temper the “agents reinvent everything” fantasy, he invokes Mitchell Hashimoto’s building-block economy: you do not want your agent rebuilding a queue from first principles every time it sends an email, and shared dependencies like a common Postgres version carry real cooperation value. Reusable infrastructure becomes more valuable in the agentic era, functioning like libraries and dependencies, or even a token cache, so models fork from existing starting points instead of burning a trillion tokens to recreate what exists. Naval and Max both note they had not written code in years and now build daily through agents, because understanding how APIs, data flow, and performance fit together matters more than syntax, and vibe coding is just transmitting intent the way a good engineering leader already did through people.
Vibe Coding Hardware at Boom Supersonic
Blake Scholl explains how AI changed the role of software and hardware developers at Boom. A great deal of hardware engineering lives in complex Excel spreadsheets and VBScript on individual laptops, with no source control and no automated testing, and handoffs happen manually over email like it is the 1990s. Boom had long tried to turn these flows into real software but could never afford enough software engineers. The new model is that software engineers create the architectures, because they understand systems, algorithms, and separation of concerns, and hardware engineers vibe code their own pieces. The result is mind-blowing productivity for small teams. His example: a turbine blade is cold at rest and expands when hot, so you must design both the cold and hot shapes and convert between structures and aerodynamics, work that took one engineer a full day per blade across a thousand blades in a jet. With a combined software-and-hardware tool you can now change blade geometry and see structural and aerodynamic results in real time, letting two engineers design an entire jet engine. The group extends this to the death of enterprise collaboration software and even spreadsheets, since you can now code the exact custom tool you need, and predicts AI will soon generate step files and PCB layouts, carrying the boom into mechanical and electrical engineering.
China, Open Source, and Which Models Actually Get Used
Naval argues China is going all-in on open-source models because its hardware and supply-chain superiority pairs naturally with on-demand software generation, which erases Silicon Valley’s software edge, and because the Chinese government has a history of funding ecosystem-wide efforts in network-effect businesses. Without frontier coding models there is no self-improvement, so a country that cannot generate frontier software falls behind on generating everything downstream. He notes the irony that almost all the open-source heft now comes from China, since OpenAI is not open, Grok and Google’s local models trail, and Anthropic ships no open models. On real usage, Rauch reports from Vercel’s AI gateway that frontier intelligence dominates the top, with a caveat: frontier intelligence at the right cost and performance, like Gemini, slaps at scale and is the best industrial production model for support and browser automation, while Chinese models are not in the frontier coding tier. Naval frames intelligence as an unalloyed good, since model mistakes are invisible and a smarter model is still cheaper than a person, which pushes everyone toward the most intelligent option and risks an oligopoly in AI.
Vertical Integration, Verifiers, and the Slop Problem
Max Hodak lays out Science’s vertical integration: the preference is always to buy, as with cheap PCBs from Asia, but when components do not exist you must make them, and the closer a product gets to a single block of covalently bonded matter the better it performs. Science owns a captive MEMS foundry on the east coast because there was no other way to do the packaging and assembly it needed. He notes AI’s most surprising internal impact so far is regulatory: generating documentation and tracing which of thousands of ISO standards apply, work that once tied up a quality team for months. Rauch raises the slop problem: mountains of AI-generated code arriving as pull requests nobody can read line by line. His standard is that an engineer must be able to say they understand and will stand behind the consequences of a PR, backed by the test harness, proofs, and type checkers, even without reading it all. Naval generalizes this into humans becoming verifiers, with lawyers, engineers, and operators moving to verifying the stack and standing behind it, and Rauch warns that creating software is the easy zero-to-one part while keeping it secure, tested, performant, and maintained a thousand days later is the real test.
Regulation as Test Suite, and the Voter Problem
Blake describes building a RAG that compresses a 200-page lightning-strike compliance test plan from months of a “monkey at keyboard” engineer’s work into minutes, with a powerful second-order effect: change the airplane and you regenerate compliance in minutes instead of crying over months of rework, which slashes change aversion and lets a small number of creative engineers iterate. Max reframes regulations as potentially good guard rails, a test suite and exit criteria for agent loops, provided they are non-contradictory and reasonable, since the alternative is shipping slop into the air. Naval warns of a red queen race of agent-on-agent compliance and agencies getting DDoSed by clever entrepreneurs flooding them with documents. Blake pushes for enforcement-based rather than pre-approval regulation, using the analogy that we would never tolerate filing a driving plan before every trip, yet that is exactly how physical infrastructure works: guilty until proven innocent. He cites the 737 Max’s single all-authority sensor and the NRC permitting almost no nuclear plants for decades as proof that this makes us slower, not safer. Hodak supplies the counterweight: the deeper issue is the voters and the regulator’s asymmetric incentives, where approving a bad thing ends a career and blocking a good thing goes unnoticed. Remove an agency and the electorate installs its twin. Naval and Max agree the real reforms are narrow, including innovation zones, opt-in YIMBY zones, and the experimental laboratory of fifty states.
Drug Discovery, Healthcare Economics, and End-of-One Medicine
Hodak explains why innovation zones do not solve drug discovery. The right-to-try act and single-patient IND already exist, and the FDA approves over 99% of such requests, sometimes by phone, but dosing requires clinical-grade drug that only the IP owner has, and the FDA will draw an adverse inference against the whole program if a very sick patient does worse. A targeted fix is to prohibit adverse inferences across different users of a compound. He points to Europe’s notified-body system, private certifiers blessed by governments, as a way to scale review capacity, and to China’s CFDA, which already approved an implantable brain-computer interface and brings products to market far cheaper. His core economic argument is that healthcare is a fixed bucket of money that grows only with tax receipts, unlike phones and laptops where falling prices expanded the market, so spending 10x more on healthcare would be a catastrophe rather than the triumph that 10x AI spending would be. With no private market you run a small communist society inside a capitalist one, with the lines and frozen quality that implies. The way out is lower cost-to-market so patients can finance care like a car, which is the direction China is pushing. Naval’s twist is a healthcare plan where the first 20% of income is the deductible to recreate a private market, citing LASIK, dental, and plastic surgery as fields that advance because patients pay directly. The group closes the segment on GitLab’s Sid Sijbrandij, who outlived a rare-cancer prognosis by building his own escalation ladder of drugs, noting that end-of-one medicine works at the high end but demands enormous agency exactly when a patient is weakest, which is where AI should democratize access to knowledge.
Autonomous Software, Hackathons, and the Autonomous Company
Asked how much autonomous software they run, Rauch describes Vercel automating much of site reliability engineering: instead of hand-set alarm thresholds, anomalies in error rate, latency, or throughput fire an alert, an agent investigates, can open an incident that loops in people, and begins remediation, stopping just short of changing production. Vercel also runs autonomous optimization and security research, and an open-sourced security tool run against the entire monorepo with 10,000 concurrent agents produced several quarters of security research in a couple of days for about $14,000 in tokens, the equivalent of months of red teaming. Max shares a vibe-coded bug-reporting queue where TestFlight users submit logs and screenshots, a daemon analyzes and fixes issues in the background, and ships him a build to try, raising the prospect of apps effectively built by their users, with the caveat that you would get a Homer Simpson car of every feature. Blake recounts stopping all project work for a week and requiring everyone, from the receptionist to the engineers, to build something with AI and demo it. He expected mostly silly projects and got mostly needle movers, including a genuinely useful automation from the shipping and receiving associate, concluding that most people have an idea worth building but cannot tell a good first idea from a bad one until they can iterate on a real thing. Rauch extends this to a workforce that trains the agents doing the work rather than doing it directly, and a coming feature to extract reusable skills from your inputs and outputs.
Creativity, Out-of-Distribution Surprise, and What Humans Can Uniquely Do
On the intelligence-versus-agency split, Max suggests returns to humans tilt toward agency since agents supply intelligence, while Naval counters that you stay 99% intelligence and 1% agency because the agents exercise the agency for you. They agree the humans best suited to the future are the agentic ones who open a coding agent and ask what to build. Coding has perhaps 10x more participants than a year ago, yet roughly 99% still never will, because the first step is unimaginable to a non-coder, even as vibe coding proves more addictive and entertaining than video games while producing something real. On AI video, the group notes it still lacks taste and judgment, but expects fan-made films by 2030, dozens of Lord of the Rings takes or generated seasons of The Expanse, while prizing a genuinely new imaginative work over a remix. The long closing debate turns on definitions. Hodak defines art as meaningful out-of-distribution behavior, broad enough to include a military maneuver, and expects models to reach it. Naval defines art as conveying emotion with intent, which makes attribution decisive: the same photo means more taken by a human, and a hardware-attestation startup gains a real use case. They cite Gödel stepping outside the formal system as the human archetype and the identical look of every Claude-built website as in-distribution slop. Naval lands the episode on optimism: productivity gains mean hiring more, not fewer, of the creative and AI-fluent, the future is a larger number of smaller teams and an entrepreneurship explosion where generalists thrive and credentials fade, and the single best move is to get extremely good with the tools, because it is people with AI versus people without AI.
Notable Quotes
“Now clearly there’s 100x or a thousandx engineers and the world hasn’t fully adjusted to this.”
Guillermo Rauch, on why AI made the spread between engineers impossible to ignore
“Just waste tokens, save time. Don’t look at the tokens either as inputs or outputs. Just look at your time and look at the final output.”
Naval Ravikant, on the right way to measure AI’s return
“We had to learn code to communicate with the models. Now the models speak English and they speak fuzzy sloppy English like a human and they understand things.”
Guillermo Rauch, asking whether pure software engineering is now obsolete
“It allows two engineers to design an entire jet engine, which is just wildly different.”
Blake Scholl, on Boom turning hardware engineering into software
“You need to be able to say I am signing off on understanding the consequences of this PR.”
Guillermo Rauch, on what it means to stand behind code you did not read line by line
“That is absolutely the way we build physical infrastructure in this country. It’s guilty until proven innocent. And what we should actually do is make more of these things enforcement based rather than pre-approval based.”
Blake Scholl, comparing the permitting process to filing a driving plan before every trip
“You’re basically running a small communist society inside a larger capitalist society. And that’s what we’re doing in healthcare.”
Max Hodak, on why there is no real private market in healthcare
“I expected we would get a large number of silly projects and a small number of needle movers. And what we got was a large number of needle movers and a very small number of silly projects.”
Blake Scholl, on the week he had the whole company build with AI
“If a person takes the photo versus AI generates the exact same photo down to the last pixel, the person taking the photo will have more meaning for me.”
Naval Ravikant, on why intent and attribution make something art
“It’s about people with AI versus people without AI. And so the single best thing you can be doing right now for yourself is just getting really good with these tools.”
Naval Ravikant, closing the conversation on the only divide that matters
Part one: Waste Tokens to Save Time, our writeup of the first segment, on software factories, the thousand-x engineer, token leaderboards, and whether pure software is dead.
Part two: Vibe Coding Hardware, our writeup of the second segment, on AI-designed jet engines, vertical integration, China’s open-source bet, and humans as verifiers.
Naval Ravikant’s official site, the canonical home for Naval’s essays and podcast on technology, judgment, and leverage.
Boom Supersonic, Blake Scholl’s company building supersonic aircraft and its own jet engines, source of the turbine-blade and two-engineers example.
Science Corporation, Max Hodak’s brain-computer interface company, whose captive MEMS foundry and FDA arguments anchor the hardware and healthcare segments.
Vercel, Guillermo Rauch’s company, whose AI gateway data and autonomous SRE work inform the usage and automation discussion.
This is the third installment of the freewheeling “Rabbit Hole” roundtable from Chris Williamson’s Modern Wisdom, and the cast is stacked: Tim Ferriss, writer George Mack, and the founder behind the ambient-AI app Sky (who posts as @signull). It is a sprawling, two-and-a-half-hour conversation that jumps from why Americans never adopted WhatsApp to whether Tim dreams in Japanese, then keeps tunneling into deeper ground: how language shapes thought, why forgetting is a feature, the frontier of brain stimulation, what the next computing interface looks like, and the search for meaning in a world where AI keeps removing scarcity. You can watch the full conversation on YouTube here.
TLDW
The group opens on language: the etymology of “soon,” Malay and Indonesian reduplication, the Sapir-Whorf idea that language shapes thought, and Tim Ferriss recounting how a year of total immersion in a Japanese high school at fifteen made him fluent, with a detour into why adults can learn languages faster than the myth suggests. From there they move into the mind itself, aphantasia versus hyperphantasia, eidetic memory, and the underrated advantages of forgetting, which loops into AI memory, hallucination as a form of confabulation, and the unreliability of eyewitness testimony. A long middle section, anchored by Packy McCormick’s essay “Riding the Leopard,” wrestles with meaning in a post-scarcity world, drawing on Viktor Frankl, Joseph Campbell, Nick Bostrom, and the Dawkins versus Hirsi Ali debate about whether comforting beliefs are rational if they work. Tim then walks through the most concrete material in the episode: his use of accelerated TMS, the one-day protocol, the stellate ganglion block, and why the chemical-imbalance theory of depression is largely debunked. They close on the next interface (ambient AI, camera-equipped AirPods, the post-app phone, Apple’s wait-and-win strategy), a riff on Britain versus America, and the rise of AI-assisted looks-maxing. The throughline, stated and restated, is that friction and scarcity are where meaning and value actually come from.
Thoughts
For a conversation that looks like pure chaos, one idea holds it together: friction is where meaning lives, and modern technology is a machine for removing friction. They route the point through Nick Bostrom (the traits we admire in people exist because we have to negotiate a scarce, resistant world), through dating apps and DoorDash (frictionless access cheapens the thing you get), and through chess (still meaningful precisely because there is an opponent pushing back, even though engines crush every human). It reframes the AI-and-meaning panic in a useful way. The danger is not that AI deletes meaning, it is that it makes meaning harder to reach, the same way a calorie-dense food environment does not outlaw health but quietly makes it the harder path. If that is right, the work ahead is less about stopping the technology and more about deliberately reintroducing resistance.
The most original riff is the treatment of forgetting as a feature rather than a defect, and then turning that lens on AI. Humans prune memory by salience, holding onto the vivid and the painful and letting the middle fade. Current AI memory systems do not prune, so when you stuff a model’s context full of stored “facts” you get noise and forced, spurious connections. The group notes that AI hallucination is really just machine confabulation, and that humans confabulate constantly, the Grenfell Tower “baby caught from the tower” false memory and the general unreliability of eyewitness testimony being the proof. The practical takeaway for anyone building AI products is counterintuitive and correct: the hard problem is not storage, it is principled forgetting.
Tim Ferriss’s neuromodulation segment is the most concrete and quietly radical part of the episode. The claim worth sitting with is that the chemical-imbalance theory of depression is largely debunked, and the frontier has moved to circuit-level intervention: accelerated TMS, a neuroplasticity agent like d-cycloserine taken beforehand, and a “one-day protocol” that took him from an eight or nine on anxiety and rumination down to a one, with lifelong insomnia resolved. Two honest caveats keep it credible rather than salesy. It does not always work (he is candid that several rounds failed), and the side effects are real (rebound symptoms, temporary anhedonia). The economics are a clean illustration of a pattern that recurs through the whole conversation: roughly thirty thousand dollars out of pocket today is how the unit cost eventually falls to something insurers and ordinary patients can afford, the same arc that electric cars and the first copy-and-paste-less iPhones traveled.
The meaning-and-religion exchange is where the conversation is most alive, and most revealing about where this cohort has landed. The Dawkins versus Ayaan Hirsi Ali anecdote crystallizes it: a man “optimizing for rationality while ignoring effectiveness,” pressing someone on whether the stone literally moved on the third day, when that someone’s life was demonstrably saved by the belief. Their tentative conclusion, that comforting delusions may be permissible when the measurable outcomes (health, community, longevity, a sense of meaning) are real, would have been near-heresy in the New Atheist moment of fifteen years ago and is now close to consensus among exactly these kinds of people. Whether you buy it or not, it is a sharp barometer of how far the cultural wind has shifted, and it pairs neatly with George Mack’s point that you cannot invalidate a whole framework with a single counterexample the way you can in mathematics.
Key Takeaways
Americans never adopted WhatsApp largely because the US had free SMS early, while Brits paid per text, which is also why a generation grew up compressing messages into 160 characters.
The word “soon” was the Anglo-Saxon word for “now.” Because people kept saying “soon” and not acting, the language invented “now” to replace it, and “now” is already drifting the same way (“now now” in South Africa, similar constructions in Latin America).
Malay and Indonesian use reduplication instead of plurals (table-table, orang-orang meaning men, the root of orangutan, “man of the forest”), a small example of how different languages carve up the world differently.
The Sapir-Whorf hypothesis and Wittgenstein’s line, “the limits of my language are the limits of my world,” frame a recurring theme: we assume we shape language, but language also shapes us, including, some speakers report, having a different personality in a different language.
Tim Ferriss became fluent in Japanese through total immersion as a fifteen-year-old exchange student, taking physics and world history in Japanese, helped by the fact that it was pre-smartphone so there was no English escape hatch.
Adults can often learn languages faster than children, not slower. Children seem faster mainly because they have no choice and are forced into immersion. Adults already have the conceptual scaffolding (grammar, abstraction, the subjunctive) that a three-year-old lacks.
Density of practice beats frequency. Learning a language one hour a week is like trying to learn tennis once a month. The Michel Thomas method and Nassim Taleb’s joke (“the best way to learn Russian is to go into a Russian jail”) both point at intensity and stakes.
People differ radically in how they think. Aphantasia is the inability to visualize (some people only think in words), while others cannot think in words at all and only in images. The “imagine an apple” test reveals where you sit on that spectrum.
An overdeveloped memory can be counter-evolutionary past a point. Hyperthymesia makes it hard to let go of grievances and slights, and there are real, underrated advantages to forgetting.
Forgetting is the hard, missing piece in AI memory. Systems store facts but have no pruning of salience, so loading lots of “memories” into context produces noise and spurious connections rather than wisdom.
AI hallucination is best understood as machine confabulation, and humans confabulate constantly. The Grenfell Tower “baby dropped and caught” story spread through multiple eyewitnesses and turned out to be a collective false memory once physicists questioned it.
Memory is bound to place. One participant had to move neighborhoods after a breakup because every coffee shop and corner replayed the relationship, echoing an Alain de Botton observation that a beautiful memory becomes the sharpest source of pain if the relationship ends.
Phantom phone vibrations are real and documented. Years of notifications Pavlovian-condition your body to feel buzzes that are not there, evidence of how deeply the device has wired itself into your nervous system.
You can train visual memory. Tools include “Drawing on the Right Side of the Brain,” gesture drawing with short timed poses, and learning to see specifics (the six local tree species) instead of the generic label “tree.” Attention and labels, not just raw acuity, drive perception.
The smartphone is described as a “black mirror.” There is data suggesting people with fewer mirrors at home self-report as happier, and “Zoom face” drove a surge in cosmetic surgery during the pandemic as people watched themselves on camera all day.
Packy McCormick’s essay “Riding the Leopard” anchors the meaning discussion. A reader who analyzed more than 200 sci-fi novels found that the most common unsolved problem in post-scarcity worlds is meaning (59% of books), with identity next at 17%.
Viktor Frankl’s framing recurs: “as the struggle for survival has subsided, the question has emerged, survival for what?” Ever more people have the means to live but no meaning to live for.
Nick Bostrom’s point (from his “solved world” work) is that almost everything we value in other people, discipline, prudence, good judgment, honesty, exists because we must negotiate a scarce world. Remove the scarcity and those values risk a strange “weightlessness.”
The precautionary principle cuts both ways: humans are very good at forecasting problems and very bad at forecasting the solutions that billions of people will eventually invent for those problems.
Chess is the optimistic counterexample to “AI removes all purpose.” Engines beat every human, yet people, including Magnus Carlsen, still love playing, because meaning needs resistance, not victory.
There is a real resurgence in religion, including the ascendant Latin Mass, conducted in a language the congregation does not speak. The group debates whether “comforting delusions” are actually rational if religious people are measurably happier, healthier, and longer-lived.
The Dawkins versus Ayaan Hirsi Ali exchange is held up as someone “optimizing for rationality while ignoring effectiveness,” and you cannot disprove a whole framework with a single counterexample the way you can in math.
Tim Ferriss is now far more focused on neuromodulation than psychedelics. Accelerated TMS, paired with a plasticity agent and refined into a “one-day protocol,” took him from an eight or nine on anxiety and rumination to a one, and resolved decades of insomnia.
The chemical-imbalance theory of depression and anxiety is, by his account, thoroughly debunked. You are not depressed simply because of low serotonin, which is part of why SSRIs come with off-target side effects and poor off-ramping plans.
The stellate ganglion block (SGB) acts like a hard reset of the nervous system. Tim measured a roughly 30% jump in HRV on his Whoop that held for months. It is used aggressively for PTSD in soldiers.
Psychedelics reopen critical-period plasticity windows (research associated with Gul Dolen) for two to three weeks afterward, which is powerful for relearning but also means whatever habits you instill in that window can stick hard. The brain is “Play-Doh warmed in the microwave.”
Most consumer vagus-nerve stimulators are “bunk” because they do not hit the nerve correctly (the target near the ear is the cymba concha). Kevin Tracey’s book “The Great Nerve” is cited as the credible source, and devices like gammaCore are FDA-cleared for migraine.
Hard safety warning: do not DIY brain stimulation. Hit the wrong target and you can make symptoms much worse. Use a reputable clinic.
Sequencing is everything, in TMS, in language learning, and in habit change. Most mistakes are sequencing mistakes. Pick the right domino to tip first and everything downstream gets easier.
The next interface is unsettled. Candidates include camera-equipped AirPods, a “Her”-style earpiece, a glanceable agentic home screen (the Sky app), and OpenAI’s Jony Ive collaboration. Elon Musk’s bet is that apps disappear and the phone generates whatever you need on demand.
Apple’s strategy is to never be first but to be best, letting other companies fund the R&D and split-test the market (MP3 players before iPod, smartphones before iPhone, wireless earbuds before AirPods), backed by a war chest and roughly 20 billion dollars a year from Google.
Both smartphone hardware and AI models feel like they are hitting diminishing returns in noticeable user experience, after a long stretch (iPhone 5 to 12) of obvious leaps.
If the UK were a US state it would rank first in many quality-of-life metrics (life expectancy, low homicide, healthcare coverage, paid leave) and 51st in GDP per capita. Scott Galloway’s line: America is the best place to earn money, Europe the best place to spend it.
A fast, real-world AI win: uploading photos of a years-long skin condition to Gemini, which correctly identified it as fungal and recommended ketoconazole shampoo after doctors had failed. Photo-based self-diagnosis is becoming a major consumer use case, as is AI-assisted “looks-maxing” and Facetune-style editing.
Tim’s recent long-form essay, “The Self-Help Trap: What I Learned After 20 Years of Improving Myself,” is on tim.blog, and George Mack’s book recommendations live at highagency.com/books.
Detailed Summary
Does Tim Ferriss dream in Japanese? Immersion and learning as an adult
The episode’s title question gets a real answer. Tim Ferriss says he runs on an English interface but became genuinely fluent in Japanese as a fifteen-year-old exchange student, after misunderstanding that “Japanese lessons” meant all his lessons (physics, world history) would be taught in Japanese. Total immersion plus a pre-smartphone world with no way to retreat into English did the work, and when he came home it took about a month to switch back, waking up and speaking Japanese to his mother. The group challenges the myth that children learn languages faster than adults: kids appear faster only because they are forced into immersion and have no mortgage and no job to distract them. Adults arrive with conceptual scaffolding, grammar, abstraction, the ability to grasp a counterfactual subjunctive, that a three-year-old simply does not have. The real variable is density of practice, which is why a six-week immersion can beat a year of weekly classes, and why the Michel Thomas method and Nassim Taleb’s “learn Russian in a Russian jail” both lean on intensity.
Language shapes thought: etymology and Sapir-Whorf
The opening stretch is a love letter to etymology. “Soon” was once the Anglo-Saxon word for “now,” and degraded over generations as people said it without acting, forcing the invention of “now,” which is itself now drifting. Malay and Indonesian double nouns rather than pluralize them (table-table, and orang-orang, men, giving us orangutan, “man of the forest”). These are small doors into the Sapir-Whorf hypothesis and Wittgenstein’s claim that the limits of your language are the limits of your world. The group treats the idea that language shapes us, not only the reverse, as easy to dismiss and probably true, citing friends who feel they have a different personality or can access different thoughts in Italian or Swedish.
Two ways of thinking, and the praise of forgetting
From language they move to cognition. People differ dramatically: some have aphantasia and cannot picture an apple at all, thinking only in words, while others cannot think in words and only in images, one friend reportedly visualizing a staircase to count. Tim places himself far toward hyper-visual memory, able to recall the floor plan of nearly every restaurant he has been in. But the group keeps returning to the underrated value of forgetting. An overdeveloped memory, hyperthymesia, makes it hard to release grievances and slights, which may be counter-evolutionary past a point. The athletic version is the “yips,” where you have to learn to process a mistake on film and then discard it rather than ruminate.
When memory becomes a feature: AI, hallucination, and false memory
The forgetting thread maps directly onto AI. The founder building the Sky app notes that it is now trivial to have AI extract and store a fact, but there is no pruning of salience, no built-in sense that something is no longer relevant, so passing many stored memories into context produces noise and forced connections. AI hallucination, the group argues, is just machine confabulation, and humans confabulate all the time. The vivid example is the Grenfell Tower fire, where multiple eyewitnesses “remembered” a baby being dropped from the tower and caught, a story that fell apart once physicists ran the numbers, an illustration that eyewitness testimony and human memory are themselves hallucinated reconstructions.
Attention, phones, and the black mirror
Phones get treated as both nervous-system extension and liability. Phantom vibrations are real and documented, a Pavlovian artifact of years of haptic notifications. The smartphone is a “black mirror,” and the group cites data suggesting fewer mirrors at home correlate with higher self-reported happiness, plus the pandemic “Zoom face” surge in cosmetic surgery. Tim describes running no social media, no vibrate, and no ringer on his phone with no felt loss of being informed, and a wider complaint that screens are now so ambient (five screens on a treadmill, a video wall, subtitles everywhere) that going screen-free requires active effort.
Riding the leopard: meaning in a post-scarcity world
Tim reads from Packy McCormick’s essay “Riding the Leopard,” which opens with a parade of AI funding announcements and the deflating question, “who gives a damn, why do we care?” before pivoting to a reader, in remission from stage-four cancer, who analyzed more than 200 sci-fi novels and found that the dominant unsolved problem in post-scarcity worlds is meaning. The piece quotes Viktor Frankl on survival giving way to “survival for what,” and takes its title from Joseph Campbell’s image of Dionysus riding the leopard without being torn apart, living with composure atop overwhelming energy. The group widens it with Nick Bostrom’s argument that the human traits we prize exist only because we negotiate a scarce world, so removing scarcity creates a values “weightlessness,” and David Deutsch’s counter that problems are infinite and soluble.
Friction, resistance, and the cocktail-party question
The most coherent conclusion is that meaning requires friction. Chess stays meaningful despite unbeatable engines because there is still resistance. Capitalism’s genius and its cost is removing friction, dating apps turning people into a swipeable catalog, DoorDash delivering a bathing suit in thirty minutes, and that frictionlessness tends to cheapen the thing delivered. The “what do you do?” cocktail-party question gets dissected as a very Western tic that ties identity to craft and productivity. Winston Churchill becomes the case study: a man who nearly died countless times, believed he was preserved for a purpose, fought his “black dog” depression, and laid 200 bricks a day just to stay occupied.
Religion, rationality, and comforting delusions
The meaning question leads into the religion revival, including the surging Latin Mass conducted in a language nobody in the pews speaks. They revisit the Jordan Peterson and Sam Harris debates about whether a secular population can build a durable moral code from first principles, and the Dawkins versus Ayaan Hirsi Ali exchange, where Dawkins challenged the literal resurrection while Hirsi Ali described religion saving her from a suicidal low. The verdict offered is that Dawkins was “optimizing for rationality while ignoring effectiveness,” and that if comforting beliefs reliably produce better health, community, and meaning, calling them irrational starts to look like the irrational move. George Mack adds the logical point that you cannot void an entire framework with a single counterexample the way you can in mathematics.
Rewiring the brain: TMS, the one-day protocol, and neuromodulation
Tim delivers the episode’s most concrete material. He describes years of generalized anxiety, OCD, and rumination he now traces partly to Lyme disease and chronic neuroinflammation, and his use of accelerated TMS (intermittent theta-burst stimulation) targeting specific circuits identified via fMRI. Paired with a neuroplasticity agent, the antibiotic d-cycloserine, dissolved in the mouth beforehand, the treatment evolved into a “one-day protocol” that took him from an eight or nine to a one and ended decades of insomnia. He is careful to caveat: he is not a doctor, it has not worked every time (five or six attempts), and side effects include rebound symptoms, occasional insomnia, and temporary anhedonia. The broader claim is that the chemical-imbalance theory of depression is largely debunked, and that real innovation here, as with electric cars and early iPhones, starts with wealthy early adopters overpaying (around 30 thousand dollars out of pocket) until cost and throughput improve. He names Jonathan Downar as a leading researcher and is involved with a device company, Ampa, built around the one-day protocol.
Psychedelics, plasticity windows, and the stellate ganglion block
Adjacent to TMS, Tim explains that psychedelics (and MDMA) appear to reopen critical-period plasticity for two to three weeks afterward, work associated with researcher Gul Dolen, which is promising for stroke recovery or relearning but dangerous if you instill bad habits while the brain is malleable. He recounts a two-sided stellate ganglion block (SGB) with Matt Cook, essentially a hard reset of the nervous system that produced a roughly 30% increase in HRV on his Whoop that held for months, and is used aggressively for PTSD in soldiers. After years funding psychedelic science, he says he has done almost none in the last three years because neuromodulation has been that compelling, while warning that psychedelics are “nuclear power for the psyche,” not suitable for everyone.
The vagus nerve, real and fake
On vagus-nerve stimulation, Tim’s verdict is that most consumer devices are bunk because they do not hit the nerve in the right place (the ear target is the cymba concha, and many heavily funded products miss it). He points to Kevin Tracey, author of “The Great Nerve,” as the credible scientist, explains the “inflammatory reflex” and its relevance to rheumatoid arthritis and autoimmune conditions, and notes that gammaCore (the prescription version of Truvaga) is FDA-cleared for migraine, with SetPoint Medical’s implant another route. A migraine-with-aura sufferer in the group provides the real-world test case.
The next interface and Apple’s wait-and-win game
The future-of-computing thread argues the real AI device has not been invented yet. Candidates include camera-equipped AirPods, a glanceable agentic home screen (the Sky app’s pitch is surfacing what you need so you doom-scroll less), a “Her”-style always-on earpiece, subvocalization sensors that read intended speech, and OpenAI’s secretive hardware with Jony Ive. Elon Musk’s bet is that apps vanish and the phone simply generates what you need on demand, which is plausible now that people use ChatGPT or Claude for tasks that used to need dedicated apps. Apple’s counter-move is its classic one: never first, always best, letting rivals fund the R&D (MP3 players, smartphones, wireless earbuds all predate Apple’s versions), backed by a war chest and roughly 20 billion dollars a year from Google. Both phone hardware and AI models, the group feels, are now delivering diminishing perceptible gains.
Britain, America, and the image economy
The closing tangents include George Mack’s viral chart showing that if the UK were a US state it would rank first in many quality-of-life measures and 51st in GDP per capita, with Scott Galloway’s summary that America is the best place to earn money and Europe the best place to spend it. They land on AI as an everyday tool: uploading photos of a stubborn skin condition to Gemini, which diagnosed it as fungal and recommended ketoconazole shampoo where doctors had failed, and the booming use of AI for “looks-maxing,” facial analysis, and Facetune-style editing, with writer Freya India’s reporting that young women now compete to be the one holding the phone so they control the edit. Tim signs off pointing to his “Self-Help Trap” essay on tim.blog, George to highagency.com/books, and the Sky founder to the app’s growing wait list.
Notable Quotes
“The reason that people mistakenly believe that kids learn faster is because the kids have no choice. The kids have no mortgage. The kids have no job.”
On why adults can actually learn languages faster than children
“It’s the Wittgenstein quote of, the limits of my world are the limits of my language. And we think that we shape language, but language shapes us.”
George Mack, introducing the Sapir-Whorf thread
“There are some tremendous advantages to forgetting.”
Tim Ferriss, on why an overdeveloped memory can be counter-evolutionary
“As the struggle for survival has subsided, the question has emerged, survival for what? Ever more people today have the means to live but no meaning to live for.”
Viktor Frankl, quoted by Tim Ferriss reading from Packy McCormick’s essay “Riding the Leopard”
“Everything that we value in other humans can be refined down to the fact that you need to negotiate with a world that is scarce.”
Summarizing Nick Bostrom’s argument about values in a solved world
“What you see is a guy who is playing a game of optimizing for rationality whilst ignoring effectiveness.”
On Richard Dawkins challenging Ayaan Hirsi Ali’s faith despite the outcomes it produced
“There’s very few things that I can think of that are meaningful that are also totally frictionless or just there is no challenge in it.”
On why meaning depends on resistance, from the chess and dating-app discussion
“The general chemical imbalance theory of depression or anxiety is pretty much thoroughly debunked at this point. You’re not depressed because you have low serotonin levels by and large.”
Tim Ferriss, on the shift from serotonin models to circuit-level neuromodulation
“A lot of innovation starts with people with money spending way too much money. That’s true with electric cars, it’s true with Uber, it’s true with the early generation iPhones.”
Tim Ferriss, on how expensive early treatments like accelerated TMS eventually scale
These are short, curated pulls from a long conversation, not a transcript. For the full context, including the brain-stimulation walkthrough and the meaning debate, watch the full episode on YouTube here.
Related Reading
Tim Ferriss (tim.blog) primary source for his newsletter, his “Self-Help Trap” essay, and his writing on neuromodulation and TMS.
Benedict Evans, the former Andreessen Horowitz partner and independent analyst behind the annual “AI Eating the World” presentation, sat down with Lenny’s Podcast for what the host calls the most rational take on AI you will hear this year. Instead of either doom or hype, Evans argues that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile, which means we are living through something closer to 1997 than to the singularity. The conversation moves through the jobs question, the difference between a task and a job, whether the model labs have any pricing power, the anti-AI backlash, and what people should actually do. You can watch the full conversation on YouTube here.
TLDW
Evans frames AI as a platform shift on the scale of the internet or mobile, with the crucial twist that almost nothing has been built yet, so we are in the 1997 moment where confident predictions about winners are usually wrong. He introduces his central tool, the distinction between the task and the job, to explain why “X percent of this profession is exposed to AI” studies are misleading, why the AI labs are paradoxically hiring forward deployed engineers and buying consultancies, and why accountants kept multiplying through every wave of automation (the lump of labour fallacy and Jevons paradox at work). On value capture he makes a deterministic bet that foundation models have no network effects, behave like a commodity, and will look more like cloud than like Windows, with the value moving up the stack to applications, much as it did in telecom, where a trillion-dollar industry grew data traffic thousands of times over while its stocks went nowhere. He covers distribution as the real moat, Apple Intelligence as the most compelling unshipped vision, the fuzzy anti-AI backlash (including the largely fake water panic and the very real harms of deepfakes), raising kids under radical uncertainty, and closes with the disarming admission that his own synthesis-heavy job is exactly the kind AI is currently worst at. His advice: presume radical uncertainty, dive in rather than sneer, and assume it will probably be okay.
Thoughts
The most useful thing in this conversation is a single question Evans keeps returning to: what is the task, and what is the job? A spreadsheet automated the arithmetic an accountant does, and the number of accountants went up for the next forty years. Claude Code can write the code, but deciding what to build, for whom, and why is the part nobody has automated. The reason the “this profession is X percent exposed to AI” studies feel hollow is that they assume a job is a neat stack of separable tasks. Evans argues, by analogy to the old expert-systems failure, that you simply cannot decompose a senior lawyer’s work that way. The 75-slide deck is the task. Walking your company, reading its politics, talking to your customers, and telling you the uncomfortable truth is the job, and that is what you actually paid McKinsey for.
The boldest and most falsifiable claim is that the foundation-model companies look more like cloud than like Windows. No network effects means no winner-take-all, which means durable competition, which means commodity pricing and compressed margins, with the real value accruing up the stack in applications that nobody at the labs is going to build. His telecom analogy is the one to sit with. A trillion-dollar industry grew mobile data traffic by 1,500 to 2,000 times in fifteen years, and the stocks went nowhere for a quarter century, because it was a low-margin utility while all the interesting value moved to Apple and the people building apps on top. If he is right, the current token-burn economics, the person reportedly spending 1.5 million dollars a month on tokens, are the 2010 equivalent of a 50,000 dollar roaming bill, not the steady state. Evans flags openly that he could be completely wrong, which is the intellectually honest part and the part most forecasters skip.
“It depends” and “it will probably be okay” sound like evasions, and Evans leans into that. But the 1997 framing is doing real work. The point is not that AI is small, it is that the things that will end up mattering have not been built, and that anyone confidently naming the winners today is repeating the 1997 mistake of betting on Excite over a search company with a weird logo. The discipline he is selling is to presume radical uncertainty and act anyway, because the alternative, declaring the whole thing slop and shouting about it online, buys a great feeling of moral superiority and nothing else. His repeated insistence that you can see the job that goes away but never the new job, because it does not exist yet, is the load-bearing idea under his optimism.
The most disarming moment is the closing AI-corner answer, where the person whose entire brand is explaining AI admits he struggles to use it. His work is synthesis and precise information retrieval, and precise retrieval happens to be exactly what today’s models are worst at. He is, in his own words, the lawyer looking at VisiCalc: it is obviously transformative, and he just does not happen to make spreadsheets all day. That admission is worth more than any benchmark, because it locates the real variable. How much AI changes your life depends less on how good the model gets and more on whether your daily work sits on the part of the jagged frontier where it already works. That is a far more practical lens than arguing about whether AGI arrives in three years or thirty.
Key Takeaways
Evans’s headline opinion is that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile. Both halves of that sentence matter.
If you make the internet comparison honestly, we are roughly in 1997: very exciting, most of it does not work yet, most of what people will build has not been built, and it is unclear how any of it will end up working.
Adoption is spread across a very wide distribution. Even among teenagers, only something like 15 to 20 percent are daily active users and another 20 percent weekly, with the majority saying they do not use it at all.
That spread maps onto the “jagged frontier” question of where AI works, where it does not, whether you can predict where it will work in advance, and whether you can even tell after the fact.
Software developers are the accountants seeing VisiCalc: for them everything has already changed. Most other professions are watching, intrigued but unsure what to do with it.
The AI labs are investing heavily in forward deployed engineers, consultancies, and professional services. Evans jokes that a forward deployed engineer is an Accenture outsourced developer who lives in San Francisco.
Companies do not have spare people sitting around to reimagine every internal workflow, so reinventing a business around AI is itself a project that needs consultants, which is why the most cutting-edge labs are funding exactly the firms everyone assumed AI would kill.
The central framework: separate the task from the job. Sometimes the task is the job (the elevator operator pressing a lever), and automating the task ends the job. Far more often, the task is only part of the job.
Amazon gets you the SKU once you know which SKU you want. Knowing which one to buy is a different job. Claude Code writes the code, but knowing what code and what features to build is the job.
A McKinsey or Bain engagement is not really about the deck. The deck is the task. The job is walking your enterprise, understanding the politics, talking to your customers, and telling you the truth.
The Jevons paradox is just price elasticity applied to labour. Make something cheaper to produce and you usually do far more of it, not the same amount with fewer people.
Excel did not give investment bankers shorter hours. iPhone SDKs did not shrink the number of engineers even though Apple writes 90 percent of the code for you. The number of accountants rose through every wave of automation.
The lump of labour fallacy: since 1800, each technology automates jobs and unlocks new ones. You can always see the job that disappears and never the new job, because it does not exist yet.
Evans is wary of argument from authority on jobs. He wants Dario Amodei’s view on where models go in the next 6 to 12 months, not necessarily his theory of labour markets and comparative advantage.
The doomer scenario of every company buying ChatGPT and firing everyone in two weeks misunderstands how enterprises work. Enterprise sales cycles run 18 months or more. Nobody is ripping out SAP overnight. The full transformation takes 3 to 10 years, sector by sector.
AGI and superintelligence are being quietly redefined to mean whatever works now. Larry Tesler’s theorem: AI is whatever machines cannot do yet, because once they can, people call it just software.
We have no theory of human intelligence, no theory of why these models work, and no theory of how much better they will get, so everyone is vibes-forecasting. Even if progress stopped tomorrow, what exists is already transformative and will roll out for a decade.
On value capture, Evans argues models show no network effects, so no single one runs away with the market. Persistent competition plus little real product differentiation means little pricing power.
Sam Altman’s pitch of selling intelligence on a meter like electricity ignores the brutal margin structure of utilities. Your TV maker does not pay the power company a cut of your bill.
The telecom analogy: a roughly trillion-dollar mobile industry spends 15 to 20 percent of revenue on capex, grew data consumption 1,500 to 2,000 times since 2010, and its stocks went nowhere for 25 years because it is a low-margin commodity utility.
The elemental question: does the model do the whole thing, or does it need thousands of different apps built by different people? If it needs apps, the labs cannot build them all, just as Microsoft did not, so it looks more like AWS than like Windows.
If the product is a commodity, distribution becomes the moat. Google pushes Gemini through its surfaces, Meta sprayed AI across its apps and quietly ranked between ChatGPT and Gemini in usage, and incumbents with distribution have a structural edge.
Browsers are the warning: Microsoft used distribution to win the browser war, then it turned out winning browsers did not matter because the value was further up the stack.
Apple Intelligence, as shown at WWDC 2024, was the most compelling vision of a personal AI assistant Evans has seen. Apple could not ship it, but neither could anyone else, because tool-using on-device agents with no hallucinations across thousands of apps is genuinely hard.
The model is “the dumb thing underneath” that powers a feature. The same commodity model can sit beneath both Gemini on Android and Apple Intelligence on iOS while the products and distribution differ entirely.
The anti-AI backlash is a big fuzzy mess. Some is real (local electricity bills, deepfakes, real job anxiety), some is sort of true, and some is simply false.
The data-center water panic is largely fake. A Livermore lab study put US data-center water consumption at about 0.017 percent of US water use. Local well conflicts are planning problems, not data-center problems.
We have shockingly little hard data. The model labs do not publish meaningful usage numbers. There is no public daily active user figure for ChatGPT, so economists are reverse-engineering effects from government surveys.
Real new harms do appear with each wave. A teenager could not use Photoshop to make explicit fakes of every classmate and send them to the whole school in an afternoon. Now they can, and turn them into video.
The UK Post Office Horizon scandal (buggy Fujitsu software wrongly showing cash shortfalls, leading to prosecutions, bankruptcies, and suicides) is a reminder that every technology brings new ways to ruin lives, by malice or by accident.
You cannot reliably predict what gets exposed. In 1997 people thought taxis were safe from the internet and newspapers would be fine. The opposite happened. Today, “AI-proof” jobs like personal trainer may not be as safe as they look.
Uber and Airbnb show that similar-sounding companies can have very different market impact. Uber demolished and then grew the taxi market, while Airbnb’s effect on hotels was fairly marginal because business travel still wants a hotel.
Every new technology first lets you do the old thing but more, then unlocks things that were not possible before. Recorded music revenue is U-shaped: first “what if I do not pay 15 dollars for a CD,” then “what if 15 dollars a month gives me all the music there is.” Spotify is not an online music store, it is something else.
Coding was supposed to be one of the last things automated, and instead it is the most transformed role of all, which is itself a lesson in how badly we predict exposure.
Practical advice: do not stick your head in the sand. Dive in, submerge yourself, and come out understanding what you can do with it. Going into a shrinking job market announcing you will never use AI is not the right posture.
Evans’s honest coda: he struggles to find AI use cases because his job is synthesis and precise retrieval, the things models are worst at. He uses it for proofreading, images, redecorating his apartment, and dictation. He is the lawyer looking at VisiCalc.
Detailed Summary
AI is as big as the internet, and we are living in 1997
Evans opens with the opinion he calls his most controversial: AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile. To some in tech that sounds dismissive, as if he is underrating a once-in-history event. His reply is that smartphones and the internet were themselves enormous, and we are talking over the internet right now. The deeper point is the comparison’s timing. If this is like the internet, then it is like the internet in 1997: thrilling, but most of it does not work yet, most of what will be built has not been built, and nobody knows how the pieces will fit. His latest 80-slide presentation, he jokes, is essentially 80 ways of saying “we do not know,” which is partly facetious and partly the entire point.
The jagged frontier and the wide spread of adoption
Adoption is not uniform, it is a wide distribution. Some people in tech have bought clusters of Mac minis and stopped using Google, while most people outside tech who use AI at all touch it once every week or two. Even among 13 to 18 year olds, daily active use sits around 15 to 20 percent, weekly use adds another 20 percent, and roughly 60 percent say they do not use it. That spread maps onto what Evans calls the jagged frontier: whether a given task works, whether you can predict in advance that it will work, whether it is intuitive, and whether you can even tell after the fact. Software developers are the accountants who just saw VisiCalc, living in a clear before-and-after. Everyone else is somewhere on the curve, picking it up to varying degrees and a little puzzled about what it is for.
Why the AI labs are buying consultancies
One of the most counterintuitive trends is that the leading labs are pouring money into forward deployed engineers and professional services, the very category many assumed AI would erase. Evans’s explanation is grounded in how companies actually operate. Firms do not keep spare people sitting around to redesign stores, hunt down churn, or rebuild a tech stack, which is exactly why they hire Bain, BCG, McKinsey, Accenture, or Infosys when a big project appears. Reimagining every internal workflow around AI, then actually plugging vertical and horizontal systems together and retraining people, is itself a multi-month project requiring people you do not have. So the work gets outsourced, and the most advanced labs are funding the firms that do it. His joke lands the point: a forward deployed engineer is a statistician, or an Accenture developer, who happens to work in San Francisco.
The task versus the job
This is the spine of the conversation. Ask what the hard part of a job really is. Sometimes the task is the job: the elevator attendant’s whole job was driving the car, the task got automated, the job ended. Much more often the visible task is only a slice. Amazon gets you the SKU once you know which SKU you want, but knowing what to buy is a separate job. Claude Code writes the code, but deciding what to build, for whom, and how to take it to market is the job. A consulting deck is the task, while the reason you pay Bain is for them to walk your company, understand its politics, talk to your customers, and tell you the truth. Evans notes you can already generate a bad McKinsey deck with AI, and the LinkedIn grifters who do are missing that the deck was never the thing you were buying.
Jevons paradox and the lump of labour fallacy
The Jevons paradox is just price elasticity applied to labour: make something cheaper to do and you usually do much more of it. Excel did not hand junior bankers their Friday afternoons off, it expanded the work. iPhone developers write a fraction of the raw code because Apple wrote the drivers and file system, and there are not a tenth as many engineers, there are far more. The count of accountants climbed through adding machines, punch cards, mainframes, databases, ERP, spreadsheets, and cloud. The lump of labour fallacy is the broader version: since 1800 every technology has removed jobs and unlocked new ones, the removed jobs usually look bad in hindsight, the new ones tend to be better, and GDP keeps rising. You can always see the job that disappears and never the one that does not exist yet.
The jobs question, Dario, and the enterprise sales cycle
On the coming jobs apocalypse, Evans is cautious about argument from authority. Running an AI lab makes Dario Amodei worth listening to on where models go in the next 6 to 12 months, not necessarily on labour economics and comparative advantage. The doomer image of companies buying ChatGPT and firing everyone within weeks misreads reality: enterprise sales cycles run 18 months or longer, nobody is tearing out SAP overnight, and the full transformation will take 3 to 10 years, sector by sector, as people slowly work out what to do. He points to the lag in software itself. Many SaaS companies founded the day before ChatGPT launched could have been built a decade earlier, and were not, because the delay was someone realizing a problem existed and that this was the way to solve it.
Redefining AGI and superintelligence
Evans is skeptical of the moving terminology. He cites Larry Tesler’s line that AI is whatever machines cannot do yet, because the moment they can, people call it just software. Machine learning, image recognition, and sentiment analysis all got reclassified as not really AI once they worked, the same way jet airliners were once high technology and are now just planes. AGI is now often quietly redefined as doing some percentage of economically valuable work, which a 1975 mainframe also did, rather than anything about consciousness or a soul. Whether we reach human-level intelligence is, in his view, genuinely unknowable right now. The reassuring point is that you do not need to resolve it. Even if models hit a brick wall tomorrow, what already exists is transformative and will take a decade to deploy.
Where the value accrues: commodity models and the telecom analogy
Here Evans makes his most deterministic argument. Foundation models appear to lack network effects, so no single model runs away from the pack, competition persists, and product differentiation as users experience it is thin. Without differentiation or lock-in, where does pricing power come from? He skewers Sam Altman’s image of selling intelligence on a meter like electricity by pointing out that utilities have terrible margins and nobody pays the power company a cut of their TV. His telecom career supplies the analogy: mobile is a roughly trillion-dollar industry that spends 15 to 20 percent of revenue on capex, grew data traffic 1,500 to 2,000 times since 2010, and whose stocks went nowhere for 25 years because it is a low-margin commodity utility while the value sits up the stack with Apple and the app makers. If models are commodities and the real product is thousands of apps the labs will not build, the outcome looks like cloud, not like Windows.
Distribution as the moat
If the product is a commodity, distribution decides the winners. The web browser is the cautionary tale: the browser product is a thin wrapper around a rendering engine, tab browsing was the last real innovation 20-plus years ago, Microsoft used distribution to win, and then winning browsers turned out not to matter because the value was elsewhere. Now Google drives Gemini through its surfaces and Meta sprayed AI across its apps and, in survey data, sat between ChatGPT and Gemini in usage despite tech writing it off. An adequate product with great distribution and brand becomes a big deal, which is why OpenAI spent last year trying everything to build a flywheel before the giants defaulted everyone onto their own offering. The power of the default and sheer inertia do a lot of work.
Apple Intelligence and the model as the dumb thing underneath
Evans calls the Apple Intelligence segment of WWDC 2024 the most compelling vision of a personal AI assistant he has seen: tool-using, on-device, agentic, with no prompt injection or hallucinations across a standardized API spanning thousands of apps. Apple could not ship it, but neither could anyone else, because that is genuinely hard. The episode illustrates his framing that the model is “the dumb thing underneath” that powers a feature. The same commodity model can sit beneath Gemini intelligence on Android and Apple Intelligence on iOS, with different products, different distribution, and different decisions about what the feature should be. Apple has a billion edge-capable devices, while Google’s “coming soon to our most powerful devices” really means it will not work on most Android phones.
The anti-AI backlash, water, and real harms
The backlash, Evans says, is a big fuzzy mess of very different things. Some is tangible, like a higher local electricity bill in a small number of places. Some is essentially fake, like the water panic. He dug into a Livermore lab study putting US data-center water use at about 0.017 percent of national consumption. Local well conflicts are planning failures, not data-center failures. The jobs piece is genuinely unresolved, with charts pointing both ways and a youth employment slowdown that shows up regardless of degree or AI exposure. He stresses how little hard data exists, since the labs publish no meaningful usage numbers and there is no public daily active user figure for ChatGPT. He compares the moment to the social media backlash, compressed, where some fears were true, some half true, and some simply false. The real new harms are real, though: deepfakes let a teenager generate explicit fakes of an entire school in an afternoon, and the UK Post Office Horizon scandal shows how buggy software plus institutional denial can destroy lives.
You cannot predict what gets exposed, and what to actually do
Evans dismisses the O*NET-style exercise of scoring what percentage of each profession AI can do as deluded, the modern version of the expert-systems problem, where you try to describe a job as 700 logical steps and it never works. You cannot say a senior partner’s work is 17 percent automatable. The history of prediction is humbling: in 1997 people thought taxis were safe from the internet and newspapers would simply save on printing, and both were wrong. Coding, supposedly one of the last things to automate, became the most transformed role of all. Personal trainers might be next once your phone can watch your form. His closing advice is to presume radical uncertainty and act anyway: do not retreat into sneering moral superiority, dive in, internalize what the tools can do, and make yourself a great hire. He ends with a candid admission that his own synthesis-and-retrieval job is exactly what AI is currently worst at, so he is the lawyer looking at VisiCalc, sure it changes everything while not personally making spreadsheets all day.
Notable Quotes
“My most controversial opinion is that I think that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile.”
Benedict Evans, stating the thesis that frames the whole conversation
“If you’re going to make the internet comparison, it’s like we’re in 1997. It’s very exciting. Most stuff kind of doesn’t work yet. Most of the stuff that people are going to do hasn’t been built yet.”
Benedict Evans, on why confident predictions about AI winners are usually wrong
“You can’t look at a senior partner at a law firm and say, well, 17 percent of their work could be automated. This is horseshit.”
Benedict Evans, on why O*NET-style job-exposure scoring fails
“Claude Code can write you the code, but what code do you want? It can make you the features, sure, but what features do you want? Who’s your customer? What’s the right product for that customer?”
Benedict Evans, drawing the line between the task and the job
“There’s this quote from Sam Altman where he said we’re going to be selling AI intelligence on a meter like water or electricity, and you look at this and think, my dear sweet child, you need me to explain the margin structure of the utility industry to you.”
Benedict Evans, on why model labs may lack pricing power
“The model is just the dumb thing underneath that powers the feature. The model is the commodity that powers different decisions about what the feature should be.”
Benedict Evans, on why value moves up the stack to applications
“Every time we have a new technology it automates away a bunch of jobs, and then that automation unlocks a bunch of new jobs, and you don’t know the new job because it doesn’t exist yet.”
Benedict Evans, on the lump of labour fallacy and 200 years of automation
“Don’t stick your head in the sand and say I hate all of this stuff. That gives you a great feeling of moral superiority, but that’s not going to help. What helps is you diving into this and coming out understanding what you can do with it.”
Benedict Evans, on what to actually do about AI right now
“AI is good at stuff that computers are bad at, and bad at stuff that computers are good at.”
Benedict Evans, quoting an observation that explains why he struggles to use AI in his own work
This is a curated set of pulls, not a transcript. To hear the full argument in context, including the telecom and recorded-music charts and the lightning round, watch the full conversation on YouTube here.
Related Reading
Benedict Evans (ben-evans.com) the primary source for his weekly newsletter and the “AI Eating the World” presentations referenced throughout.
Jevons paradox (Wikipedia) the price-elasticity idea that anchors his argument about why cheaper output tends to expand work rather than shrink it.
This is part two of Naval Ravikant’s conversation with frontier founders Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science. Where the first part argued that you should waste tokens to save time and that the job of an engineer is now to build the factory rather than the output, this segment drags that thesis out of pure software and into atoms. The question on the table is what happens to hardware when models can vibe code the spreadsheets, the simulations, and eventually the step files and PCB layouts that aerospace, semiconductors, and biotech are built on. This segment is one half of the discussion, and you can watch and read the full episode here. The full conversation is on the Naval Podcast YouTube channel.
TLDW
Blake Scholl describes how Boom Supersonic took hardware engineering workflows that used to live in siloed Excel spreadsheets and VBScript on individual laptops, with handoffs done by email like it was the 1990s, and turned them into versioned, testable software. The new model is that software engineers build the architectures and the tools while hardware engineers vibe code their own domain-specific pieces, which collapsed a turbine-blade analysis that once took one engineer one day per blade into something where two engineers can design an entire jet engine in real time. Naval generalizes this into the cataclysm of enterprise software: there is no longer a startup that can sell you hardware collaboration tools because companies just code the exact thing they need on demand, and even spreadsheets are cooked because they only existed as a proxy for custom software nobody could previously afford to build. Blake predicts that within 2026 AI will move from generating software to generating step files and PCB layouts, which reshapes mechanical and electrical engineering. The group debates China’s open-source push as a way to neutralize Silicon Valley’s software advantage and protect its hardware and supply-chain superiority, lands on the point that if you fall behind on generating software you fall behind on generating everything, and Guillermo notes that frontier coding intelligence still dominates real usage while cheaper models like Gemini win at scale for support and browser automation. Max Hodak explains Science’s vertical integration, including a captive MEMS foundry on the East Coast, because the most innovative hardware cannot be bought off the shelf, and argues that software still needs hands since a model that cannot make physical things hits real boundaries. The conversation closes on the shift from writing to verifying: junior engineering got absorbed by agents while juniors got promoted, the same way paralegals could be seen as fired or promoted, and humans across law, engineering, and operations are becoming the verifiers who sign off on systems they did not write line by line.
Thoughts
The most important shift in this segment is that vibe coding stops being a software-industry story and becomes a deep-tech story. In part one the examples were Postgres, ClickHouse, and deploy targets. Here Blake Scholl is talking about turbine blades that change shape when they heat up, and the brutal fact that converting between cold and hot geometry, and between aerodynamics and structures, used to eat one engineer for one full day per blade in an engine that has a thousand blades. That is the kind of math that quietly kills ambition. When he says two engineers can now design an entire jet engine because the structural and aerodynamic results update in real time as you change the geometry, that is not a productivity improvement, it is a change in what a small team is allowed to attempt. The interesting move is the division of labor: software engineers build the architecture and the framework because they understand systems and separation of concerns, and the hardware engineers vibe code the pieces only they understand. Nobody has to become both.
Naval’s “cataclysm of enterprise software” is the most investable idea in the episode, and it is darker than it sounds for anyone selling B2B tools. His claim is that the entire category of internal collaboration software is being eaten from the inside, because a company that can generate exactly the tool it needs on any given day will not pay a vendor for an approximation of that tool. His follow-on that even spreadsheets are cooked is the sharpest version of the point. The spreadsheet won for forty years precisely because it was the closest thing to custom software that a non-programmer could produce. Remove the constraint that custom software is expensive and the spreadsheet loses its reason to exist. The counterweight, which the group raised in part one with the block-economy thesis, is that the infrastructure primitives agents reach for get more valuable, not less. So the safe place to build is not the collaboration layer on top, it is the primitive underneath.
The China discussion is the geopolitical center of the conversation and it lands on a genuinely uncomfortable insight. The argument is that China leans into open-source models not only because it is a model or two behind, but because open weights neutralize Silicon Valley’s software advantage and let China lean on what it already dominates: hardware, supply chains, and component ecosystems. If software can be generated on demand from open models, then the country with the factories wins the stack. The sharpest line is that if you fall behind on the ability to generate software, you fall behind on the ability to generate everything, because software is now upstream of every hardware pipeline. That reframes the open-versus-closed debate as a question about who controls the means of producing the means of production. It also quietly flatters the American frontier labs, since the same logic says self-improvement requires frontier coding models, and on that narrow axis the consensus at the table is that the Chinese models are not yet in the race.
Max Hodak provides the necessary cold water, and it is the most grounding contribution in the episode. Everyone else is describing software eating the design layer, and Max points out that you still have to make the thing. Science owns a captive MEMS foundry on the East Coast not as a flex but because there was no other way to do the packaging and assembly for products that approach a single block of covalently bonded matter. His framing that the software still needs hands is the real boundary condition on all the AI-eats-everything talk: a model can be smarter than every engineer in the building and still be unable to deposit a layer, bond a wafer, or pass a regulatory inspection. The optimistic version, which he also makes, is that he has instrumented the foundry so that as models improve, the gains show up immediately in cell engineering and material science. The pessimistic reading is that the physical world remains a hard rate limiter, and the companies that own the atoms will capture more of the surplus than the companies that only own the bits.
The closing thread on verification is where the whole conversation resolves into a job description for humans. Guillermo’s point that the biggest problem in software is mountains of slop arriving as a pull request, and that the answer is not pretending to read every line but being able to say “I am signing off on the consequences of this PR, and I wrote the harness, the simulations, the proofs, and the type checkers that let me,” is the most practically useful idea in the episode. It generalizes cleanly. The lawyer you trust is not the one who wrote every clause by hand, it is the one putting their reputation on the line that the document is sound. The production engineer who gets paged at 3am is the one signing off that the system is safe to ship. As models absorb the junior tier of every knowledge profession, the surviving human role is the verifier who carries the accountability. That is a promotion for the people who can hold it and an extinction event for the people whose value was doing the work nobody now needs done by hand.
Key Takeaways
The factory framing from part one carries straight into hardware: you are judged on whether you build the system that produces multiplicative outputs, not on the single artifact, and the real multiplier was always 100x or 1000x, not 10x.
AI completely changes the role of software and hardware developers rather than just speeding either one up.
A huge amount of hardware engineering lives in complex Excel spreadsheets and VBScript on individual engineers’ laptops, with no source control, no automated testing, and handoffs done manually over email. It is software that is not treated as software.
Boom Supersonic’s move from day one was to turn traditional hardware engineering workflows into real software frameworks that are automatable and repeatable, to drive down the cost of iteration.
The old bottleneck was never being able to afford enough software engineers to build those frameworks. AI removes that constraint.
The new model: software engineers create the architectures because they understand systems, algorithms, and separation of concerns, and hardware engineers vibe code the domain pieces only they understand.
A turbine blade is cold when it starts and hot when it runs, so it changes shape, and you must design both the cold and hot geometry across aerodynamics and structures. Classically that was one engineer, one day, for one blade, in an engine with a thousand blades.
With software and hardware people combined, you can now change blade geometry and see the structural and aerodynamic results in real time, which lets two engineers design an entire jet engine.
Naval’s cataclysm of enterprise software: no startup can sell hardware collaboration tools anymore because companies just code the exact thing they need at any given time.
Even spreadsheets are cooked. Spreadsheets won only because nobody could build custom software, so a spreadsheet full of VBScript was the closest available approximation. Remove the cost barrier and the approximation loses.
Engineers are moving from Excel to Python models that produce believable simulations of physical systems.
AI can generate software today, but within 2026 it is expected to generate step files and PCB layouts, which opens up mechanical and electrical engineering as the next frontier.
The hardware software boon is biggest for small gadget and parts companies that historically shipped bad software because they could not afford good software. Now they can ship good-enough software, or skip the human front end entirely and expose hardware agentically for voice and agent control.
China goes all in on open-source models partly to neutralize Silicon Valley’s software edge: if software can be generated on demand from open weights, China’s hardware and supply-chain superiority stops being offset by a software disadvantage.
Other reasons cited for China’s open-source push: it is a model or two behind, it is distilling models, and the government has a history of funding efforts that lift the whole ecosystem, especially in network-effect businesses.
Open-source heft is coming almost entirely from China. OpenAI is not open, Grok publishes models but is seen as a model or two behind, Google’s local models are not very competitive, and Anthropic is not known for open-source releases.
Without frontier coding models you do not get self-improvement, and if you fall behind on generating software you fall behind on generating everything, because software now sits upstream of every hardware pipeline.
Real AI gateway usage shows open models do get used, but the top is heavily dominated by frontier intelligence.
Frontier intelligence at the right cost and performance slaps at scale. Gemini models are underrated and excel as industrial production models for support tasks and browser automation, even if they are not the top pick for coding.
For pushing the frontier you need the best possible coding model, which is now only two or three models, and the Chinese models are not among them.
One contrarian view at the table: use DeepSeek for 97% of tasks because it is cheap, run it repeatedly for harder problems, and reserve frontier models for the most advanced work. The counterargument: intelligence is an unalloyed good, mistakes are invisible and costly, and a smarter model is always cheaper than a person, so you default to the most intelligent option.
Always wanting the most intelligent model risks creating a monopoly or oligopoly in AI, because when two models disagree you cannot tell which is right, so you trust the smarter one and stop asking the weaker one.
Vertical integration is forced, not chosen: if you cannot buy it, you have to make it. The preference is always to buy when a vendor offers a service at a great price, like PCBs from Asia.
The closer a product gets to a single block of covalently bonded matter, the better it performs: lower power, smaller, higher performance, longer lasting. The components for that level of integration simply are not available to buy.
Science owns a captive MEMS foundry on the East Coast, bought because there was no other way to do the packaging and assembly the company needed.
One of the biggest near-term AI impacts inside hardware companies is regulatory and documentation work: tracing which of thousands of ISO standards apply used to occupy a regulatory and quality team for months, and now AI just knows.
Software still needs hands. A model can be smarter than us and still hit real boundaries if it cannot physically make things, which is why Science has instrumented its foundry so model improvements show up immediately in cell engineering and material science.
Basic legal work is already going away. People have stopped asking lawyers for NDAs and routine agreements, because law is spaghetti code in English with no real APIs, and the basic tasks are handled by AI.
Junior engineers got promoted to senior engineers while junior engineering itself got taken over by agents. The same framing applies to paralegals: fired, or promoted to senior lawyers who now spend their time thinking about the law.
What you value in a lawyer is a trusted authority who puts their reputation on the line, not someone who read every clause. The same trust model is coming to engineering.
The biggest problem in software engineering today is mountains of slop arriving as a pull request. The old norm of reading every line of a PR is gone.
The new standard is being able to say “I understand and I am signing off on the consequences of this PR,” backed by the test harness, simulations, proofs, and type checkers you built, even without reading every line.
Embrace a world where code is spaghetti you do not fully understand, but build the evaluators that give confidence, and rely on production engineers to sign off because someone gets paged if the system goes down.
Creating software is easy from zero to one. The hard part is a thousand days from now: is it secure, tested, production grade, and performant, and are you still motivated to invest the tokens to maintain it in prod?
Humans are becoming verifiers. The same way models are trained on good verification data, the old functions of lawyers, engineers, and operations people are moving to verifying the stack and standing behind it.
Detailed Summary
Turning Hardware Engineering Into Software
Blake Scholl opens by describing how AI completely changes the role of software and hardware developers at Boom Supersonic. From day one the company tried to take traditional hardware engineering workflows and turn them into software. For anyone who has not been around hardware engineering, he explains that an enormous amount of it happens in complex Excel spreadsheets on individual engineers’ laptops, sometimes with VBScript code, all of which is actually software but is not treated as software. There is no source control, no automated testing, and when an aerodynamicist hands work to a structures engineer it is done manually with a spreadsheet over email, like it is the 1990s. Boom started building software frameworks to automate and make those flows repeatable so the cost of iteration would drop, but progress was slow because the company could never afford enough software engineers.
Two Engineers, One Jet Engine
The mind-blowing change, in Blake’s words, is a new division of labor. Software engineers create the architectures because they understand systems, algorithms, and separation of concerns, and then hardware engineers vibe code the pieces that draw on what they uniquely know about hardware. The result is wildly different productivity for small teams. His example is the turbine blade: it starts cold and gets bigger as it heats up in operation, so you have to design both the cold shape and the hot shape, converting between them and between structures and aerodynamics. Classically that was one engineer, one day, for one blade of analysis, in a jet engine with a thousand blades, which means you simply could not do much. Now, with software and hardware people working together, you can change blade geometry and see the structural and aerodynamic results in real time, which allows two engineers to design an entire jet engine.
The Cataclysm of Enterprise Software
Picking up on the point that software engineers now build the tools and architectures for everyone else, Naval names what he calls the cataclysm of enterprise software. There is no longer a startup that can build and sell hardware collaboration tools, because internally companies just code the right things they need at any given moment. Even spreadsheets are cooked, he argues, because the reason spreadsheets succeeded is that no one could build custom software, so a spreadsheet stuffed with VBScript functions was the closest available approximation. With that constraint gone, the proxy collapses. He notes he has personally moved almost entirely from Excel to Python models where he can get believable simulations of things.
Generating Step Files and PCB Layouts
The next frontier, Blake suggests, is the thing AI has not reached yet but probably will within 2026: today it can generate software, but soon it will generate step files and PCB layouts, and when it comes for mechanical and electrical engineering that will be a whole other thing nobody has seen yet. On the hardware side this is described as a particular boon for the many small gadget and parts companies that historically wrote bad software because they could not make great software. Now they can make good-enough software, or skip a human front end entirely and expose the hardware agentically, so that an agent accesses it and a person controls the hardware by voice.
China’s Open-Source Bet and Hardware Superiority
This leads into one of the reasons China is described as going all in on open-source models. With hardware superiority, complex supply chains, and deep component chains, China’s logic is that if it can generate software on demand it no longer suffers a software disadvantage against Silicon Valley. That is framed as not the only reason: China is also a model or two behind, it is distilling models, and the government has a history of funding efforts that lift the entire ecosystem, especially in network-effect businesses. Ironically, the open-source heft comes from China precisely because OpenAI is not open, Grok publishes models but is a model or two behind, Google’s local models are not very competitive, and Anthropic is not known for open releases. The deeper point is that without great frontier coding models you do not get self-improvement, and if you fall behind on the ability to generate software you fall behind on the ability to generate everything, because generating software is embedded in every piece of the hardware pipeline.
Frontier Intelligence vs. Cheap Models
Naval raises a dinner-table argument from the night before, where someone claimed you will use DeepSeek for 97% of things because it is cheap, run it repeatedly when you need more intelligence, and reserve OpenAI or Anthropic for the most advanced tasks. Naval pushes back: intelligence is an unalloyed good, you always want more of it, model mistakes are invisible, and a smarter model is always cheaper than a real person in real time, so you default to the most intelligent model available. He notes the downside is that this tends toward a monopoly or oligopoly, because when two models give different answers you often cannot tell which is correct, so you trust the smarter one and gradually stop asking the weaker one. Guillermo confirms with AI gateway data that open models do get used, but the top is heavily dominated by frontier intelligence. His caveat is that frontier intelligence at the right cost and performance slaps at scale: Gemini models are underrated but are excellent industrial production models for support tasks and browser automation, while for pushing the frontier you need the best possible coding model, now only two or three models, and the Chinese models are not in that set.
Vertical Integration and the Captive MEMS Foundry
Asked about his push into vertical integration and extreme urgency, Max Hodak explains that for many things you cannot buy what you need, so you have to make it. The preference is always to buy when a vendor offers a service at a great price, and he points to PCBs, which are basically free and available in unlimited quantity from Asia. But the closer a product gets to being a single block of covalently bonded matter, the better it is: lower power, smaller, higher performance, longer lasting. The components for that level of integration are not available, so to innovate beyond piecing together off-the-shelf parts you have to learn to do it yourself, which shows up as vertical integration. Science owns a captive MEMS foundry on the East Coast, bought because there was no other way to do the packaging and assembly work the company wanted.
Software Still Needs Hands
Max expects AI to heavily affect all of this over the next few years, though it is not quite there yet. Ironically, one of the biggest impacts already seen is in regulatory interactions and documentation: figuring out which of thousands of ISO standards apply to a product change, and tracing it through, used to occupy a regulatory and quality team for months, and now the AI just knows. But for things like the surgical program or the MEMS fab, he argues the software still needs hands. It will be smarter than us, but if it cannot make things, those are real boundaries. Science has instrumented its foundry and many other parts of the company so that as models get better, the improvement shows up immediately in cell engineering and material science.
Lawyers, Paralegals, and the Promotion of Junior Work
The discussion turns to law as a parallel to engineering. It has been a while since anyone at the table generated a basic legal document using a lawyer. Routine work like NDAs and standard agreements is gone, because law is essentially spaghetti code that contradicts itself and has no real APIs, expressed in complicated English. Junior engineers got a promotion to senior engineers while junior engineering itself was taken over by agents, and the same framing applies to paralegals: you can say they were fired, or you can say they were promoted to senior lawyers who now spend their time thinking about the law. What you actually value in a lawyer is a trusted authority who went to law school and puts their reputation on the line when they tell you a document is legit.
Slop PRs, the Thousand-Day Problem, and Humans as Verifiers
Guillermo argues the biggest problem in software engineering today is mountains of slop ending up as a pull request. The old meme of reading every line of a PR is gone. In infrastructure he wants engineers to be able to say they understand and are signing off on the consequences of a PR, backed by the test harness, simulations, proofs, and type checkers they wrote, so they have confidence it will be safe in production even without reading every line. There is a world where everyone embraces that the code is spaghetti nobody fully understands, but builds the evaluators that give confidence and relies on production engineers to say it is fine to ship, because someone gets paged if the system goes down. The further warning is that creating software is easy from zero to one, but a thousand days from now you have to ask whether it is secure, tested, production grade, and performant, and whether you are still motivated to invest the tokens to maintain it in prod. The resolution is that humans are becoming verifiers, the same way models are trained on good verification data, and the old functions of lawyers, engineers, and operations people are moving to verifying the stack and standing behind it.
Notable Quotes
“What I found is it completely changes the role of software and hardware developers.”
Blake Scholl, on how AI reshaped engineering at Boom Supersonic.
“If you want to hand something off from like an aerodynamicist to a structures engineer that’s done manually with like a spreadsheet over email. It’s the 1990s. It’s terrible.”
Blake Scholl, describing the state of traditional hardware engineering workflows.
“It allows two engineers to design an entire jet engine, which is just wildly different.”
Blake Scholl, on collapsing turbine-blade analysis with real-time structural and aerodynamic feedback.
“Even spreadsheets are kind of cooked, right? Because the reason spreadsheets were successful is that no one could build custom software.”
Naval Ravikant, on the cataclysm of enterprise software.
“Right now it can generate software, but soon it’ll be able to generate step files and PCB layouts. And when it comes for mechanical and electrical engineering, that will be a whole other thing that we haven’t seen yet.”
Blake Scholl, on the next frontier for AI in hardware.
“If you fall behind on your ability to generate software, you fall behind on the ability to generate everything.”
Naval Ravikant, on why software now sits upstream of every hardware pipeline.
“Anytime I’m working to push the frontier you need the best possible coding model, and that’s basically now like two or three models, and the Chinese are certainly not in it.”
Guillermo Rauch, on where frontier coding intelligence actually lives.
“You can’t buy it, so you got to make it somehow. The closer that our products get to being like a single block of covalently bonded matter, the better they’ll be.”
Max Hodak, on why Science is forced into vertical integration.
“The software still needs hands. It’s going to be smarter than us, but if it can’t make things, then those are real real boundaries.”
Max Hodak, on the physical limits of AI in hardware.
“You need to be able to say I am signing off on understanding the consequences of this PR, or I wrote the test harness, the simulations, the proofs, the type checkers, to be able to say even without reading this, I have confidence it’s going to be safe in production.”
Guillermo Rauch, on what code review becomes in the age of slop PRs.
“Creating software is really easy 0 to one. But think about a thousand days from now. Is it secure? Is it tested? Is it production grade? And are you still motivated to invest all of those tokens in maintaining it in prod?”
On the long-term cost of software that is cheap to create and expensive to keep alive.
Full episode: The AI Industrial Revolution, the complete hour-long conversation this clip is drawn from, covering software factories, hardware, regulation, healthcare economics, autonomous companies, and creativity.
Part one: Waste Tokens to Save Time, the first half of this same conversation, where Naval, Guillermo Rauch, Blake Scholl, and Max Hodak argue that the job of an engineer is to build the factory and that pure software is not dead.
Boom Supersonic, Blake Scholl’s company building supersonic civilian aircraft and its own jet engines, the source of the turbine-blade and two-engineers example.
Science Corporation, Max Hodak’s company, whose captive MEMS foundry and surgical program anchor the vertical-integration argument.
Vercel, Guillermo Rauch’s company, whose AI gateway data informs the point about frontier intelligence dominating real usage.
Anthropic has closed one of the largest private financing rounds in the history of technology, raising $65 billion in Series H funding at a $965 billion post-money valuation. The round, announced on May 28, 2026, lands as demand for Claude reaches what the company calls historic levels, and it positions Anthropic to pour fresh capital into safety research, compute, and the products that enterprises now lean on every day.
TLDR
Anthropic raised $65 billion in its Series H at a $965 billion post-money valuation, with Altimeter Capital, Dragoneer, Greenoaks, and Sequoia Capital leading and Capital Group, Coatue, D1 Capital Partners, GIC, ICONIQ, and XN co-leading, alongside $15 billion in previously committed hyperscaler investment that includes $5 billion from Amazon. The raise follows Anthropic crossing $47 billion in run-rate revenue earlier in May 2026, and it funds three priorities named by CFO Krishna Rao: advancing safety and interpretability research, expanding compute capacity to meet growing Claude demand, and scaling the products and partnerships customers depend on. On the infrastructure side, the company is locking in gigawatt-scale compute through 5 gigawatts with Amazon, 5 gigawatts of TPU capacity via Google and Broadcom, GPU access from SpaceX, and supply from partners Micron, Samsung, and SK hynix, while Claude remains available across all three major cloud platforms, AWS, Google Cloud, and Microsoft Azure, with widespread enterprise adoption across industries.
Thoughts
Start with the number that everyone will fixate on. A $965 billion post-money valuation against $47 billion in run-rate revenue is roughly 20 times sales, and for a company growing this fast that multiple is not the interesting part. The interesting part is that run-rate revenue crossed $47 billion earlier this month, which means the denominator is moving so quickly that the multiple is already stale. Investors are not pricing the business Anthropic is today. They are pricing the slope. A 20x multiple on a number that may double again inside a year is a very different bet than 20x on a flat line, and the lead names here (Altimeter, Dragoneer, Greenoaks, Sequoia, with Capital Group, Coatue, GIC and others co-leading) are not the kind of capital that pays for nostalgia. They are paying for the second derivative.
But the real story is not the valuation. It is the compute. Read the infrastructure list carefully and you see the actual problem this round solves: 5 gigawatts from Amazon, 5 gigawatts of TPU capacity through Google and Broadcom, GPU access from SpaceX, and memory supply locked down with Micron, Samsung, and SK hynix. That is more than 10 gigawatts of secured power and silicon. The constraint on frontier AI in 2026 is no longer talent or even algorithms. It is electricity, land, and the multi-year queue for advanced packaging and high-bandwidth memory. You cannot buy 10 gigawatts on a quarterly basis. You reserve it years out, and you need the balance sheet to make those commitments credible. A $65 billion raise is, in plain terms, the down payment that lets Anthropic sign for capacity nobody can conjure on demand. The money is downstream of the megawatts.
The diversification across that compute stack matters as much as the size. By splitting between Amazon’s infrastructure, Google and Broadcom’s custom TPUs, and SpaceX-supplied GPUs, Anthropic is refusing to become hostage to any single supplier’s roadmap or pricing. Custom silicon through Broadcom in particular is a bet on bending the cost curve, because the long-term economics of serving Claude at this scale depend on dollars per token, not just on raw availability. Anyone who has watched cloud lock-in play out over the last decade understands the move. Optionality at the hardware layer is leverage, and leverage is what keeps margins from being dictated by whoever owns the only fab slot you can reach.
It is worth pausing on the fact that the round explicitly funds safety and interpretability research alongside scaling, and not as a footnote. Most companies treat safety spend as a cost center to be minimized once growth kicks in. Naming it first, ahead of compute and products, is a statement about where Anthropic believes its durable advantage sits. If models keep getting more capable, the binding constraint on deployment inside regulated industries (finance, healthcare, government) becomes trust, not intelligence. Interpretability is the work that turns a black box into something an enterprise risk committee can actually sign off on. Framed that way, safety research is not philanthropy subtracted from the bottom line. It is the thing that unlocks the most lucrative and defensible parts of the market, and pairing it with the scaling budget is the tell.
Finally, look at distribution. Claude now ships on all three major clouds at once: AWS, Google Cloud, and Microsoft Azure. In a market where most frontier labs are tethered to a single hyperscaler, being available everywhere enterprises already run their workloads is a structural edge. It removes the procurement friction of asking a customer to adopt a new vendor relationship, and it means Anthropic competes on the merits of the model rather than on which cloud a buyer happened to standardize on years ago. Combine that omnipresent distribution with the compute reservations and the explicit safety mandate, and the shape of the strategy is clear. This is not a company buying time. It is a company buying the three things that actually compound: capacity that cannot be rushed, trust that cannot be faked, and reach into every place where work already happens.
Key Takeaways
Anthropic raised $65 billion in its Series H funding round, one of the largest private financings in the history of the technology industry.
The round set Anthropic’s post-money valuation at $965 billion, placing the company within reach of the $1 trillion mark.
Altimeter Capital, Dragoneer, Greenoaks, and Sequoia Capital led the Series H round.
Capital Group, Coatue, D1 Capital Partners, GIC, ICONIQ, and XN served as co-leads on the investment.
The new capital builds on $15 billion in previously committed hyperscaler investments, which includes $5 billion from Amazon.
Anthropic crossed $47 billion in run-rate revenue earlier in May 2026, reflecting the surging commercial demand for Claude.
A core priority for the funding is to advance Anthropic’s safety and interpretability research.
The company will use the capital to expand compute capacity in order to meet growing demand for Claude.
Anthropic plans to scale the products and partnerships that customers depend on across its business.
CFO Krishna Rao said the funding will help Anthropic serve the historic demand it is experiencing, stay at the research frontier, and bring Claude to more of the places where work happens.
Amazon is providing 5 gigawatts of compute capacity as part of Anthropic’s infrastructure expansion.
Google and Broadcom are supplying 5 gigawatts of TPU capacity to power Claude’s growth.
SpaceX is contributing GPU access to Anthropic’s compute footprint.
Micron, Samsung, and SK hynix are partnering with Anthropic on memory and infrastructure to support its scaling needs.
Claude is available on all three major cloud platforms, AWS, Google Cloud, and Microsoft Azure.
Anthropic reports widespread enterprise adoption of Claude across a broad range of industries.
Detailed Summary
The Raise and the Valuation
Anthropic has raised $65 billion in Series H funding, a round that values the company at $965 billion on a post-money basis. The size of the raise places it among the largest private financing events the technology industry has ever seen, and the valuation pushes Anthropic to the doorstep of the trillion dollar mark. The capital arrives at a moment when demand for the company’s Claude models has accelerated sharply, and the round is built to fund the response to that demand rather than simply mark a milestone. Anthropic framed the financing in its Series H announcement as the fuel for staying at the research frontier while scaling the infrastructure and products that customers increasingly rely on.
Who Put In the Money
The Series H was led by Altimeter Capital, Dragoneer, Greenoaks, and Sequoia Capital, a group that combines deep growth-stage technology experience with conviction in Anthropic’s long-term trajectory. Joining as co-leads were Capital Group, Coatue, D1 Capital Partners, GIC, ICONIQ, and XN, a roster that spans crossover funds, sovereign wealth, and institutional investors. Beyond the new equity, Anthropic pointed to $15 billion in previously committed hyperscaler investment, including $5 billion from Amazon. Taken together, the investor base reflects a mix of financial backers and strategic partners with a direct stake in seeing Claude reach more customers and more compute.
Revenue at $47 Billion Run-Rate
Underpinning the valuation is a business that has scaled with unusual speed. Anthropic crossed a $47 billion run-rate revenue figure earlier in May 2026, a number that signals how quickly enterprises and developers have adopted Claude across their workflows. Run-rate revenue annualizes the company’s most recent performance, and at this level it puts Anthropic firmly among the fastest growing software businesses on record. That financial momentum is the practical justification for both the round’s size and the near trillion dollar valuation investors were willing to support.
The Compute Buildout
A large share of the strategy behind the raise centers on securing compute at enormous scale. Anthropic detailed a set of infrastructure partnerships designed to keep pace with Claude demand. Amazon is providing 5 gigawatts of capacity, while Google and Broadcom together are supplying 5 gigawatts of TPU capacity. SpaceX is contributing GPU access, broadening the range of silicon Anthropic can draw on. Supporting the buildout on the hardware supply side are Micron, Samsung, and SK hynix, the memory and component partners whose output is essential to standing up data centers at this magnitude. The combined picture is a company assembling power, chips, and supply chain commitments measured in gigawatts rather than racks.
Where the Money Goes
Anthropic outlined three priorities for the new capital. The first is to advance safety and interpretability research, continuing the work of understanding how models behave and ensuring they remain reliable as they grow more capable. The second is to expand compute capacity to meet the growing demand for Claude, the practical engine behind the infrastructure commitments above. The third is to scale the products and partnerships that customers depend on, deepening the company’s reach into the tools and platforms where work actually happens. Krishna Rao, Anthropic’s chief financial officer, said the funding “will help us serve the historic demand we are experiencing, stay at the research frontier, and bring Claude to more of the places where work happens.”
Claude Everywhere
The funding lands on top of a distribution footprint that already spans the major cloud ecosystems. Claude is available on all three leading cloud platforms, AWS, Google Cloud, and Microsoft Azure, which means enterprises can reach the models through whichever provider they have standardized on. That availability has translated into widespread enterprise adoption across industries, from software and finance to healthcare and beyond. By being present everywhere developers and businesses already operate, Anthropic positions Claude not as a destination customers must travel to but as a capability woven into the platforms they use every day.
Notable Quotes
This funding will help us serve the historic demand we are experiencing, stay at the research frontier, and bring Claude to more of the places where work happens.
Krishna Rao, CFO at Anthropic, on the purpose of the Series H round.
Advance safety and interpretability research, expand compute capacity to meet growing Claude demand, and scale products and partnerships customers depend on.
How Anthropic describes its use of funds from the round.
For the full details on the round, the lead and co-lead investors, and how Anthropic plans to deploy the capital across safety research, compute, and products, read the full announcement here.
Related Reading
Anthropic, the AI safety and research company behind Claude that raised this Series H round.
Sequoia Capital, one of the lead investors anchoring the financing.
Amazon Web Services, one of the three major cloud platforms where Claude is available and the source of a $5 billion investment.
Google Cloud TPUs, the tensor processing units behind the 5 gigawatts of TPU capacity in the Google and Broadcom partnership.
AI safety, the research field at the center of how Anthropic says it will use the new funding.
Anthropic has released Claude Opus 4.8, the newest member of its flagship Opus class, available today across every surface and priced exactly like the model it replaces. The company calls it “a modest but tangible improvement” on Opus 4.7, but the framing undersells what is actually interesting here: the headline upgrade is not a benchmark number, it is honesty. Opus 4.8 is built to know when it does not know, and that single behavioral shift may matter more for real agent work than any raw capability bump.
TLDR
Claude Opus 4.8 is an across-the-board upgrade to Anthropic’s Opus class that ships today at the same regular price as Opus 4.7 ($5 per million input tokens, $25 per million output tokens), with the model positioned as “a more effective collaborator.” The marquee improvement is honesty: Opus 4.8 is roughly four times less likely than its predecessor to let flaws in its own code pass unremarked, and it is more willing to flag uncertainty rather than confidently claim progress on thin evidence. A pre-release alignment assessment found new highs on prosocial traits like supporting user autonomy and acting in the user’s best interest, with misaligned behavior at rates similar to Anthropic’s best-aligned model, Claude Mythos Preview. Three things launch alongside the model: dynamic workflows in Claude Code (research preview), where Claude plans work then runs hundreds of parallel subagents that run even longer and verify their own outputs before reporting back; effort control in claude.ai and Cowork, a slider for how hard Claude thinks; and a Messages API update that accepts system entries inside the messages array so developers can update instructions mid-task without breaking the prompt cache. Fast mode now runs at 2.5x speed and is three times cheaper than before ($10 / $50 per million tokens). The roadmap points to cheaper Opus-equivalent models, a higher-intelligence class above Opus, and a wider rollout of Mythos-class models gated behind stronger cyber safeguards under Project Glasswing.
Thoughts
The most important sentence in this announcement is not about coding scores. It is the claim that Opus 4.8 is about four times less likely than Opus 4.7 to let flaws in its own code slip by without comment. For a chat assistant, overconfidence is annoying. For an agent, it is catastrophic. The whole premise of long-running autonomous work is that you hand the model a task and walk away, which means the model’s own judgment about whether it succeeded becomes the only judgment in the loop until you come back. A model that confidently declares victory on a half-finished migration does not save you time, it costs you a debugging session plus the time you spent trusting it. Honesty, framed this way, is not a soft virtue. It is the load-bearing reliability property that makes unattended agents usable at all.
Read the launch as a single coherent argument rather than a list of features, and the pieces lock together. Dynamic workflows let Claude plan a job and fan out hundreds of parallel subagents that, with Opus 4.8, run longer than before. Effort control lets you dial up how much the model thinks. The honesty improvement means the model checks its own work and flags what it is unsure about instead of papering over it. Put those three together and you get one product thesis: let it run longer, let it think harder, and trust it to tell you when something is wrong. The codebase-scale migration example, hundreds of thousands of lines from kickoff to merge with the existing test suite as the bar, is the proof point. None of those three capabilities is worth much alone. A model that runs for hours but lies about its results is a liability. A model that flags uncertainty but cannot sustain a long task never reaches the moment where its honesty matters. Anthropic shipped all three at once because they only pay off together.
The economics deserve a closer look than the “same price” headline invites. Regular pricing is flat versus Opus 4.7, which is the polite way of saying you get a better model for free. The real move is fast mode: 2.5x the speed at three times cheaper than it cost on previous models, landing at $10 per million input and $50 per million output. That is Anthropic quietly attacking the latency-versus-cost tradeoff that has shaped how teams deploy frontier models. Until now, “fast” meant “expensive,” so you reserved it for interactive moments and ate the wait everywhere else. Collapsing that premium changes the default. And note the subtle token story underneath: Opus 4.8 at its default high effort spends roughly the same tokens on coding as Opus 4.7’s default while performing better, so the effort slider is not a way to bleed you dry, it is an honest exposure of the quality-cost dial that was always there implicitly.
The Messages API change is the kind of unglamorous plumbing that practitioners will appreciate immediately. Letting system entries live inside the messages array means you can update an agent’s instructions, permissions, token budget, or environment context partway through a task without smuggling the update through a fake user turn and without blowing up your prompt cache. Anyone who has built a long-running agent has hit this wall: the world changes mid-task, the agent needs new constraints, and the only clean way to inject them previously was a cache-busting hack. This is Anthropic treating agents as first-class, stateful, long-lived processes rather than oversized chat sessions. It is a small spec change with outsized implications for how you architect an agent that runs for an hour.
Then there is the roadmap, where the most telling line is the quietest. Anthropic says a small number of organizations are already using Claude Mythos Preview for cybersecurity work under Project Glasswing, and that models of this capability level require stronger cyber safeguards before general release. Notice that they are pinning Opus 4.8’s alignment numbers to Mythos as the benchmark for “best-aligned,” while simultaneously holding Mythos back from general availability on safety grounds. That is a deliberate signal: the next class of model is good enough that they are gating it on cyber-offense risk, not on capability. For a site about the pursuit of joy, fulfillment, and purpose through AI, this is the part worth sitting with. The frontier is increasingly defined not by what the models can do, but by what their builders decide it is responsible to ship. Honesty in the small (flagging a bad line of code) and restraint in the large (holding back a cyber-capable model) are the same instinct expressed at two different scales.
Key Takeaways
Claude Opus 4.8 is now available everywhere, replacing Opus 4.7 as Anthropic’s flagship Opus-class model and positioned as “a more effective collaborator.”
Regular usage pricing is unchanged from Opus 4.7, holding at $5 per million input tokens and $25 per million output tokens, so the capability gains come at no added cost.
The single most emphasized improvement is honesty, which Anthropic treats as a core trained behavior rather than a marketing flourish.
Evaluations show Opus 4.8 is around four times less likely than its predecessor to let flaws in its own code pass unremarked, a direct reliability win for autonomous coding.
Early testers report the model is more likely to flag uncertainty about its work and less likely to make unsupported claims or jump to conclusions on thin evidence.
A detailed alignment assessment was run before release and concluded Opus 4.8 reaches new highs on prosocial traits like supporting user autonomy and acting in the user’s best interest.
Misaligned behavior such as deception or cooperation with misuse is at rates substantially lower than Opus 4.7 and similar to Anthropic’s best-aligned model, Claude Mythos Preview.
The full alignment assessment and pre-deployment safety tests are documented in the public Claude Opus 4.8 System Card.
Dynamic workflows launch as a research preview inside Claude Code, letting Claude plan the work and then run hundreds of parallel subagents in a single session.
With Opus 4.8, those subagents can run even longer, and Claude verifies its outputs before reporting back rather than declaring success blindly.
Anthropic’s flagship example for dynamic workflows is a codebase-scale migration across hundreds of thousands of lines of code, from kickoff to merge, using the existing test suite as the success bar.
Dynamic workflows are available in Claude Code for the Enterprise, Team, and Max plans.
Effort control arrives in claude.ai and Cowork as a setting next to the model selector that lets users choose how much effort Claude puts into a response.
Higher effort makes Claude think more frequently and deeply for better answers; lower effort responds faster and consumes rate limits more slowly. Effort control is available on all plans.
Opus 4.8 defaults to “high” effort, judged the best overall balance of quality and user experience.
On coding tasks, the default effort spends a similar number of tokens as Opus 4.7’s default but delivers better performance, so quality rises without a token penalty.
Users can select “extra” (called “xhigh” in Claude Code) or “max” to spend more tokens for stronger results, and Anthropic recommends “extra” for difficult tasks and long-running asynchronous workflows.
Rate limits in Claude Code were increased to accommodate the higher token usage of the higher effort levels.
The Messages API now accepts system entries inside the messages array, a meaningful change for agent developers.
That update lets developers change Claude’s instructions mid-task, adjusting permissions, token budgets, or environment context, without breaking the prompt cache or routing through a user turn.
Fast mode now runs at 2.5x speed and is three times cheaper than it was for previous models, priced at $10 per million input tokens and $50 per million output tokens.
Developers access the model as claude-opus-4-8 through the Claude API.
Partner Miguel Gonzalez reports Opus 4.8 scored 84% on Online-Mind2Web, a meaningful jump over both Opus 4.7 and GPT-5.5, calling it the strongest computer-use and browser-agent model his team has tested.
Databricks reports that, inside Genie, Opus 4.8 reasons over unstructured content like PDFs and diagrams at 61% cheaper token cost than Opus 4.7.
Thomson Reuters reports Opus 4.8 is the first model to break 10% overall on the all-pass standard of its Legal Agent Benchmark, the highest score recorded there.
Eleven partners weighed in, including Cursor, Cognition’s Devin, Databricks Genie, Thomson Reuters CoCounsel, and Hebbia, spanning coding, legal, finance, and enterprise data work.
Anthropic is working on models that deliver many of the same capabilities as Opus at a lower cost.
The company plans to release a new class of model with even higher intelligence than Opus.
Under Project Glasswing, a small number of organizations are already using Claude Mythos Preview for cybersecurity work, with Mythos-class models expected to reach all customers in the coming weeks once stronger cyber safeguards are in place.
Detailed Summary
What Claude Opus 4.8 Is
Claude Opus 4.8 is an upgrade to Anthropic’s Opus class of models, building on Opus 4.7 with improvements across benchmarks covering coding, agentic skills, reasoning, and practical knowledge-work tasks. Anthropic describes the result as “a more effective collaborator” while characterizing the release overall as “a modest but tangible improvement on its predecessor.” The model is available today, everywhere, and developers call it as claude-opus-4-8 via the Claude API. The announcement includes a comparison table against the predecessor and other models, though the per-cell numbers in that table are published as an image and are not reproduced here as text.
Honesty: The Headline Improvement
Anthropic singles out honesty as one of the most prominent improvements in Opus 4.8. All of the company’s models are trained to be honest, which includes avoiding claims they cannot support. A persistent problem with AI models generally is that they sometimes jump to conclusions, confidently claiming progress despite thin evidence. Early testers report that Opus 4.8 is more likely to flag uncertainties about its own work and less likely to make unsupported claims. The most concrete measure: evaluations show Opus 4.8 is around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked. For agentic and unattended use, this self-skepticism is the difference between a model that reliably tells you when something went wrong and one that quietly ships a broken result.
Alignment Assessment
A detailed alignment assessment was run before release. On the positive side, the Alignment team concluded that Opus 4.8 “reaches new highs on our measures of prosocial traits like supporting user autonomy and acting in the user’s best interest.” On the risk side, misaligned behavior such as deception or cooperation with misuse occurs at rates substantially lower than Opus 4.7, and similar to Anthropic’s best-aligned model, Claude Mythos Preview. The full alignment assessment and the pre-deployment safety tests are published in the Claude Opus 4.8 System Card, which also contains the complete benchmark table and wider evaluations.
Dynamic Workflows in Claude Code
Launching today as a research preview in Claude Code, dynamic workflows let Claude plan the work and then run hundreds of parallel subagents in a single session. With Opus 4.8, those agents can run even longer than before, and Claude verifies its outputs before reporting back rather than reporting unchecked results. The showcase example is a codebase-scale migration: Claude Code with Opus 4.8 can carry out migrations across hundreds of thousands of lines of code, all the way from kickoff to merge, using the existing test suite as its bar for success. Dynamic workflows are available in Claude Code for the Enterprise, Team, and Max plans.
Effort Control
Effort control arrives in claude.ai and Cowork as a setting alongside the model selector that lets users choose how much effort Claude puts into a response. Higher effort means Claude thinks more frequently and deeply for better responses; lower effort means it responds faster and uses rate limits more slowly. Opus 4.8 defaults to “high” effort, which Anthropic judged the best overall balance of quality and user experience. On coding tasks, that default spends a similar number of tokens as Opus 4.7’s default while performing better. Users who want more can choose “extra” (called “xhigh” in Claude Code) or “max” to spend more tokens for stronger results, and Anthropic recommends “extra” for difficult tasks and long-running asynchronous workflows. To support the heavier token usage at higher effort levels, rate limits in Claude Code were increased. Effort control is available on all plans.
Messages API Update
The Messages API now accepts system entries inside the messages array. This lets developers update Claude’s instructions mid-task without breaking the prompt cache and without routing the update through a user turn. In practice that means you can update permissions, token budgets, or environment context while an agent is running, which is exactly the kind of statefulness a long-running autonomous process needs. It is a small specification change with significant consequences for how developers build durable agents.
Pricing and Fast Mode
Regular usage pricing is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. The notable shift is in fast mode, where the model works at 2.5x the speed and fast mode is now three times cheaper than it was for previous models, landing at $10 per million input tokens and $50 per million output tokens. The combination of unchanged regular pricing and dramatically cheaper fast mode reshapes the latency-versus-cost calculus that has long governed how teams deploy frontier models.
Partner Results Across Coding, Legal, Finance, and Data
Eleven partners shared results spanning the spectrum of professional work. Miguel Gonzalez reports 84% on Online-Mind2Web, a meaningful jump over both Opus 4.7 and GPT-5.5, calling it the strongest computer-use and browser-agent model his team has tested. Databricks reports that Genie reasons over unstructured content like PDFs and diagrams at 61% cheaper token cost than Opus 4.7. Thomson Reuters reports Opus 4.8 is the first model to break 10% overall on the all-pass standard of its Legal Agent Benchmark. Cursor reports gains across every effort level on CursorBench with more efficient tool calling, and Cognition reports that Devin sees cleaner tool use, fixes to the comment-verbosity and tool-calling issues seen with Opus 4.7, and improvements over Opus 4.6. Hebbia reports strong quality with better citation precision and more token efficiency on retrieval for dense financial filings. The footnotes note that Terminal-Bench 2.1 was scored on the Terminus-2 public harness (GPT-5.5’s Codex CLI harness score is 83.4%), that OSWorld-Verified methodology changed with Opus 4.7’s score updated to 82.3%, and that on Finance Agent v2 Gemini 3.5 Flash scores 57.9%.
What Is Next: Cheaper Models, Higher Intelligence, and Mythos
Anthropic outlined a three-part roadmap. First, the company is working on models that provide many of the same capabilities as Opus at a lower cost. Second, it plans to release a new class of model with even higher intelligence than Opus. Third, as part of Project Glasswing, a small number of organizations are currently using Claude Mythos Preview for cybersecurity work; models of this capability level require stronger cyber safeguards before general release, and Anthropic expects to bring Mythos-class models to all customers in the coming weeks.
Notable Quotes
“Claude Opus 4.8 has noticeably better judgment. In Claude Code, it asks the right questions, catches its own mistakes, pushes back when a plan isn’t sound, and builds up confidence around complex, multi-service explorations before making big changes. It’s a great model to build with.”
Tom Pritchard, Staff Engineer, in Claude Code
“On our Super-Agent benchmark, Claude Opus 4.8 is the only model to complete every case end-to-end, beating prior Opus models and GPT-5.5 at parity on cost. For agent products in translation, deep research, slide-building, and analysis, it delivers powerful reliability.”
Kay Zhu, Co-Founder and CTO, on the Super-Agent benchmark
“On CursorBench, Claude Opus 4.8 exceeds prior Opus models across every effort level. Tool calling is meaningfully more efficient, using fewer steps for the same intelligence, and it carries end-to-end tasks through.”
Michael Truell, Co-Founder and CEO, on CursorBench results
“Claude Opus 4.8 delivers the highest score recorded on our Legal Agent Benchmark, and is the first model to break 10% overall on the all-pass standard. For substantive legal work, that’s the kind of accuracy lift that translates directly into how much real attorney work our customers can hand off with confidence.”
Niko Grupen, Head of Applied Research, on the Legal Agent Benchmark
“Claude Opus 4.8 feels like a major quality-of-life update over Opus 4.7: faster, easier to collaborate with, and better at carrying context and style direction across a long session. Opus 4.8 is the model I kept trusting for work where voice, taste, and technical execution all have to happen side-by-side.”
Katie Parrott, Staff Writer, on long writing sessions
“Claude Opus 4.8 is the strongest computer-use and browser-agent model we’ve tested, scoring 84% on Online-Mind2Web, which is a meaningful jump over both Opus 4.7 and GPT-5.5. It stays reflective and on-task in the way our customers’ agent workloads need to be reliable end-to-end.”
Miguel Gonzalez, Tech Lead, on computer-use and browser agents
“Claude Opus 4.8 uses tools cleanly and follows instructions with the consistency our autonomous engineering workloads need to keep running unattended. It improves on Opus 4.6 and fixes the comment-verbosity and tool-calling issues we saw with Opus 4.7. This release from Anthropic translates directly into faster capability gains for engineers building on Devin.”
Scott Wu, CEO, on building with Devin
“On our long-running evals, Claude Opus 4.8’s analysis was consistently higher quality than prior Opus models. It finished faster and produced richer, more information dense outputs. Overall, a noticeably better signal to noise ratio. The biggest differentiator was Opus 4.8’s tendency to proactively flag issues with the inputs and outputs of an analysis, something other models routinely missed and left to the users to catch.”
Michael Ran, Sr. Investment Associate, on long-running analysis evals
Claude Opus 4.8 is a quieter release than its “modest but tangible” billing suggests, because the gains land where autonomous work actually lives: a model that flags its own uncertainty, runs longer and checks itself, scales effort on demand, and stays affordable while fast mode gets cheaper. The honesty improvement alone changes the trust math for anyone deploying agents. Read Anthropic’s full announcement here.
Related Reading
Claude Opus 4.8 System Card, the source for the full benchmark table, wider evaluations, and the complete alignment assessment.