PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: application layer ai

  • Benedict Evans on Why AI Is Stuck in 1997: The Task vs the Job, Commodity Models, and Why the Jobs Apocalypse Is Overhyped

    Benedict Evans, the former Andreessen Horowitz partner and independent analyst behind the annual “AI Eating the World” presentation, sat down with Lenny’s Podcast for what the host calls the most rational take on AI you will hear this year. Instead of either doom or hype, Evans argues that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile, which means we are living through something closer to 1997 than to the singularity. The conversation moves through the jobs question, the difference between a task and a job, whether the model labs have any pricing power, the anti-AI backlash, and what people should actually do. You can watch the full conversation on YouTube here.

    TLDW

    Evans frames AI as a platform shift on the scale of the internet or mobile, with the crucial twist that almost nothing has been built yet, so we are in the 1997 moment where confident predictions about winners are usually wrong. He introduces his central tool, the distinction between the task and the job, to explain why “X percent of this profession is exposed to AI” studies are misleading, why the AI labs are paradoxically hiring forward deployed engineers and buying consultancies, and why accountants kept multiplying through every wave of automation (the lump of labour fallacy and Jevons paradox at work). On value capture he makes a deterministic bet that foundation models have no network effects, behave like a commodity, and will look more like cloud than like Windows, with the value moving up the stack to applications, much as it did in telecom, where a trillion-dollar industry grew data traffic thousands of times over while its stocks went nowhere. He covers distribution as the real moat, Apple Intelligence as the most compelling unshipped vision, the fuzzy anti-AI backlash (including the largely fake water panic and the very real harms of deepfakes), raising kids under radical uncertainty, and closes with the disarming admission that his own synthesis-heavy job is exactly the kind AI is currently worst at. His advice: presume radical uncertainty, dive in rather than sneer, and assume it will probably be okay.

    Thoughts

    The most useful thing in this conversation is a single question Evans keeps returning to: what is the task, and what is the job? A spreadsheet automated the arithmetic an accountant does, and the number of accountants went up for the next forty years. Claude Code can write the code, but deciding what to build, for whom, and why is the part nobody has automated. The reason the “this profession is X percent exposed to AI” studies feel hollow is that they assume a job is a neat stack of separable tasks. Evans argues, by analogy to the old expert-systems failure, that you simply cannot decompose a senior lawyer’s work that way. The 75-slide deck is the task. Walking your company, reading its politics, talking to your customers, and telling you the uncomfortable truth is the job, and that is what you actually paid McKinsey for.

    The boldest and most falsifiable claim is that the foundation-model companies look more like cloud than like Windows. No network effects means no winner-take-all, which means durable competition, which means commodity pricing and compressed margins, with the real value accruing up the stack in applications that nobody at the labs is going to build. His telecom analogy is the one to sit with. A trillion-dollar industry grew mobile data traffic by 1,500 to 2,000 times in fifteen years, and the stocks went nowhere for a quarter century, because it was a low-margin utility while all the interesting value moved to Apple and the people building apps on top. If he is right, the current token-burn economics, the person reportedly spending 1.5 million dollars a month on tokens, are the 2010 equivalent of a 50,000 dollar roaming bill, not the steady state. Evans flags openly that he could be completely wrong, which is the intellectually honest part and the part most forecasters skip.

    “It depends” and “it will probably be okay” sound like evasions, and Evans leans into that. But the 1997 framing is doing real work. The point is not that AI is small, it is that the things that will end up mattering have not been built, and that anyone confidently naming the winners today is repeating the 1997 mistake of betting on Excite over a search company with a weird logo. The discipline he is selling is to presume radical uncertainty and act anyway, because the alternative, declaring the whole thing slop and shouting about it online, buys a great feeling of moral superiority and nothing else. His repeated insistence that you can see the job that goes away but never the new job, because it does not exist yet, is the load-bearing idea under his optimism.

    The most disarming moment is the closing AI-corner answer, where the person whose entire brand is explaining AI admits he struggles to use it. His work is synthesis and precise information retrieval, and precise retrieval happens to be exactly what today’s models are worst at. He is, in his own words, the lawyer looking at VisiCalc: it is obviously transformative, and he just does not happen to make spreadsheets all day. That admission is worth more than any benchmark, because it locates the real variable. How much AI changes your life depends less on how good the model gets and more on whether your daily work sits on the part of the jagged frontier where it already works. That is a far more practical lens than arguing about whether AGI arrives in three years or thirty.

    Key Takeaways

    • Evans’s headline opinion is that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile. Both halves of that sentence matter.
    • If you make the internet comparison honestly, we are roughly in 1997: very exciting, most of it does not work yet, most of what people will build has not been built, and it is unclear how any of it will end up working.
    • Adoption is spread across a very wide distribution. Even among teenagers, only something like 15 to 20 percent are daily active users and another 20 percent weekly, with the majority saying they do not use it at all.
    • That spread maps onto the “jagged frontier” question of where AI works, where it does not, whether you can predict where it will work in advance, and whether you can even tell after the fact.
    • Software developers are the accountants seeing VisiCalc: for them everything has already changed. Most other professions are watching, intrigued but unsure what to do with it.
    • The AI labs are investing heavily in forward deployed engineers, consultancies, and professional services. Evans jokes that a forward deployed engineer is an Accenture outsourced developer who lives in San Francisco.
    • Companies do not have spare people sitting around to reimagine every internal workflow, so reinventing a business around AI is itself a project that needs consultants, which is why the most cutting-edge labs are funding exactly the firms everyone assumed AI would kill.
    • The central framework: separate the task from the job. Sometimes the task is the job (the elevator operator pressing a lever), and automating the task ends the job. Far more often, the task is only part of the job.
    • Amazon gets you the SKU once you know which SKU you want. Knowing which one to buy is a different job. Claude Code writes the code, but knowing what code and what features to build is the job.
    • A McKinsey or Bain engagement is not really about the deck. The deck is the task. The job is walking your enterprise, understanding the politics, talking to your customers, and telling you the truth.
    • The Jevons paradox is just price elasticity applied to labour. Make something cheaper to produce and you usually do far more of it, not the same amount with fewer people.
    • Excel did not give investment bankers shorter hours. iPhone SDKs did not shrink the number of engineers even though Apple writes 90 percent of the code for you. The number of accountants rose through every wave of automation.
    • The lump of labour fallacy: since 1800, each technology automates jobs and unlocks new ones. You can always see the job that disappears and never the new job, because it does not exist yet.
    • Evans is wary of argument from authority on jobs. He wants Dario Amodei’s view on where models go in the next 6 to 12 months, not necessarily his theory of labour markets and comparative advantage.
    • The doomer scenario of every company buying ChatGPT and firing everyone in two weeks misunderstands how enterprises work. Enterprise sales cycles run 18 months or more. Nobody is ripping out SAP overnight. The full transformation takes 3 to 10 years, sector by sector.
    • AGI and superintelligence are being quietly redefined to mean whatever works now. Larry Tesler’s theorem: AI is whatever machines cannot do yet, because once they can, people call it just software.
    • We have no theory of human intelligence, no theory of why these models work, and no theory of how much better they will get, so everyone is vibes-forecasting. Even if progress stopped tomorrow, what exists is already transformative and will roll out for a decade.
    • On value capture, Evans argues models show no network effects, so no single one runs away with the market. Persistent competition plus little real product differentiation means little pricing power.
    • Sam Altman’s pitch of selling intelligence on a meter like electricity ignores the brutal margin structure of utilities. Your TV maker does not pay the power company a cut of your bill.
    • The telecom analogy: a roughly trillion-dollar mobile industry spends 15 to 20 percent of revenue on capex, grew data consumption 1,500 to 2,000 times since 2010, and its stocks went nowhere for 25 years because it is a low-margin commodity utility.
    • The elemental question: does the model do the whole thing, or does it need thousands of different apps built by different people? If it needs apps, the labs cannot build them all, just as Microsoft did not, so it looks more like AWS than like Windows.
    • If the product is a commodity, distribution becomes the moat. Google pushes Gemini through its surfaces, Meta sprayed AI across its apps and quietly ranked between ChatGPT and Gemini in usage, and incumbents with distribution have a structural edge.
    • Browsers are the warning: Microsoft used distribution to win the browser war, then it turned out winning browsers did not matter because the value was further up the stack.
    • Apple Intelligence, as shown at WWDC 2024, was the most compelling vision of a personal AI assistant Evans has seen. Apple could not ship it, but neither could anyone else, because tool-using on-device agents with no hallucinations across thousands of apps is genuinely hard.
    • The model is “the dumb thing underneath” that powers a feature. The same commodity model can sit beneath both Gemini on Android and Apple Intelligence on iOS while the products and distribution differ entirely.
    • The anti-AI backlash is a big fuzzy mess. Some is real (local electricity bills, deepfakes, real job anxiety), some is sort of true, and some is simply false.
    • The data-center water panic is largely fake. A Livermore lab study put US data-center water consumption at about 0.017 percent of US water use. Local well conflicts are planning problems, not data-center problems.
    • We have shockingly little hard data. The model labs do not publish meaningful usage numbers. There is no public daily active user figure for ChatGPT, so economists are reverse-engineering effects from government surveys.
    • Real new harms do appear with each wave. A teenager could not use Photoshop to make explicit fakes of every classmate and send them to the whole school in an afternoon. Now they can, and turn them into video.
    • The UK Post Office Horizon scandal (buggy Fujitsu software wrongly showing cash shortfalls, leading to prosecutions, bankruptcies, and suicides) is a reminder that every technology brings new ways to ruin lives, by malice or by accident.
    • You cannot reliably predict what gets exposed. In 1997 people thought taxis were safe from the internet and newspapers would be fine. The opposite happened. Today, “AI-proof” jobs like personal trainer may not be as safe as they look.
    • Uber and Airbnb show that similar-sounding companies can have very different market impact. Uber demolished and then grew the taxi market, while Airbnb’s effect on hotels was fairly marginal because business travel still wants a hotel.
    • Every new technology first lets you do the old thing but more, then unlocks things that were not possible before. Recorded music revenue is U-shaped: first “what if I do not pay 15 dollars for a CD,” then “what if 15 dollars a month gives me all the music there is.” Spotify is not an online music store, it is something else.
    • Coding was supposed to be one of the last things automated, and instead it is the most transformed role of all, which is itself a lesson in how badly we predict exposure.
    • Practical advice: do not stick your head in the sand. Dive in, submerge yourself, and come out understanding what you can do with it. Going into a shrinking job market announcing you will never use AI is not the right posture.
    • Evans’s honest coda: he struggles to find AI use cases because his job is synthesis and precise retrieval, the things models are worst at. He uses it for proofreading, images, redecorating his apartment, and dictation. He is the lawyer looking at VisiCalc.

    Detailed Summary

    AI is as big as the internet, and we are living in 1997

    Evans opens with the opinion he calls his most controversial: AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile. To some in tech that sounds dismissive, as if he is underrating a once-in-history event. His reply is that smartphones and the internet were themselves enormous, and we are talking over the internet right now. The deeper point is the comparison’s timing. If this is like the internet, then it is like the internet in 1997: thrilling, but most of it does not work yet, most of what will be built has not been built, and nobody knows how the pieces will fit. His latest 80-slide presentation, he jokes, is essentially 80 ways of saying “we do not know,” which is partly facetious and partly the entire point.

    The jagged frontier and the wide spread of adoption

    Adoption is not uniform, it is a wide distribution. Some people in tech have bought clusters of Mac minis and stopped using Google, while most people outside tech who use AI at all touch it once every week or two. Even among 13 to 18 year olds, daily active use sits around 15 to 20 percent, weekly use adds another 20 percent, and roughly 60 percent say they do not use it. That spread maps onto what Evans calls the jagged frontier: whether a given task works, whether you can predict in advance that it will work, whether it is intuitive, and whether you can even tell after the fact. Software developers are the accountants who just saw VisiCalc, living in a clear before-and-after. Everyone else is somewhere on the curve, picking it up to varying degrees and a little puzzled about what it is for.

    Why the AI labs are buying consultancies

    One of the most counterintuitive trends is that the leading labs are pouring money into forward deployed engineers and professional services, the very category many assumed AI would erase. Evans’s explanation is grounded in how companies actually operate. Firms do not keep spare people sitting around to redesign stores, hunt down churn, or rebuild a tech stack, which is exactly why they hire Bain, BCG, McKinsey, Accenture, or Infosys when a big project appears. Reimagining every internal workflow around AI, then actually plugging vertical and horizontal systems together and retraining people, is itself a multi-month project requiring people you do not have. So the work gets outsourced, and the most advanced labs are funding the firms that do it. His joke lands the point: a forward deployed engineer is a statistician, or an Accenture developer, who happens to work in San Francisco.

    The task versus the job

    This is the spine of the conversation. Ask what the hard part of a job really is. Sometimes the task is the job: the elevator attendant’s whole job was driving the car, the task got automated, the job ended. Much more often the visible task is only a slice. Amazon gets you the SKU once you know which SKU you want, but knowing what to buy is a separate job. Claude Code writes the code, but deciding what to build, for whom, and how to take it to market is the job. A consulting deck is the task, while the reason you pay Bain is for them to walk your company, understand its politics, talk to your customers, and tell you the truth. Evans notes you can already generate a bad McKinsey deck with AI, and the LinkedIn grifters who do are missing that the deck was never the thing you were buying.

    Jevons paradox and the lump of labour fallacy

    The Jevons paradox is just price elasticity applied to labour: make something cheaper to do and you usually do much more of it. Excel did not hand junior bankers their Friday afternoons off, it expanded the work. iPhone developers write a fraction of the raw code because Apple wrote the drivers and file system, and there are not a tenth as many engineers, there are far more. The count of accountants climbed through adding machines, punch cards, mainframes, databases, ERP, spreadsheets, and cloud. The lump of labour fallacy is the broader version: since 1800 every technology has removed jobs and unlocked new ones, the removed jobs usually look bad in hindsight, the new ones tend to be better, and GDP keeps rising. You can always see the job that disappears and never the one that does not exist yet.

    The jobs question, Dario, and the enterprise sales cycle

    On the coming jobs apocalypse, Evans is cautious about argument from authority. Running an AI lab makes Dario Amodei worth listening to on where models go in the next 6 to 12 months, not necessarily on labour economics and comparative advantage. The doomer image of companies buying ChatGPT and firing everyone within weeks misreads reality: enterprise sales cycles run 18 months or longer, nobody is tearing out SAP overnight, and the full transformation will take 3 to 10 years, sector by sector, as people slowly work out what to do. He points to the lag in software itself. Many SaaS companies founded the day before ChatGPT launched could have been built a decade earlier, and were not, because the delay was someone realizing a problem existed and that this was the way to solve it.

    Redefining AGI and superintelligence

    Evans is skeptical of the moving terminology. He cites Larry Tesler’s line that AI is whatever machines cannot do yet, because the moment they can, people call it just software. Machine learning, image recognition, and sentiment analysis all got reclassified as not really AI once they worked, the same way jet airliners were once high technology and are now just planes. AGI is now often quietly redefined as doing some percentage of economically valuable work, which a 1975 mainframe also did, rather than anything about consciousness or a soul. Whether we reach human-level intelligence is, in his view, genuinely unknowable right now. The reassuring point is that you do not need to resolve it. Even if models hit a brick wall tomorrow, what already exists is transformative and will take a decade to deploy.

    Where the value accrues: commodity models and the telecom analogy

    Here Evans makes his most deterministic argument. Foundation models appear to lack network effects, so no single model runs away from the pack, competition persists, and product differentiation as users experience it is thin. Without differentiation or lock-in, where does pricing power come from? He skewers Sam Altman’s image of selling intelligence on a meter like electricity by pointing out that utilities have terrible margins and nobody pays the power company a cut of their TV. His telecom career supplies the analogy: mobile is a roughly trillion-dollar industry that spends 15 to 20 percent of revenue on capex, grew data traffic 1,500 to 2,000 times since 2010, and whose stocks went nowhere for 25 years because it is a low-margin commodity utility while the value sits up the stack with Apple and the app makers. If models are commodities and the real product is thousands of apps the labs will not build, the outcome looks like cloud, not like Windows.

    Distribution as the moat

    If the product is a commodity, distribution decides the winners. The web browser is the cautionary tale: the browser product is a thin wrapper around a rendering engine, tab browsing was the last real innovation 20-plus years ago, Microsoft used distribution to win, and then winning browsers turned out not to matter because the value was elsewhere. Now Google drives Gemini through its surfaces and Meta sprayed AI across its apps and, in survey data, sat between ChatGPT and Gemini in usage despite tech writing it off. An adequate product with great distribution and brand becomes a big deal, which is why OpenAI spent last year trying everything to build a flywheel before the giants defaulted everyone onto their own offering. The power of the default and sheer inertia do a lot of work.

    Apple Intelligence and the model as the dumb thing underneath

    Evans calls the Apple Intelligence segment of WWDC 2024 the most compelling vision of a personal AI assistant he has seen: tool-using, on-device, agentic, with no prompt injection or hallucinations across a standardized API spanning thousands of apps. Apple could not ship it, but neither could anyone else, because that is genuinely hard. The episode illustrates his framing that the model is “the dumb thing underneath” that powers a feature. The same commodity model can sit beneath Gemini intelligence on Android and Apple Intelligence on iOS, with different products, different distribution, and different decisions about what the feature should be. Apple has a billion edge-capable devices, while Google’s “coming soon to our most powerful devices” really means it will not work on most Android phones.

    The anti-AI backlash, water, and real harms

    The backlash, Evans says, is a big fuzzy mess of very different things. Some is tangible, like a higher local electricity bill in a small number of places. Some is essentially fake, like the water panic. He dug into a Livermore lab study putting US data-center water use at about 0.017 percent of national consumption. Local well conflicts are planning failures, not data-center failures. The jobs piece is genuinely unresolved, with charts pointing both ways and a youth employment slowdown that shows up regardless of degree or AI exposure. He stresses how little hard data exists, since the labs publish no meaningful usage numbers and there is no public daily active user figure for ChatGPT. He compares the moment to the social media backlash, compressed, where some fears were true, some half true, and some simply false. The real new harms are real, though: deepfakes let a teenager generate explicit fakes of an entire school in an afternoon, and the UK Post Office Horizon scandal shows how buggy software plus institutional denial can destroy lives.

    You cannot predict what gets exposed, and what to actually do

    Evans dismisses the O*NET-style exercise of scoring what percentage of each profession AI can do as deluded, the modern version of the expert-systems problem, where you try to describe a job as 700 logical steps and it never works. You cannot say a senior partner’s work is 17 percent automatable. The history of prediction is humbling: in 1997 people thought taxis were safe from the internet and newspapers would simply save on printing, and both were wrong. Coding, supposedly one of the last things to automate, became the most transformed role of all. Personal trainers might be next once your phone can watch your form. His closing advice is to presume radical uncertainty and act anyway: do not retreat into sneering moral superiority, dive in, internalize what the tools can do, and make yourself a great hire. He ends with a candid admission that his own synthesis-and-retrieval job is exactly what AI is currently worst at, so he is the lawyer looking at VisiCalc, sure it changes everything while not personally making spreadsheets all day.

    Notable Quotes

    “My most controversial opinion is that I think that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile.”

    Benedict Evans, stating the thesis that frames the whole conversation

    “If you’re going to make the internet comparison, it’s like we’re in 1997. It’s very exciting. Most stuff kind of doesn’t work yet. Most of the stuff that people are going to do hasn’t been built yet.”

    Benedict Evans, on why confident predictions about AI winners are usually wrong

    “You can’t look at a senior partner at a law firm and say, well, 17 percent of their work could be automated. This is horseshit.”

    Benedict Evans, on why O*NET-style job-exposure scoring fails

    “Claude Code can write you the code, but what code do you want? It can make you the features, sure, but what features do you want? Who’s your customer? What’s the right product for that customer?”

    Benedict Evans, drawing the line between the task and the job

    “There’s this quote from Sam Altman where he said we’re going to be selling AI intelligence on a meter like water or electricity, and you look at this and think, my dear sweet child, you need me to explain the margin structure of the utility industry to you.”

    Benedict Evans, on why model labs may lack pricing power

    “The model is just the dumb thing underneath that powers the feature. The model is the commodity that powers different decisions about what the feature should be.”

    Benedict Evans, on why value moves up the stack to applications

    “Every time we have a new technology it automates away a bunch of jobs, and then that automation unlocks a bunch of new jobs, and you don’t know the new job because it doesn’t exist yet.”

    Benedict Evans, on the lump of labour fallacy and 200 years of automation

    “Don’t stick your head in the sand and say I hate all of this stuff. That gives you a great feeling of moral superiority, but that’s not going to help. What helps is you diving into this and coming out understanding what you can do with it.”

    Benedict Evans, on what to actually do about AI right now

    “AI is good at stuff that computers are bad at, and bad at stuff that computers are good at.”

    Benedict Evans, quoting an observation that explains why he struggles to use AI in his own work

    This is a curated set of pulls, not a transcript. To hear the full argument in context, including the telecom and recorded-music charts and the lightning round, watch the full conversation on YouTube here.

    Related Reading

  • Gavin Baker on Orbital Compute, TSMC, Frontier AI Models, Anthropic’s Vertical Take Off, and the Coming Wafer Shortage

    Gavin Baker, founder and CIO of Atreides Management, returns to Patrick O’Shaughnessy’s Invest Like the Best for his sixth appearance. He calls the current AI moment the most extraordinary moment in the history of capitalism, walks through what Anthropic’s vertical takeoff in revenue actually means, lays out why orbital compute is closer than skeptics believe, dissects the TSMC bottleneck that may be the only thing standing between today’s market and a full-on AI bubble, and rates every hyperscaler on how they have positioned for a world where frontier model providers may stop selling API access altogether.

    TLDW

    Anthropic added eleven billion dollars of ARR in a single month, which is roughly the combined business of Palantir, Snowflake, and Databricks built over a decade. That is the setup. From there Gavin Baker covers the March and April selloff, the contrarian read that a closed Strait of Hormuz was actually bullish for American manufacturing competitiveness, why Anthropic and OpenAI multiples may be misleadingly cheap on an unconstrained run rate basis, why Elon Musk’s discipline on SpaceX valuation created a superpower of permanent access to capital, the practical engineering case for orbital compute as racks in space rather than Pentagon sized space stations, why TSMC’s capacity discipline is the single most important variable in whether the AI cycle becomes a bubble, what Terafab in Texas changes, why the Pareto frontier of AI models has flipped from Google dominance to Anthropic and OpenAI dominance in nine months, the shift from all you can eat AI subscriptions to usage based pricing and what that means for revenue scaling, Richard Sutton’s bitter lesson as the largest risk to the AI trade, why frontier tokens still capture an overwhelming share of economic value, the role of continual learning as the third great open question, why most new chip startups should not try to build a better GPU, why Cerebras did something different and hard, why disaggregated inference may extend GPU useful lives to ten or fifteen years and rescue the private credit industry, why being in the token path is the new venture filter, the new prisoner’s dilemma around releasing frontier models via API, an honest rating of Google, Meta, Amazon, and Microsoft, why personal safety is becoming a real AI era risk, and why he remains an AI optimist maximalist who believes this could be the next Pax Americana.

    Key Takeaways

    • Anthropic added eleven billion dollars of ARR in one month, more than the combined businesses of Palantir, Snowflake, and Databricks built across a decade. There is no precedent for this in the history of capitalism.
    • The SaaS and cloud revolution created between five and ten trillion dollars of value over twenty years. AI is replaying that compression on a timeline measured in months.
    • The March selloff was a drawdown driven by disagreement with price action, not invalidated thesis. That is the kind of drawdown an investor can lean into.
    • Deep Seek Monday in January 2025 was a similar setup. By the day of the selloff, AWS Asia GPU prices had already doubled, GPU availability had fallen, and it was obvious reasoning models would be vastly more compute hungry at inference. The market priced the opposite.
    • The Strait of Hormuz closing was actually positive for America. US natural gas (the primary input into US electricity, which feeds AI) fell twenty percent on Bloomberg while Asian and European natural gas doubled or tripled. American manufacturing competitiveness improved overnight.
    • The US is now the world’s largest producer and exporter of oil and gas. The economy is dramatically less energy intensive than in the 1970s. The shortage trauma comparison does not hold.
    • Tech as a sector traded as cheaply versus the rest of the market in early April as at any point in the last ten years, into the single most bullish moment for AI fundamentals on record.
    • Anthropic is dramatically more capital efficient than OpenAI, having burned roughly eighty percent less to reach a similar revenue scale. They have very different structural returns on invested capital.
    • Anthropic at roughly nine hundred billion for fifty billion of ARR (growing a thousand percent) is striking. Adjusted for compute constraint, the unconstrained run rate could be one hundred fifty to two hundred billion, putting the implied multiple closer to five times.
    • Claude Opus generates roughly seventy percent fewer tokens for the same question than previously, with token quantity tied to answer quality. Subscribers on flat-fee plans are getting a lobotomized model.
    • Elon Musk’s superpower is twenty years of making investors money. He never pushes valuation. SpaceX compounded low thirty percent per year for a decade because Musk treats fair pricing as a sacred covenant.
    • Capitalism will solve the watts shortage. The current bottleneck has shifted from chips and energy to zoning and political approval. Many capex decisions are paused until after the US midterms.
    • The watts shortage probably begins to alleviate in 2027 and 2028. Orbital compute solves it longer term.
    • Orbital compute is not Pentagon sized data centers in space. It is racks in space. A Blackwell rack is three thousand pounds, eight feet tall, four feet deep, three feet wide. SpaceX has shown a satellite roughly that size.
    • The satellites operate in sun synchronous orbit so solar wings (around five hundred feet per side) always face the sun and the radiator on the dark side always points to deep space.
    • Starlink V3 satellites already run at around twenty kilowatts. A Blackwell rack runs at one hundred kilowatts. SpaceX engineers express genuine confidence they have already solved cooling and radiator design at these scales.
    • Racks in space are connected with lasers traveling through vacuum, the same lasers already on every Starlink. SpaceX operates the world’s largest satellite fleet and, via xAI Colossus, the world’s largest data center on Earth.
    • Inference will move to orbit. Training will stay on Earth for a long time. Terrestrial data centers remain valuable for the rest of an investor’s career.
    • The wafer bottleneck is structural and political. TSMC is essentially Taiwan’s GDP, water, and electricity. The leaders see themselves as inheritors of Morris Chang’s sacred legacy and they do not behave like a Western public company.
    • Jensen Huang has never had a contract with TSMC. The relationship is run on handshakes and the assumption that things will be fair over time.
    • If TSMC did everything Jensen wanted, Nvidia could be selling two to three trillion dollars of GPUs in 2026 and 2027. TSMC’s discipline is the single largest factor preventing a true AI bubble.
    • Historically, foundational technologies always get a bubble. Railroads, canals, the internet. The current AI buildout is overwhelmingly funded out of operating cash flow, GPUs are running at one hundred percent utilization, and that is fundamentally different from the year 2000 fiber overbuild.
    • If one of Intel or Samsung Foundry catches up at the leading node, the other will follow, and TSMC’s discipline collapses. Watch TSMC capacity decisions to predict a bubble.
    • Terafab, the SpaceX and Tesla joint venture to build the world’s largest fab in America, has a partnership with Intel that grants access to fifty years of institutional foundry knowledge. The A teams at ASML, KLA, Lam Research, and Applied Materials will follow Elon’s reputation in hardware engineering.
    • The hiring playbook for Terafab includes building Taiwan Town, Japan Town, and Korea Town next to the fab. Recruit the engineers and import their families, their restaurants, and their staff.
    • Frontier tokens still capture an overwhelming share of all economic value created at the model layer. This is surprising and is one of the three big open questions for AI investing.
    • The Pareto frontier of intelligence versus cost has flipped. Nine months ago Google’s TPU dominated every point on the frontier. Today Anthropic and OpenAI dominate, with Grok 4.3 on the frontier and Gemini 3.1 hanging on.
    • Google’s conservative TPU V8 design (partly an attempt to reduce dependence on Broadcom and Nvidia) is the leading explanation for the loss of per token cost leadership.
    • AI pricing is shifting from all you can eat to usage based, mirroring the cellular and long distance industries. Cellular stopped being a great growth industry when it went all you can eat. AI just made the opposite move.
    • OpenAI and Anthropic together could exceed two hundred billion in ARR this year if compute keeps coming online and frontier token pricing holds.
    • The two hundred fifty dollar a month consumer AI plan is no longer enough to evaluate frontier capability. Enterprise plans with usage based billing are required because rate limits are now severe.
    • The three biggest open questions for AI investors are: violation of the bitter lesson via ASI or human ingenuity, whether frontier tokens keep commanding their premium, and when continual learning arrives.
    • Today’s continual learning is crude reinforcement learning during mid training on verifiable tasks. True continual learning means weights updating dynamically, like a human who learns the first time they touch fire.
    • Trying to build a better GPU is a losing strategy. Jensen will copy any one to three percent share design. Startups should target one percent share, do something different, and make it hard enough that Nvidia cannot fast follow.
    • Disaggregated inference (separating prefill and decode) opens new design canvases. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently.
    • Cerebras did something different and hard with wafer scale computing. Three generations of chips and real grit to get there.
    • Disaggregation of inference may stretch GPU useful lives to ten or fifteen years, dropping financing costs from low sevens to five or six percent, mathematically lowering the cost of the AI buildout and likely saving the private credit industry from its SaaS loan exposure.
    • Sellers of shortage outperform buyers of shortage. But owning the largest installed base of what is currently in shortage (hyperscaler CPU fleets, for example) is also a strong position.
    • Most of the economic value at the application layer of AI has been destroyed, not created. The exceptions are companies in the token path or in niches small enough that frontier labs ignore them.
    • Coding may be the shortest path to ASI. If you can write code, you can write code that does anything. Cursor, Cognition, and Anthropic correctly focused on it.
    • Jensen could probably get close to the frontier with his own Nemotron family of models whenever he wants. The fact that he chooses not to is a strategic decision about not commoditizing his customers.
    • The new prisoner’s dilemma in AI is whether frontier labs release their best model via API. If everyone agrees not to, Chinese open source falls behind. If anyone defects, the defector pulls ahead on revenue and resources, forcing everyone else to defect.
    • Google still owns the largest compute installed base. Without TPU’s prior cost advantage, this matters more. YouTube data has real value in a world of robotics. GCP is going crazy.
    • Meta deserves credit for becoming AI first internally faster than any other internet giant. Musa, their first MSL model, is impressively close to the Pareto frontier.
    • Amazon is strong because of Trainium and robotics driven retail P&L efficiency. Nova is better than it gets credit for.
    • Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Microsoft products rather than reselling to OpenAI is a courageous and probably correct call, even at the cost of an eight hundred dollar stock price.
    • The hyperscalers most engaged with startups are Amazon and Nvidia by a mile, followed by Google. Broadcom is the favorite ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement and that will cost them as the best teams are now at startups.
    • Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion at the speed of FaceTime is already feasible.
    • Ukraine is winning largely on the back of having the best battlefield AI outside America and Israel. Adversaries are starting to internalize what AI dominance means geopolitically.
    • An optimistic read is that this becomes a new Pax Americana, the way the post 1945 American nuclear monopoly was used to rebuild Germany and Japan rather than dominate.
    • AI cured a friend’s daughter’s rare disease by spinning up a research effort that identified a market drug capable of impacting her condition. That is the upside that keeps Gavin an AI optimist maximalist.

    Detailed Summary

    The most extraordinary moment in the history of capitalism

    Gavin’s framing of the current moment is unusually direct. Anthropic added eleven billion dollars of annual recurring revenue in a single month. The three highest profile SaaS companies of the last decade plus, Palantir, Snowflake, and Databricks, took a decade and tens of thousands of employees collectively to build the combined business that Anthropic added in thirty days. He has been investing through every major tech cycle and says there is no historical analog. Not the dotcom era, not the cloud transition, not mobile. This is its own thing.

    The market response, then, was peculiar. The NASDAQ sold off into the single most bullish moment for AI fundamentals on record. Tech traded at roughly its widest discount versus the rest of the market in a decade. Investors who said they wished they had bought into AI during 2022, during COVID, or during Deep Seek Monday got the same valuation setup again in early April, this time with an even clearer inflection.

    Why the Strait of Hormuz closing was secretly bullish for America

    One reason the macro fear in March may have been mispriced is that the same geopolitical event that drove the selloff was, in practice, a relative benefit to the United States. American natural gas, the input into American electricity, which is the input into American AI training and inference, fell roughly twenty percent. Asian and European natural gas prices doubled or tripled. The US emerged with sharply improved relative manufacturing competitiveness, which is exactly what the current administration cares about.

    The 1970s comparison does not hold. The US economy is dramatically less energy intensive, it is now the world’s largest producer and largest exporter of oil and gas, and there are no shortages, only price moves. That backdrop made it easier for disciplined investors to stay focused on AI fundamentals through the volatility.

    Anthropic and OpenAI valuations on an unconstrained run rate

    Anthropic at roughly nine hundred billion for fifty billion of ARR sounds rich until you adjust for the fact that the company is severely compute constrained. Gavin estimates that, unconstrained, Anthropic might be at one hundred fifty to two hundred billion in run rate revenue, putting the implied multiple closer to five times. He also points out that Claude Opus now generates roughly seventy percent fewer tokens for the same question than it used to. Token quantity correlates with answer quality, and Anthropic is rate limiting and shrinking outputs to ration capacity across its user base.

    Anthropic and OpenAI are also structurally very different. Anthropic has burned around eighty percent less cash than OpenAI to reach a comparable revenue scale. That implies very different long term returns on invested capital, though OpenAI has done a better job locking in compute and Sarah Friar is one of the most exceptional CFOs Gavin has worked with.

    Why neither lab is raising at a three trillion dollar valuation

    The answer Gavin gives is that both labs are deliberately leaving valuation on the table the way Elon has done for two decades. SpaceX compounded at low thirty percent annually for a decade because Elon never pushed price. The result is a permanent superpower of access to capital. Investors trust him because they have made money with him for twenty years. That is a moat that compounds with every round.

    Anthropic could probably raise at a one hundred percent premium to its rumored latest mark. They are choosing not to. In an uncertain world (Ukraine, Russia, Iran, Taiwan), preserving the ability to raise more capital later at fair prices is more valuable than maximizing this round.

    Watts and wafers, the two real constraints

    Capitalism is solving the watts problem. The leading PE infrastructure investors now say zoning and political approval, not chips or energy, are the gating factors. Companies are deferring big capex announcements until after the US midterms. Turbine capacity is being doubled at the manufacturers. Companies like Boom Aerospace are repurposing jet engines for grid use. Watts probably ease meaningfully in 2027 and 2028 and then orbital compute does the rest.

    Wafers are the harder problem because they live in Taiwan, run on handshakes, and depend on a corporate culture that does not respond to public market incentives. TSMC is essentially the GDP, water consumption, and electricity consumption of Taiwan. Its leadership treats the company as the legacy of Morris Chang. The Silicon Shield doctrine is real and internal.

    Orbital compute as racks in space

    The biggest mental update Gavin asks listeners to make is to stop picturing data centers in space as Pentagon sized space stations. A Blackwell rack is three thousand pounds and roughly the size of a refrigerator. SpaceX has shown a concept satellite of about that size. Solar wings extend five hundred feet to each side and the radiator extends hundreds of feet behind, both possible because the orbit is sun synchronous and the orientation is fixed relative to the sun.

    SpaceX engineers Gavin has spoken to at Starbase express genuine confidence that they have solved cooling at these power levels. They have. Starlink V3 satellites already operate at twenty kilowatts. A Blackwell rack is one hundred kilowatts. The same company operates the world’s largest satellite fleet and the world’s largest data center on Earth via xAI Colossus. The racks are connected to each other with lasers traveling through vacuum, technology already deployed in every Starlink. The naysayers, Gavin observes, are armchair skeptics and Larry Ellison’s response (he is out there landing rockets, no one else is) is the right frame.

    Terafab in Texas and the threat to TSMC’s discipline

    Terafab, the SpaceX and Tesla joint venture, intends to be the largest fab in the world. The partnership with Intel grants access to fifty years of foundry institutional knowledge, allowing Terafab to start three to five quarters behind the leading node rather than fifteen years behind. The A teams at the semicap equipment companies (ASML, KLA, Lam Research, Applied Materials) will follow Elon’s reputation in hardware engineering the same way they followed TSMC twenty years ago when Intel stumbled.

    The talent strategy is the part most observers underestimate. Recruit the best engineers globally, then import their families, their restaurants, their staff. Build Taiwan Town, Japan Town, and Korea Town next to the fab. Optimize the human experience for the people whose work matters. Intel and Samsung do not think that way.

    Bubble watch and the year 2000 comparison

    Every foundational technology in modern history has had a bubble. Railroads, canals, the internet. Carlota Perez documented why. Markets correctly identify the importance, diversity of opinion collapses, supply gets ahead of demand, the bubble crashes. The current cycle has two important differences. The buildout is overwhelmingly funded out of operating cash flow, not debt. Every GPU is running at one hundred percent utilization, while at the peak of the fiber bubble ninety nine percent of fiber was unused.

    TSMC discipline is the single largest reason a bubble has not formed. If Jensen could buy everything TSMC could theoretically make, Nvidia could sell two to three trillion dollars of GPUs in 2026 and 2027. At some point that becomes more than the market can absorb. If Intel or Samsung Foundry catches up at the leading node, the other will too. TSMC’s pricing discipline collapses and the bubble starts.

    The Pareto frontier and the loss of Google’s cost advantage

    The most important chart in AI is the Pareto frontier of model intelligence versus per token cost. Nine months ago, Google’s TPU based models dominated every point on it. OpenAI, Anthropic, and xAI sat inside the frontier. Today the frontier is dominated by Anthropic and OpenAI, with Grok 4.3 on the frontier and Gemini 3.1 hanging on by subsidization more than economics. The most likely cause is Google’s conservative TPU V8 design, an attempt to reduce dependence on Broadcom and Nvidia that sacrificed per token economics.

    The bitter lesson, frontier tokens, and continual learning

    Three open questions dominate AI investing. The first is whether Richard Sutton’s bitter lesson (more compute beats human algorithmic cleverness) gets violated by ASI itself optimizing for efficiency. Closer observers of AI are more skeptical of a violation. Gavin thinks ASI’s first move will be to make itself more efficient and more resourced, which is technically a temporary violation.

    The second is whether frontier tokens keep capturing the overwhelming share of economic value at the model layer. Today they do, surprisingly. Gemini 3.1 Pro was mindblowing nine months ago and is intolerable today. The third is when continual learning arrives. Today’s models need a million fire touches to learn what a human learns from one. True continual learning would mean dynamic weight updates in real time and would produce a fast takeoff.

    From all you can eat to usage based AI pricing

    AI is shifting from flat fee plans to usage based pricing. The historical analogy is cellular and long distance. Both stopped being great growth industries when they went all you can eat. AI just made the opposite move. The consequence is that flat fee subscribers, even on premium consumer plans, get a rate limited and token throttled version of the frontier model. Enterprise plans with usage based billing are now required to evaluate true capability. Gavin thinks the combination of new compute coming online and usage based pricing is what gets OpenAI and Anthropic past two hundred billion in combined ARR this year.

    Chip startups, prefill decode disaggregation, and Cerebras

    Trying to build a better GPU is the wrong move. The four scaled players (Nvidia, AMD, Trainium, TPU) have copy capability for any one to three percent share design that looks attractive. The good news for startups is that disaggregated inference (separating prefill and decode) opens a richer design canvas. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently. Andrew Fox’s analogy is a British naval ship of the eighteenth century. Prefill is loading the cannon. Decode is firing it.

    Cerebras is the model. Wafer scale computing is genuinely different and genuinely hard. It took three generations of chips to get right. Andrew Feldman and his team had the grit to keep going through chip one being a failure. The design has a high ratio of on chip compute and memory relative to shoreline IO, which is why Cerebras is now experimenting with putting an optical wafer on top of the compute wafer to solve scale out.

    GPU useful lives and the rescue of private credit

    One of the strongest claims in the conversation is that disaggregated inference will stretch GPU useful lives to ten or fifteen years. The skeptical narrative (GPUs are obsolete in two years, companies are cooking their depreciation books) is wrong. You can put a Cerebras system or Groq LPU in front of older Hopper or Ampere parts, use them only for prefill, and run them until they physically melt. Private credit, which is in pain from SaaS loans and which underwrote GPU loans on three to four year lives, may be saved by this.

    If GPU financing rates can come down from low sevens to five or six percent, the mathematics of the AI buildout improves materially. That is a structural tailwind that compounds for years.

    The application layer, the token path, and a new prisoner’s dilemma

    Trillions of dollars of value have been destroyed at the application layer, not created. Cursor and Cognition are the rare scaled exceptions, and they got there by focusing on coding very early. As Amjad Masad noted, coding is plausibly the shortest path to ASI because a coding agent can write itself into any new domain. Jamin Ball’s frame is that the new venture filter is whether the company is in the token path. Data Bricks is. Most application layer startups are not.

    Jensen could probably get close to the frontier with Nemotron whenever he wants, and the strategic question of whether to do that is a new prisoner’s dilemma. If every frontier lab agrees not to release best models via API, Chinese open source falls steadily behind. If anyone defects, the defector gains revenue and resources, and everyone else has to defect. The same dynamic exists between TSMC, Intel, and Samsung. If Nvidia or AMD ever truly used an alternative foundry, that foundry would catch up rapidly.

    Rating the hyperscalers

    Google has the largest compute installed base, the YouTube data that matters in a robotics world, and a search business that prints. Their loss of TPU cost leadership is the surprise of the year. If Google IO in five days does not produce a leapfrog model, the Nvidia centric narrative gets even stronger.

    Meta deserves real credit. Zuckerberg made Meta AI first internally faster than any other internet giant, paid up for the talent contracts when no one else would, and shipped Musa as a first model from MSL that is close to the Pareto frontier. Amazon is well positioned on Trainium, robotics in retail, and a Nova model line that is better than it gets credit for. Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Copilot rather than reselling to OpenAI is courageous and probably correct, even at the cost of stock price.

    The most interesting cross hyperscaler metric is startup engagement. Nvidia and Amazon engage deeply with startups. Google is next. Broadcom is the favored ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement, which Gavin believes will cost them as the best teams now sit at startups.

    Personal safety, geopolitics, and the Pax Americana case

    The closing section turns darker. Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion via something that looks exactly like your child calling on FaceTime is already feasible. Political violence against AI leaders is a real concern. Geopolitically, Ukraine is winning largely because it has the best battlefield AI outside America and Israel. How adversaries respond to that asymmetry is the next great variable.

    Gavin’s optimistic frame is the Pax Americana. After 1945 the US had a nuclear monopoly and could have controlled the world. Instead it rebuilt Germany and Japan, both of which became the most reliable American allies for the next eighty years. If AI dominance plays out similarly, this is a generationally positive story rather than a destabilizing one. The personal anecdote that closes the conversation is a friend whose daughter was diagnosed with a rare genetic condition. He spun up agents, identified a drug already on the market that addresses her mutation, and her life is immeasurably different because of AI. That is the upside.

    Thoughts

    The Anthropic eleven billion in a month framing is the kind of stat that resets priors. The right way to interpret it is not as a one off but as a measure of how fast value can compound when the underlying technology improves on a curve steeper than the ability of the rest of the economy to absorb it. The skeptical question is whether that ARR is durable or whether it is heavily tied to a customer base of other AI companies that are themselves on a single venture funded year of runway. The bullish answer is that frontier coding, frontier research, and frontier enterprise tasks are not going to stop being valuable, and Anthropic is the best at all three. Both can be true. The number is still extraordinary.

    The argument that TSMC discipline is the only thing preventing a bubble is the analytically tightest part of the conversation. The implied trade is to watch TSMC capacity additions like a hawk and to be more, not less, cautious if Intel Foundry or Samsung Foundry ever announce real share at the leading node. The Terafab thesis is more speculative but more interesting. If Elon’s talent recruiting playbook works and the Intel partnership gives Terafab a real seat at the table within five years, the geometry of the global semiconductor industry shifts in a way that is bullish for American manufacturing, bullish for power and water infrastructure in Texas, and ambiguous for TSMC itself.

    The Pareto frontier discussion deserves more attention than it usually gets. Pricing leadership in AI is not a vanity metric. It determines who can subsidize free tier usage, who can absorb compute shortages, who can ship cheaper enterprise plans, and ultimately whose model becomes the default for any given workload. Google losing per token leadership in nine months is one of the most under analyzed events in the sector and it explains a lot about why Anthropic and OpenAI are growing the way they are. If Google IO does not produce a leapfrog model, the implied verdict on TPU V8 design choices gets a lot harsher.

    The application layer destruction point is worth sitting with. Founders building on top of frontier models are competing in a world where the model itself moves faster than any moat they can build, where the model lab can absorb their niche if it gets interesting, and where the only protection is either deep token path integration or a niche so small the lab does not bother. That is a much harsher venture environment than the early SaaS era. The compensating opportunity is that one human can now run a hundred agents, so the ceiling on what a small team can build is correspondingly higher. The bet is that productivity per founder rises faster than competitive pressure from the labs. We will find out.

    The orbital compute pitch is the section that will polarize listeners. The naive read is that this is science fiction. The closer read is that every component (sun synchronous orbit, laser interconnect, twenty kilowatt satellite buses, ten thousand satellite manufacturing cadence, full rocket reusability) already exists. The remaining engineering problems are repair, maintenance, and radiator scale, all of which are real but tractable on a five to ten year horizon. The strategic implication is that the political and zoning ceiling on terrestrial data centers becomes less binding if orbital compute is a credible alternative for inference workloads. The investor implication is that being short the watts and cooling complex on a five year horizon is a real trade, not a meme.

    Watch the full conversation here.