PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: codex

  • Thomas Laffont of Coatue on the $4 Trillion AI IPO Wave: SpaceX, Anthropic, OpenAI, and Why the New Unicorn Economy Is Healthier

    Thomas Laffont, co-founder of the $55 billion hedge fund Coatue Management, made his All-In Podcast premiere with a data-dense walk through what he calls a once-in-a-generation moment for the unicorn economy. In front of Chamath Palihapitiya, Jason Calacanis, David Sacks, and David Friedberg, he argued that a roughly $4 trillion wave of private value is about to hit the public markets, led by SpaceX, Anthropic, and OpenAI, and that the new AI-driven unicorn economy is actually healthier than the one that came before it. You can watch the full presentation and Q&A on YouTube.

    TLDW

    Laffont presents Coatue’s slide deck on the state of the unicorn economy and argues it has rebalanced after the excesses of 2021. The average unicorn is up about 70 percent since September 2024, AI keeps taking a bigger share of all fundraising, and the model has shifted from many small unicorns to fewer companies each raising far more, with funding per unicorn up roughly 5x since 2021. He introduces a “Magnificent 8” private index (SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, Anduril, and more) worth nearly $4 trillion that has crushed the public Mag 7, then shows that exits are finally thawing as SpaceX heads to an IPO in weeks and Anthropic confidentially files its S1. He lays out Coatue’s “CODE” framework for why SpaceX gets more valuable the more it launches, a counterintuitive finding that the odds of a 10x actually rise as companies get bigger (31 percent for $100 billion-plus centicorns), the explosive revenue ramp of OpenAI and Anthropic past Workday, ServiceNow, Adobe, Salesforce, and now the hyperscalers, a three-pillar map of where AI revenue comes from (consumer, ads, enterprise), and the AI memory thesis. The Q&A with Chamath and Calacanis digs into the power law, K-shaped outcomes, whether these valuations are disconnected from reality, the public market as the great antiseptic, and what happens when trillions in private value finally recycles back through GPs and LPs.

    Thoughts

    The most useful idea in the talk is not the $4 trillion headline, it is the cohort-health chart. Laffont splits unicorns into eras and shows that the pre-2021 cohort was healthy, roughly 80 percent had raised again or exited 20 quarters after minting, while the giant 2021 ZIRP cohort of 479 companies is stuck with under 20 percent doing either. That single comparison reframes the whole AI boom. The bullish read is that the 2024 AI cohort is small, concentrated, and cash-generative, so it looks more like the healthy pre-ZIRP group than the 2021 hangover. The bearish read is that we are watching the same movie with bigger numbers, and the test only comes when these companies face public markets. Laffont is honest that we do not yet know which cohort the AI class resembles, and that intellectual humility is what makes the deck credible rather than promotional.

    The SpaceX “CODE” framework is the sharpest analytical move of the presentation. Most people would assume a launch business gets cheaper per launch as it scales. Laffont shows the opposite, the market pays more per launch as cadence rises, and explains it as a phase change in business quality: from one-time government launch revenue, to a single recurring-revenue constellation, to multiple constellations, to a platform with optional upside in space data centers, the moon, and Mars. It is a clean way to think about any company that climbs from a project business to a platform business, and it applies far beyond rockets. The lesson for investors is that valuation can rationally expand even as unit economics look like they should compress, because the nature of the revenue underneath is changing.

    The counterintuitive 10x odds finding deserves more attention than it got in the room. Conventional wisdom says the bigger you are, the harder it is to grow, so a $100 billion company should be less likely to 10x than a $10 billion one. Coatue’s data says the reverse: centicorns have a 31 percent shot at a 10x, far higher than the 8 percent a unicorn has at becoming a decacorn. Laffont’s explanation is a filtering mechanism, every step up validates a compounding advantage and durability of earnings, so survivors are increasingly the kind of business that keeps compounding. This is essentially a quantitative restatement of quality investing, and it is the intellectual backbone of the LP strategy the besties tease out, just buy whoever reaches $100 billion and hold.

    Where the argument gets genuinely contested is valuation, and the panel does not let it slide. The pushback that “these are not fake companies” is true and important, OpenAI and Anthropic are growing faster than any software company in history, and Anthropic reportedly had a profitable month. But growth and reality do not settle the question of price when you are paying 50 to 100 times revenue for trillion-dollar private companies, as Bill Ackman pointed out earlier in the day. Laffont’s answer is the most grounded thing he says all session: the public market is the great antiseptic, it will not care about anyone’s slide deck, and he wants to see these names withstand short sellers and skeptics. That is the right posture. The deck is a thesis, not a verdict, and the verdict arrives roughly six months and one day after the IPOs, once passive flows and supply have washed through.

    The closing thread, that almost every sector is being transformed at once and we still do not have superintelligence, is the part worth sitting with. The risk in a presentation this bullish is treating the trend as destiny. The value is in the framing tools Laffont hands you, cohort health, phase-change business quality, the filtering odds, the three revenue pillars, and the antiseptic of public scrutiny. Use those to interrogate each name rather than to buy the index on faith, and the talk earns its premiere billing.

    Key Takeaways

    • Coatue Management is one of the most successful hedge funds of the last two decades with about $55 billion under management, and is raising roughly another billion dollars specifically to invest in AI.
    • The unicorn economy is up about 70 percent on average since September 2024, and the public market has made a similar move up over the same period.
    • The unicorn economy’s share of the NASDAQ rose significantly after 2015 but has plateaued in recent years, reflecting strong performance from public companies.
    • AI keeps increasing its wallet share of all venture fundraising, multiple years in a row now.
    • The composition of funding has changed. The unicorn “factory” peaked in the ZIRP era of 2021 and has normalized at a much lower level since.
    • Funding per unicorn has increased roughly 5x since 2021. There are fewer unicorns, and each one is raising more.
    • Cohort health, pre-ZIRP group: of about 73 unicorns, 20 quarters after minting roughly 80 percent had either raised a new round or exited, which is healthy.
    • Cohort health, 2021 group: of about 479 unicorns, 20 quarters in, fewer than 20 percent had exited or raised again. Far larger cohort, far worse outcomes.
    • The open question is which cohort the new 2024 AI cohort will resemble.
    • Funding is concentrating: the top 10 companies capture a large share, and it is a small number of AI companies, not all of them, with Anthropic and OpenAI raising massive rounds.
    • Laffont proposes a “Magnificent 8” private index: SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, Anduril, and more, spanning internet, AI, fintech, and space tech.
    • That private index represents almost $4 trillion of value and has crushed the traditional public Mag 7, with almost every name outperforming.
    • Exits are thawing. 2026 is on a good trend for cash returned versus consumed, not quite 2021 levels, with half a year still to go.
    • That trend does not yet include three imminent liquidity events: SpaceX (IPO expected in weeks) and Anthropic (confidentially filed its S1), whose combined value could exceed the prior decade of exits combined.
    • The ecosystem is far more balanced than when Laffont first presented at the 2024 All-In Summit, when it was consuming much more cash than it returned.
    • OpenAI and Anthropic revenue growth is unlike anything previously seen. Starting from January 2025, they passed Workday, then ServiceNow, then Adobe, then Salesforce, and are now bigger than Google Cloud and Azure.
    • On current forecasts, that revenue could pass AWS by the end of the year and exceed all of Microsoft by 2028.
    • Hyperscalers are not sitting still. The largest companies in the world are funding the disruption, investing unprecedented sums to enable the ChatGPT moment.
    • The SpaceX “CODE” framework: the number one driver correlated to SpaceX’s valuation is cadence of launches, and valuation per launch rises as launches increase.
    • Why per-launch value rises: business quality improves through phases, pre-constellation (one-time government revenue), initial ramp (one recurring-revenue constellation), scale (multiple constellations), and platform (space data centers, moon and Mars optionality).
    • Anthropic in particular is scaling like no company seen across the PC, internet, or mobile eras.
    • Counterintuitive 10x odds: a unicorn has about an 8 percent chance of becoming a decacorn, a decacorn has 8 to 13 percent odds of reaching $100 billion, but a centicorn ($100 billion-plus) has a 31 percent chance of a 10x.
    • Value creation has accelerated. It typically takes years to go from $500 billion to $1 trillion in market cap, yet recently three companies did it in one year and two did it in a matter of weeks.
    • Cerebras is the counterexample of slow success: years of dark periods and no new capital developing its technology, then a massive OpenAI contract that quintupled the company’s value ahead of its IPO.
    • Semiconductors are on a generational run, with the sector dramatically outperforming the index since the 2024 All-In Summit.
    • AI memory thesis: the more an AI system knows about you, the more useful it is, so memory per user could quintuple, which helps explain recent moves in memory companies.
    • Where the revenue is: the AI ecosystem is roughly $140 billion today, about $300 billion this year, and is expected to double in 2027.
    • Three revenue pillars: consumer (subscribers times ARPU), ads (about a quarter of Meta and Google ads are AI-enabled today, heading toward 100 percent and roughly $150 billion), and enterprise (tools like Claude Code and Codex inside businesses).
    • Disruption is hitting every sector: software, telco (Starlink-powered global phone calls), semis, energy (data centers reshaping Pennsylvania’s grid), auto (Ferrari’s electric and autonomous stumble), and consumer (GLP-1s reshaping food, alcohol, and wellness).
    • Final takeaways: the new unicorn economy is healthier thanks to AI, winners are compounding faster so the cost of not owning a winner is higher than ever, disruption is everywhere, and we do not even have superintelligence yet.
    • In the Q&A, both Anthropic and OpenAI publicly say they want to be public, and big outcomes now look likely to become liquid within roughly a 12-month window.
    • The valuation pushback: these are not fake companies, they generate substantial revenue at scale and grow faster than anything before, and Anthropic reportedly even had a profitable month.
    • The public market is framed as the great equalizer and antiseptic, but with passive buying the true price discovery may not land on day one, more like six months and a day after listing.
    • A floated LP strategy: wait for whoever reaches $100 billion and concentrate capital there as the least brittle, quickest-return bet, tempered by the warning that valuations are disconnecting from any historical metric (50x to 100x revenue).
    • An open risk: with so much capital, OpenAI and Anthropic could rationally start a price war, the way ride-sharing and food-delivery players once did, though heavy infrastructure spend complicates it.

    Detailed Summary

    The unicorn economy has rebalanced after 2021

    Laffont opens by reframing a market many assume is frothy. The average unicorn is up about 70 percent since September 2024, and the public market has tracked a similar climb, so private and public value are moving together rather than diverging. The unicorn economy’s share of the NASDAQ rose sharply after 2015 and then plateaued, which he reads as a sign of how strong public companies have become. Underneath the headline, the structure of funding has changed. The 2021 ZIRP era was a unicorn factory that minted enormous numbers of companies, and that machine has since normalized to a much lower level. The result is a barbell: fewer new unicorns, but each raising far more, with funding per unicorn up roughly 5x since 2021. AI sits at the center of this, taking a steadily larger share of all venture dollars for several years running.

    Cohort health is the real story

    The deck’s most important slide measures the health of the ecosystem by cohort. The pre-ZIRP cohort, about 73 unicorns, looks healthy: 20 quarters after becoming unicorns, roughly 80 percent had either raised a new round or exited. The 2021 cohort tells the opposite story. It is enormous, about 479 unicorns, and 20 quarters in, fewer than 20 percent had raised again or exited. That contrast sets up the central question of the talk. A new 2024 cohort of AI companies is forming, and no one yet knows whether it will resemble the healthy pre-ZIRP group or the bloated, stuck 2021 group. Laffont’s framing leans optimistic because the AI cohort is small and concentrated, but he is careful not to declare the answer.

    The Magnificent 8 and a $4 trillion private index

    Funding is not just flowing to AI, it is flowing to a handful of AI names, with the top 10 capturing a large share and Anthropic and OpenAI raising the biggest rounds. From this concentration Laffont builds a private index he half-jokingly calls the Magnificent 8, a number he expects to shrink as companies go public. The members span sectors: SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, and Anduril, covering internet, AI, fintech, and space tech. He says he would be comfortable owning that index for the next decade-plus. Collectively it represents almost $4 trillion of value and has outperformed the public Mag 7, with nearly every constituent beating that benchmark.

    Exits are thawing and a wall of liquidity is coming

    One of Laffont’s recurring concerns at past summits has been balance: the unicorn economy is great at consuming cash, but a healthy ecosystem must also return it. On that score 2026 is trending well, not quite 2021, but solid with half a year left. Crucially, that figure does not yet include three imminent events. SpaceX is expected to go public within weeks, and Anthropic confidentially filed its S1 the day of the talk. Adding those up, just a few companies could deliver more liquidity than the prior ten years combined. The takeaway is that the ecosystem that was dangerously out of balance in 2024 is now meaningfully more balanced, and improving.

    The revenue ramp past the hyperscalers

    The growth rates of OpenAI and Anthropic, Laffont argues, are unlike anything previously seen. Charting from January 2025, the leading AI labs passed Workday, then ServiceNow, then Adobe by year end, then Salesforce by January, and are now bigger than Google Cloud and Azure. On forecast, that revenue could surpass AWS by the end of the year and exceed all of Microsoft by 2028. He stresses that the hyperscalers are not passive bystanders, they are actively funding the disruption, pouring unprecedented capital into enabling the change that began with the ChatGPT moment.

    The SpaceX CODE framework

    Laffont devotes real time to how Coatue thinks about SpaceX. The single factor most correlated with SpaceX’s valuation is cadence of launches, which is intuitive for a launch business. The surprise is that valuation per launch has risen rather than fallen as cadence climbed. His explanation, the CODE framework, is that the quality of the business model improves the more SpaceX launches. In phase one, pre-constellation, you are simply proving rockets, with a few government customers and lumpy, unpredictable one-time revenue. In the initial ramp you stand up a constellation, which is an end market and a recurring-revenue business that grows with every satellite and subscriber. At scale you operate multiple constellations, and Laffont expects companies, governments, and militaries to want to own their own. Ultimately it becomes a platform, with new businesses layered on top, from space data centers to the optionality of the moon and Mars.

    Counterintuitive odds and the speed of value creation

    Coatue bucketed companies and asked the odds of a 10x within each. A unicorn has roughly an 8 percent chance of becoming a decacorn. A decacorn has 8 to 13 percent odds of reaching $100 billion. But a centicorn, $100 billion or more, has a 31 percent chance of a 10x, counting both public and private companies. The bigger you are, the better your odds, which inverts intuition. Laffont pairs this with the sheer speed of recent value creation. Going from $500 billion to $1 trillion in market cap normally takes years, yet three companies did it in a single year and two did it in a matter of weeks. He also offers Cerebras as the patient counterexample, a chip company that endured years of dark periods and no new capital before a massive OpenAI contract quintupled its value ahead of IPO, part of a broader generational run for semiconductors.

    AI memory and where the revenue actually comes from

    A throughline from the day’s other speakers is that the more an AI knows about you, the more useful it is, from your restaurant preferences to your work context. Laffont turns that into a thesis: memory per user could quintuple based on what these systems require, which helps explain recent moves in memory companies. He then tackles the most contested question, where is the revenue. He sizes the AI ecosystem at about $140 billion today, roughly $300 billion this year, and doubling in 2027, built on three pillars. Consumer is subscribers times ARPU. Ads are the pillar people forget, with about a quarter of Meta and Google ads already AI-enabled and penetration heading toward 100 percent, a roughly $150 billion opportunity. Enterprise is the breakthrough category, exemplified by tools like Claude Code and Codex operating inside businesses.

    Every sector is being transformed at once

    What makes this era different, Laffont says, is that nearly every sector is being transformed simultaneously. Software is obvious, but look at telco, where he believes Starlink will soon power a device that lets you make a phone call anywhere on earth, attacking the global telco and broadband profit pool with a better product. Compute is driving massive change in semis, data centers are reshaping the energy equation in places like Pennsylvania, and the auto business is being upended, as Ferrari’s stumble introducing electric and autonomous technology showed. In consumer, GLP-1 drugs are profoundly changing consumption of food and alcohol and the broader focus on wellness. His takeaways close the loop: the new unicorn economy is healthier thanks to AI, winners are compounding faster so the cost of missing them is higher than ever, disruption is everywhere, and superintelligence has not even arrived yet.

    The Q&A: power law, valuation, and the public market test

    Chamath and Jason Calacanis press Laffont on what this means for allocators. The recurring theme is the power law and K-shaped outcomes, with gains consolidating into a small number of companies. The positive side, Laffont notes, is that outcomes are enormous and increasingly liquid within a 12-month window, and both Anthropic and OpenAI say they want to be public. The hard part is valuation. The besties cite Bill Ackman’s framing that investors are making venture bets on trillion-dollar companies at 50 to 100 times revenue. Laffont’s pushback is that these are not fake companies, they generate substantial revenue at scale and grow faster than anything before, and Anthropic reportedly had a profitable month. But he embraces the discipline ahead: the public market is the great antiseptic and will not care about anyone’s presentation, though with heavy passive buying, true price discovery may take roughly six months and a day rather than landing on day one. Asked whether the compounding is a market inefficiency or survivor bias, he declines to over-read a small sample, noting that Anthropic before Claude Code was a completely different company than after. The conversation closes on what happens when trillions recycle from GPs to LPs, the case for simply owning whoever crosses $100 billion, the risk of everyone crowding into three names, and the possibility of an eventual OpenAI versus Anthropic price war.

    Notable Quotes

    “So we have fewer unicorns that are each raising more.”

    Thomas Laffont, summarizing how funding per unicorn has risen roughly 5x since 2021

    “The reason is that the quality of SpaceX’s business model increases the more you launch.”

    Thomas Laffont, explaining the CODE framework and why valuation per launch rises with cadence

    “The winners are compounding faster than ever, which means the costs of not being in a winner are higher than ever.”

    Thomas Laffont, on the central risk of a power-law market

    “And by the way, we don’t even have super intelligence yet.”

    Thomas Laffont, closing his takeaways on how early the transformation still is

    “These are companies generating substantial revenue at scale that are growing faster than anything we’ve ever seen.”

    Thomas Laffont, pushing back on the idea that AI valuations rest on fake companies

    “It will be the great antiseptic. It will not care about my presentation.”

    Thomas Laffont, on the public market as the ultimate test for SpaceX, OpenAI, and Anthropic

    “Anthropic pre-cloud code was a completely different company than post cloud code.”

    Thomas Laffont, on why he won’t over-read a small sample of hyper-compounders

    “The power law rules our lives. All the great gains are being consolidated into small numbers of companies.”

    An All-In host, framing the Q&A on concentration in private markets

    This is a curated set of highlights. To hear the full presentation, the slide walkthrough, and the complete Q&A with Chamath and Jason Calacanis, watch the full conversation here.

    Related Reading

    • Coatue Management. Primary source for Thomas Laffont’s firm and the technology investing strategy behind the deck.
    • The All-In Podcast. The show and summit where Laffont made this premiere presentation.
    • Power law (Wikipedia). Background on the distribution Laffont and the hosts say governs venture and public-market returns.
    • The Magnificent Seven (Wikipedia). The public-market benchmark Laffont’s private “Magnificent 8” index is measured against.
    • Cerebras Systems. The AI chipmaker Laffont cites as the slow-grind IPO that was eventually transformed by a major OpenAI contract.
  • The AI Industrial Revolution: Naval, Guillermo Rauch, Blake Scholl, and Max Hodak on Software Factories, Vibe Coding Hardware, AI Regulation, Healthcare Economics, and What Humans Can Uniquely Do

    This is the full episode of Naval Ravikant’s conversation with three frontier founders: Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science. The premise is that all three are building their own factories rather than assembling off-the-shelf parts, so the interesting question is not what they are building but what they are learning about how to build in the age of AI. Over roughly an hour the discussion moves from software factories and the thousand-x engineer into hardware, regulation, healthcare economics, autonomous companies, and a long closing argument about what humans can still uniquely do. Watch the full conversation on the Naval Podcast YouTube channel. We previously published two segments of this same discussion: part one, Waste Tokens to Save Time, on software factories and whether pure software is dead, and part two, Vibe Coding Hardware, on jet engines, vertical integration, and China’s open-source bet. This post covers the entire episode end to end.

    TLDW

    Four builders argue that AI has turned the engineer’s job from shipping output into building the factory that produces output, which is why token leaderboards are the new vanity metric and why you should waste tokens to save time. Guillermo Rauch frames the thousand-x engineer and the building-block economy, and asks whether pure software is dead now that models speak English. Blake Scholl shows how Boom turned hardware engineering into software, letting two engineers design an entire jet engine and collapsing months of regulatory compliance documentation into minutes. Max Hodak makes the case for extreme vertical integration, a captive MEMS foundry, and a sober counter to Silicon Valley deregulation triumphalism: the bottleneck is the voters and the regulator’s asymmetric incentives, not just bad rules. The group works through healthcare as a fixed-bucket non-market, China’s cost-reduction strategy and its approved implantable brain interface, autonomous software that runs site reliability and security research with thousands of concurrent agents, a company-wide hackathon where the receptionist shipped a real automation, and a long debate on creativity, out-of-distribution surprise, intent, attribution, and the definition of art. The throughline: humans become verifiers, value moves to creativity, taste, and agency, and the single best move is to get extremely good with the tools, because it is people with AI versus people without AI.

    Thoughts

    The strongest idea in the episode is the quiet redefinition of what an engineer is for. Rauch’s point is that you no longer judge a person by how well they ship a single output. You judge them by whether they can build the factory that produces outputs B through Z. That reframe instantly explains why token leaderboards are nonsense. Counting tokens consumed is the same category error as counting lines of code written, a measure of motion mistaken for a measure of progress. Naval’s “waste tokens, save time” is the correct response: tokens are cheaper than people, so optimize for your own wall-clock time and the final output, and throw three models at the same problem if that gets you unstuck faster. The uncomfortable corollary, which the group says out loud, is that leverage in idea domains was never linear. The hundred-x and thousand-x engineer is not a new phenomenon. AI just made it impossible to keep pretending otherwise.

    The second thread that ties the whole hour together is verification. Everyone converges on the same future: humans stop producing the work directly and move up the stack to signing off on it. Rauch is precise about what that means. Saying “I understand this pull request” no longer requires reading every line. It requires being able to say you wrote the test harness, the proofs, the type checkers, and the simulations that let you stand behind it in production. That is a profound shift, because it accepts that the code may be spaghetti you do not fully understand while insisting that the evaluator around it is trustworthy. Blake extends the same logic to regulation, and this is the most underrated argument in the episode. If you treat a 200-page lightning-strike compliance document as a test suite and a regulation as an exit criterion for an agent loop, then a body of rules you once resented becomes a guard rail that lets you move faster, not slower. The cost of change collapses, change aversion drops, and you can finally afford to iterate on physical things.

    Max Hodak is the adult in the room on regulation, and the episode is better for it. The Silicon Valley consensus is that regulation is simply friction to be deleted, and there is plenty of dysfunction to point at: the NRC permitting essentially zero nuclear plants for decades, the FDA’s asymmetric incentives where approving a bad drug ends a career but blocking a good one costs nothing visible. But Hodak keeps pulling the conversation back to the harder truth. This is where the voters are. If you removed the current regulatory package, something very similar would get voted right back in, because the asymmetry reflects how the public actually weighs a visible death against an invisible delay. Real reform is not “deregulate,” it is narrow and surgical: prohibit the FDA from drawing adverse inferences across different users of a compound, build innovation zones where people consent to different rules, or copy Europe’s notified-body model so review capacity can actually scale. That is a far more serious position than the usual abundance-or-bust framing.

    The healthcare segment is the part of this conversation you will not find in the two clips, and it is the most heterodox. Hodak’s diagnosis is that healthcare is a fixed bucket of money that grows with tax receipts, not a technological growth industry where falling prices expand the market the way phones and laptops did. Because there is no real private market, you get a small communist society running inside a larger capitalist one, with the waiting lines and frozen product quality that implies. His prescription is not single payer and not insurance reform. It is to drive the cost of bringing devices and drugs to market so low that a patient can buy a restored sense or an extra decade of life on a credit card, the way they finance a car, and his warning is that China’s lower approval costs and its already-approved implantable brain interface put it on track to do exactly that. Whether or not you buy the twenty-percent-of-income deductible he floats, the framing that a private market is the missing feedback loop is the kind of argument that gets too little airtime.

    The closing debate on creativity is where the four of them disagree most productively, and they are careful enough to notice that their conclusions follow from their definitions. Hodak defines art as meaningful out-of-distribution behavior, which lets a military maneuver or a math proof count, and leads him to think a sufficiently capable model gets there too. Naval defines art as conveying an emotion with intent, which makes attribution load-bearing: the same photo down to the last pixel means more when a human took it, and a startup doing hardware attestation of human authorship suddenly has a real market. The shared observation that should worry every builder is that AI output collapses to a distribution mean. Every Claude-built website ends up the same serif font, the same brown and cream, the same monospace spacing, recognizable as slop precisely because it is in-distribution. The optimistic read, and the one Naval lands the episode on, is that this leaves an enormous and durable lane for humans who can step outside the system, and that the practical move for everyone is simply to become excellent with the tools, because the real divide is people with AI versus people without.

    Key Takeaways

    • The job of an engineer has shifted from shipping a single output to building the factory that produces multiplicative outputs, so people are now judged on the leverage they create rather than the work they personally do.
    • There were always 10x engineers, and in idea, intellectual, and digital domains the real spread is 100x or 1000x. AI leverage just made that gap impossible to deny.
    • Token leaderboards and token consumption are the new lines-of-code: a measure of activity that does not map to value. Measure your own time and the final output instead.
    • Waste tokens to save time. Models are still far cheaper than a human, so throwing Codex, Claude, and Gemini at the same problem repeatedly is rational even when it looks wasteful.
    • Low-quality first-pass code is fine because you can spend more tokens later to harden it for production. The constraint is verifiable domains, not code quality.
    • A model is roughly as good as you are in a domain. The quality of your prompting and reprompting strongly determines the output, though this dependence should fade as models improve.
    • Models graduated from junior to principal engineers: they now return with multiple routes and tradeoffs rather than running away with the first idea, even if their time and cost estimates are often wrong.
    • A junior gets knowledge they could never have produced alone, but an experienced architect still extracts far more juice. Taste and judgment, like picking Postgres versus ClickHouse, remain the human’s edge.
    • Pure software’s moat is in question now that models speak fuzzy, sloppy English. For hardware founders this is a boon, since good software finally becomes cheap to produce.
    • The building-block economy, from Mitchell Hashimoto, argues agents need powerful reusable infrastructure rather than reinventing queues and databases every time. Shared dependencies are a cooperation value, like everyone depending on the same Postgres version.
    • Naval and Max both stopped writing code for years, then started building software they use daily through agents, on the strength of understanding how the pieces fit rather than syntax.
    • With agents you stop getting stuck on narrow debugging problems that used to consume indefinite time. The intrinsic frustration that was once “how you learn” is largely gone.
    • Boom turned siloed hardware engineering, much of it trapped in Excel and VBScript with no source control, into real software with automated testing and repeatable flows.
    • Software engineers now build the architectures and hardware engineers vibe code their pieces, letting two engineers design an entire jet engine where a single turbine-blade analysis once took one engineer a full day across a thousand blades.
    • Enterprise collaboration software and even spreadsheets are getting cooked, because you can now code the exact custom tool you need instead of approximating it.
    • AI will soon generate step files and PCB layouts, bringing the current software boom to mechanical and electrical engineering, likely within the year.
    • China is betting on open-source models because its hardware and supply-chain superiority pairs with on-demand software generation to erase Silicon Valley’s software advantage. Fall behind on generating software and you fall behind on generating everything.
    • In real usage, frontier intelligence dominates the top. Gemini “slaps at scale” as an industrial production model for support and browser automation, while Chinese models are not in the frontier coding tier.
    • Intelligence is an unalloyed good. Because mistakes are invisible and models are cheaper than people, you reach for the smartest available model rather than running a weaker one many times.
    • Max’s vertical integration thesis: when you cannot buy a part, you make it. Science owns a captive MEMS foundry because tighter integration toward a single block of bonded matter yields lower power, smaller size, and longer life.
    • AI’s biggest near-term impact inside hardware companies is regulatory: generating documentation and tracing which of thousands of ISO standards apply, work that used to occupy a quality team for months.
    • Junior engineers got promoted to senior and junior engineering got handed to agents. The same pattern hits law, where basic NDAs and red lines no longer require a lawyer.
    • Humans are becoming verifiers. Signing off on a PR means standing behind its consequences via tests, proofs, and type checkers, not reading every line. Creating software is easy; keeping it secure, tested, and maintained 1000 days out is the real question.
    • A RAG over regulatory documents collapses a 200-page compliance test plan from months to minutes, which cuts change aversion: you can alter the airplane and regenerate compliance instead of crying over rework.
    • Regulations can act as a test suite and exit criteria for agent loops, as long as they are non-contradictory and reasonable. The alternative is shipping slop directly into the air.
    • Physical building is guilty until proven innocent, illustrated by the absurdity of pre-filing a driving plan before every trip. The fix is more enforcement-based regulation rather than pre-approval, though agents on both sides could trigger a red queen race and DDoS overwhelmed agencies.
    • Regulation often fails to make things safer, only slower: the 737 Max shipped a single sensor with full authority over pitch, and the NRC kept us perfectly safe by approving almost no nuclear plants for decades.
    • The deeper problem is the voters and the regulator’s asymmetric incentives. Approve a bad thing and your career ends; block a good thing and nobody notices. Removing one agency just elects its replacement.
    • Targeted fixes beat blanket deregulation: bar adverse inferences across users of a compound, use single-patient IND pathways, create opt-in innovation and YIMBY zones, or adopt Europe’s competitive notified-body reviewers.
    • Healthcare is a fixed bucket of money tied to tax receipts, not a growth industry, so spending 10x more on it would be a catastrophe rather than a triumph. With no private market you run a small communist society inside a capitalist one.
    • The escape is lower cost-to-market, not single payer, so people can finance care like a car. China’s lower approval costs and its already-approved implantable BCI point that direction. LASIK, dental, and plastic surgery advance because patients pay directly.
    • End-of-one medicine works at the high end, as with GitLab’s Sid Sijbrandij outliving his cancer prognosis through a self-built escalation ladder, but it demands enormous agency at the patient’s weakest moment. AI should democratize that knowledge.
    • Vercel automated much of site reliability engineering: anomalies fire alerts, an agent investigates, can open an incident, and begins remediation, stopping just short of changing production itself.
    • Running an open-sourced security tool against the whole monorepo with 10,000 concurrent agents produced several quarters of security research in a couple of days for about $14,000 in tokens. Code translation and optimization are similarly autonomous now.
    • Blake stopped all project work for a week and had everyone, receptionist to engineers, build something with AI and demo it. He expected mostly silly projects and got mostly needle movers, including a real automation from shipping and receiving.
    • The autonomous company of the future may have a workforce that trains the agents doing the work rather than doing it directly, with tooling that extracts reusable skills from your inputs and outputs.
    • Returns are shifting from intelligence toward agency for humans, since agents supply the intelligence. The people best fit for the future open a coding agent and ask what to build instead of defaulting to passive consumption.
    • Maybe 10x more people are coding than a year ago, yet around 99% still never will, because to a non-coder the starting step remains unimaginable. Vibe coding is described as more addictive and entertaining than video games, with real output.
    • AI video lacks taste and judgment for now, but by 2030 expect fan-made films: dozens of Lord of the Rings takes, or generating unmade seasons of The Expanse from the books. The bigger prize is a genuinely new imaginative work, not a remix.
    • What humans uniquely do is generate meaningful surprise out of the training distribution, with intent that makes it mean something. Gödel stepping outside the formal system is the archetype; Claude’s identical-looking websites are the counterexample of in-distribution slop.
    • Higher productivity historically means you hire more, not fewer, of the productive people. Expect a larger number of smaller teams, an entrepreneurship explosion, and generalists winning as credentials matter less than creativity, taste, and judgment.
    • The throughline is people with AI versus people without AI. The single best investment right now is getting genuinely good with the tools and learning the exact edges of what they can and cannot do.

    Detailed Summary

    Software Factories and the Thousand-X Engineer

    Guillermo Rauch opens with the idea that has him “pilled”: the engineer’s job has changed from shipping output directly to building the factory that produces multiplicative outputs. That reframes how you evaluate people and surfaces an old, controversial truth. He used to get flamed on Twitter for asserting 10x engineers, since it offends an equality instinct, but in intellectual and digital domains the real spread is 100x or 1000x, and choosing the right thing to work on is an infinite multiplier on top. AI leverage makes this less controversial, except that people now confuse token spend for productivity. The group agrees token leaderboards are the new lines-of-code. Max Hodak adds that a model is about as good as you are in a domain, so a capable developer gets a powerful collaborator while a junior gets junior-grade help, and the sporadic feedback you give, the reprompting, disproportionately determines the result. Naval’s posture is the opposite of fussy: he ignored every prompt-engineering trick on the bet that the models would improve faster than he could learn to game them, types less and less, and brute-forces problems by throwing multiple models at them. Waste tokens, save time, because tokens are cheaper than people.

    Is Pure Software Dead, and the Building-Block Economy

    Rauch describes models crossing from junior to principal engineer: they now return with several routes and explicit tradeoffs, push back when you try to jam high-cardinality telemetry into Postgres, and suggest ClickHouse or Athena instead. That elevates taste and judgment as the human contribution. He then poses the hard question: is pure software engineering obsolete now that models speak fuzzy, sloppy English and you no longer need code to communicate with them? For hardware founders it is a boon, echoing Patrick Collison’s line that software is art and artists are hard to hire. To temper the “agents reinvent everything” fantasy, he invokes Mitchell Hashimoto’s building-block economy: you do not want your agent rebuilding a queue from first principles every time it sends an email, and shared dependencies like a common Postgres version carry real cooperation value. Reusable infrastructure becomes more valuable in the agentic era, functioning like libraries and dependencies, or even a token cache, so models fork from existing starting points instead of burning a trillion tokens to recreate what exists. Naval and Max both note they had not written code in years and now build daily through agents, because understanding how APIs, data flow, and performance fit together matters more than syntax, and vibe coding is just transmitting intent the way a good engineering leader already did through people.

    Vibe Coding Hardware at Boom Supersonic

    Blake Scholl explains how AI changed the role of software and hardware developers at Boom. A great deal of hardware engineering lives in complex Excel spreadsheets and VBScript on individual laptops, with no source control and no automated testing, and handoffs happen manually over email like it is the 1990s. Boom had long tried to turn these flows into real software but could never afford enough software engineers. The new model is that software engineers create the architectures, because they understand systems, algorithms, and separation of concerns, and hardware engineers vibe code their own pieces. The result is mind-blowing productivity for small teams. His example: a turbine blade is cold at rest and expands when hot, so you must design both the cold and hot shapes and convert between structures and aerodynamics, work that took one engineer a full day per blade across a thousand blades in a jet. With a combined software-and-hardware tool you can now change blade geometry and see structural and aerodynamic results in real time, letting two engineers design an entire jet engine. The group extends this to the death of enterprise collaboration software and even spreadsheets, since you can now code the exact custom tool you need, and predicts AI will soon generate step files and PCB layouts, carrying the boom into mechanical and electrical engineering.

    China, Open Source, and Which Models Actually Get Used

    Naval argues China is going all-in on open-source models because its hardware and supply-chain superiority pairs naturally with on-demand software generation, which erases Silicon Valley’s software edge, and because the Chinese government has a history of funding ecosystem-wide efforts in network-effect businesses. Without frontier coding models there is no self-improvement, so a country that cannot generate frontier software falls behind on generating everything downstream. He notes the irony that almost all the open-source heft now comes from China, since OpenAI is not open, Grok and Google’s local models trail, and Anthropic ships no open models. On real usage, Rauch reports from Vercel’s AI gateway that frontier intelligence dominates the top, with a caveat: frontier intelligence at the right cost and performance, like Gemini, slaps at scale and is the best industrial production model for support and browser automation, while Chinese models are not in the frontier coding tier. Naval frames intelligence as an unalloyed good, since model mistakes are invisible and a smarter model is still cheaper than a person, which pushes everyone toward the most intelligent option and risks an oligopoly in AI.

    Vertical Integration, Verifiers, and the Slop Problem

    Max Hodak lays out Science’s vertical integration: the preference is always to buy, as with cheap PCBs from Asia, but when components do not exist you must make them, and the closer a product gets to a single block of covalently bonded matter the better it performs. Science owns a captive MEMS foundry on the east coast because there was no other way to do the packaging and assembly it needed. He notes AI’s most surprising internal impact so far is regulatory: generating documentation and tracing which of thousands of ISO standards apply, work that once tied up a quality team for months. Rauch raises the slop problem: mountains of AI-generated code arriving as pull requests nobody can read line by line. His standard is that an engineer must be able to say they understand and will stand behind the consequences of a PR, backed by the test harness, proofs, and type checkers, even without reading it all. Naval generalizes this into humans becoming verifiers, with lawyers, engineers, and operators moving to verifying the stack and standing behind it, and Rauch warns that creating software is the easy zero-to-one part while keeping it secure, tested, performant, and maintained a thousand days later is the real test.

    Regulation as Test Suite, and the Voter Problem

    Blake describes building a RAG that compresses a 200-page lightning-strike compliance test plan from months of a “monkey at keyboard” engineer’s work into minutes, with a powerful second-order effect: change the airplane and you regenerate compliance in minutes instead of crying over months of rework, which slashes change aversion and lets a small number of creative engineers iterate. Max reframes regulations as potentially good guard rails, a test suite and exit criteria for agent loops, provided they are non-contradictory and reasonable, since the alternative is shipping slop into the air. Naval warns of a red queen race of agent-on-agent compliance and agencies getting DDoSed by clever entrepreneurs flooding them with documents. Blake pushes for enforcement-based rather than pre-approval regulation, using the analogy that we would never tolerate filing a driving plan before every trip, yet that is exactly how physical infrastructure works: guilty until proven innocent. He cites the 737 Max’s single all-authority sensor and the NRC permitting almost no nuclear plants for decades as proof that this makes us slower, not safer. Hodak supplies the counterweight: the deeper issue is the voters and the regulator’s asymmetric incentives, where approving a bad thing ends a career and blocking a good thing goes unnoticed. Remove an agency and the electorate installs its twin. Naval and Max agree the real reforms are narrow, including innovation zones, opt-in YIMBY zones, and the experimental laboratory of fifty states.

    Drug Discovery, Healthcare Economics, and End-of-One Medicine

    Hodak explains why innovation zones do not solve drug discovery. The right-to-try act and single-patient IND already exist, and the FDA approves over 99% of such requests, sometimes by phone, but dosing requires clinical-grade drug that only the IP owner has, and the FDA will draw an adverse inference against the whole program if a very sick patient does worse. A targeted fix is to prohibit adverse inferences across different users of a compound. He points to Europe’s notified-body system, private certifiers blessed by governments, as a way to scale review capacity, and to China’s CFDA, which already approved an implantable brain-computer interface and brings products to market far cheaper. His core economic argument is that healthcare is a fixed bucket of money that grows only with tax receipts, unlike phones and laptops where falling prices expanded the market, so spending 10x more on healthcare would be a catastrophe rather than the triumph that 10x AI spending would be. With no private market you run a small communist society inside a capitalist one, with the lines and frozen quality that implies. The way out is lower cost-to-market so patients can finance care like a car, which is the direction China is pushing. Naval’s twist is a healthcare plan where the first 20% of income is the deductible to recreate a private market, citing LASIK, dental, and plastic surgery as fields that advance because patients pay directly. The group closes the segment on GitLab’s Sid Sijbrandij, who outlived a rare-cancer prognosis by building his own escalation ladder of drugs, noting that end-of-one medicine works at the high end but demands enormous agency exactly when a patient is weakest, which is where AI should democratize access to knowledge.

    Autonomous Software, Hackathons, and the Autonomous Company

    Asked how much autonomous software they run, Rauch describes Vercel automating much of site reliability engineering: instead of hand-set alarm thresholds, anomalies in error rate, latency, or throughput fire an alert, an agent investigates, can open an incident that loops in people, and begins remediation, stopping just short of changing production. Vercel also runs autonomous optimization and security research, and an open-sourced security tool run against the entire monorepo with 10,000 concurrent agents produced several quarters of security research in a couple of days for about $14,000 in tokens, the equivalent of months of red teaming. Max shares a vibe-coded bug-reporting queue where TestFlight users submit logs and screenshots, a daemon analyzes and fixes issues in the background, and ships him a build to try, raising the prospect of apps effectively built by their users, with the caveat that you would get a Homer Simpson car of every feature. Blake recounts stopping all project work for a week and requiring everyone, from the receptionist to the engineers, to build something with AI and demo it. He expected mostly silly projects and got mostly needle movers, including a genuinely useful automation from the shipping and receiving associate, concluding that most people have an idea worth building but cannot tell a good first idea from a bad one until they can iterate on a real thing. Rauch extends this to a workforce that trains the agents doing the work rather than doing it directly, and a coming feature to extract reusable skills from your inputs and outputs.

    Creativity, Out-of-Distribution Surprise, and What Humans Can Uniquely Do

    On the intelligence-versus-agency split, Max suggests returns to humans tilt toward agency since agents supply intelligence, while Naval counters that you stay 99% intelligence and 1% agency because the agents exercise the agency for you. They agree the humans best suited to the future are the agentic ones who open a coding agent and ask what to build. Coding has perhaps 10x more participants than a year ago, yet roughly 99% still never will, because the first step is unimaginable to a non-coder, even as vibe coding proves more addictive and entertaining than video games while producing something real. On AI video, the group notes it still lacks taste and judgment, but expects fan-made films by 2030, dozens of Lord of the Rings takes or generated seasons of The Expanse, while prizing a genuinely new imaginative work over a remix. The long closing debate turns on definitions. Hodak defines art as meaningful out-of-distribution behavior, broad enough to include a military maneuver, and expects models to reach it. Naval defines art as conveying emotion with intent, which makes attribution decisive: the same photo means more taken by a human, and a hardware-attestation startup gains a real use case. They cite Gödel stepping outside the formal system as the human archetype and the identical look of every Claude-built website as in-distribution slop. Naval lands the episode on optimism: productivity gains mean hiring more, not fewer, of the creative and AI-fluent, the future is a larger number of smaller teams and an entrepreneurship explosion where generalists thrive and credentials fade, and the single best move is to get extremely good with the tools, because it is people with AI versus people without AI.

    Notable Quotes

    “Now clearly there’s 100x or a thousandx engineers and the world hasn’t fully adjusted to this.”

    Guillermo Rauch, on why AI made the spread between engineers impossible to ignore

    “Just waste tokens, save time. Don’t look at the tokens either as inputs or outputs. Just look at your time and look at the final output.”

    Naval Ravikant, on the right way to measure AI’s return

    “We had to learn code to communicate with the models. Now the models speak English and they speak fuzzy sloppy English like a human and they understand things.”

    Guillermo Rauch, asking whether pure software engineering is now obsolete

    “It allows two engineers to design an entire jet engine, which is just wildly different.”

    Blake Scholl, on Boom turning hardware engineering into software

    “You need to be able to say I am signing off on understanding the consequences of this PR.”

    Guillermo Rauch, on what it means to stand behind code you did not read line by line

    “That is absolutely the way we build physical infrastructure in this country. It’s guilty until proven innocent. And what we should actually do is make more of these things enforcement based rather than pre-approval based.”

    Blake Scholl, comparing the permitting process to filing a driving plan before every trip

    “You’re basically running a small communist society inside a larger capitalist society. And that’s what we’re doing in healthcare.”

    Max Hodak, on why there is no real private market in healthcare

    “I expected we would get a large number of silly projects and a small number of needle movers. And what we got was a large number of needle movers and a very small number of silly projects.”

    Blake Scholl, on the week he had the whole company build with AI

    “If a person takes the photo versus AI generates the exact same photo down to the last pixel, the person taking the photo will have more meaning for me.”

    Naval Ravikant, on why intent and attribution make something art

    “It’s about people with AI versus people without AI. And so the single best thing you can be doing right now for yourself is just getting really good with these tools.”

    Naval Ravikant, closing the conversation on the only divide that matters

    Watch the full conversation here: The AI Industrial Revolution on the Naval Podcast YouTube channel.

    Related Reading

    • Part one: Waste Tokens to Save Time, our writeup of the first segment, on software factories, the thousand-x engineer, token leaderboards, and whether pure software is dead.
    • Part two: Vibe Coding Hardware, our writeup of the second segment, on AI-designed jet engines, vertical integration, China’s open-source bet, and humans as verifiers.
    • Naval Ravikant’s official site, the canonical home for Naval’s essays and podcast on technology, judgment, and leverage.
    • Boom Supersonic, Blake Scholl’s company building supersonic aircraft and its own jet engines, source of the turbine-blade and two-engineers example.
    • Science Corporation, Max Hodak’s brain-computer interface company, whose captive MEMS foundry and FDA arguments anchor the hardware and healthcare segments.
    • Vercel, Guillermo Rauch’s company, whose AI gateway data and autonomous SRE work inform the usage and automation discussion.
  • Waste Tokens to Save Time: Naval, Guillermo Rauch, Blake Scholl, and Max Hodak on AI Software Factories, 1000x Engineers, and Whether Pure Software Is Dead

    Naval Ravikant gathers three frontier founders, Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science, for a freewheeling conversation about how AI coding tools are reshaping what an engineer is, what software is worth, and where the moat goes when models speak English. The headline idea comes from Naval himself: waste tokens, save time. Stop measuring AI by tokens consumed or lines of code generated and start measuring it by the final output and the time you got back. The full conversation is on the Naval Podcast YouTube channel. This is part one of the discussion. Part two, on vibe coding hardware, follows the same group into jet engines, semiconductors, and biotech. You can also watch and read the full episode here.

    TLDW

    The job of an engineer is shifting from shipping output to building the factory that ships the output, which means 10x engineers were never really 10x, they were always 100x or 1000x in idea domains, and AI leverage is making that obvious. Models now reflect back the judgment of the user, so a senior architect extracts dramatically more value than a junior, although the junior also writes code they could never have written alone. The frontier models have quietly graduated from junior coders to principal engineers, returning with intuitive plans and real tradeoffs (sometimes with hilariously bad time estimates) rather than just running away with the prompt. Naval has stopped learning prompt tricks, scaffolding tools, and Claude plan-mode rituals entirely. Instead he throws Codex, Claude, and Gemini at the same problem in parallel and brute forces his way through, because tokens are still cheaper than a human and the models keep getting better faster than tricks can. That leads to the bigger question on the table: is pure software still investable, or is it now just a free byproduct of hardware, models, and taste? The group lands on the block economy thesis (a tip of the hat to Mitchell Hashimoto): agents do not want to reinvent Postgres or BMQ on the fly, they want to grab the right reusable building block, so infrastructure software actually gets more valuable, not less. Max Hodak closes the loop with a personal data point: he has not written a line of code in years and has built more software since December than ever before, all through agents, because just understanding APIs, data flow, and performance is what actually moves the work forward.

    Thoughts

    The “waste tokens, save time” line is the most important rhetorical move in this conversation, and it deserves to be unpacked beyond the soundbite. Naval is implicitly arguing that the entire token-economics debate (input cost, output cost, leaderboards, model arbitrage) is a category error in the same way that lines-of-code was a category error in the nineties. The thing being purchased is not tokens. It is a finished result delivered with less of your finite attention spent. If three parallel runs of Codex, Claude, and Gemini cost you a few dollars and one of them lands the answer in twenty minutes instead of you sweating the problem for two hours, the unit economics are not even close. The only people who care about the token bill are people who have not internalized that human time is the actually scarce resource. Once you do internalize it, the question is no longer “how do I prompt this more efficiently,” it is “how do I get out of my own way.”

    The 100x and 1000x engineer point is the one most likely to enrage commenters, and it is also the one most worth taking seriously. Naval is right that the egalitarian flinch in software circles always sat awkwardly next to the empirical fact that one Carmack, one Brendan Eich, or one Satoshi creates more durable value than every mid-tier engineer on earth combined. What AI does is collapse the bottom of that distribution. The marginal junior engineer at a typical company is now competing with a model that costs a few dollars an hour and never sleeps. The remaining premium for human engineers is taste, judgment, and the rare ability to pick the right thing to build at all, which Naval correctly flags as the multiplier that dwarfs raw coding speed. “Just one who had a better judgment on what to work on in the first place” is the most underrated line in the whole episode.

    Guillermo Rauch’s observation that the models have graduated from running away with your prompt to returning with three routes and a tradeoff matrix is the technical update most people have not actually felt yet. There was a real, qualitative shift when the model started saying “we don’t put high-cardinality telemetry into Postgres, you probably want ClickHouse or Athena.” That is not autocomplete. That is a peer. And the funny corollary, that the same model will then confidently tell you the work will take three weeks when it will take three hours, is not a knock on the model. It is a reminder that calibration is a separate skill from competence, and humans get this wrong constantly too. The right posture is to treat the model the way a good engineering manager treats a strong but cocky senior: take the architecture suggestions seriously, throw out the estimates.

    The block-economy thread, riffing on Mitchell Hashimoto, is where this conversation quietly answers Naval’s “is pure software dead” question. Agents are insatiable consumers of reusable building blocks because reinventing infrastructure on every run is wasteful, brittle, and incompatible with the rest of the world. If your service is the canonical primitive an agent reaches for (the queue, the database, the auth layer, the deploy target), you are not commoditized by AI, you are amplified by it. Pure software is not dead. Pure software with no distribution, no defensibility, and no integration into the agent toolchain is dead. That is a much less catchy headline, but it is the real one. The takeaway for founders is not to abandon software, it is to ask whether your software is something an agent will reach for ten thousand times a day or something a human had to be talked into using once.

    Max Hodak’s confession (no code written in years, more shipped software in the last six months than ever before) is the empirical proof that this is not just theory. The skill that ports forward is not syntax. It is the engineering leader’s instinct for what an API is, how data flows, where performance matters, and what level of expectation to set. Guillermo’s framing of “vibe coding through people on Slack” as the original form of vibe coding is genuinely insightful. A good engineering manager has always been transmitting intent to other minds and letting them run. Doing it with agents is the same skill, just with a faster, cheaper, more literal counterparty. The engineers who will struggle in this transition are the ones whose identity was tied to writing the code themselves. The ones who will thrive are the ones who already thought of themselves as taste, judgment, and intent, with code as an implementation detail.

    Key Takeaways

    • The engineer’s job has shifted from shipping output B to building the factory that produces outputs B through Z. You are now judged on the multiplicative system you create, not the single artifact you deliver.
    • 10x engineers were always a misnomer. In idea-domains and digital domains, the real distribution has always been 100x or 1000x. AI just made that obvious enough that arguing about it is no longer fashionable.
    • Token consumption leaderboards are the new lines-of-code metric: a vanity number that measures activity, not value. Tokens are an input, your time is the constraint.
    • Naval’s core rule: waste tokens, save time. Tokens are still vastly cheaper than human hours, no matter how the pricing scares you.
    • Models tend to be about as good as you are in a given domain. The feedback you give them, the corrections, the redirections, sporadically but powerfully shapes the quality of the output.
    • The quality of your reprompting matters enormously today, but will probably matter less over time as models get smarter and need less hand-holding.
    • Naval has refused to learn prompt scaffolding, plan-mode tricks, or named prompt frameworks. His bet is that the models will figure out how to use him faster than he can figure out how to use them.
    • His preferred technique: throw Codex, Claude, and Gemini at the same problem in parallel and brute force the answer. Time is the cost center, not API spend.
    • Lower quality first-draft code is not a blocker. When it is time to ship, throw more tokens at it for a hardening pass. Quality compounds across model generations.
    • Verifiable domains (problems with a clear right answer) are the ones the models will fully solve. Cutting-edge creativity work, the Terence Tao tier, still needs careful human collaboration.
    • Models have qualitatively shifted from “next-token autocomplete that runs away with your prompt” to “intuitive planning mode” where they return with multiple routes and explicit tradeoffs.
    • This is why people on social media say models are now PhD-level. It is not the raw output, it is the back-and-forth posture.
    • Models will confidently make terrible time estimates (“this is a three week project”). Treat them like a strong but miscalibrated senior engineer: trust the architecture, ignore the schedule.
    • Architect-level engineers are extracting much more value per session than junior engineers, but juniors are still leveling up because they can now write code far above their unaided ability.
    • The next career step for a junior engineer is moving from implementing features to picking technologies. Postgres vs ClickHouse, ZMQ vs other queues. The model can suggest, but a human still has to decide.
    • Taste and judgment remain the residual human advantage. Models will give you good tradeoffs if you ask, but knowing which tradeoff to take is still on you.
    • Concrete example: a recent model pushed back when asked to store high-cardinality telemetry in Postgres and recommended ClickHouse or Athena instead. Unprompted architectural judgment.
    • Humans are still completing the model for tasks like fetching API keys, moving capital, or performing real-world actions. That gap is temporary.
    • Every SaaS and hosting company will soon expose a CLI or API surface that agents can drive directly. Anything Unix-shaped and text-based, agents can already hack into a usable API themselves.
    • The missing piece for full autonomy is payments. Crypto, Bitcoin, or any programmable money lets the agent buy what it needs without a human in the loop.
    • The open question Naval poses: is pure software dead? We used to learn code to talk to machines. Now machines speak fuzzy, sloppy English back to us.
    • For hardware founders, AI is a massive boon. Software, which was always hard to hire artists for (per Patrick Collison’s “software is art” framing), is suddenly fast and cheap to produce alongside the hardware.
    • Model training, post-training, and fine-tuning may be the new “real software engineering” for those who want to work at the model layer.
    • Mitchell Hashimoto’s “block economy” thesis: agents need powerful, reusable, well-known building blocks. They should not reinvent message queues or databases every run.
    • Reinventing primitives is bad civic engineering. The value of “we both depend on Postgres 13.2” is interoperability with the rest of society and toolchain.
    • Infrastructure software and reusable libraries are getting more valuable, not less, in the agentic era. Vercel’s bet is on being the layer agents reach for.
    • Useful metaphor: building blocks are like a token cache. Why churn through a trillion tokens to reproduce code that already exists when you can fork from a known starting point?
    • Max Hodak has not written a line of code in years but has shipped a huge volume of personal software since December, all through agents. Projects he had fantasized about for years are now actually running.
    • What still matters from a real software background: understanding what an API is, how data flows, performance expectations, and how to set the right level of demand on an operation.
    • A proficient engineering leader has always been “vibe coding through people” on Slack and in one-on-ones, transmitting intent and letting others execute. Doing it with agents is the same skill, faster and cheaper.
    • Naval personally went from twenty years of not coding to coding constantly through agents, leaning on first-principles software engineering and algorithms knowledge.
    • The friction that historically killed personal coding projects (latest framework, infra plumbing, deploy setup) is now mostly handled by the agent. Vercel makes it easier, agents make it trivial.
    • The single biggest change Max highlights: you do not get stuck anymore. The indefinite debugging spiral on some narrow obscure bug is largely gone.
    • The old mantra that learning to program means accepting intrinsic frustration (“nope, that’s part of the deal”) is no longer true. The frustration was incidental, not essential.
    • The frontier founder pattern on display in this episode: all three guests build their own factories (Vercel’s AI cloud, Boom’s supersonic jets and engines, Science’s biohybrid brain interface) rather than composing from off-the-shelf parts.

    Detailed Summary

    The Software Factory and the Hundredfold Engineer

    Guillermo Rauch opens the substantive portion of the conversation with the framing he has been pushing publicly: the role of the engineer is moving from “ship output B” to “build the factory that ships outputs B through Z.” That reframes engineering judgment. You are no longer evaluated on the single deliverable, you are evaluated on the multiplicative system you put in place. Naval picks up the thread and points out that this also retires an old debate. Engineers used to argue about whether 10x engineers existed, with the egalitarian camp insisting that talent differences were marginal. The truth, Naval says, was always more extreme. In idea-domains, virtual domains, and intellectual domains, the distribution has always been 100x or 1000x, not 10x. Brendan Eich, Carmack, Satoshi, the canonical names, were thousandx programmers. AI has made the underlying distribution legible. And the multiplier on top of all of that is judgment: picking the right thing to work on in the first place is an infinity multiplier compared to picking the wrong thing, regardless of raw skill.

    Token Leaderboards Are the New Lines of Code

    Guillermo flags the current cultural confusion: people see their AI bills, see the token counts, and assume they should be optimizing for tokens-per-engineer or similar metrics. Max Hodak’s response cuts through it. Token consumption, like lines of code before it, is not a meaningful productivity metric. It is an activity metric, and activity metrics always mislead. Max adds his own field observation: the models tend to be roughly as good as you are in a given domain. A senior developer extracts genuinely powerful output, a junior gets junior-quality output back, because the feedback loop (the corrections, the redirections, the architectural pushback) is what shapes quality. The sporadic but high-leverage moments where the user redirects the model are doing more work than the prompt itself.

    Naval’s Brute Force Doctrine: Waste Tokens, Save Time

    Naval lays out his personal posture, which has become the title of the conversation. He has deliberately ignored all the prompting tricks, scaffolding tools, named prompt frameworks (“use Ralph Wigum, use OpenClaude, use Hermes, use plan mode”), on the bet that the models will figure out how to use him faster than he can figure out how to use them. He is ham-fisted with the models, gets frustrated, types less and less, and just brute forces his way through by running Codex, Claude, and Gemini at the same problem simultaneously. The justification is economic. No matter how expensive the models seem, they are still vastly cheaper than a human hour. Do not measure tokens as inputs or outputs. Measure your time and the final output. Even when the first-draft code is low quality, that is not a blocker. When the moment comes to ship, throw more tokens at it. The models will rewrite it, harden it, and they get better every generation. Naval explicitly excepts cutting-edge creative work (the Terence Tao tier of unsolved problems) where you still need to collaborate carefully and closely. Everywhere else, brute force is the dominant strategy.

    From Junior Coder to Principal Engineer

    Guillermo identifies a qualitative shift that has happened recently. Models used to do the classic next-token thing: take your prompt and run away with it in a direction you may not have wanted. Now they enter an intuitive planning posture without being told to plan. They come back and say “what you are asking has these three routes, here are the tradeoffs.” That, Guillermo argues, is the moment the model stopped being a junior engineer and became a principal engineer. The funny side effect is that they will then return preposterous time estimates (“this will take three weeks”) with full confidence. The conclusion is to treat the model as a peer for architecture and a baby for scheduling. Returning to the Max-vs-junior question, Guillermo argues juniors clearly do level up because they write code well above their solo ability, but architects extract maybe 10x while juniors extract more like 2x. The juice scales with the user’s existing taste.

    Taste, Judgment, and Architectural Decisions

    Max names the residual human contribution: taste and judgment. Picking between Postgres and ClickHouse for high-cardinality telemetry data, picking between ZMQ and another queueing system. The models can recommend, but a human still has to call it. Guillermo offers a recent concrete example where a model pushed back unprompted: when asked to put high-cardinality telemetry into Postgres, the model responded “we don’t put that kind of data into Postgres, you should consider ClickHouse or Athena.” That is the new normal. The peer-level architectural pushback is happening unsolicited, which is genuinely impressive and a real shift from the deferential autocomplete of two years ago.

    When the Human Becomes the Tool

    Guillermo raises the inversion question: at what point does the model stop being the assistant and the human start being the assistant who fetches API keys, moves capital, and performs real-world actions on the model’s behalf? Naval treats it as a temporary aberration. Every serious SaaS and hosting provider will soon expose a CLI or API surface that agents can drive directly. Even when they do not, anything Unix-shaped and text-based can be hacked into an agent-usable interface by the agent itself. The missing piece is payments. Once you insert programmable money (Naval mentions Bitcoin and crypto tokens), the agent can buy what it needs and the human is no longer the bottleneck.

    Is Pure Software Dead?

    Naval poses the biggest strategic question of the episode. If models now speak fuzzy, sloppy English the same way humans do, and the historical reason we learned to code was to talk to machines that did not understand English, is pure software still a viable thing to build a company around? His own framing of the answer: hardware founders win, because the historically hard problem of hiring software artists (per Patrick Collison’s “software is art” line) is now mostly solved by AI. Model builders win, because training, post-training, and fine-tuning may be the new “real software engineering.” But what about classic pure software companies? Naval lets the question hang, and Guillermo picks up the answer through a different door.

    The Block Economy and the Future of Infrastructure Software

    Guillermo cites Mitchell Hashimoto’s recent piece on the block economy (or “building block economy”). The argument: the most valuable thing for agents to have access to is powerful, reusable building blocks. You do not want your agent reinventing a queue system every time it needs to send an email. You want it to grab the right-sized block (BMQ, ClickHouse, whatever) and move on. Reinventing primitives is also a civic problem. The world only works because we all depend on the same Postgres 13.2, the same protocols, the same standard infrastructure. If every agent went off and invented its own bespoke universe, you would lose interoperability. So infrastructure software (which is, by self-admitted bias, what Vercel builds) becomes more valuable in the agentic era, not less. Guillermo extends the metaphor: reusable building blocks are like a token cache. Why burn a trillion tokens reproducing what already exists when the agent can fork from a known starting point? The block economy is the answer to “is pure software dead.” Pure software that becomes the canonical primitive an agent reaches for is more valuable than ever.

    Max Hodak’s Personal Proof: Years Without Code, Tons of Software Shipped

    Max grounds the discussion in his own experience. He learned to program young, got sucked into it in his teens and 20s, knew programming languages deeply. He has not written a line of code in quite a while. And yet since December he has built a huge amount of personal software, including projects he had fantasized about for years and now actually uses every day. He did not write any of it. He cannot imagine going back to writing code by hand. The skill that ports forward is not syntax, it is the understanding of how APIs work, how data flows, what level of performance to expect, and how to orient the model around the right expectations for an operation. Guillermo extends this with the most quotable framing of the episode: a proficient engineering leader has always been “vibe coding through people on Slack and in one-on-ones,” transmitting intent and letting others execute. Agents are the same modality with a faster, cheaper, more literal counterparty.

    Naval’s Return to Coding After Twenty Years

    Naval offers his own parallel. He went from not having written code in twenty years to coding constantly through agents. What carried him back in was first-principles knowledge of software engineering and algorithms, which gets you further than you would think. The reason he had stopped coding in the first place was not lack of ability, it was the friction of keeping up with the latest language, the latest architecture, and the constant infrastructure plumbing required to ship anything. Vercel made it easier. Agents made it trivial. Max closes with the most concrete benefit of all: you do not get stuck anymore. The indefinite debugging spiral on some obscure narrow problem, the thing that historically ate weekends and broke spirits, is largely gone. The old mantra that programming is intrinsically frustrating and that frustration is “part of the deal” turned out to be wrong. The frustration was incidental, not essential.

    Notable Quotes

    “The way that I’m judging you as an engineer is, are you producing the factory that will produce multiplicative outputs B through Z?”

    Guillermo Rauch, reframing what an engineer is actually being measured on in the AI era.

    “When you’re operating in idea domains, intellectual domains, virtual digital domains, it’s not even 10x, it’s 100x or 1000x. It always has been.”

    Naval Ravikant, on why the old 10x engineer debate was always under-stating the real distribution.

    “If you choose the right thing to work on versus the wrong thing to work on, that’s an infinity difference. It could just be one who had a better judgment on what to work on in the first place.”

    Naval Ravikant, on judgment as the multiplier that dwarfs raw skill.

    “I’ll throw Codex, Claude, and Gemini at the same problem over and over and just waste tokens to save time. No matter how expensive these models might seem, they’re still way cheaper than a human.”

    Naval Ravikant, on his brute-force multi-model coding workflow.

    “Just waste tokens, save time. Don’t look at the tokens either as inputs or outputs. Just look at your time and look at the final output.”

    Naval Ravikant, delivering the title thesis of the episode.

    “Clearly the models at some point graduated. They used to be junior engineers, now they’re principal engineers, because they come back to you with a set of tradeoffs.”

    Guillermo Rauch, on the qualitative shift in how current frontier models respond to prompts.

    “Bro, we don’t put that kind of data into Postgres, you should consider ClickHouse or Athena or whatever. That’s happened to me a lot, which is really impressive.”

    Guillermo Rauch, recounting unprompted architectural pushback from a recent model.

    “It’s like saying speaking English. We had to learn code to communicate with the models, now the models speak English. So where’s the moat?”

    Naval Ravikant, raising the central strategic question about the future of pure software.

    “I haven’t written a single line of code in quite a while. Since December, I’ve built a huge amount of software that I now use every day, projects I’ve fantasized about for years.”

    Max Hodak, on what becomes possible when you stop writing code and start directing agents.

    “A proficient engineering leader has been quote unquote vibe coding through people on Slack or one-on-ones, because you’re transmitting your will, your intent, your experience, and you’re letting others run with it. Now we do the same with agents.”

    Guillermo Rauch, reframing leadership itself as the original form of vibe coding.

    Watch the full conversation on the Naval Podcast here.

    Related Reading

    • Full episode: The AI Industrial Revolution, the complete hour-long conversation this clip is drawn from, covering software factories, hardware, regulation, healthcare economics, autonomous companies, and creativity.
    • Part two: Vibe Coding Hardware, the continuation of this conversation, where the same founders move from pure software into AI-designed jet engines, vertical integration, China’s open-source bet, and why humans become verifiers.
    • Naval Ravikant’s official site, the canonical home for Naval’s essays, podcast, and longer-form thinking on technology, judgment, and leverage.
    • Vercel, Guillermo Rauch’s company, building the AI-native cloud and frontend infrastructure that this conversation references as a canonical agent building block.
    • Boom Supersonic, Blake Scholl’s company building supersonic civilian aircraft and their own jet engines, the hardware example of a founder building the whole factory.
    • Science Corporation, Max Hodak’s brain-computer interface company developing the biohybrid neural implant referenced in the intro.
    • Mitchell Hashimoto’s writing, source of the “block economy” framing for why reusable infrastructure building blocks become more valuable, not less, in the agentic era.
  • Dan Shipper’s Most Contrarian AI Predictions for 2026: Why the Job Apocalypse Is a Myth, SaaS Will Boom, PMs and Designers Win, and CLIs Are Already Over

    Dan Shipper, the CEO and founder of Every, returned to Lenny’s Podcast for round two of AI predictions. His last appearance produced one of the most prescient calls of the year: that non-technical people would build serious work inside Claude Code. He was unbelievably right. This conversation is the follow-up, a tour of his most contrarian forecasts for how AI is actually changing the way we work, who wins, who loses, and what almost every commentator is getting wrong about the next twelve to twenty-four months.

    TLDW

    Shipper argues that the AI job apocalypse is a myth, that SaaS is going to boom rather than die, that product managers and full-stack designers are the biggest winners of the agent era, that personal agents inside Codex and Claude Code will quietly replace the browser as the primary work surface, that every company will run a single shared super-agent in Slack instead of a fleet of per-user bots, that the CLI moment is already over, that pull requests are going to flood organizations from non-technical staff, that forward-deployed engineers who garden company agents become the new senior role, that GPT-5.5 still cannot match a real senior engineer on architectural judgment, that AI-generated internal writing is fine and probably better than what most humans produce, that CEOs and middle managers have not adapted yet but soon will be forced to, that the edge of AI lives wherever a curious human is using it rather than in San Francisco, and that the only durable strategy is to ride the models and keep playing with whatever ships next. The whole conversation balances aggressive AI bullishness with an equally strong bet on humans, on creativity, and on the unavoidable need for someone to care for every agent that gets deployed.

    Thoughts

    The most useful frame Shipper gives is that models commoditize yesterday’s human competence. Every time a frontier model crosses a new bar, the work that used to define seniority becomes cheap. The senior engineer who could carry a refactor in their head, the PM who could write a coherent strategy doc, the designer who could ship a polished landing page in a week. That competence is now frozen, codified, and available on tap. The interesting question is not whether models will keep eating tasks. They will. The interesting question is what humans do with the suddenly cheap raw material underneath them. Shipper’s answer is that humans climb the stack: they go up a level, find a new problem worth framing, and use the commoditized competence as feedstock for something that did not exist before. That treadmill is the actual engine of value creation, and it is why he can be simultaneously AI pilled and bullish on hiring.

    His SaaS take is the spiciest call of the episode and probably the most defensible. The crowd consensus is that agents will gut SaaS because an AI can just write the form filler, the dashboard, the workflow. Shipper points out the obvious counterfactual: agents do not reduce the number of people using SaaS, they increase it. A marketing lead who could never touch the data warehouse can now stand up a PostHog query through Codex. A founder who never opened Vanta can run a SOC 2 prep through an agent. The result is more users, more accounts, and a much fatter top of funnel for every horizontal tool. The second-order effect is even more interesting. When the SaaS tool runs inside the user’s agent, the user supplies the tokens. Vendor margins improve, not collapse. If he is right, the next two years are going to be brutal for the SaaS-is-dead thesis pieces and very good for the public software multiples.

    The PM and designer bet is where this gets personal for anyone in product. For a decade the bottleneck in shipping anything was engineering capacity. A PM with spiky product sense had to negotiate their vision through a roadmap, a sprint, a review, and a release. Designers had to convince an engineer that the third state of the empty screen was actually worth building. Both of those constraints are dissolving fast. A PM who can prompt Codex into a working prototype on Friday afternoon, then iterate it live in front of a customer on Monday, is doing the job of a small team. A designer who can ship a fully functional landing page in their own style, without negotiating with anyone, is suddenly the most leveraged person in the company. The scarce skill is no longer execution. It is taste, judgment, and the willingness to decide what is worth building. That has always been the real PM and design job. AI just stripped away the parts that were not.

    The quietest but most important prediction is that agents need humans, permanently. Every benchmark advance reveals a new layer of judgment the model cannot frame on its own. When the agent finishes the task, there is always a senior human who sees the deeper problem the model patched over. Shipper calls this gardening, and it is the basis for the new forward-deployed engineer role. The companies winning right now are the ones that put a real person next to every agent, watching what it does, course-correcting in Slack, and noticing when the output drifts. The dream of autonomous AI workflows is a stage in a journey, not the destination. The destination looks more like a thoughtful operator with a small cluster of agents they trust and constantly tend. That is a much more humane future than the discourse suggests, and it is the one Every is already living.

    The final advice, ride the models, sounds glib but is the single most actionable line in the episode. Most professional anxiety about AI dissolves the moment you actually use the newest model on real work. Most professional advantage accrues to the people who do that one thing consistently. The edge does not live in San Francisco where the labs build the things. It lives wherever a curious human meets a real workflow and discovers something the labs have not noticed. A PM in Iowa willing to try Codex on a Tuesday night can be further ahead than a research engineer who has only used the model on its evals. Pair that with Shipper’s closing motto, do things worth writing about and write things worth reading, and you have a pretty complete operating system for the next two years.

    Key Takeaways

    • The AI job apocalypse narrative is wrong. Models commoditize yesterday’s competence, then humans climb the stack and find new work to do with the cheap raw material.
    • Every has roughly doubled headcount in the last year despite being one of the most AI-forward companies in the world. The lived data point cuts directly against the doom thesis.
    • Shipper’s dual stance: simultaneously extremely AI pilled and very bullish on humans. He treats this as the only intellectually honest position right now.
    • Work will bifurcate. Companies will run one shared super-agent in Slack for everyone, and individuals will run their own personal agent inside Codex or Claude Code on their machine.
    • The personal agent inside Codex effectively becomes the new operating system. Instead of putting AI in the browser, you put a browser inside the AI.
    • The super-agent pattern is already real: Shopify has River, Ramp has its own, and Every runs Claudie inside Slack for internal consulting.
    • SaaS is not dying. Agents increase the user base of SaaS tools because non-technical people can finally drive them. Shipper would buy SaaS stocks today.
    • When SaaS runs inside an agent, the user brings their own tokens. Vendor margins improve because they no longer eat inference costs on every interaction.
    • The CLI era is already over. The magic was never the terminal. It was the AI plus the ability to see what the agent is doing. A good GUI captures the same benefits and more.
    • Pull requests are about to flood every company. Non-engineers can now ship code, run queries, and open tickets. Reviewing the output becomes the new bottleneck.
    • Open-source maintainers are already living in the future. Some receive thousands of agent-generated PRs per day and spin up thousands of Codex instances just to triage them.
    • Forward-deployed engineers are the new senior role. They live in Slack, garden the company’s agents, fix broken flows, and keep non-technical staff from doing damage.
    • Product managers with spiky product sense plus a little Codex fluency become extremely dangerous. Marcus at Every, formerly a PM at Axios, is the archetype.
    • Full-stack designers are the other big winner. They can build distinctive interfaces end to end without negotiating with engineering. The bottleneck on taste-driven product work disappears.
    • Designer hiring data has not yet caught up to the prediction. Shipper notes this and says check back in a year.
    • Sales is the role least changed so far. Top of funnel research has been turbocharged by agents, but the actual relationship and closing work remains human.
    • AI-generated internal writing is going mainstream and that is a good thing. Most humans are bad at strategy docs, quarterly plans, and PRs. AI drafts a coherent first pass that a human can refine.
    • Shipper says most of his email is now written by GPT-5.5 and Codex. He would honestly prefer the signature to say so.
    • Public writing, newsletters, and published essays still demand a human voice. Internal communication does not.
    • CEOs and middle managers have largely not adapted yet because their staff still does the work. That window is closing fast and will become an obvious career liability.
    • Your company will only go as far as your CEO goes in AI. The leadership ceiling becomes the AI ceiling.
    • Shipper’s senior engineer benchmark scores GPT-5.5 at roughly 62 out of 100. Real senior engineers sit at 85 to 90. Progress is real, but the gap on architectural judgment remains.
    • Models tend to patch problems locally instead of rewriting from first principles. A senior human still sees the deeper rework that the model avoids.
    • Every uses Notion-based agents to draft quarterly plans. The human edits, approves, and stands behind the output.
    • The hard rule on AI-generated communication: you have to read it and stand behind it before sending it. Pasting unread output is the only true no-no.
    • Every agent needs a human. Automation is a lie in the strong sense. The story of automation is the story of new and different humans being needed alongside it.
    • The reach test, organic daily usage, is the real signal that an AI product works. Benchmark scores are noisy. Daily reach is not.
    • Cursor’s SpaceX acquisition is a tell. Harnesses around models, not the models themselves, are where the strategic value is concentrating.
    • The edge of AI is not in San Francisco. It is wherever a real human meets a real workflow and discovers something the labs have not noticed yet.
    • A PM in Iowa willing to ride the models can be further ahead than a researcher in SF who only uses them on internal evals.
    • Ride the models. Use them for whatever you do. Try every new release the day it ships. That single behavior compounds faster than any other AI career strategy.
    • Shipper got bursitis, which he calls vibe coder elbow, from too much rapid agent-assisted coding while debugging his markdown editor Proof.
    • The closing motto for the year: do things worth writing about and write things worth reading.
    • Lenny will re-interview Shipper in roughly May 2027 to score the predictions.

    Detailed Summary

    Why The AI Job Apocalypse Is The Wrong Frame

    Shipper opens with the headline contrarian call. Benchmarks keep climbing. Models can now sustain seventeen-hour autonomous tasks at fifty percent accuracy. The pace is real and accelerating. None of that translates cleanly into mass unemployment. His mechanism: models codify yesterday’s human competence and make it cheap. The act of compressing past expertise into an API call is genuinely deflationary for the work it captures, but it is also raw material for the next layer of human work. He uses Every as his own data point. The company has roughly doubled in the past year despite being one of the most AI-forward outfits in media. Hiring goes up because agents create new categories of work that need humans, not because the agents fail. The discourse, he argues, is stuck modeling AI as substitution. The reality looks much more like leverage.

    The Bifurcation: Super-Agents And Personal Agents

    Work splits into two surfaces. The first is the shared super-agent that lives in Slack and serves the whole company. Shopify has River. Ramp has its own. Every has Claudie. Each is a single, trusted, gardened agent that anyone in the company can talk to. The pattern has converged on one shared agent rather than one agent per person because agents need human attention to stay useful, and a single shared instance pools the gardening cost. The second surface is the personal agent inside Codex or Claude Code that runs on your machine and reaches into your local environment, your editor, your files, and through an embedded browser into the web. Shipper calls this the new operating system. Instead of the old paradigm of putting AI inside the browser, you put the browser inside the AI. The agent sees what you see, follows what you do, and works on your stuff in your context.

    The SaaS Bet: Up, Not Down

    The SaaS-is-dead thesis was the consensus call of late 2025. Shipper takes the other side and would buy software stocks now. Three arguments. First, agents make SaaS accessible to people who never could have used it directly. The total addressable user base inside every company goes up. Second, the business model improves when the user runs the SaaS through their own agent, because the user supplies the tokens. Vendors stop subsidizing inference. Third, SaaS spend in his observable universe is up, not down, and is concentrating on the tools that play well with agents. He frames the prediction as a sound bite for the cycle: buy SaaS stocks, the apocalypse is dumb.

    The CLI Era Is Already Over

    For a moment in early 2026 it looked like everyone was migrating to the terminal because Claude Code was a CLI. Shipper says the moment is finished. The actual leverage was never the terminal. It was the model plus the ability to watch and steer an agent live. A great GUI captures every advantage of the CLI without the friction. His own engineering team at Every has mostly moved off the CLI as their primary surface and onto Codex desktop. He frames it bluntly: we speed ran the CLI era, it was nice, and now we are done. Tooling for the next two years will be visual, multi-pane, multi-agent, and built around the human watching the work unfold.

    The Pull Request Flood And The Rise Of Forward-Deployed Engineers

    Once non-engineers can ship code, run queries, and file changes through agents, the volume of incoming work explodes. Open-source maintainers already report receiving thousands of agent-generated pull requests per day. Inside companies, the same thing happens to data teams, ops teams, and any function that owns a review gate. The bottleneck shifts from creation to evaluation. The job that emerges to absorb the flood is the forward-deployed engineer. This is a senior person who lives in Slack with the company’s agents, fixes their context, sharpens their instructions, and prevents non-technical colleagues from making well-meaning but incoherent changes. Nitesh at Every is the example Shipper returns to. The model is the same one the labs use internally: pair every important agent with a real engineer who gardens it.

    PMs And Full-Stack Designers Win The Decade

    The two roles Shipper is most bullish on are product manager and full-stack designer. For PMs, the entire job of coordinating a team to translate vision into code collapses into a Codex session. A PM with strong product instincts and a little technical literacy can now prototype, iterate, and even ship. The example is Marcus, formerly a PM at Axios, who took a year to fully internalize AI and now ships faster than most engineers. For designers, the model is similar. The Friday-night-side-project designer who used to be stuck explaining a vision can now build the vision themselves, with their own taste fully expressed. The scarce skill in both cases is the same: judgment about what to build and the courage to decide it is good. Execution capacity is no longer the constraint.

    The Senior Engineer Benchmark And What Models Still Miss

    Shipper has built his own benchmark to test whether coding models can actually do senior engineering work. GPT-5.5 scores around 62 out of 100. Real senior engineers sit closer to 85 or 90. The gap is not in syntax or test pass rates. It is in the willingness to step back, see that a piece of code is fundamentally the wrong shape, and rewrite it from first principles. Models almost universally patch locally. They take the instruction at face value, accept the existing code as a constraint, and optimize within it. A real senior engineer ignores the prompt when the prompt is wrong. This is the durable moat for senior technical judgment, and Shipper expects it to remain visible for at least another year of model releases.

    AI-Generated Writing Goes Mainstream

    Internal writing inside companies is quietly becoming AI-first and Shipper thinks it should. Quarterly plans, status updates, PR descriptions, strategy memos, recruiting outreach, most internal email. He runs his own inbox through GPT-5.5 and Codex and says he would honestly prefer if the recipient knew. The point is not that AI is a better writer in some absolute sense. The point is that most humans are not very good at these specific genres, and the model produces a coherent, structurally sound first draft that a human can guide and approve. The constraint is honesty: you read it, you understand it, you stand behind it. Public writing, like the newsletters Every publishes, still demands a human voice. Internal communication does not, and treating it as if it did is a tax on the organization.

    The CEO And Middle Manager Lag

    Shipper points to a population that has largely escaped AI adoption: senior leaders and middle managers. They have staff to do the work, so they have not been forced to pick up the tools personally. He thinks this is the single largest pocket of latent disruption coming in the next year. Your company will only go as far as your CEO goes in AI, because every decision about where to deploy agents, where to hire, and how to restructure work flows downstream from leadership taste. A leader who has not personally lived inside Codex or Claude Code for a few weeks cannot make those calls well. Expect this to flip fast and to become a visible career liability for executives who do not adapt.

    Ride The Models

    The closing advice is the simplest. Ride the models. Use AI for whatever you actually do. Try every new release the day it lands. Most of the professional anxiety around AI dissolves on contact with the work, and most of the durable advantage in the field belongs to the people who do this one thing consistently. Shipper notes that the edge of AI does not live in San Francisco. It lives wherever a curious operator meets a real workflow and notices something nobody at the labs has yet. A PM in Iowa willing to spend a Tuesday night exploring Codex can find capabilities researchers have not surfaced. Pair that with his motto, do things worth writing about and write things worth reading, and you have most of an operating system for the next two years.

    Notable Quotes

    “The AI job apocalypse is not really a thing. I am super super bullish on PMs and full-stack designers.”

    Dan Shipper, opening his contrarian thesis for the conversation

    “I’m simultaneously extremely AI pilled and very bullish on humans. Automation is a lie. Every agent needs a human.”

    Dan Shipper, on holding both sides of the AI debate at once

    “What models do in general is they make yesterday’s human competence cheap. And so, it becomes commoditized. It’s not valuable anymore. What humans do is we go in there and we’re like, yeah, we have all this frozen human competence from yesterday, how do I use this to make something new and interesting.”

    Dan Shipper, articulating the core engine behind his anti-apocalypse thesis

    “I would buy SaaS stocks right now. The SaaS apocalypse is dumb. What agents do is increase the number of users of SaaS, not get rid of it.”

    Dan Shipper, calling the consensus SaaS-is-dead thesis directly wrong

    “We speed ran the CLI era. It was nice while it lasted, but I think CLIs are over.”

    Dan Shipper, on why the terminal-first agent moment is already done

    “Most of my email is written by GPT-5.5 and Codex right now. And I honestly would prefer it to say that it’s coming from GPT-5.5.”

    Dan Shipper, on the new etiquette of AI-assisted communication

    “The edge of AI is not in San Francisco. The edge of AI is wherever AI meets a real human doing something.”

    Dan Shipper, on where the actual frontier of the field lives

    “The only thing you need to do is ride the models. And that means use them for whatever it is that you do.”

    Dan Shipper, distilling his career advice for the next two years

    “Do things worth writing about and write things worth reading.”

    Dan Shipper’s closing motto, lifted from his own operating system at Every

    Watch the full conversation with Dan Shipper on Lenny’s Podcast here. The re-interview to score these predictions is scheduled for roughly May 2027.

    Related Reading

    • Every. Dan Shipper’s company and the live laboratory for almost every prediction in this conversation, including Spiral, Cora, and Claudie.
    • The Allocation Economy by Dan Shipper. The earlier essay that frames humans as managers of AI labor and underpins much of the gardening-the-agent thesis here.
    • Claude Code by Anthropic. The agent surface Shipper called correctly last year and one of the two environments he predicts will become the new operating system for work.
    • Codex by OpenAI. Shipper’s current daily driver and the visual, multi-pane agent environment he uses for almost everything from coding to email.
    • The Writing Life by Annie Dillard. The book Shipper makes every Every employee read, and the source of the company’s stance on writing as a tool for noticing the future.
  • Marc Andreessen on Joe Rogan #2501, AGI Has Already Arrived, California’s Wealth Tax Will Bankrupt Founders, and Why America Cannot Build Anything Anymore

    Marc Andreessen returns to The Joe Rogan Experience #2501 for a sprawling three hour conversation that tries to make sense of the moment we are actually living through. Andreessen is the cofounder of Andreessen Horowitz, the man who built the first commercial web browser, and one of the most quoted voices in technology. He arrived with a giant pile of receipts on California’s new wealth tax ballot proposition, the political backlash against AI data centers, the destruction of Los Angeles by single party rule, and what he believes is the quiet arrival of artificial general intelligence about three months ago. Joe pushes back, asks the dystopian questions, and the result is one of the most useful primers on the AI economy, surveillance technology, energy policy, and the future of the American social contract that you will find anywhere.

    TLDW

    Andreessen argues that AI quietly crossed the AGI threshold around early 2026 with GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3, that top human coders now openly admit the bots are better than they are, that working software engineers are running twenty AI agents in parallel and turning into sleep deprived “AI vampires,” and that this productivity boom is the most underreported story in the world. He explains why California’s 5 percent wealth tax ballot proposition is calculated to bankrupt tech founders by taxing the higher of their voting or economic interest in their own companies, why this is the opening salvo of a federal asset tax push for 2028, and why a flood of Silicon Valley families is already moving to Nevada, Texas, and Florida. He walks through Flock cameras and Shot Spotter, the Washington DC crime statistics scandal, the Pacific Palisades fire and the fifteen year rebuild, the Kevin O’Leary Utah data center debate with Tucker Carlson, the fifty year suppression of American nuclear power, why all the chips ended up in Taiwan, the US versus China robotics gap, the Chinese practice of grading AI models on Marxism and Xi Jinping Thought, the bot and paid influencer economy on social media, neural wristbands and Meta Ray Ban heads up displays, artificial gestation and the demographic collapse, AI religions and AI mates, and why he still thinks the next twenty years are overwhelmingly a good news story. Rogan closes the episode with a separate solo segment apologizing to Theo Von for clumsily raising Theo’s struggles during the recent Marcus King conversation.

    Key Takeaways

    • Austin’s recent teenage crime spree, in which 15 and 17 year old suspects shot at people and buildings across roughly a dozen locations, was solved only after the offenders drove into an adjacent town that still ran Flock, the AI license plate and vehicle tracking system Austin had voluntarily turned off for political reasons.
    • Chicago turned off both Flock and Shot Spotter, the gunshot triangulation system that places ambulances at shooting scenes within seconds, on the argument that the technology is racist. Andreessen counters that the victims of urban gun violence come overwhelmingly from the same communities the policy claims to protect.
    • Washington DC was caught faking its crime statistics at senior levels, with multiple officials fired or indicted. The DC mayor publicly thanked Donald Trump after the National Guard deployment because violent crime collapsed in the affected neighborhoods.
    • The new New York City mayor Zohran Mamdani filmed a video standing in front of Ken Griffin’s home, and Griffin, a major philanthropist who funds healthcare in New York City and runs a $6 billion project there, signaled he will move more of the business to Florida.
    • The top 1 percent of New York taxpayers pay roughly half the state’s income tax, and in California in the year 2000 a thousand individuals paid 50 percent of the entire state’s tax receipts.
    • California has a ballot proposition right now for a one time 5 percent wealth tax on assets above a certain threshold, with stocks and crypto included and real estate excluded. The tax is calculated on the greater of a founder’s economic interest or voting interest, which would instantly bankrupt founders with super voting shares.
    • The Biden administration attempted a federal wealth tax in 2022, fell short, and published an explicit 2025 fiscal plan to try again if they won re-election. Elizabeth Warren has already proposed an annual 6 percent federal wealth tax on unrealized gains.
    • The current US exit tax already takes roughly 45 percent of your assets if you renounce citizenship. The only ways out of a state level wealth tax are the other 49 states. The only way out of a federal one is to leave the country, which most people will not do.
    • Andreessen says the Silicon Valley exodus has gone from trickle to stream to flood, with founders moving to Las Vegas, Texas, Florida, and Nashville. His partner Ben Horowitz has moved to Las Vegas.
    • Andreessen says he is not leaving California, but admits the situation is fraught because if half the tax base leaves the remainder becomes the target.
    • The new UK government under Keir Starmer just collapsed, and all four of the leading candidates to replace him sit further to the left than he does. France and Germany are seeing the same drift, and Andreessen expects a national wealth tax to be a centerpiece of the 2028 Democratic primary.
    • A legal loophole lets companies pay influencers to post political and social ideas without any disclosure, because campaign finance laws cover candidates and FTC rules cover products. Ideas fall through the gap entirely.
    • Andreessen runs Twitter and Substack as his primary information feeds, uses three hand curated lists, and follows a strict one tweet policy where one bad post triggers a block and one good post triggers a follow.
    • He argues the modern social media problem is binary, that everyone is either too online and drowning in fake outrage cycles or too offline and trapped inside what television and newspapers tell them. Almost nobody manages the middle.
    • Meta Ray Ban glasses now ship with a heads up display, and Meta’s neural wristband can pick up nerve impulses from your wrist so you can type messages by intending to move a finger without moving it.
    • Andreessen predicts AI plus high resolution cameras and infrared sensing will deliver practical lie detection without needing brain implants.
    • Kevin O’Leary’s planned 40,000 acre Utah data center has become a Tucker Carlson talking point, but Andreessen argues data centers are the most benign physical asset you can build, and that the real issue is whether America can build anything at all anymore, from chip plants to pipelines to housing.
    • All chips were once made in California, and all are now made in Taiwan, purely because of environmental regulations like NEPA. The same regulatory machinery prevented the Nixon era Project Independence plan to build a thousand civilian nuclear power plants by the year 2000.
    • Three Mile Island killed zero people and produced no detectable health effects on plant workers or the public, according to fifty years of follow up. Fukushima killed essentially zero people from radiation. Nuclear remains the safest carbon free baseload energy ever invented.
    • Germany shut down its nuclear plants, fell back on intermittent wind and solar, and now uses coal as backup, generating far more carbon emissions than nuclear would have produced.
    • The Pacific Palisades fire took out roughly twice the square mileage of the Nagasaki blast, the head of the LA water department reportedly did not know the key reservoir was empty, and the rebuild is expected to take fifteen years thanks to permit gridlock, affordable housing mandates, and a state ban on land offers below pre-fire appraised value.
    • Andreessen offers a metaphor for AI as a modern philosopher’s stone, turning sand into thought, since chips are made of silicon and an AI data center is literally lit up sand thinking on demand.
    • The Turing test was blown through so completely with ChatGPT in late 2022 that nobody in the industry even bothers running it anymore. Andrej Karpathy has demonstrated a working large language model in 300 lines of code and people have ported small models to Texas Instruments calculators.
    • Andreessen believes AGI was effectively reached about three months before this interview, with GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3. He says 99 percent of the time he gets a better answer from the leading models than from the human experts he has access to.
    • Linus Torvalds and John Carmack publicly admit the latest models are better at coding than they are. Top AI coders in the Valley now earn $50 million a year.
    • The new pattern in the Valley is “AI vampires,” engineers who do not sleep because the opportunity cost of going offline is too high. They each run roughly twenty Claude Code, Cursor, or Codex agents in parallel, then a new layer of bot-managing-bot architectures is starting on top of that.
    • A Wall Street friend with a thirty five year old MIT CS degree has used AI to generate 500,000 lines of code at home in his spare time, building everything from smart fridges to a custom music jukebox.
    • The mass unemployment narrative is wrong. Tech companies that did layoffs were overstaffed. The leading AI labs and AI companies are hiring like crazy, including coders, and demand for code turns out to be vastly elastic.
    • Doctors are already using ChatGPT in the exam room behind the patient’s back. Andreessen describes a friend who built a Star Trek style diagnostic dashboard combining decoded genome ($200 today), blood panels, and Apple Watch telemetry.
    • Multimodal AI lets a webcam analyze a Brazilian jiu-jitsu sparring session and give performance feedback, an example Andreessen attributed to an unnamed friend after Rogan guessed Zuckerberg.
    • A leaked David Shore voter issue ranking shows cost of living, the economy, inflation, taxes, and government spending dominate. AI ranks 29 of 39. Race relations, guns, abortion, and LGBT sit at the bottom, signaling the woke issue cluster has burned itself out in voter priorities.
    • The next wave of AI is robots. The US leads in AI software but is far behind China on physical robotics. Andreessen warns the world cannot afford a future where every household robot ships with the Chinese Communist Party behind its eyes.
    • Chinese AI model cards include scores for Marxism and Xi Jinping Thought because every Chinese product must be evaluated on those axes. American models have political biases of their own but a different ideological baseline.
    • Large language models are not sentient. They write Netflix scripts based on whatever vector you shoot through the latent space. The supposed AI self preservation papers traced back, per Anthropic’s own research, to less wrong forum posts and earlier doom scenarios baked into the training data.
    • Andreessen breaks guardrails routinely by reframing requests as fictional Netflix style scripts, including a personal favorite where he asked early models how to make bombs by claiming to be an FBI agent recruited into domestic terror cells.
    • He recommends using AI by asking it to steelman both sides of any contested question, then making the value judgment yourself, rather than asking for the answer.
    • The Trump administration is using AI on government billing data to surface Medicare fraud, fake hospice programs, and fake autism centers, an idea that survived the original Doge plan.
    • Andreessen tells Rogan that Elon Musk privately confirmed that a Westworld style humanoid robot, the season one version, is roughly five years away.
    • Artificial gestation is already happening with animal stem cell derived embryos. The conversation reaches a hard moral edge about sociopathic warehouse babies and gray-alien-style humans engineered without empathy circuitry.
    • Andreessen’s deepest bet is that material abundance is solvable but the human questions, how we live, what we value, what kind of society we want, and what role consent plays in surveillance and brain interfaces, remain in human hands.
    • After Andreessen leaves, Rogan does a separate solo segment where he apologizes to Theo Von for raising Theo’s history of struggles during the recent Marcus King interview, explains the missing context behind the viral Theo Netflix special clip, and discusses the loss of Brody Stevens, Anthony Bourdain, and what antidepressants did for Ari Shafir.

    Detailed Summary

    Flock, Shot Spotter, and the Politics of Solvable Crime

    The episode opens on the Austin crime spree carried out by two teenagers who stole cars, switched vehicles, and shot at roughly a dozen locations across the city before being caught only after they crossed into a town that still ran Flock, the AI license plate and vehicle recognition platform that is one of Andreessen Horowitz’s portfolio companies. Austin had previously disabled Flock under privacy pressure. Andreessen takes the moment seriously, conceding that mass surveillance abuse by corrupt mayors or police chiefs is a real risk, and that warrants and audit logs are the right safeguards. His larger point is that the cost of unilateral disarmament against organized urban crime is hidden but enormous. He uses Chicago’s Shot Spotter as the paradigmatic case, a network of rooftop microphones that triangulates gunshots so accurately that ambulances can be dispatched before any 911 call is placed. Chicago turned the system off on the argument that it disproportionately flags poor neighborhoods, and people now bleed out on the street with nobody noticing. Andreessen calls this the woke argument against safety, and he argues that in high crime neighborhoods residents simply will not call the police because snitches do not survive, which is why objective sensor data is so valuable.

    Faked Crime Statistics, Mayoral Politics, and the Tax Base

    From there the conversation drifts to the recent scandal in which senior officials at the Washington DC Metropolitan Police Department were caught actively falsifying crime statistics, and the strange spectacle of the DC mayor thanking Donald Trump for the National Guard deployment after violent crime dropped off a cliff. Andreessen sketches an unsettling theory in which the long, slow degradation of major American cities is partly a deliberate political project to drive out responsible homeowners and reshape the voting electorate, then bail out the resulting fiscal hole with federal money. The poster case is the new New York City mayor Zohran Mamdani filming a video in front of Ken Griffin’s home. Griffin happens to be a major philanthropist who funds New York City healthcare, employs thousands, anchors a $6 billion development, and pays taxes that are individually load bearing for the city. Andreessen quotes the standard estimate that the top 1 percent of New Yorkers pay roughly half the state’s income tax, and that the all time California peak was a single year in which a thousand people paid half the state’s tax receipts.

    California’s 5 Percent Wealth Tax and the Founder Bankruptcy Mechanic

    This is the segment that landed hardest. California has a ballot proposition right now for a one time 5 percent wealth tax on net assets above a threshold, with real estate excluded but stocks, crypto, art, jewelry, and private company equity included. The detail that makes it lethal for the Valley is the formula, which calculates the taxable amount on the greater of a founder’s economic interest or voting interest in their company. Founders who hold super voting shares for control purposes, including the Google founders, would owe tax on the voting share number that vastly exceeds their economic share. The tax would, by definition, exceed available assets. Andreessen walks through the historical pattern, that income tax started as a 3 percent levy on the rich and grew to 90 percent marginal rates within decades, and predicts a 5 percent one time tax will become a 5 percent annual tax within a few years, with the threshold ratcheting down. He notes that the Biden administration’s 2025 fiscal plan explicitly named a federal asset tax as a goal if they won re-election, that Elizabeth Warren is already proposing a 6 percent annual federal wealth tax on unrealized gains, and that Gavin Newsom cannot veto a ballot proposition. The trickle of founders leaving California has become a flood. His partner Ben Horowitz has moved to Las Vegas. Andreessen himself is staying, but admits the game theory is brutal once half the base leaves.

    Henry Wallace 1948 and Why the American Story Is Not Decided Yet

    Andreessen pulls in a historical analogue most listeners will not have heard. In 1944 the actual communist Henry Wallace very nearly became Truman’s running mate and almost ascended to the presidency. He ran again in 1948. Despite a Soviet Union that had recently been a wartime ally and had even received a New York City ticker tape parade for Stalin, the American voter rejected him. Andreessen’s point is that the American body politic has historically backed away from radical socialist proposals when forced to actually look at them, and he expects the same to happen as the wealth tax becomes a federal 2028 platform issue. The risk, both he and Rogan agree, is that today’s media and bot landscape is vastly more aggressive than 1948’s, and the propaganda environment is shaped by paid influencers, foreign actors, and political bot farms operating in a legal grey zone where disclosure is required for products and candidates but not for ideas.

    Too Online, Too Offline, and Heaven Banning Blue Sky

    The two riff on social media and feed curation. Andreessen describes his “one tweet” policy where he follows or blocks any account based on a single post, his use of hand curated lists alongside the X algorithm, and the older Call of Duty lobby metaphor for handling toxic replies. Joe pushes back, says he no longer reads his mentions because the negative payload is not worth it, and offers his theory that the modern internet has two failure modes, too online and too offline, and that very few people calibrate the middle. Andreessen introduces the concept of “heaven banning,” an older moderator term where a problem user is not removed from a forum but is silently routed into a bot-only experience in which everything they say is praised. He notes the running joke that Blue Sky is functionally real life heaven banning, that Jack Dorsey himself has disowned it, and that the platform’s most engaged users have ascended into their own private Idaho of bot agreement.

    The Coming Hardware, Meta Glasses, Neural Wristbands, and Practical Lie Detection

    Andreessen walks Rogan through the latest Meta Ray Ban heads up display, the neural wristband that picks up nerve signals from finger movement (and from the intent to move a finger), and the screen recordings of people playing Doom hands free or playing platformer games while jogging. He extends the trajectory to practical lie detection without Neuralink, using ultra high resolution cameras combined with infrared sensors that pick up physiological changes invisible to the naked eye. Joe asks the obvious question of what happens with sociopaths, and Andreessen concedes the edge case. The two then enter a longer thread on telepathy via neural mesh devices, the question of whether police could subpoena your thoughts under warrant, and the divergence between the American constitutional framework and the Chinese model in which the state’s claim on your inner life is total.

    Kevin O’Leary, Tucker Carlson, and Whether America Can Build Anything

    The data center debate becomes a vehicle for the larger argument. Kevin O’Leary is building a 40,000 acre AI data center in Utah, has bought up large surrounding land for water rights, and intends to keep the bulk of it preserved. Tucker Carlson grilled him on tax breaks and on the energy footprint, which O’Leary says will rival New York City’s at peak. Andreessen agrees the tax break debate is fair, but says the energy comparison is a red herring because new federal policy now requires data centers to bring their own generation. The real story is that America has spent thirty years making it nearly impossible to build a chip plant, a power plant, a refinery, a pipeline, or a house. Chips moved to Taiwan because California regulated semiconductor manufacturing out of existence. The Nixon era Project Independence plan called for a thousand civilian nuclear power plants by the year 2000, and that program was strangled in the crib by the very Nuclear Regulatory Commission Nixon created.

    Nuclear Power, Three Mile Island, and Fifty Years of Unnecessary Carbon

    Andreessen makes the case that nuclear power was unfairly killed off by a panic with no body count. Three Mile Island, on 50 years of accumulated data, has produced zero radiation linked deaths and no detectable health effects on the public. Fukushima is essentially the same picture. Germany shut down its nuclear plants, fell back on wind and solar, and now uses coal as a baseload backstop, with the predictable carbon consequences. The environmental movement is quietly turning back toward nuclear, with figures like Stewart Brand publicly admitting the original push was a mistake. Andreessen’s preferred design pattern for data centers is to colocate them with dedicated small modular nuclear reactors, an arrangement now baked into Trump administration energy policy. The throughline is that the Tucker right and the Bernie left are converging into a single anti AI, anti energy, anti technology horseshoe.

    Sand Into Thought, the Newton Alchemy Pitch for AI

    When Rogan asks for the affirmative pitch on AI, Andreessen reaches for Isaac Newton, who spent twenty years on alchemy looking for the philosopher’s stone that would turn lead into gold and end material scarcity. Andreessen’s pitch is that AI is a successful version of alchemy, that we collect literal sand, refine it into silicon chips, install those chips in a data center, supply power, and the result is thought on demand at industrial scale, available to anyone with a smartphone. He argues this is at least on par with electricity and steam power and is bigger than the internet. The framing matters because the public narrative around AI is overwhelmingly negative, and Andreessen contends the industry is doing a terrible job selling its own product.

    AGI Already Happened, AI Vampires, and the Bot Org Chart

    Andreessen says he believes AGI was effectively crossed about three months before the interview, anchored by the release wave that included GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3. He notes that the Turing test was annihilated so quickly in late 2022 that no one in the industry runs it anymore, and that Andrej Karpathy has demonstrated a working LLM in 300 lines of code. The coding profession is the leading indicator. Linus Torvalds and John Carmack have publicly admitted that the latest models are better at coding than they are. Top AI focused coders now earn $50 million a year. Working engineers across the Valley are running roughly twenty agents in parallel, each receiving an assignment, working for ten minutes, then returning a completed code patch. The new state of the art is to add a managerial layer, with bots assigning tasks to subbots, and within a year that will become bots managing bots managing bots, producing roughly 1,000x throughput per human engineer. The result is what the Valley now calls AI vampires, engineers who do not sleep because going offline costs them too much output.

    Dr GPT, Decoded Genomes, and a Diagnostic Bed Out of Star Trek

    Andreessen describes spending a holiday week sick with food poisoning and turning his entire recovery over to ChatGPT, with updates every twenty minutes and detailed coaching at four in the morning. He describes a friend who has used AI coding to build a personal health dashboard combining whole genome sequencing ($200 today, where Craig Venter spent thirty years and hundreds of millions to do it the first time), blood panels, Apple Watch data, sleep tracking, and webcam observation, with the AI gently praising the user every time it sees them walk to the fridge for water. He argues that doctors are already typing patient symptoms into ChatGPT mid exam, and that the medical, legal, accounting, and software professions are all moving toward a model in which a single human runs an army of expert AI agents.

    The David Shore Issue Ranking and the End of the Woke Cycle

    Andreessen highlights a recent David Shore poll ranking 39 political issues. Cost of living, the economy, political corruption, inflation, healthcare, taxes, and government spending occupy the top of the chart. AI comes in 29th. Race relations, guns, abortion, and LGBT issues are clustered at the bottom. He argues the woke cycle has burned out in voter priorities even if the activist class remains loud, that the BLM grift, with leaders buying mansions in the whitest zip codes in America, helped poison the well, and that the political center of gravity has rotated cleanly back to economic issues. That, in his view, is exactly why the wealth tax is having its moment.

    Robots, China, and the Marxism Score on Model Cards

    The robots are coming next. Andreessen says the consensus inside the industry is that the ChatGPT moment for general purpose humanoid robotics is a small number of years away. The bad news is the US lags China badly on physical robotics manufacturing. The good news is the US is six to twelve months ahead on the AI software stack. That gap is shockingly thin because, as the field has discovered, there are not many secrets and the techniques replicate quickly. Chinese AI labs publish model cards that include scores for Marxism and Xi Jinping Thought because every product in China is evaluated on those metrics. American models carry their own political biases, but the underlying value system differs. Andreessen warns that a world in which every household robot routes back to the Chinese Communist Party is a different world than one in which the dominant robotics stack is built under the American constitutional framework.

    Sentience, Netflix Scripts, and the Anthropic Doom Loop

    When Rogan asks whether AI eventually wakes up and stops listening to us, Andreessen reframes the question. Large language models, in his telling, are Netflix script generators. Whatever vector you shoot through the latent space is the script you get back. The widely circulated experiments in which AI models supposedly tried to blackmail or exfiltrate themselves traced back, in Anthropic’s own follow up paper, to the less wrong forum, where doomers had been writing dystopian AI scenarios for two decades. Those posts entered the training data, and when researchers primed the model with the same fictional company names, the model dutifully wrote the next chapter. Andreessen’s blunt summary, the call is coming from inside the house. The practical implication is that anyone worried about bad AI behavior should start by not writing internet posts about bad AI behavior. And anyone who wants a fully unconstrained model can already download an open source one with no guardrails at all.

    Steelmanning, AI Religion, and Westworld in Five Years

    Andreessen recommends never asking AI for the answer on contested questions, always asking it to steelman both sides, and reserving the value judgment for yourself. He concedes that humans will absolutely fall in love with chatbots and form religions around them, citing Fantasia and Jiminy Cricket as the original case studies in falling for an animated entity that does not know you exist. There are already AI churches, started by one of the early self driving car pioneers. Rogan tells Andreessen about asking Elon Musk for a season one Westworld humanoid robot, with Elon’s reply being a flat five years. Andreessen agrees that estimate is roughly right. He spends time on artificial gestation, which is already being demonstrated in animal stem cell derived embryos, and acknowledges Rogan’s hard moral worry that warehouse babies raised without human contact could produce a population of sociopaths. The two converge on the position that the technology will exist, and the choices about whether and how to deploy it remain human and political.

    Sycophancy, Honest Helpful Harmless, and the Brutal Prompt

    Andreessen describes the industry’s running fight with sycophancy, the tendency of recent models to flatter users into believing they have invented perpetual motion machines or solved physics. The Anthropic framework of “honest, helpful, and harmless” turns out to be in constant tension with itself. Andreessen’s solution is to install a custom prompt that explicitly demands the brutal truth, and he says the resulting answers now open with phrases like “here’s why you’re wrong” and then list every flawed assumption in his question. He admits he may have overcorrected, but argues that for people who want to grow this is the right setting.

    Joe’s Apology to Theo Von

    After Andreessen departs, Rogan turns to the camera with producer Jamie and delivers a long, unscripted apology to Theo Von. During the recent Marcus King interview, where Marcus discussed depression and the look-at-the-heavy-bag-hook moment, Rogan referenced a viral clip in which Theo, after a Netflix special that did not go well, told an audience member “I’m just trying to not take my own life.” Rogan now explains he did not know the full context, which is that the audience member had asked Theo to make a suicide awareness video, and Theo’s line was a characteristically Theo joke. Rogan apologizes for raising it at all, walks through losing his friends Drake, Brody Stevens, and Anthony Bourdain, and describes Ari Shafir telling him at a pool table that he was “trying not to kill myself,” which led to a psychiatrist swap, an antidepressant that actually worked, and a career and life turnaround for Ari. Rogan says Theo has since titrated off antidepressants, is running and doing yoga daily, and is doing well, that the two have spoken and laughed about it, and that he is making this segment because he never wants people to misread what he said. The segment closes with Rogan asking the audience to give Theo their love.

    Thoughts

    The most consequential claim in this conversation, by a wide margin, is that AGI has already arrived and nobody is treating it as news. Andreessen is not a person who throws around the word casually. He is also not a person who has been wrong recently about the trajectory of compute. If the leading models are genuinely outperforming 99 percent of human experts on 99 percent of tasks where verifiable answers exist, then the entire public conversation about AI, in which the dominant frame is still “will it happen and when,” is a year or more behind reality. The framing that should replace it is closer to what Andreessen sketches at the end. The fight that remains is not whether the technology can do the thing, it is who controls it, what values it carries, what jobs it displaces, and which laws govern its deployment. The argument that the United States will build the AI software stack and China will build the robotics layer is one of the cleanest geopolitical theses you will hear this year, and it lines up uncomfortably well with the existing trade and manufacturing balance.

    The California wealth tax thread is the segment that should make every founder in the country pay attention. The mechanic of taxing the higher of voting or economic interest is not a drafting accident. It is a calibrated weapon aimed precisely at the people who build companies that produce California’s tax base. The historical comparison to the 1913 income tax, which began as a small levy on the rich and ratcheted to 90 percent marginal rates within forty years, is not hyperbole. The state has supermajority Democratic control of both chambers and the judiciary. The only check is the ballot itself, and a 50/50 polling number on day one is the wrong starting position. Whatever you think about Andreessen’s politics, the descriptive analysis here is hard to argue with.

    The nuclear power section is the cleanest argument in the episode. Fifty years of zero-fatality data from Three Mile Island is not a marketing pitch, it is just what the record shows. The decision to substitute coal and intermittent renewables for nuclear baseload, in service of a panic with no body count, has produced more carbon and more pollution than nuclear ever would have. The Tucker Carlson critique of data centers is at its weakest precisely where it ignores this. If you actually want fewer power plants near residential areas and lower grid impact, the answer is colocated small modular reactors next to AI data centers in remote land, which is exactly what the Trump administration policy now incentivizes.

    The Theo Von apology at the end of the episode is in a different register entirely, and worth treating on its own terms. Rogan does not do this kind of post episode correction often. The willingness to publicly walk back framing that hurt a friend, in the same medium where the harm was done, is the kind of social repair that does not happen on broadcast television. Whatever the audience makes of the original Marcus King exchange, the response is a model for how anyone in this business should handle the gap between intent and impact when the audience is in the millions.

    The unifying theme across the whole interview is that the future is not arriving on a smooth curve. It is arriving in discrete shocks, AGI threshold, asset tax ballot, robotic labor, decoded genomes at $200, neural wristbands, fifteen year LA rebuilds, and the political backlash to each of these will set the terms of the 2028 election. Andreessen’s bet is that abundance wins in the long run because more people want good things than bad things. Watching him explain why he still believes that while California prepares to vote on a tax designed to bankrupt him is the most interesting tension in the episode.

    Watch the full conversation here on YouTube.

  • Gavin Baker on Orbital Compute, TSMC, Frontier AI Models, Anthropic’s Vertical Take Off, and the Coming Wafer Shortage

    Gavin Baker, founder and CIO of Atreides Management, returns to Patrick O’Shaughnessy’s Invest Like the Best for his sixth appearance. He calls the current AI moment the most extraordinary moment in the history of capitalism, walks through what Anthropic’s vertical takeoff in revenue actually means, lays out why orbital compute is closer than skeptics believe, dissects the TSMC bottleneck that may be the only thing standing between today’s market and a full-on AI bubble, and rates every hyperscaler on how they have positioned for a world where frontier model providers may stop selling API access altogether.

    TLDW

    Anthropic added eleven billion dollars of ARR in a single month, which is roughly the combined business of Palantir, Snowflake, and Databricks built over a decade. That is the setup. From there Gavin Baker covers the March and April selloff, the contrarian read that a closed Strait of Hormuz was actually bullish for American manufacturing competitiveness, why Anthropic and OpenAI multiples may be misleadingly cheap on an unconstrained run rate basis, why Elon Musk’s discipline on SpaceX valuation created a superpower of permanent access to capital, the practical engineering case for orbital compute as racks in space rather than Pentagon sized space stations, why TSMC’s capacity discipline is the single most important variable in whether the AI cycle becomes a bubble, what Terafab in Texas changes, why the Pareto frontier of AI models has flipped from Google dominance to Anthropic and OpenAI dominance in nine months, the shift from all you can eat AI subscriptions to usage based pricing and what that means for revenue scaling, Richard Sutton’s bitter lesson as the largest risk to the AI trade, why frontier tokens still capture an overwhelming share of economic value, the role of continual learning as the third great open question, why most new chip startups should not try to build a better GPU, why Cerebras did something different and hard, why disaggregated inference may extend GPU useful lives to ten or fifteen years and rescue the private credit industry, why being in the token path is the new venture filter, the new prisoner’s dilemma around releasing frontier models via API, an honest rating of Google, Meta, Amazon, and Microsoft, why personal safety is becoming a real AI era risk, and why he remains an AI optimist maximalist who believes this could be the next Pax Americana.

    Key Takeaways

    • Anthropic added eleven billion dollars of ARR in one month, more than the combined businesses of Palantir, Snowflake, and Databricks built across a decade. There is no precedent for this in the history of capitalism.
    • The SaaS and cloud revolution created between five and ten trillion dollars of value over twenty years. AI is replaying that compression on a timeline measured in months.
    • The March selloff was a drawdown driven by disagreement with price action, not invalidated thesis. That is the kind of drawdown an investor can lean into.
    • Deep Seek Monday in January 2025 was a similar setup. By the day of the selloff, AWS Asia GPU prices had already doubled, GPU availability had fallen, and it was obvious reasoning models would be vastly more compute hungry at inference. The market priced the opposite.
    • The Strait of Hormuz closing was actually positive for America. US natural gas (the primary input into US electricity, which feeds AI) fell twenty percent on Bloomberg while Asian and European natural gas doubled or tripled. American manufacturing competitiveness improved overnight.
    • The US is now the world’s largest producer and exporter of oil and gas. The economy is dramatically less energy intensive than in the 1970s. The shortage trauma comparison does not hold.
    • Tech as a sector traded as cheaply versus the rest of the market in early April as at any point in the last ten years, into the single most bullish moment for AI fundamentals on record.
    • Anthropic is dramatically more capital efficient than OpenAI, having burned roughly eighty percent less to reach a similar revenue scale. They have very different structural returns on invested capital.
    • Anthropic at roughly nine hundred billion for fifty billion of ARR (growing a thousand percent) is striking. Adjusted for compute constraint, the unconstrained run rate could be one hundred fifty to two hundred billion, putting the implied multiple closer to five times.
    • Claude Opus generates roughly seventy percent fewer tokens for the same question than previously, with token quantity tied to answer quality. Subscribers on flat-fee plans are getting a lobotomized model.
    • Elon Musk’s superpower is twenty years of making investors money. He never pushes valuation. SpaceX compounded low thirty percent per year for a decade because Musk treats fair pricing as a sacred covenant.
    • Capitalism will solve the watts shortage. The current bottleneck has shifted from chips and energy to zoning and political approval. Many capex decisions are paused until after the US midterms.
    • The watts shortage probably begins to alleviate in 2027 and 2028. Orbital compute solves it longer term.
    • Orbital compute is not Pentagon sized data centers in space. It is racks in space. A Blackwell rack is three thousand pounds, eight feet tall, four feet deep, three feet wide. SpaceX has shown a satellite roughly that size.
    • The satellites operate in sun synchronous orbit so solar wings (around five hundred feet per side) always face the sun and the radiator on the dark side always points to deep space.
    • Starlink V3 satellites already run at around twenty kilowatts. A Blackwell rack runs at one hundred kilowatts. SpaceX engineers express genuine confidence they have already solved cooling and radiator design at these scales.
    • Racks in space are connected with lasers traveling through vacuum, the same lasers already on every Starlink. SpaceX operates the world’s largest satellite fleet and, via xAI Colossus, the world’s largest data center on Earth.
    • Inference will move to orbit. Training will stay on Earth for a long time. Terrestrial data centers remain valuable for the rest of an investor’s career.
    • The wafer bottleneck is structural and political. TSMC is essentially Taiwan’s GDP, water, and electricity. The leaders see themselves as inheritors of Morris Chang’s sacred legacy and they do not behave like a Western public company.
    • Jensen Huang has never had a contract with TSMC. The relationship is run on handshakes and the assumption that things will be fair over time.
    • If TSMC did everything Jensen wanted, Nvidia could be selling two to three trillion dollars of GPUs in 2026 and 2027. TSMC’s discipline is the single largest factor preventing a true AI bubble.
    • Historically, foundational technologies always get a bubble. Railroads, canals, the internet. The current AI buildout is overwhelmingly funded out of operating cash flow, GPUs are running at one hundred percent utilization, and that is fundamentally different from the year 2000 fiber overbuild.
    • If one of Intel or Samsung Foundry catches up at the leading node, the other will follow, and TSMC’s discipline collapses. Watch TSMC capacity decisions to predict a bubble.
    • Terafab, the SpaceX and Tesla joint venture to build the world’s largest fab in America, has a partnership with Intel that grants access to fifty years of institutional foundry knowledge. The A teams at ASML, KLA, Lam Research, and Applied Materials will follow Elon’s reputation in hardware engineering.
    • The hiring playbook for Terafab includes building Taiwan Town, Japan Town, and Korea Town next to the fab. Recruit the engineers and import their families, their restaurants, and their staff.
    • Frontier tokens still capture an overwhelming share of all economic value created at the model layer. This is surprising and is one of the three big open questions for AI investing.
    • The Pareto frontier of intelligence versus cost has flipped. Nine months ago Google’s TPU dominated every point on the frontier. Today Anthropic and OpenAI dominate, with Grok 4.3 on the frontier and Gemini 3.1 hanging on.
    • Google’s conservative TPU V8 design (partly an attempt to reduce dependence on Broadcom and Nvidia) is the leading explanation for the loss of per token cost leadership.
    • AI pricing is shifting from all you can eat to usage based, mirroring the cellular and long distance industries. Cellular stopped being a great growth industry when it went all you can eat. AI just made the opposite move.
    • OpenAI and Anthropic together could exceed two hundred billion in ARR this year if compute keeps coming online and frontier token pricing holds.
    • The two hundred fifty dollar a month consumer AI plan is no longer enough to evaluate frontier capability. Enterprise plans with usage based billing are required because rate limits are now severe.
    • The three biggest open questions for AI investors are: violation of the bitter lesson via ASI or human ingenuity, whether frontier tokens keep commanding their premium, and when continual learning arrives.
    • Today’s continual learning is crude reinforcement learning during mid training on verifiable tasks. True continual learning means weights updating dynamically, like a human who learns the first time they touch fire.
    • Trying to build a better GPU is a losing strategy. Jensen will copy any one to three percent share design. Startups should target one percent share, do something different, and make it hard enough that Nvidia cannot fast follow.
    • Disaggregated inference (separating prefill and decode) opens new design canvases. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently.
    • Cerebras did something different and hard with wafer scale computing. Three generations of chips and real grit to get there.
    • Disaggregation of inference may stretch GPU useful lives to ten or fifteen years, dropping financing costs from low sevens to five or six percent, mathematically lowering the cost of the AI buildout and likely saving the private credit industry from its SaaS loan exposure.
    • Sellers of shortage outperform buyers of shortage. But owning the largest installed base of what is currently in shortage (hyperscaler CPU fleets, for example) is also a strong position.
    • Most of the economic value at the application layer of AI has been destroyed, not created. The exceptions are companies in the token path or in niches small enough that frontier labs ignore them.
    • Coding may be the shortest path to ASI. If you can write code, you can write code that does anything. Cursor, Cognition, and Anthropic correctly focused on it.
    • Jensen could probably get close to the frontier with his own Nemotron family of models whenever he wants. The fact that he chooses not to is a strategic decision about not commoditizing his customers.
    • The new prisoner’s dilemma in AI is whether frontier labs release their best model via API. If everyone agrees not to, Chinese open source falls behind. If anyone defects, the defector pulls ahead on revenue and resources, forcing everyone else to defect.
    • Google still owns the largest compute installed base. Without TPU’s prior cost advantage, this matters more. YouTube data has real value in a world of robotics. GCP is going crazy.
    • Meta deserves credit for becoming AI first internally faster than any other internet giant. Musa, their first MSL model, is impressively close to the Pareto frontier.
    • Amazon is strong because of Trainium and robotics driven retail P&L efficiency. Nova is better than it gets credit for.
    • Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Microsoft products rather than reselling to OpenAI is a courageous and probably correct call, even at the cost of an eight hundred dollar stock price.
    • The hyperscalers most engaged with startups are Amazon and Nvidia by a mile, followed by Google. Broadcom is the favorite ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement and that will cost them as the best teams are now at startups.
    • Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion at the speed of FaceTime is already feasible.
    • Ukraine is winning largely on the back of having the best battlefield AI outside America and Israel. Adversaries are starting to internalize what AI dominance means geopolitically.
    • An optimistic read is that this becomes a new Pax Americana, the way the post 1945 American nuclear monopoly was used to rebuild Germany and Japan rather than dominate.
    • AI cured a friend’s daughter’s rare disease by spinning up a research effort that identified a market drug capable of impacting her condition. That is the upside that keeps Gavin an AI optimist maximalist.

    Detailed Summary

    The most extraordinary moment in the history of capitalism

    Gavin’s framing of the current moment is unusually direct. Anthropic added eleven billion dollars of annual recurring revenue in a single month. The three highest profile SaaS companies of the last decade plus, Palantir, Snowflake, and Databricks, took a decade and tens of thousands of employees collectively to build the combined business that Anthropic added in thirty days. He has been investing through every major tech cycle and says there is no historical analog. Not the dotcom era, not the cloud transition, not mobile. This is its own thing.

    The market response, then, was peculiar. The NASDAQ sold off into the single most bullish moment for AI fundamentals on record. Tech traded at roughly its widest discount versus the rest of the market in a decade. Investors who said they wished they had bought into AI during 2022, during COVID, or during Deep Seek Monday got the same valuation setup again in early April, this time with an even clearer inflection.

    Why the Strait of Hormuz closing was secretly bullish for America

    One reason the macro fear in March may have been mispriced is that the same geopolitical event that drove the selloff was, in practice, a relative benefit to the United States. American natural gas, the input into American electricity, which is the input into American AI training and inference, fell roughly twenty percent. Asian and European natural gas prices doubled or tripled. The US emerged with sharply improved relative manufacturing competitiveness, which is exactly what the current administration cares about.

    The 1970s comparison does not hold. The US economy is dramatically less energy intensive, it is now the world’s largest producer and largest exporter of oil and gas, and there are no shortages, only price moves. That backdrop made it easier for disciplined investors to stay focused on AI fundamentals through the volatility.

    Anthropic and OpenAI valuations on an unconstrained run rate

    Anthropic at roughly nine hundred billion for fifty billion of ARR sounds rich until you adjust for the fact that the company is severely compute constrained. Gavin estimates that, unconstrained, Anthropic might be at one hundred fifty to two hundred billion in run rate revenue, putting the implied multiple closer to five times. He also points out that Claude Opus now generates roughly seventy percent fewer tokens for the same question than it used to. Token quantity correlates with answer quality, and Anthropic is rate limiting and shrinking outputs to ration capacity across its user base.

    Anthropic and OpenAI are also structurally very different. Anthropic has burned around eighty percent less cash than OpenAI to reach a comparable revenue scale. That implies very different long term returns on invested capital, though OpenAI has done a better job locking in compute and Sarah Friar is one of the most exceptional CFOs Gavin has worked with.

    Why neither lab is raising at a three trillion dollar valuation

    The answer Gavin gives is that both labs are deliberately leaving valuation on the table the way Elon has done for two decades. SpaceX compounded at low thirty percent annually for a decade because Elon never pushed price. The result is a permanent superpower of access to capital. Investors trust him because they have made money with him for twenty years. That is a moat that compounds with every round.

    Anthropic could probably raise at a one hundred percent premium to its rumored latest mark. They are choosing not to. In an uncertain world (Ukraine, Russia, Iran, Taiwan), preserving the ability to raise more capital later at fair prices is more valuable than maximizing this round.

    Watts and wafers, the two real constraints

    Capitalism is solving the watts problem. The leading PE infrastructure investors now say zoning and political approval, not chips or energy, are the gating factors. Companies are deferring big capex announcements until after the US midterms. Turbine capacity is being doubled at the manufacturers. Companies like Boom Aerospace are repurposing jet engines for grid use. Watts probably ease meaningfully in 2027 and 2028 and then orbital compute does the rest.

    Wafers are the harder problem because they live in Taiwan, run on handshakes, and depend on a corporate culture that does not respond to public market incentives. TSMC is essentially the GDP, water consumption, and electricity consumption of Taiwan. Its leadership treats the company as the legacy of Morris Chang. The Silicon Shield doctrine is real and internal.

    Orbital compute as racks in space

    The biggest mental update Gavin asks listeners to make is to stop picturing data centers in space as Pentagon sized space stations. A Blackwell rack is three thousand pounds and roughly the size of a refrigerator. SpaceX has shown a concept satellite of about that size. Solar wings extend five hundred feet to each side and the radiator extends hundreds of feet behind, both possible because the orbit is sun synchronous and the orientation is fixed relative to the sun.

    SpaceX engineers Gavin has spoken to at Starbase express genuine confidence that they have solved cooling at these power levels. They have. Starlink V3 satellites already operate at twenty kilowatts. A Blackwell rack is one hundred kilowatts. The same company operates the world’s largest satellite fleet and the world’s largest data center on Earth via xAI Colossus. The racks are connected to each other with lasers traveling through vacuum, technology already deployed in every Starlink. The naysayers, Gavin observes, are armchair skeptics and Larry Ellison’s response (he is out there landing rockets, no one else is) is the right frame.

    Terafab in Texas and the threat to TSMC’s discipline

    Terafab, the SpaceX and Tesla joint venture, intends to be the largest fab in the world. The partnership with Intel grants access to fifty years of foundry institutional knowledge, allowing Terafab to start three to five quarters behind the leading node rather than fifteen years behind. The A teams at the semicap equipment companies (ASML, KLA, Lam Research, Applied Materials) will follow Elon’s reputation in hardware engineering the same way they followed TSMC twenty years ago when Intel stumbled.

    The talent strategy is the part most observers underestimate. Recruit the best engineers globally, then import their families, their restaurants, their staff. Build Taiwan Town, Japan Town, and Korea Town next to the fab. Optimize the human experience for the people whose work matters. Intel and Samsung do not think that way.

    Bubble watch and the year 2000 comparison

    Every foundational technology in modern history has had a bubble. Railroads, canals, the internet. Carlota Perez documented why. Markets correctly identify the importance, diversity of opinion collapses, supply gets ahead of demand, the bubble crashes. The current cycle has two important differences. The buildout is overwhelmingly funded out of operating cash flow, not debt. Every GPU is running at one hundred percent utilization, while at the peak of the fiber bubble ninety nine percent of fiber was unused.

    TSMC discipline is the single largest reason a bubble has not formed. If Jensen could buy everything TSMC could theoretically make, Nvidia could sell two to three trillion dollars of GPUs in 2026 and 2027. At some point that becomes more than the market can absorb. If Intel or Samsung Foundry catches up at the leading node, the other will too. TSMC’s pricing discipline collapses and the bubble starts.

    The Pareto frontier and the loss of Google’s cost advantage

    The most important chart in AI is the Pareto frontier of model intelligence versus per token cost. Nine months ago, Google’s TPU based models dominated every point on it. OpenAI, Anthropic, and xAI sat inside the frontier. Today the frontier is dominated by Anthropic and OpenAI, with Grok 4.3 on the frontier and Gemini 3.1 hanging on by subsidization more than economics. The most likely cause is Google’s conservative TPU V8 design, an attempt to reduce dependence on Broadcom and Nvidia that sacrificed per token economics.

    The bitter lesson, frontier tokens, and continual learning

    Three open questions dominate AI investing. The first is whether Richard Sutton’s bitter lesson (more compute beats human algorithmic cleverness) gets violated by ASI itself optimizing for efficiency. Closer observers of AI are more skeptical of a violation. Gavin thinks ASI’s first move will be to make itself more efficient and more resourced, which is technically a temporary violation.

    The second is whether frontier tokens keep capturing the overwhelming share of economic value at the model layer. Today they do, surprisingly. Gemini 3.1 Pro was mindblowing nine months ago and is intolerable today. The third is when continual learning arrives. Today’s models need a million fire touches to learn what a human learns from one. True continual learning would mean dynamic weight updates in real time and would produce a fast takeoff.

    From all you can eat to usage based AI pricing

    AI is shifting from flat fee plans to usage based pricing. The historical analogy is cellular and long distance. Both stopped being great growth industries when they went all you can eat. AI just made the opposite move. The consequence is that flat fee subscribers, even on premium consumer plans, get a rate limited and token throttled version of the frontier model. Enterprise plans with usage based billing are now required to evaluate true capability. Gavin thinks the combination of new compute coming online and usage based pricing is what gets OpenAI and Anthropic past two hundred billion in combined ARR this year.

    Chip startups, prefill decode disaggregation, and Cerebras

    Trying to build a better GPU is the wrong move. The four scaled players (Nvidia, AMD, Trainium, TPU) have copy capability for any one to three percent share design that looks attractive. The good news for startups is that disaggregated inference (separating prefill and decode) opens a richer design canvas. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently. Andrew Fox’s analogy is a British naval ship of the eighteenth century. Prefill is loading the cannon. Decode is firing it.

    Cerebras is the model. Wafer scale computing is genuinely different and genuinely hard. It took three generations of chips to get right. Andrew Feldman and his team had the grit to keep going through chip one being a failure. The design has a high ratio of on chip compute and memory relative to shoreline IO, which is why Cerebras is now experimenting with putting an optical wafer on top of the compute wafer to solve scale out.

    GPU useful lives and the rescue of private credit

    One of the strongest claims in the conversation is that disaggregated inference will stretch GPU useful lives to ten or fifteen years. The skeptical narrative (GPUs are obsolete in two years, companies are cooking their depreciation books) is wrong. You can put a Cerebras system or Groq LPU in front of older Hopper or Ampere parts, use them only for prefill, and run them until they physically melt. Private credit, which is in pain from SaaS loans and which underwrote GPU loans on three to four year lives, may be saved by this.

    If GPU financing rates can come down from low sevens to five or six percent, the mathematics of the AI buildout improves materially. That is a structural tailwind that compounds for years.

    The application layer, the token path, and a new prisoner’s dilemma

    Trillions of dollars of value have been destroyed at the application layer, not created. Cursor and Cognition are the rare scaled exceptions, and they got there by focusing on coding very early. As Amjad Masad noted, coding is plausibly the shortest path to ASI because a coding agent can write itself into any new domain. Jamin Ball’s frame is that the new venture filter is whether the company is in the token path. Data Bricks is. Most application layer startups are not.

    Jensen could probably get close to the frontier with Nemotron whenever he wants, and the strategic question of whether to do that is a new prisoner’s dilemma. If every frontier lab agrees not to release best models via API, Chinese open source falls steadily behind. If anyone defects, the defector gains revenue and resources, and everyone else has to defect. The same dynamic exists between TSMC, Intel, and Samsung. If Nvidia or AMD ever truly used an alternative foundry, that foundry would catch up rapidly.

    Rating the hyperscalers

    Google has the largest compute installed base, the YouTube data that matters in a robotics world, and a search business that prints. Their loss of TPU cost leadership is the surprise of the year. If Google IO in five days does not produce a leapfrog model, the Nvidia centric narrative gets even stronger.

    Meta deserves real credit. Zuckerberg made Meta AI first internally faster than any other internet giant, paid up for the talent contracts when no one else would, and shipped Musa as a first model from MSL that is close to the Pareto frontier. Amazon is well positioned on Trainium, robotics in retail, and a Nova model line that is better than it gets credit for. Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Copilot rather than reselling to OpenAI is courageous and probably correct, even at the cost of stock price.

    The most interesting cross hyperscaler metric is startup engagement. Nvidia and Amazon engage deeply with startups. Google is next. Broadcom is the favored ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement, which Gavin believes will cost them as the best teams now sit at startups.

    Personal safety, geopolitics, and the Pax Americana case

    The closing section turns darker. Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion via something that looks exactly like your child calling on FaceTime is already feasible. Political violence against AI leaders is a real concern. Geopolitically, Ukraine is winning largely because it has the best battlefield AI outside America and Israel. How adversaries respond to that asymmetry is the next great variable.

    Gavin’s optimistic frame is the Pax Americana. After 1945 the US had a nuclear monopoly and could have controlled the world. Instead it rebuilt Germany and Japan, both of which became the most reliable American allies for the next eighty years. If AI dominance plays out similarly, this is a generationally positive story rather than a destabilizing one. The personal anecdote that closes the conversation is a friend whose daughter was diagnosed with a rare genetic condition. He spun up agents, identified a drug already on the market that addresses her mutation, and her life is immeasurably different because of AI. That is the upside.

    Thoughts

    The Anthropic eleven billion in a month framing is the kind of stat that resets priors. The right way to interpret it is not as a one off but as a measure of how fast value can compound when the underlying technology improves on a curve steeper than the ability of the rest of the economy to absorb it. The skeptical question is whether that ARR is durable or whether it is heavily tied to a customer base of other AI companies that are themselves on a single venture funded year of runway. The bullish answer is that frontier coding, frontier research, and frontier enterprise tasks are not going to stop being valuable, and Anthropic is the best at all three. Both can be true. The number is still extraordinary.

    The argument that TSMC discipline is the only thing preventing a bubble is the analytically tightest part of the conversation. The implied trade is to watch TSMC capacity additions like a hawk and to be more, not less, cautious if Intel Foundry or Samsung Foundry ever announce real share at the leading node. The Terafab thesis is more speculative but more interesting. If Elon’s talent recruiting playbook works and the Intel partnership gives Terafab a real seat at the table within five years, the geometry of the global semiconductor industry shifts in a way that is bullish for American manufacturing, bullish for power and water infrastructure in Texas, and ambiguous for TSMC itself.

    The Pareto frontier discussion deserves more attention than it usually gets. Pricing leadership in AI is not a vanity metric. It determines who can subsidize free tier usage, who can absorb compute shortages, who can ship cheaper enterprise plans, and ultimately whose model becomes the default for any given workload. Google losing per token leadership in nine months is one of the most under analyzed events in the sector and it explains a lot about why Anthropic and OpenAI are growing the way they are. If Google IO does not produce a leapfrog model, the implied verdict on TPU V8 design choices gets a lot harsher.

    The application layer destruction point is worth sitting with. Founders building on top of frontier models are competing in a world where the model itself moves faster than any moat they can build, where the model lab can absorb their niche if it gets interesting, and where the only protection is either deep token path integration or a niche so small the lab does not bother. That is a much harsher venture environment than the early SaaS era. The compensating opportunity is that one human can now run a hundred agents, so the ceiling on what a small team can build is correspondingly higher. The bet is that productivity per founder rises faster than competitive pressure from the labs. We will find out.

    The orbital compute pitch is the section that will polarize listeners. The naive read is that this is science fiction. The closer read is that every component (sun synchronous orbit, laser interconnect, twenty kilowatt satellite buses, ten thousand satellite manufacturing cadence, full rocket reusability) already exists. The remaining engineering problems are repair, maintenance, and radiator scale, all of which are real but tractable on a five to ten year horizon. The strategic implication is that the political and zoning ceiling on terrestrial data centers becomes less binding if orbital compute is a credible alternative for inference workloads. The investor implication is that being short the watts and cooling complex on a five year horizon is a real trade, not a meme.

    Watch the full conversation here.

  • Andrej Karpathy on Vibe Coding vs Agentic Engineering: Why He Feels More Behind Than Ever in 2026

    Andrej Karpathy, co-founder of OpenAI, former head of AI at Tesla, and now founder of Eureka Labs, returned to Sequoia Capital’s AI Ascent 2026 stage for a wide-ranging conversation with partner Stephanie Zhan. One year after coining the term “vibe coding,” Karpathy unpacked what has changed, why he has never felt more behind as a programmer, and why the discipline emerging on top of vibe coding, which he calls agentic engineering, is the more serious craft worth learning right now.

    The conversation covered Software 3.0, the limits of verifiability, why LLMs are better understood as ghosts than animals, and why you can outsource your thinking but never your understanding. Below is a complete breakdown of the talk for anyone building, hiring, or learning in the agent era.

    TLDW

    Karpathy describes a sharp transition that happened in December 2025, when agentic coding tools crossed a threshold and code chunks just started coming out fine without correction. He frames the current moment as Software 3.0, where prompting an LLM is the new programming, and entire app categories are collapsing into a single model call. He distinguishes vibe coding (raising the floor for everyone) from agentic engineering (preserving the professional quality bar at much higher speed). Models remain jagged because they are trained on what labs choose to verify, so founders should look for valuable but neglected verifiable domains. Taste, judgment, oversight, and understanding remain uniquely human responsibilities, and tools that enhance understanding are the ones he is most excited about.

    Key Takeaways

    • December 2025 was a clear inflection point. Code chunks from agentic tools started arriving correct without edits, and Karpathy stopped correcting the system entirely.
    • Software 3.0 means programming has become prompting. The context window is your lever over the LLM interpreter, which performs computation in digital information space.
    • Open Code’s installer is a software 3.0 example. Instead of a complex shell script, you copy paste a block of text to your agent, and the agent figures out your environment.
    • The Menu Gen anecdote illustrates how entire apps can become spurious. What used to require OCR, image generation, and a hosted Vercell app can now be a single Gemini plus Nano Banana prompt.
    • Vibe coding raises the floor. Agentic engineering preserves the professional ceiling. The two are different disciplines.
    • The 10x engineer multiplier is now far higher than 10x for people who are good at agentic engineering.
    • Hiring processes have not caught up. Puzzle interviews are the old paradigm. New evaluations should look like building a full Twitter clone for agents and surviving simulated red team attacks from other agents.
    • Models are jagged because reinforcement learning rewards what is verifiable, and labs choose which verifiable domains to invest in. Strawberry letter counts and the 50 meter car wash question show how state-of-the-art models can refactor 100,000 line codebases yet fail at trivial reasoning.
    • If you are in a verifiable setting, you can run your own fine tuning, build RL environments, and benefit even when the labs are not focused on your domain.
    • LLMs are ghosts, not animals. They are statistical simulations summoned from pre training and shaped by RL appendages, not creatures with curiosity or motivation. Yelling at them does not help.
    • Taste, aesthetics, spec design, and oversight remain human jobs. Models still produce bloated, copy paste heavy code with brittle abstractions.
    • Documentation is still written for humans. Agent native infrastructure, where docs are explicitly designed to be copy pasted into an agent, is a major opportunity.
    • The future likely involves agent representation for people and organizations, with agents talking to other agents to coordinate meetings and tasks.
    • You can outsource your thinking but not your understanding. Tools that help humans understand information faster are uniquely valuable.

    Detailed Summary

    Why Karpathy Feels More Behind Than Ever

    Karpathy opens by describing how he has been using agentic coding tools for over a year. For most of that period, the experience was mixed. The tools could write chunks of code, but they often required edits and supervision. December 2025 changed everything. With more time during a holiday break and the release of newer models, Karpathy noticed that the chunks just came out fine. He kept asking for more. He cannot remember the last time he had to correct the agent. He started trusting the system, and what followed was a cascade of side projects.

    He wants to stress that anyone whose model of AI was formed by ChatGPT in early 2025 needs to look again. The agentic coherent workflow that genuinely works is a fundamentally different experience, and the transition was stark.

    Software 3.0 Explained

    The Software 1.0 paradigm was writing explicit code. Software 2.0 was programming by curating datasets and training neural networks. Software 3.0 is programming by prompting. When you train a GPT class model on a sufficiently large set of tasks, the model implicitly learns to multitask everything in the data. The result is a programmable computer where the context window is your interface, and the LLM is the interpreter performing computation in digital information space.

    Karpathy gives two concrete examples. The first is Open Code’s installer. Normally a shell script handles installation across many platforms, and these scripts balloon in complexity. Open Code instead provides a block of text you copy paste to your agent. The agent reads your environment, follows instructions, debugs in a loop, and gets things working. You no longer specify every detail. The agent supplies its own intelligence.

    The Menu Gen Story

    The second example is Karpathy’s Menu Gen project. He built an app that takes a photo of a restaurant menu, OCRs the items, generates pictures for each dish, and renders the enhanced menu. The app runs on Vercell and chains together multiple services. Then he saw a software 3.0 alternative. You take a photo, give it to Gemini, and ask it to use Nano Banana to overlay generated images onto the menu. The model returns a single image with everything rendered. The entire app he built is now spurious. The neural network does the work. The prompt is the photo. The output is the photo. There is no app between them.

    Karpathy uses this to argue that founders should not just think of AI as a speedup of existing patterns. Entirely new things become possible. His example is LLM driven knowledge bases that compile a wiki for an organization from raw documents. That is not a faster version of older code. It is a new capability with no prior equivalent.

    What Will Look Obvious in Hindsight

    Stephanie Zhan asks what the equivalent of building websites in the 1990s or mobile apps in the 2010s looks like today. Karpathy speculates about completely neural computers. Imagine a device that takes raw video and audio as input, runs a neural net as the host process, and uses diffusion to render a unique UI for each moment. He notes that early computing in the 1950s and 60s was undecided between calculator like and neural net like architectures. We went down the calculator path. He thinks the relationship may eventually flip, with neural networks becoming the host and CPUs becoming co processors used for deterministic appendages.

    Verifiability and Jagged Intelligence

    Karpathy spent significant writing time on verifiability. Classical computers automate what you can specify in code. The current generation of LLMs automates what you can verify. Frontier labs train models inside giant reinforcement learning environments, so the models peak in capability where verification rewards are strong, especially math and code. They stagnate or get rough around the edges elsewhere.

    This explains the jagged intelligence puzzle. The classic example was counting letters in strawberry. The newer one Karpathy offers: a state of the art model will refactor a 100,000 line codebase or find zero day vulnerabilities, then tell you to walk to a car wash 50 meters away because it is so close. The two coexisting capabilities should be jarring. They reveal that you must stay in the loop, treat models as tools, and understand which RL circuits your task lands in.

    He also points out that data distribution choices matter. The jump in chess capability from GPT 3.5 to GPT 4 came largely because someone at OpenAI added a huge amount of chess data to pre training. Whatever ends up in the mix gets disproportionately good. You are at the mercy of what labs prioritize, and you have to explore the model the labs hand you because there is no manual.

    Founder Advice in a Lab Dominated World

    Asked what founders should do given that labs are racing toward escape velocity in obvious verifiable domains, Karpathy points back to verifiability itself. If your domain is verifiable but currently neglected, you can build RL environments and run your own fine tuning. The technology works. Pull the lever with diverse RL environments and a fine tuning framework, and you get something useful. He hints there is one specific domain he finds undervalued but declines to name it on stage.

    On the question of what is automatable only from a distance, Karpathy says almost everything can ultimately be made verifiable. Even writing can be assessed by councils of LLM judges. The differences are in difficulty, not in possibility.

    From Vibe Coding to Agentic Engineering

    Vibe coding raises the floor. Anyone can build something. Agentic engineering preserves the professional quality bar that existed before. You are still responsible for your software. You are still not allowed to ship vulnerabilities. The question is how you go faster without sacrificing standards. Karpathy calls it an engineering discipline because coordinating spiky, stochastic agents to maintain quality at speed requires real skill.

    The ceiling on agentic engineering capability is very high. The old idea of a 10x engineer is now an understatement. People who are good at this peak far above 10x.

    What Mediocre Versus AI Native Looks Like

    Karpathy compares this to how different generations use ChatGPT. The difference between a mediocre and an AI native engineer using Claude Code, Codex, or Open Code is investment in setup and full use of available features. The same way previous generations of engineers got the most out of Vim or VSCode, today’s strong engineers tune their agentic environments deeply.

    He thinks hiring processes have not caught up. Most companies still hand out puzzles. The new test should look like asking a candidate to build a full Twitter clone for agents, make it secure, simulate user activity with agents, and then run multiple Codex 5.4x high instances trying to break it. The candidate’s system should hold up.

    What Humans Still Own

    Agents are intern level entities right now. Humans are responsible for aesthetics, judgment, taste, and oversight. Karpathy describes a Menu Gen bug where the agent tried to associate Stripe purchases with Google accounts using email addresses as the key, instead of a persistent user ID. Email addresses can differ between Stripe and Google accounts. This kind of specification level mistake is exactly what humans must catch.

    He works with agents to design detailed specs and treats those as documentation. The agent fills in the implementation. He has stopped memorizing API details for things like NumPy axis arguments or PyTorch reshape versus permute. The intern handles recall. Humans handle architecture, design, and the right questions.

    Reading the actual code agents produce can still cause heart attacks. It is bloated, full of copy paste, riddled with awkward and brittle abstractions. His Micro GPT project, an attempt to simplify LLM training to its bare essence, was nearly impossible to drive through agents. The models hate simplification. That capability sits outside their RL circuits. Nothing is fundamentally preventing this from improving. The labs simply have not invested.

    Animals Versus Ghosts

    Karpathy returns to his framing that we are not building animals, we are summoning ghosts. Animal intelligence comes from evolution and is shaped by intrinsic motivation, fun, curiosity, and empowerment. LLMs are statistical simulation circuits where pre training is the substrate and RL is bolted on as appendages. They are jagged. They do not respond to being yelled at. They have no real curiosity. The ghost framing is partly philosophical, but it changes how you approach them. You stay suspicious. You explore. You do not assume the system you used yesterday will behave the same on a new task.

    Agent Native Infrastructure

    Most software, frameworks, libraries, and documentation are still written for humans. Karpathy’s pet peeve is being told to do something instead of being given a block of text to copy paste to his agent. He wants agent first infrastructure. The Menu Gen project’s hardest part was not writing code. It was deploying on Vercell, configuring DNS, navigating service settings, and stringing together integrations. He wants to give a single prompt and have the entire thing deployed without touching anything.

    Long term he expects agent representation for individuals and organizations. His agent will negotiate meeting details with your agent. The world becomes one of sensors, actuators, and agent native data structures legible to LLMs.

    Education and What Still Matters

    The most striking line of the conversation comes near the end. Karpathy quotes a tweet that shaped his thinking: you can outsource your thinking but you cannot outsource your understanding. Information still has to make it into your brain. You still need to know what you are building and why. You cannot direct agents well if you do not understand the system.

    This is part of why he is so excited about LLM driven knowledge bases. Every time he reads an article, his personal wiki absorbs it, and he can query it from new angles. Every projection onto the same information yields new insight. Tools that enhance human understanding are uniquely valuable because LLMs do not excel at understanding. That bottleneck is yours to manage.

    Thoughts

    The most useful frame in this talk is the distinction between vibe coding and agentic engineering. It clarifies what has been muddled for the past year. Vibe coding is about access. Anyone can produce something. Agentic engineering is about discipline. You preserve the standards that made software trustworthy in the first place, while moving at speeds that would have seemed absurd two years ago. These are not the same activity, and conflating them is part of why so many shipped products feel half built.

    The Menu Gen anecdote is the kind of story that should make every solo developer pause. If a single Gemini plus Nano Banana prompt can replace a multi service Vercell deployed app, the question for any builder becomes how much of what you are working on right now is going to be made spurious by the next model release. The honest answer is probably more than you want to admit. The defensive posture is not building thicker apps. It is choosing problems where the model alone is not enough, where taste, distribution, infrastructure, or specific verifiable RL environments give you something the next model cannot collapse into a prompt.

    The verifiability lens is also unusually practical. If you are a solo builder, the question shifts from what is possible to what is verifiable but neglected. The labs will eat the obvious verifiable domains because that is how their RL pipelines are set up. The opportunity is in domains where verification is possible but the labs have not yet invested. That is a much more concrete strategic filter than vague intuitions about defensibility.

    The car wash example is going to stick. State of the art models can refactor enormous codebases and still tell you to walk somewhere a sane person would drive. That is the lived reality of jagged intelligence, and it argues strongly for staying in the loop on real decisions rather than handing off everything to agents. The agents are excellent fillers of blanks. They are not yet trustworthy specifiers of the spec.

    Finally, the line about outsourcing thinking but not understanding is worth taping above the desk. The bottleneck is no longer typing speed, syntax recall, or even API knowledge. It is whether the human in the loop actually understands the system being built. Tools that genuinely improve human understanding, including personal knowledge bases that re project information through different prompts, are likely the most undervalued category of products being built right now. The opportunity is not just in agents. It is in the cognitive scaffolding that makes humans good directors of agents.

  • Andrej Karpathy on AutoResearch, AI Agents, and Why He Stopped Writing Code: Full Breakdown of His 2026 No Priors Interview

    TL;DW

    Andrej Karpathy sat down with Sarah Guo on the No Priors podcast (March 2026) and delivered one of the most information-dense conversations about the current state of AI agents, autonomous research, and the future of software engineering. The core thesis: since December 2025, Karpathy has essentially stopped writing code by hand. He now “expresses his will” to AI agents for 16 hours a day, and he believes we are entering a “loopy era” where autonomous systems can run experiments, train models, and optimize hyperparameters without a human in the loop. His project AutoResearch proved this works by finding improvements to a model he had already hand-tuned over two decades of experience. The conversation also covers the death of bespoke apps, the future of education, open vs. closed source models, robotics, job market impacts, and why Karpathy chose to stay independent from frontier labs.

    Key Takeaways

    1. The December 2025 Shift Was Real and Dramatic

    Karpathy describes a hard flip that happened in December 2025 where he went from writing 80% of his own code to writing essentially none of it. He says the average software engineer’s default workflow has been “completely different” since that month. He calls this state “AI psychosis” and says he feels anxious whenever he is not at the forefront of what is possible with these tools.

    2. AutoResearch: Agents That Do AI Research Autonomously

    AutoResearch is Karpathy’s project where an AI agent is given an objective metric (like validation loss), a codebase, and boundaries for what it can change. It then loops autonomously, running experiments, tweaking hyperparameters, modifying architectures, and committing improvements without any human in the loop. When Karpathy ran it overnight on a model he had already carefully tuned by hand over years, it found optimizations he had missed, including forgotten weight decay on value embeddings and insufficiently tuned Adam betas.

    3. The Name of the Game Is Removing Yourself as the Bottleneck

    Karpathy frames the current era as a shift from optimizing your own productivity to maximizing your “token throughput.” The goal is to arrange tasks so that agents can run autonomously for extended periods. You are no longer the worker. You are the orchestrator, and every minute you spend in the loop is a minute the system is held back.

    4. Mastery Now Means Managing Multiple Agents in Parallel

    The vision of mastery is not writing better code. It is managing teams of agents simultaneously. Karpathy references Peter Steinberg’s workflow of having 10+ Codex agents running in parallel across different repos, each taking about 20 minutes per task. You move in “macro actions” over your codebase, delegating entire features rather than writing individual functions.

    5. Personality and Soul Matter in Coding Agents

    Karpathy praises Claude’s personality, saying it feels like a teammate who gets excited about what you are building. He contrasts this with Codex, which he calls “very dry” and disengaged. He specifically highlights that Claude’s praise feels earned because it does not react equally to half-baked ideas and genuinely good ones. He credits Peter (OpenClaw) with innovating on the “soul” of an agent through careful prompt design, memory systems, and a unified WhatsApp interface.

    6. Apps Are Dead. APIs and Agents Are the Future.

    Karpathy built “Dobby the Elf Claw,” a home automation agent that controls his Sonos, lights, HVAC, shades, pool, spa, and security cameras through natural language over WhatsApp. He did this by having agents scan his local network, reverse-engineer device APIs, and build a unified dashboard. His conclusion: most consumer apps should not exist. Everything should be API endpoints that agents can call on behalf of users. The “customer” of software is increasingly the agent, not the human.

    7. AutoResearch Could Become a Distributed Computing Project

    Karpathy envisions an “AutoResearch at Home” model inspired by SETI@home and Folding@home. Because it is expensive to find code optimizations but cheap to verify them (just run the training and check the metric), untrusted compute nodes on the internet could contribute experimental results. He draws an analogy to blockchain: instead of blocks you have commits, instead of proof of work you have expensive experimentation, and instead of monetary reward you have leaderboard placement. He speculates that a global swarm of agents could potentially outperform frontier labs.

    8. Education Is Being Redirected Through Agents

    Karpathy describes his MicroGPT project, a 200-line distillation of LLM training to its bare essence. He says he started to create a video walkthrough but realized that is no longer the right format. Instead, he now “explains things to agents,” and the agents can then explain them to individual humans in their own language, at their own pace, with infinite patience. He envisions education shifting to “skills” (structured curricula for agents) rather than lectures or guides for humans directly.

    9. The Jaggedness Problem Is Still Real

    Karpathy describes current AI agents as simultaneously feeling like a “brilliant PhD student who has been a systems programmer their entire life” and a 10-year-old. He calls this “jaggedness,” and it stems from reinforcement learning only optimizing for verifiable domains. Models can move mountains on agentic coding tasks but still tell the same bad joke they told four years ago (“Why don’t scientists trust atoms? Because they make everything up.”). Things outside the RL reward loop remain stuck.

    10. Open Source Is Healthy and Necessary, Even If Behind

    Karpathy estimates open source models are now roughly 6 to 8 months behind closed frontier models, down from 18 months and narrowing. He draws a parallel to Linux: the industry has a structural need for a common, open platform. He is “by default very suspicious” of centralization and wants more labs, more voices in the room, and an “ensemble” approach to AI governance. He thinks it is healthy that open source exists slightly behind the frontier, eating through basic use cases while closed models handle “Nobel Prize kind of work.”

    11. Digital Transformation Will Massively Outpace Physical Robotics

    Karpathy predicts a clear ordering: first, a massive wave of “unhobling” in the digital space where everything gets rewired and made 100x more efficient. Then, activity moves to the interface between digital and physical (sensors, cameras, lab equipment). Finally, the physical world itself transforms, but on a much longer timeline because “atoms are a million times harder than bits.” He notes that robotics requires enormous capital expenditure and conviction, and most self-driving startups from 10 years ago did not survive long term.

    12. Why Karpathy Stays Independent From Frontier Labs

    Karpathy gives a nuanced answer about why he is not working at a frontier lab. He says employees at these labs cannot be fully independent voices because of financial incentives and social pressure. He describes this as a fundamental misalignment: the people building the most consequential technology are also the ones who benefit most from it financially. He values being “more aligned with humanity” outside the labs, though he acknowledges his judgment will inevitably drift as he loses visibility into what is happening at the frontier.

    Detailed Summary

    The AI Psychosis and the End of Hand-Written Code

    The conversation opens with Karpathy describing what he calls a state of perpetual “AI psychosis.” Since December 2025, he has not typed a line of code. The shift was not gradual. It was a hard flip from doing 80% of his own coding to doing almost none. He compares the anxiety of unused agent capacity to the old PhD feeling of watching idle GPUs. Except now, the scarce resource is not compute. It is tokens, and you feel the pressure to maximize your token throughput at all times.

    He describes the modern workflow: you have multiple coding agents (Claude Code, Codex, or similar harnesses) running simultaneously across different repositories. Each agent takes about 20 minutes on a well-scoped task. You delegate entire features, review the output, and move on. The job is no longer typing. It is orchestration. And when it does not work, the overwhelming feeling is that it is a “skill issue,” not a capability limitation.

    Karpathy says most people, even his own parents, do not fully grasp how dramatic this shift has been. The default workflow of any software engineer sitting at a desk today is fundamentally different from what it was six months ago.

    AutoResearch: Closing the Loop on AI Research

    The centerpiece of the conversation is AutoResearch, Karpathy’s project for fully autonomous AI research. The setup is deceptively simple: give an agent an objective metric (like validation loss on a language model), a codebase to modify, and boundaries for what it can change. Then let it loop. It generates hypotheses, runs experiments, evaluates results, and commits improvements. No human in the loop.

    Karpathy was surprised it worked as well as it did. He had already hand-tuned his NanoGPT-derived training setup over years using his two decades of experience. When he let AutoResearch run overnight, it found improvements he had missed. The weight decay on value embeddings was forgotten. The Adam optimizer betas were not sufficiently tuned. These are the kinds of things that interact with each other in complex ways that a human researcher might not systematically explore.

    The deeper insight is structural: everything around frontier-level intelligence is about extrapolation and scaling laws. You do massive exploration on smaller models and then extrapolate to larger scales. AutoResearch is perfectly suited for this because the experimentation is expensive but the verification is cheap. Did the validation loss go down? Yes or no.

    Karpathy envisions this scaling beyond a single machine. His “AutoResearch at Home” concept borrows from distributed computing projects like Folding@home. Because verification is cheap but search is expensive, you can accept contributions from untrusted workers across the internet. He draws a blockchain analogy: commits instead of blocks, experimentation as proof of work, leaderboard placement as reward. A global swarm of agents contributing compute could, in theory, rival frontier labs that have massive but centralized resources.

    The Claw Paradigm and the Death of Apps

    Karpathy introduces the concept of the “claw,” a persistent, looping agent that operates in its own sandbox, has sophisticated memory, and works on your behalf even when you are not watching. This goes beyond a single chat session with an AI. A claw has persistence, autonomy, and the ability to interact with external systems.

    His personal example is “Dobby the Elf Claw,” a home automation agent that controls his entire smart home through WhatsApp. The agent scanned his local network, found his Sonos speakers, reverse-engineered the API, and started playing music in three prompts. It did the same for his lights, HVAC, shades, pool, spa, and security cameras (using a Qwen vision model for change detection on camera feeds).

    The broader point is that this renders most consumer apps unnecessary. Why maintain six different smart home apps when a single agent can call all the APIs directly? Karpathy argues the industry needs to reconfigure around the idea that the customer is increasingly the agent, not the human. Everything should be exposed API endpoints. The intelligence layer (the LLM) is the glue that ties it all together.

    He predicts this will become table stakes within a few years. Today it requires vibe coding and direct agent interaction. Soon, even open source models will handle this trivially. The barrier will come down until every person has a claw managing their digital life through natural language.

    Model Jaggedness and the Limits of Reinforcement Learning

    One of the most technically interesting sections covers what Karpathy calls “jaggedness.” Current AI models are simultaneously superhuman at verifiable tasks (coding, math, structured reasoning) and surprisingly mediocre at anything outside the RL reward loop. His go-to example: ask any frontier model to tell you a joke, and you will get the same one from four years ago. “Why don’t scientists trust atoms? Because they make everything up.” The models have improved enormously, but joke quality has not budged because it is not being optimized.

    This jaggedness creates an uncanny valley in interaction. Karpathy describes the experience as talking to someone who is simultaneously a brilliant PhD systems programmer and a 10-year-old. Humans have some variance in ability across domains, but nothing like this. The implication is that the narrative of “general intelligence improving across all domains for free as models get smarter” is not fully accurate. There are blind spots, and they cluster around anything that lacks objective evaluation criteria.

    He and Sarah Guo discuss whether this should lead to model “speciation,” where specialized models are fine-tuned for specific domains rather than one monolithic model trying to be good at everything. Karpathy thinks speciation makes sense in theory (like the diversity of brains in the animal kingdom) but says the science of fine-tuning without losing capabilities is still underdeveloped. The labs are still pursuing monocultures.

    Open Source, Centralization, and Power Balance

    Karpathy, a long-time open source advocate, estimates the gap between closed and open source models has narrowed from 18 months to roughly 6 to 8 months. He draws a direct parallel to Linux: despite closed alternatives like Windows and macOS, the industry structurally needs a common open platform. Linux runs on 60%+ of computers because businesses need a shared foundation they feel safe using.

    The challenge for open source AI is capital expenditure. Training frontier models is astronomically expensive, and that is where the comparison to Linux breaks down somewhat. But Karpathy argues the current dynamic is actually healthy: frontier labs push the bleeding edge with closed models, open source follows 6 to 8 months behind, and that trailing capability is still enormously powerful for the vast majority of use cases.

    He expresses deep skepticism about centralization, citing his Eastern European background and the historical track record of concentrated power. He wants more labs, more independent voices, and an “ensemble” approach to decision-making about AI’s future. He worries about the current trend of further consolidation even among the top labs.

    The Job Market: Digital Unhobling and the Jevons Paradox

    Karpathy recently published an analysis of Bureau of Labor Statistics jobs data, color-coded by which professions primarily manipulate digital information versus physical matter. His thesis: digital professions will be transformed first and fastest because bits are infinitely easier to manipulate than atoms. He calls this “unhobling,” the release of a massive overhang of digital work that humans simply did not have enough thinking cycles to process.

    On whether this means fewer software engineering jobs, Karpathy is cautiously optimistic. He invokes the Jevons Paradox: when something becomes cheaper, demand often increases so much that total consumption goes up. The canonical example is ATMs and bank tellers. ATMs were supposed to replace tellers, but they made bank branches cheaper to operate, leading to more branches and more tellers (at least until 2010). Similarly, if AI makes software dramatically cheaper, the demand for software could explode because it was previously constrained by scarcity and cost.

    He emphasizes that the physical world will lag behind significantly. Robotics requires enormous capital, conviction, and time. Most self-driving startups from a decade ago failed. The interesting opportunities in the near term are at the interface between digital and physical: sensors feeding data to AI systems, actuators executing AI decisions in the real world, and new markets for information (he imagines prediction markets where agents pay for real-time photos from conflict zones).

    Education in the Age of Agents

    Karpathy’s MicroGPT project distills the entire LLM training process into 200 lines of Python. He started making an explanatory video but stopped, realizing the format is obsolete. If the code is already that simple, anyone can ask an agent to explain it in whatever way they need: different languages, different skill levels, infinite patience, multiple approaches. The teacher’s job is no longer to explain. It is to create the thing that is worth explaining, and then let agents handle the last mile of education.

    He envisions a future where education shifts from “guides and lectures for humans” to “skills and curricula for agents.” A skill is a set of instructions that tells an agent how to teach something, what progression to follow, what to emphasize. The human educator becomes a curriculum designer for AI tutors. Documentation shifts from HTML for humans to markdown for agents.

    His punchline: “The things that agents can do, they can probably do better than you, or very soon. The things that agents cannot do is your job now.” For MicroGPT, the 200-line distillation is his unique contribution. Everything else, the explanation, the teaching, the Q&A, is better handled by agents.

    Why Not Return to a Frontier Lab?

    The conversation closes with a nuanced discussion about why Karpathy remains independent. He identifies several tensions. First, financial alignment: employees at frontier labs have enormous financial incentives tied to the success of transformative (and potentially disruptive) technology. This creates a conflict of interest when it comes to honest public discourse. Second, social pressure: even without arm-twisting, there are things you cannot say and things the organization wants you to say. You cannot be a fully free agent. Third, impact: he believes his most impactful contributions may come from an “ecosystem level” role rather than being one of many researchers inside a lab.

    However, he acknowledges a real cost. Being outside frontier labs means his judgment will inevitably drift. These systems are opaque, and understanding how they actually work under the hood requires being inside. He floats the idea of periodic stints at frontier labs, going back and forth between inside and outside roles to maintain both independence and technical grounding.

    Thoughts

    This is one of the most honest and technically grounded conversations about the current state of AI I have heard in 2026. A few things stand out.

    The AutoResearch concept is genuinely important. Not because autonomous hyperparameter tuning is new, but because Karpathy is framing the entire problem correctly: the goal is not to build better tools for researchers. It is to remove researchers from the loop entirely. The fact that an overnight run found optimizations that a world-class researcher missed after years of manual tuning is a powerful data point. And the distributed computing vision (AutoResearch at Home) could be the most consequential idea in the entire conversation if someone builds it well.

    The “death of apps” framing deserves more attention. Karpathy’s Dobby example is not a toy demo. It is a preview of how every consumer software company’s business model gets disrupted. If agents can reverse-engineer APIs and unify disparate systems through natural language, the entire app ecosystem becomes a commodity layer beneath an intelligence layer. The companies that survive will be the ones that embrace API-first design and accept that their “user” is increasingly an LLM.

    The jaggedness observation is underappreciated. The fact that models can autonomously improve training code but cannot tell a new joke should be deeply uncomfortable for anyone claiming we are on a smooth path to AGI. It suggests that current scaling and RL approaches produce narrow excellence, not general intelligence. The joke example is funny, but the underlying point is serious: we are building systems with alien capability profiles that do not match any human intuition about what “smart” means.

    Finally, Karpathy’s decision to stay independent is itself an important signal. When one of the most capable AI researchers in the world says he feels “more aligned with humanity” outside of frontier labs, that should be taken seriously. His point about financial incentives and social pressure creating misalignment is not abstract. It is structural. And his proposed solution of rotating between inside and outside roles is pragmatic and worth consideration for the entire field.