PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: codex

OpenCode CEO Jay V on 20x Growth in 6 Months: 13 Million Users, 7 Trillion Tokens a Day, the Anthropic Block That Backfired, and the 16-Year Road to Overnight Success
In this episode of Y Combinator’s Lightcone podcast, Jay V, founder and CEO of OpenCode, the open-source coding agent that works with any model, walks through one of the wildest growth stories in developer tools: 650,000 monthly active users in January to roughly 13 million by June, 7 trillion tokens processed per day, and a business that went from zero to a $40 million revenue run rate in about eight months. He also tells the part almost nobody knows: the company behind this “overnight success” is a 16-year-old legal entity that applied to Y Combinator nine times before getting in.

TLDW

Jay V explains how OpenCode grew 20x in six months to around 13 million monthly active users and 4.6 million weekly actives, processing 7 trillion tokens a day (more than OpenRouter’s entire volume), with an inference business annualizing near $40 million plus 160,000 subscribers worth another $18 million. The inflection point came when Anthropic started blocking Claude Code subscriptions inside OpenCode by rejecting requests whose system prompt contained the words “open code,” which backfired by equating the two products and sending curious users flooding in, shortly after which OpenAI’s Codex officially supported OpenCode. The conversation covers OpenCode’s public usage data (DeepSeek Flash dominating token volume despite GLM hype), a global user base led by China at 17% with heavy usage in Indonesia, Brazil, and Vietnam, Fortune 500 companies discovering thousands of employees already using the tool, the shift from ad-based CAC to token-based CAC, the flat 24-hour GPU utilization curve that comes from serving the whole planet, the “betting the field” marketplace thesis on model commoditization, and the founder’s 16-year, nine-application journey from a Waterloo dorm through SST, OpenNext, and selling coffee over SSH to finally catching lightning.

Thoughts

The Anthropic block is the most instructive growth story in the episode, because it is a perfect modern Streisand effect. Anthropic had a defensible reason to stop subsidized Claude Code subscriptions from flowing through a third-party harness, but the implementation (rejecting any request whose system prompt literally contained “open code”) turned a quiet policy decision into a public endorsement. As Jay puts it, the block placed OpenCode on the same pedestal as Claude Code in the minds of developers who had never heard of it. The hosts’ Instacart comparison is apt: when Amazon bought Whole Foods, the “death of Instacart” meme drove every grocer in America into Instacart’s arms. Incumbents keep learning this lesson the hard way. You cannot block a product without simultaneously advertising that it matters.

The deeper story is geographic. Silicon Valley talks about coding agents as if the $200-per-month power user is the market, and Jay’s data says the opposite. China alone is 17% of OpenCode’s usage, with Indonesia, Brazil, and Vietnam each carrying meaningful share, places where a frontier subscription costs more than rent. OpenCode’s $10 Go plan, running DeepSeek and GLM instead of Sonnet and Opus, is how billions of developers will actually have their first coding-agent moment. There is also a hard operational edge hiding in that distribution: because the East works while the West sleeps, OpenCode’s GPU utilization runs a nearly flat 24-hour cycle, which quietly improves unit economics in a way no single-market competitor can match. Serving the whole planet is not just a mission statement. It is a margin strategy.

OpenCode’s neutrality is turning into one of the most valuable datasets in AI. Because the product is a harness over every model rather than a storefront for one lab, opencode.ai/data shows what developers actually run when they are spending their own money, and it routinely contradicts the Twitter narrative. GLM was supposedly eating DeepSeek’s lunch; the token-volume charts show DeepSeek Flash dipping and then bouncing right back. Users are not loyal, they are rational: they ride frontier limits until they hit caps, then switch to models cheap and fast enough to finish the day’s work. That behavioral reality, boring cost optimization rather than fandom, is what the model market actually looks like once the marketing fog clears, and only a neutral aggregator gets to see it.

The business model inversion deserves more attention than it usually gets. In the last era, customer acquisition cost meant ads. In this one, it means tokens: the free tier is the marketing budget, spent on giving people the aha moment, and the payoff comes when a fraction of those users become whales paying per token, where OpenCode’s volume discounts become margin. This is the same funnel Anthropic and OpenAI run, except the frontier labs subsidize with investor billions while OpenCode rides the falling cost curve of open-weight models. The enterprise motion follows the same bottoms-up physics: no procurement dance, just inbound emails saying thousands of our employees are already using you, please sign the security questionnaire. That is the purest product-market-fit signal that exists.

And then there is the 16-year overnight success. Same legal entity since 2010, same two founders from a Waterloo dorm room, nine YC applications and four interviews before acceptance in 2021, years of living with parents and running out of money, a serverless framework, a coffee shop that ran over SSH. Every “dead end” turns out to have been training: the consumer company taught metrics discipline, SST taught open source and building in public, the terminal storefront taught terminal-UI craft that made OpenCode instantly credible with the Neovim crowd. The hosts land the right conclusion: lightning did strike, but the founders spent a decade positioning the bottle. In an industry currently obsessed with six-month-old unicorns, this episode is a useful reminder that most of them are carrying more history than the headline suggests.

Key Takeaways
- OpenCode ended June 2026 at roughly 13 million monthly active users and 4.6 million weekly actives, close to Codex’s numbers, a 20x increase from about 650,000 monthly actives at the start of the year.
- The platform now processes around 7 trillion tokens per day, more than OpenRouter’s total of roughly 6 trillion, up from about 300 billion per day at the beginning of the year.
- The pay-per-token inference business, launched around late September 2025, annualizes to $31-33 million on June data and $38-40 million on the most recent week, roughly eight months from zero.
- The subscription product launched in late February has grown to about 160,000 monthly subscribers, roughly $18 million in annualized revenue on top of inference.
- A Codex lead engineer publicly noted that about 5% of all Codex subscribers use OpenCode as their main harness, and OpenAI officially supports Codex subscriptions inside OpenCode.
- In the first week of January, Anthropic began blocking Claude Code subscriptions in OpenCode by rejecting any request whose system prompt contained the words “open code.”
- Jay concedes the block made business sense (Anthropic subsidizes that usage) but says it inadvertently equated OpenCode with Claude Code and drove waves of new users to investigate the product.
- The hosts compare it to Amazon buying Whole Foods: the “death of Instacart” meme drove every grocer in America to sign with Instacart, fueling its growth instead of killing it.
- The founding premise is that most people in the world still have not experienced the magic of a coding agent, and frontier per-token prices put that moment out of reach for much of the globe.
- When OpenCode launched in June 2025 the pitch was using your Claude Code subscription in a better terminal UI; by August and September the first credible open-source models (GLM, Kimi, MiniMax) arrived, roughly six months behind the frontier.
- February 2026 marked the first four-week span in OpenCode’s data where users ran Gemini more than the Anthropic models (Sonnet plus Opus combined), which convinced the team the non-Anthropic models were ready for real work and triggered the subscription launch.
- OpenCode publishes its usage data at opencode.ai/data, covering the Go plan where $10 a month buys access to open-source models.
- DeepSeek Flash leads token volume per day, with the two DeepSeek models plus GLM as the top three, despite social media chatter suggesting GLM had overtaken DeepSeek.
- By unique users the top models run DeepSeek Flash at about 38,000, DeepSeek Pro at 31,000, and GLM 5.2 near 30,000.
- A key usage pattern: as users approach daily or weekly limits on premium models, they switch to very cheap models like DeepSeek Flash to finish their work, extending how much coding-agent time their budget buys.
- Speed matters too: some open models are hosted with far higher tokens-per-second than alternatives, making the agent feel near real time, and users perceive quality niches, like GLM 5.2 being better at front-end design.
- China is OpenCode’s largest market at 17% of usage, which the hosts note may make it the only YC company in history with meaningful usage in China, partly because Chinese developers want to run Chinese models and OpenCode gives them that choice.
- Developing countries are huge: Indonesia at 4% of traffic, Brazil at 5%, plus Vietnam and similar markets where a $200-a-month Claude Code subscription is prohibitively expensive.
- The US, which the team was not even targeting with the Go plan, is growing strongly anyway, which Jay reads as a broader vibe shift toward token budgeting even among Americans.
- Large US companies with effectively unlimited token budgets also adopted OpenCode early because they did not want to be locked into a specific model or harness.
- Dozens of forward-leaning Fortune 500 companies have significant OpenCode footprints, often discovered when the company itself emails saying thousands of employees are already using it.
- Enterprise inbound has inverted the old SaaS procurement dance: companies beg OpenCode to fill out security questionnaires so they can officially use a product their engineers already adopted.
- Enterprise pull comes in four flavors: officially blessing developer usage, extending the tool to non-technical employees, embedding the agent loop inside their own products, and managing token spend by routing teams to cheaper models.
- One enterprise asked for deep visibility into exactly what every employee does with the tool, which the team flagged as a should-we-even-build-this question.
- Ramp built a Slack bot running OpenCode’s embeddable server (the agent loop that works behind the UI) before OpenCode had built anything similar internally, publishing a blog post about it in December.
- OpenCode is architected as a two-part product: the terminal UI you interact with, and a separately embeddable server that runs the agent loop and calls the LLM.
- The new CAC is tokens, not ads: the free tier exists to give people the magic moment, the subscription converts them to real work, and whales paying per token feed directly into margin via OpenCode’s volume discounts on inference.
- The episode references Dylan Patel’s podcast claim that Anthropic reached roughly $50 billion annualized revenue at around 70% margin in Q2, proof that the subsidize-then-harvest funnel can cross into profitability.
- Global usage produces a nearly flat 24-hour GPU utilization curve (the East works while the West sleeps), improving unit economics versus competitors serving one region.
- Jay describes OpenCode as a marketplace that showcases model diversity: competition among labs benefits consumers, while vendor lock-in mostly benefits vendor margins.
- OpenCode is now the largest customer by token volume for most open-source model labs, making the relationship symbiotic: the strategy is not picking a winning lab but betting the whole field.
- Every bump in OpenCode’s monthly actives traces back to a corresponding release in the open-source model market, making its growth a proxy for open-model progress.
- The name OpenCode was deliberate positioning: when a market has one or two dominant players, the rest coalesces around an open alternative, and whoever occupies that position first is very hard to displace.
- To support 70+ models and providers at launch, the team built models.dev, an open-source database of models and providers that Jay calls probably the best such dataset in the world.
- The origin moment: when Claude Code appeared in February 2025, the team (Neovim users unimpressed by its terminal UI) decided to build a coding agent that met the standard of modern terminal tools, credibility that resonated instantly with the core developer audience.
- The team had form here: co-founder Dax had built terminal.shop, a complete storefront for buying coffee over SSH, the kind of eccentric-taste project the hosts argue pulls founders toward outlier outcomes.
- The company is one 16-year-old legal entity, incorporated in 2010, founded by Jay and his college roommate Frank after a Waterloo co-op term convinced Jay he never wanted a normal job.
- Jay applied to YC nine times between 2016 and 2021 with four interviews before getting in, with his first interview dating back to the era when Paul Graham ran them and an Airbnb founder was hanging around the waiting room.
- The 2021 YC idea was a serverless platform, Heroku for AWS, which became SST, the team’s first big open-source project and the on-ramp to building in public.
- Building in public became core identity after co-founder Dax observed that if all your code is public and you work in public, staying silent about it is a disservice to the product; the community now follows the company like a reality TV show.
- Jay credits survival to stubbornness, visible forward progress, and cheap burn (living with parents after running out of money), while warning founders: don’t try this at home.
- The hosts’ framing of the whole arc: it took ten years of grinding to get to zero-to-$30-million in eight months, and catching lightning in a bottle requires positioning the bottle correctly first.
Detailed Summary

The Numbers: 20x in Six Months

OpenCode began the year around 650,000 monthly active users and ended June near 13 million, with 4.6 million weekly actives that put it in the same conversation as OpenAI’s Codex. Token throughput grew from roughly 300 billion per day to 7 trillion, a volume larger than all of OpenRouter. The money followed two tracks: a pay-per-token inference business launched in the fall that annualizes near $40 million on recent weeks, and a subscription product launched in late February that reached 160,000 monthly subscribers and about $18 million annualized. Codex officially supporting OpenCode, with around 5% of Codex subscribers choosing it as their harness, added a second frontier on-ramp right as the Anthropic controversy peaked.

The Anthropic Block That Backfired

Using a Claude Code subscription inside OpenCode was one of the most common usage patterns until Anthropic moved to stop it in early January, rejecting requests whose system prompt mentioned “open code.” Jay is gracious about the logic (Anthropic subsidizes subscription usage and wants it inside its own product) but the effect was the opposite of containment. The block put the scrappy open-source harness on the same pedestal as the category leader, told every developer who had not tried it that it was worth investigating, and kicked off the year’s 20x run. The hosts draw the Instacart parallel: a supposed death blow that functioned as the best marketing campaign the company never paid for.

A Global User Base the Valley Doesn’t See

The product premise is that the coding-agent aha moment is a once-a-generation experience most of the world cannot afford at frontier prices. The Go plan ($10 a month for open-source models) was built for that global audience, and the geography shows it: China leads at 17%, with Indonesia at 4%, Brazil at 5%, and Vietnam prominent, markets where $200 a month is simply not a consumer price point. Two surprises followed. Chinese developers use OpenCode partly to run their own country’s models, which no US-locked product lets them do. And the US, never the target for Go, is growing fast anyway, which Jay reads as the token-budgeting vibe shift reaching even the throw-money-at-it crowd, helped by moments like GLM 5.2’s popularity making the plan the easiest way to try it.

What the Usage Data Really Shows

OpenCode publishes per-model usage at opencode.ai/data, and because every data point is an actual end user rather than aggregated API traffic, it is arguably the cleanest picture of what working engineers really run. DeepSeek Flash dominates token volume, the two DeepSeeks plus GLM hold the top three, and the market-share graph shows DeepSeek dipping when GLM launched and then bouncing back, contradicting the Twitter narrative of a GLM takeover. By unique users, Flash leads at 38,000 with DeepSeek Pro at 31,000 and GLM 5.2 near 30,000. The behavioral driver is pragmatic: cheap, fast models let users keep working after they hit premium limits, hosted speeds make some models feel real time, and perceived niches (GLM for front-end design) steer specific workloads.

Enterprises Arriving Through the Back Door

Before the open-model wave, companies adopted OpenCode to avoid lock-in to any single model or harness. Now dozens of forward-thinking Fortune 500 companies have significant footprints, and the procurement process has inverted: instead of sales outreach, OpenCode receives DMs saying a few thousand employees are already using the product, please sign the security questionnaire, and often, please don’t tell anyone. Once inside, enterprises pull in predictable directions: extend access to non-technical staff, embed the agent loop in their own products, and manage token spend by restricting expensive frontier models to teams that need them. Ramp exemplified the embedding path, running a Slack bot on OpenCode’s server component before OpenCode itself had tried it. One request, total visibility into employee activity, raised the harder question of what the company is willing to build.

Token Economics: CAC Is Now Paid in Tokens

The episode’s sharpest business insight is that customer acquisition cost has migrated from ads to tokens. Becoming skilled enough with coding agents to justify heavy spend is itself expensive, a chasm most individuals and companies cannot cross unaided. Anthropic and OpenAI solve this by subsidizing subscriptions until a percentage of users become whales, and per Dylan Patel’s numbers cited in the episode, that funnel has carried Anthropic to roughly $50 billion annualized at 70% margins. OpenCode runs the same funnel without frontier-scale subsidies: the free tier delivers the magic moment, the $10 plan makes real work affordable on open models, and whales paying per token convert OpenCode’s volume discounts into margin. The flat 24-hour GPU utilization curve from serving every timezone compounds the advantage.

Betting the Field: The Marketplace Thesis

Jay frames OpenCode as a marketplace where users pick models by attribute and cost, which keeps labs honest and passes competitive gains to consumers instead of vendor margins. Every bump in OpenCode’s growth traces to a release in the open-model market, so the company is explicitly not picking a winning lab; it is betting the field. That bet has made OpenCode the largest customer by token volume for most open-source model labs, a symbiosis where each side needs the other. On commoditization, Jay’s view is nuanced: the intelligence market is so large that labs will carve defensible niches along the quality-cost-performance axes, the way DeepSeek deliberately owns the cost corner. The positioning strategy has deep roots: as with the team’s earlier OpenNext project, when a market has two dominant players, the rest coalesces around an open alternative, and OpenCode raced to become that default, building models.dev along the way just to support 70+ providers at launch.

Sixteen Years to Overnight Success

The backstory reframes everything. Jay started the company after a discouraging Waterloo co-op term in 2006-2007, incorporated with college roommate Frank in 2010, and spent the next decade shipping products that did “reasonably well” while applying to YC nine times across 2016-2021, with four interviews, all as the same legal entity, the same founders, and a rotating cast of ideas. His first YC interview was with Paul Graham, in a waiting room shared with an Airbnb founder. Acceptance finally came in 2021 with the serverless platform that became SST, the team’s gateway into open source and building in public, a practice pushed by YC’s Dalton and crystallized by co-founder Dax’s observation that public code deserves public storytelling. When Claude Code landed in February 2025, the team’s terminal-UI taste (honed on projects as eccentric as coffee-over-SSH) told them exactly what to build. The hosts close on the honest version of the lightning-in-a-bottle myth: ten years of grinding taught the team consumer metrics, open source, marketing, and positioning, so when the strike came, the bottle was already in place.

Notable Quotes

“Most people in the world still haven’t experienced the magic of a coding agent.”
Jay V, on the founding premise of OpenCode

“You really know you have product market fit when like enterprises are bugging you to sign the security agreement so they can use your product.”
Lightcone host, on OpenCode’s inverted enterprise sales motion

“It’s not that we’re picking a winner in terms of a model lab. We’re just betting the field. We just think the rest of the field is going to do well.”
Jay V, on OpenCode’s strategy toward the model market

“With these open-source models, we’re the largest customer for most of them.”
Jay V, on OpenCode’s token volume relative to open-model labs

“When you’ve got a dominant or in this case two dominant players in the market, the rest of the market coalesces around an open alternative. And picking that position ends up being really valuable because if you pick it, it’s very hard for somebody else to displace you.”
Jay V, on the deliberate positioning behind the OpenCode name

“This is just an unprecedented market, like the market for intelligence has not existed before, everybody should be thinking in a positive-sum grow-the-pie mentality.”
Lightcone host, on why labs should welcome OpenCode’s growth

“Look, you know, all your code is public. You work basically in public. If you don’t talk about it publicly, you’re probably doing yourself a disservice and your product a disservice.”
Jay V, recounting co-founder Dax’s case for building in public

“It was really more a journey that took 10 years to get to 0 to 30 million in 8 months.”
Lightcone host, reframing the overnight-success narrative

“To catch the lightning in the bottle, you actually like have to sort of position the bottle correctly and be ready for it and know what to do with it.”
Lightcone host, closing the episode on preparation meeting luck

Watch the full conversation here.

Related Reading
- OpenCode the open-source coding agent discussed throughout the episode, including its public usage data.
- models.dev the open-source database of AI models and providers the team built to support 70+ providers at launch.
- SST the serverless framework that got the company into YC and established its open-source, build-in-public roots.
- Terminal the coffee-over-SSH storefront that proved the team’s terminal-UI chops before OpenCode existed.
- Y Combinator the accelerator behind the Lightcone podcast, which Jay applied to nine times before getting in.
July 24, 2026
Jensen Huang Says the AI Apocalypse Is ‘Complete Nonsense’: NVIDIA’s CEO on AI Jobs, China, Open Source Models, the AI Bubble, and the Trillion-Agent Future (Axios Behind the Curtain)
Sitting on the floor of a brand new chip factory in Fort Worth, Texas, NVIDIA CEO Jensen Huang gave Axios reporter Mike Allen one of his most combative and quotable interviews yet. In this episode of Behind the Curtain, the head of the world’s most valuable company dismisses AI doom scenarios as “complete nonsense,” argues that AI is creating jobs rather than destroying them, defends Chinese open source models like Kimi and DeepSeek, explains why the AI build out is not a bubble yet, and calls for Anthropic’s most powerful model to be made available to everyone.

TLDW

Huang covers the full sweep of the AI moment: Chinese export control threats and why he wants open research flows in both directions, why the world needs both closed models (Anthropic, OpenAI) and open models (Kimi, Qwen, DeepSeek, NVIDIA’s own Nemotron), why Wall Street misread the Kimi selloff exactly as it misread DeepSeek, the sovereign AI argument that no company or country should “outsource its alpha,” his evidence that AI is increasing jobs for radiologists, paralegals, and manufacturing workers, a sustained attack on AI doomers and the “made up” narratives of singularity, simulation, and machine consciousness, the CapEx-heavy economics of manufacturing intelligence via tokens, his claim that the bubble is not coming in the next five years because physical constraints (chips, memory, power, construction workers) are pacing the build out, his warm relationship with President Trump and his warning against knee-jerk regulation, his position that Claude Mythos should be available to all users, the coming era of a trillion AI agents, the “ChatGPT moment” for robots having already arrived, and closing life lessons on pain, suffering, practice, immigration, and why he refuses to wear a watch because “now is the most important time.”

Thoughts

The first thing to hold in mind while watching this: every single position Huang takes, without exception, maps to selling more GPUs. Open models are good (more diffusion, more compute). Closed models are also good (more services, more compute). Chinese models are good (more use, more compute). Doom talk is bad (fear slows adoption, which slows compute). The bubble is far away (keep buying compute). That perfect alignment between worldview and order book does not make him wrong, but it means his arguments deserve scrutiny on the merits rather than deference to his position. He is the most effective anti-doomer in the industry partly because he is the person with the most to lose if the world gets scared.

That said, his strongest material is empirical, and it lands. The radiologist example is a direct rebuttal to one of the most famous predictions in AI history, Geoffrey Hinton’s 2016 claim that we should stop training radiologists. Huang’s version of events, that automating the scan-reading task let radiologists see more patients and demand for them grew, is a textbook case of what economists call the Jevons effect applied to labor. Whether his specific numbers (20 percent more radiologists, 10 percent more paralegals, 50 percent more manufacturing jobs) survive fact-checking, the structural argument that automating a task can grow the profession around it is historically well supported, and it is the single most useful reframe in the interview: your job is not your task, and when the task gets automated, the purpose remains.

The open source security argument is the most intellectually serious part of the conversation and the one most directly aimed at his own customers. Huang praises Anthropic and OpenAI as businesses in one breath and then dismantles the “closed models are safer” position in the next: Linux runs the world’s digital infrastructure precisely because millions of people can inspect and harden it, and a world defended by one closed model is a world with a single point of failure. His call for “massively distributed, diverse defense” via open models in the hands of cybersecurity experts everywhere is a real policy position with real stakes, and it puts him closer to Meta’s historical stance than to the labs he supplies.

The bubble section is where the skeptic should lean in. Allen hands him the most famous cursed phrase in financial history, “this time is different,” and Huang takes the bait enthusiastically: it is different, he says, because the demand is industrial rather than cyclical. Every bubble in history was justified by exactly this argument, including the railroads and the dot-com fiber build out that Huang implicitly invokes as precedent. But his supply-side observation deserves weight: bubbles pop when supply overshoots demand, and right now everything (chips, memory, packaging, power, land, construction labor) is short. A market that cannot build fast enough is at least not overbuilt yet. His own concession that “the bubble will come someday” and his refusal to vouch for years five through ten is more honest than the rest of the answer.

Finally, notice the tension he never resolves. He says warnings about AI’s power are “well heeded,” that safety is the leaders’ responsibility, and that Anthropic must fix jailbreaks fast. He also says consciousness, singularity, and existential risk are “all made up,” and shrugs off the referenced Mythos jailbreak with “everything was fine, you and I are here having a conversation.” Those two postures, take the technology seriously enough to harden it but never seriously enough to fear it, are held together mostly by confidence. It is a bet that capability and controllability scale together. The doomers he mocks are making the opposite bet, and nothing in this interview actually settles which one is right.

Key Takeaways
- On reports that Chinese regulators may tighten export controls on AI models and semiconductors to keep them from the West: Huang hopes it does not happen, notes half the world’s AI researchers are Chinese, and says both sides should de-escalate and let the technology advance.
- He opposes any US ban on Chinese models like Kimi: American companies should absolutely be allowed to use them, because downloaded open models can be fine-tuned, guardrailed, and run inside secure sandboxes and harnesses, and the “back door” fear is a misconception.
- The world needs both closed and open models: use closed services (Anthropic, OpenAI) as much as possible because they are excellent and convenient, but science, cybersecurity, and sovereignty require open models.
- Regulate applications of AI (medicine, transportation, autonomous vehicles), not the underlying technology, which is dual use and should advance as fast as possible.
- NVIDIA’s China sales are “approximately zero today” and he has told investors to expect none; he would consider it an honor to return if both governments allow it.
- The market misunderstood DeepSeek and is now misunderstanding Kimi the same way: great open models, wherever they come from, drive more AI use, which drives more NVIDIA computers, more data centers, and more services.
- Open models are not adversarial to closed models: the most likely customer to upgrade to Anthropic or OpenAI is someone who already uses AI and wants it more convenient and better.
- NVIDIA’s Nemotron open model exists for companies that must build their own AI for sovereignty, regulatory, privacy, or IP reasons. “We don’t have to be the frontier. We have to be at the frontier.”
- The large language model is the brain; a harness (he names OpenClaw and Claude Code as examples) turns it into a working agent. With the right harness, Nemotron can be world-class for specific skills.
- Cheap or free open source tokens are “fantastic” for the proprietary labs: free AI grows the population of people who realize they need AI, and running even a free model yourself usually costs more than renting a service.
- Echoing the viral Palantir CEO interview: “Nobody should outsource their alpha.” Companies and countries should rent AI wherever they can but must build their own AI for domain-specific, proprietary, sovereign, secret, or regulated work.
- For non-differentiating work (marketing automation, legal department productivity), outsource to the frontier labs as much as possible.
- Nothing AI has done has truly surprised him; what society needs to realize is that automating tasks is increasing the number of jobs the world needs.
- His jobs evidence: radiologists up roughly 20 percent because AI-automated scan reading lets them see far more patients; paralegals up roughly 10 percent for the same reason; US manufacturing jobs up roughly 50 percent in recent years because AI data centers require industrial might.
- On the demonstrated ability of Anthropic’s Mythos to break into hardened systems: “it surprised me that people were surprised.” An AI that can write and debug software can necessarily find vulnerabilities; the same capability powers cyber defense.
- His security architecture argument: one single model is one single point of attack and failure. Open models in the hands of cybersecurity experts worldwide create “massively distributed, diverse defense,” the same reason Linux is trustworthy.
- Whether China has “caught up” does not matter: the race-with-a-finish-line framing is wrong, China manufactures more AI researchers than the rest of the world combined, holding China back is ill-conceived, and neither side can hold back the other.
- “AI is not going to destroy all of our jobs. Someone who uses AI is going to take our jobs.” The biggest risk to the US is scaring industries and society out of adopting AI.
- On doomer AI CEOs: warning is fine, warning with a solution is better, and making things up is “absolutely inappropriate.” End-of-humanity and half-of-jobs-destroyed claims are “complete nonsense” contradicted by all the evidence.
- Asked why Asia loves him while America is anxious: “the doomers spend too much time theorizing about these science fiction outcomes, maybe it makes them sound smart.”
- OpenAI and Anthropic are not in trouble from Chinese competition: “zero possibility” China runs US companies off the road, both labs are thriving, and their IPOs will be the most successful in human history.
- On chip stocks down 18 percent after Kimi dropped: free AI is great for hardware, chips, and data centers; the market got it wrong with DeepSeek (NVIDIA fell about 30 percent) and is getting it wrong again.
- AI cannot have peaked because diffusion into society and industry has barely begun; useful AI has finally arrived, and useful AI is profitable AI, citing coding agents companies happily pay hundreds of millions a year for.
- The new IT industry is CapEx heavier than software because intelligence must be manufactured: machines produce the tokens behind every answer, image, protein, and robot maneuver, and the resulting productivity will more than pay for the build out.
- A token is an embedding of knowledge and intelligence, and unlike pi it gets smarter over time; smarter tokens are more valuable, which is why token economics keep improving.
- On the bubble: “The bubble will come someday. It’s just not today.” Very unlikely in the next five years; five to ten years depends on how fast the industry can build.
- The build out is constrained in every direction (chips, memory, land, power, construction workers), and that constraint is healthy: it pushes out the day supply exceeds demand.
- This cycle is “industrial-driven,” not seasonal or consumer-demand-driven: the world needs a new intelligence infrastructure layer on top of energy, internet, roads, and railroads, and the semiconductor industry needs to be 5 to 10 times larger within ten years.
- He is not worried about customers issuing hundreds of billions in debt to buy his chips: these companies generate enormous cash, the compute platform shift is real, and the ROI question has been answered because AI is now demonstrably profitable.
- He would use Kimi himself, with fine-tuning, guardrails, sandboxing, and access control, the same way the world already trusts open source software like Linux.
- On Trump: they text, the president “remembers everything” including H20, H200, Blackwell, and Rubin, and the Fort Worth factory they are sitting in is a direct result of their first conversation about reindustrializing America.
- His warning to the administration: do not over-correct based on science fiction narratives about AI consciousness; talk to many CEOs and scientists, not one or two, and take time to be informed before regulating.
- On the government taking an equity stake in NVIDIA: unnecessary, because the US already has a stake via $10 billion in taxes paid last year, job creation, and the stock market holdings of most Americans.
- Claude Mythos should “absolutely be available to everyone,” not just selected institutions; it is Anthropic’s job to harden it and patch jailbreaks fast, and he notes that when it was jailbroken “everything was fine.”
- On distillation of closed models: learning from other intelligence is fundamental (soon the internet will be 99 percent AI-generated content anyway), but violating terms of service or privacy is not okay and should be handled through existing legal channels.
- NVIDIA has 6,500 employee families in Israel he is concerned for; he remains bullish on the UAE reinventing itself from an oil economy into an AI hub.
- NVIDIA runs about 50,000 employees and may reach only 75,000 in ten years, “as small as possible,” because strategy means maximizing impact per unit of resource.
- Jobs that are a single task (customer service call centers) will be automated; jobs with purpose survive because purpose does not change when the task is automated. “Don’t mistake your task for the job.”
- In 10 to 20 years, photos of people typing at keyboards will look like old photos of typing pools with IBM Selectrics: typing was never the job, solving problems and creating value was.
- The ChatGPT moment for robots has already arrived (a robot can reason through “put the apple in the drawer,” including opening the drawer first); useful robots in ordinary life within 3 to 4 years would not surprise him.
- The agentic era’s capability has arrived and diffusion is next: the future holds 100 billion to a trillion agents running constantly, and agents will not become computers, they will use computers, which is why compute demand explodes.
- $300 billion has been invested into US venture capital startups in the last six months, and he tells his nieces and nephews that great fortunes will be created on a laptop.
- Life lessons: greatness requires “plenty of pain and suffering” and practice when nobody is watching; under maximum stress, time slows down the way athletes describe, and that comes from repetition.
- He advises every bright mind in the world to come to America, the country built by immigrants that will need amazing immigrants in the future.
- He wears no watch and refuses to let Outlook manage his life: “now is the most important time.” His perfect Saturday: dogs, work, family dinner, a cocktail, and he notes every weekend is exactly like that.
Detailed Summary

Export Controls Cut Both Ways

The interview opens on a Financial Times report that Chinese regulators are considering export controls of their own, restricting Chinese AI models and semiconductors from reaching the West. Huang’s response is de-escalation in both directions: half the world’s AI researchers are Chinese, groundbreaking research flows from both countries, and once one side reaches for export controls, everyone starts thinking in those terms. He is confident the US will continue to lead as long as government supports rather than constrains its companies. Asked whether the US should ban Chinese models like Kimi, he rejects the premise: downloaded open models run inside harnesses and sandboxes with security, privacy, and access controls, and the idea of hidden back doors phoning home to China is a misconception. His China sales, he notes pointedly, are approximately zero today, so his position is not about protecting revenue he does not have.

Open and Closed Models Both Win

Huang’s framework is consistent: rent closed models (Anthropic, OpenAI, which he personally uses along with Perplexity) whenever you can because they are excellent and convenient, and build on open models only when you must, for sovereignty, regulation, privacy, or proprietary domain reasons. This is the pitch for NVIDIA’s own Nemotron open model family, which he positions not as a frontier competitor but as raw material for companies that need custom AI: “We don’t have to be the frontier. We have to be at the frontier.” He describes the modern stack in plain terms: the large language model is the brain, and a harness (he cites OpenClaw and Claude Code) turns it into a working agent. Open, cheap, and free models are on-ramps that grow the total population of AI users, which is why he insists the labs should not fear them: the person most likely to pay for Claude is someone already using AI who wants it better and easier.

Kimi, DeepSeek, and Wall Street’s Repeated Mistake

Chip stocks fell 18 percent in the month after Kimi dropped, echoing the roughly 30 percent NVIDIA drawdown when DeepSeek landed. Huang says the market got it wrong both times and for the same reason: free and open AI is great for hardware, because great models drive use, use drives data centers, and data centers drive chips. He runs through the models he considers extraordinary (Kimi 3, Qwen, Nemotron, GPT 5.6, Codex, Claude Code) and lands on his core claim about this moment: useful AI has finally arrived, and useful AI is profitable AI. Companies like NVIDIA happily pay hundreds of millions of dollars a year for coding agents doing high-value work, which funds more AI, which he describes as a flywheel that has now started.

Don’t Outsource Your Alpha

Allen raises the viral Palantir CEO warning about handing your intellectual property to frontier labs, noting Huang’s unique position as both a top customer and top supplier of those labs, including using their models for chip design. Huang agrees with the principle without hesitation: nobody, no company, no country should outsource its alpha or its intelligence. His dividing line is specificity: work that is domain-specific, proprietary, sovereign, secret, or regulated must be done in-house on your own models, while generic productivity work like marketing automation or legal department support should be outsourced to the labs as aggressively as possible. The same logic scales to nations, which he says cannot outsource their fundamental intelligence to a third party.

The Jobs Evidence

Asked what AI has done that scared or awed him, Huang says essentially nothing surprised him, including the demonstrated ability of Anthropic’s Mythos to penetrate hardened systems (“it surprised me that people were surprised,” since an AI that debugs software can obviously find vulnerabilities). What he wants the world to notice instead is the labor data. Radiology reading has been substantially automated, and the number of radiologists is up roughly 20 percent because they can now see the enormous backlog of patients. Paralegals are up roughly 10 percent by the same mechanism. Manufacturing jobs are up roughly 50 percent in recent years because AI data centers require industrial construction. His formulation of the real risk: AI will not take your job, someone who uses AI will, and the worst thing America could do is scare its own industries out of adopting the technology.

Against the Doomers

This is the section that gives the interview its title. Huang says warning people is fine, warning with a solution is better, and making things up is absolutely inappropriate. The end of humanity: complete nonsense. Half of American jobs destroyed: complete nonsense. The singularity, living in a simulation, machine consciousness: “all made ups,” fun science fiction he enjoys hearing from “many of those leaders and my friends,” but Hollywood, not ground truth. Asked why he is mobbed by fans in Asia while the American mood is hostile, he suggests the doomers theorize about science fiction outcomes because “maybe it makes them sound smart.” His prescription for the industry is to tell the factual story, that AI is creating millions of jobs, rather than a made-up narrative that frightens the public and, more dangerously in his view, frightens policymakers. His closest thing to a concession: the closest thing to true AI is R2-D2 and C-3PO, “and who doesn’t want R2-D2 and C-3PO?”

CapEx, Tokens, and the Bubble Question

Huang’s economic argument for the build out runs through the token. Unlike the CapEx-light software era, intelligence must be manufactured: machines generate the tokens behind every answer, every image, and eventually every protein, chemical, and robot movement. A token is an embedding of knowledge, and unlike a static number it gets smarter over time, which makes it more useful, more valuable, and worth paying more for. On the bubble, he does not deny one is possible: “The bubble will come someday. It’s just not today.” He rules it out for roughly five years and hedges on five to ten. His reasoning is that this cycle is industrial-driven rather than consumer-cyclical: the world is adding an intelligence layer on top of energy, internet, roads, and railroads, the semiconductor industry needs to be 5 to 10 times larger within a decade, and everything (chips, memory, optical interconnects, packaging, TSMC capacity, land, power, construction workers) is short. Those constraints pace the CapEx and push out the day supply overtakes demand. As for customers issuing hundreds of billions in debt to buy his chips, he says the companies are extraordinary cash generators and the ROI question has been settled by profitable coding agents.

Trump, Washington, and the Over-Correction Risk

Huang describes a genuinely warm relationship with President Trump: they text, the president remembers chip model numbers (H20, H200, Blackwell, and next-generation Rubin), and the Fort Worth factory hosting the interview traces directly to their first conversation about restoring American manufacturing. He praises Susie Wiles, Secretary Bessent, and Secretary Lutnick. But his message to the administration is a warning: signs point toward more restrictive AI policy, and he fears policymakers falling for science fiction narratives (consciousness, an imminent finish line in a US-China race) pushed partly by companies hoping regulation will advantage them. His advice: talk to many CEOs and scientists, not one or two, take time, and do not over-correct. He rejects the 100-meter-dash framing of the China race entirely, arguing the win is diffusion, not invention: America did not invent electricity or manufacturing, it applied them with more enthusiasm than anyone, and that is what made the country. Asked about the government taking equity stakes in AI companies, he calls it unnecessary: the US already holds a stake in NVIDIA through $10 billion in annual taxes, job creation, and the stock market.

Mythos for Everyone, and the Distillation Question

In the most newsworthy exchange, Allen asks whether the world is ready for Anthropic’s most powerful model, Claude Mythos, to be available to everyone rather than selected institutions. Huang’s answer is unambiguous: it should absolutely be available to everyone, it is Anthropic’s responsibility to harden it, and jailbreaks are the nature of software, to be patched as fast as they are found. He points to the referenced jailbreak incident and observes that “everything was fine,” while noting that holding Anthropic back serves no American interest, especially since open models are available regardless. On distillation, he splits the question: AIs learning from other AIs is fundamental and inevitable (within a few years, he predicts, the internet will be 99 percent AI-generated content, so every model is distilling other AIs anyway), but violating terms of service or privacy is not acceptable, and aggrieved providers should pursue the conventional legal remedies that already exist.

Robots, Agents, and the Next Era

Huang argues the ChatGPT moment for robots has already happened, on his definition: the 2022 ChatGPT moment was not when AI became useful (that took four more years) but when it did something surprising, and a robot that can reason through “put the apple in the drawer,” including opening the drawer first, clears that bar today. Useful everyday robots within three to four years would not surprise him. On the agentic era, capability has arrived and diffusion is what comes next: where perhaps 100 million humans use computers at any given moment today, the future holds 100 billion to a trillion agents of every kind running constantly. His line: agents are not going to become computers, agents are going to use computers, and that is the deepest driver of compute demand.

Life Lessons from 33 Years at the Helm

The closing stretch turns personal. On keeping NVIDIA at roughly 50,000 employees (maybe 75,000 in ten years, “as small as possible”) while peers run six figures, he says strategy is using limited resources with maximum precision, a craft he has practiced longer than any CEO in tech history: “this is my kung fu.” On which jobs disappear, he distinguishes task from job from purpose: call center tasks will be automated, but a radiologist’s purpose (ending human suffering) survives the automation of scan reading, and typing was never the job in the first place. Born in Taiwan and sent to a rough American boarding school at nine, he calls America the greatest country in the world because open discourse and freedom let it work through its disagreements, and he urges bright minds everywhere to come. On greatness: no athlete just happens to be great, it is practice when nobody is watching, setbacks, losing, and “plenty of pain and suffering” that elevate craft, character, and resilience. He wears no watch because now is the most important time, and his perfect Saturday (dogs, work, family dinner, a cocktail) is, he says, exactly what every weekend already looks like.

Notable Quotes

“And so the fact that this is going to be the end of humanity, it’s complete nonsense. The fact that this is going to destroy half of the American jobs. It’s complete nonsense. And all of the facts, all of the evidence point exactly to the opposite.”
Jensen Huang, on AI doom predictions from fellow tech leaders

“AI is not going to destroy all of our jobs. Someone who uses AI is going to take our jobs, and so we have to make sure that we adopt AI, diffuse AI into the industries as quickly as possible.”
Jensen Huang, on the real employment risk of the AI era

“Nobody should outsource their alpha. Nobody should outsource their intelligence. No country should.”
Jensen Huang, agreeing with the Palantir CEO’s warning about handing IP to frontier labs

“We don’t have to be the frontier. We have to be at the frontier.”
Jensen Huang, on NVIDIA’s Nemotron open source model strategy

“The bubble will come someday. It’s just not today.”
Jensen Huang, on whether the AI build out is a bubble

“It is made up that there’s going to be a singularity. It’s made up that somehow we’re living in a simulation. These are all made ups.”
Jensen Huang, on science fiction narratives he says are scaring the public and policymakers

“The closest thing to true AI is R2-D2 and C-3PO. And who doesn’t want R2-D2 and C-3PO?”
Jensen Huang, on how to inoculate the public against fear of AI

“These two companies will be the most successful IPOs in human history.”
Jensen Huang, predicting the public debuts of OpenAI and Anthropic

“If your job is the task, then it’s very likely that when that task is automated, your job will be eliminated or changed.”
Jensen Huang, on which jobs disappear in an industrial revolution

“Because now is the most important time. I refuse to let Outlook manage my life, and I refuse to let a watch manage my life.”
Jensen Huang, on why he does not wear a watch

Watch the full conversation between Jensen Huang and Mike Allen on Axios Behind the Curtain here.

Related Reading
- Jensen Huang (Wikipedia) background on NVIDIA’s co-founder and the longest-tenured CEO in tech.
- Axios Behind the Curtain the column and interview series by Mike Allen and Jim VandeHei behind this conversation.
- NVIDIA Nemotron primary source on the open model family Huang pitches for sovereign and custom AI.
- Moonshot AI (Wikipedia) the Chinese lab behind Kimi, the open model that rattled the markets.
- Jevons paradox (Wikipedia) the economic effect behind Huang’s argument that automating tasks grows demand for the people who do them.
July 24, 2026
OpenAI and Broadcom Unveil Jalapeño, a Custom LLM Inference Chip to Cut Compute Costs and Reduce Nvidia Dependence
OpenAI and Broadcom pulled the wrapper off Jalapeño on Wednesday, June 24, 2026, a custom silicon accelerator that OpenAI is calling its first “Intelligence Processor” and its first real move into designing the hardware underneath its own models. Broadcom President and CEO Hock Tan and President Charlie Kawwas physically handed the wafer to OpenAI CEO Sam Altman and President and Co-Founder Greg Brockman, a staged moment meant to signal that the ChatGPT maker is no longer just a models-and-products company but is now reaching all the way down to the chip. Jalapeño is purpose-built for large language model inference, the compute-intensive job of actually serving answers to users rather than training the model in the first place, and OpenAI plans to deploy it at gigawatt scale by the end of 2026 as the first step in a multi-generation platform built with Broadcom and Canadian electronics manufacturer Celestica. You can read the announcement straight from the source in OpenAI’s official post.

TLDR

OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI chip, an ASIC designed from a blank slate specifically for LLM inference rather than training, manufactured by TSMC and integrated into server systems by Celestica that only OpenAI will use. OpenAI claims the chip went from initial design to manufacturing tape-out in just nine months, what it calls the fastest ASIC development cycle ever in high-performance advanced semiconductors, accelerated in part by using its own AI models to design the silicon. Engineering samples are already running ML workloads in the lab, including GPT-5.3-Codex-Spark, and OpenAI says early testing shows performance per watt “substantially better” than current state-of-the-art, a self-reported and not yet independently verified claim with a full technical report promised in the coming months. Broadcom CEO Hock Tan told Reuters the chip matches Nvidia’s Blackwell and Google’s TPUs, framing the launch as part of a flywheel where OpenAI owns the full stack from chip to model to product. The chip slots into a broader infrastructure strategy targeting 10 gigawatts of custom accelerator capacity between 2026 and 2029 with deployments alongside Microsoft and other partners, and The Decoder reported Microsoft is expected to buy 40 percent of the chips, a guarantee Broadcom reportedly demanded to secure the first phase. The move is widely read as OpenAI diversifying away from Nvidia, continuing a procurement spree that already includes AWS Trainium, AMD, and Cerebras, as inference quietly becomes the company’s real cost center.

Thoughts

The single most important word in this announcement is “inference,” and it is the word doing the heavy lifting. Training a frontier model is a capital expense that happens in bursts. Inference is the bill that arrives every single day, forever, scaling linearly with usage. Every ChatGPT reply, every Codex task, every API call, every agent step is an inference event, and as OpenAI’s product surface explodes that recurring cost is the thing that actually threatens the unit economics. A custom chip aimed squarely at inference is therefore not a vanity project or a research flex. It is OpenAI attacking the largest variable cost in its business at the root, trying to bend its cost-per-token curve below what it pays renting Nvidia GPUs. If Jalapeño lands anywhere near its claims, the payoff is not faster benchmarks, it is gross margin.

The performance-per-watt claim, though, deserves the most skeptical reading in the room. OpenAI says Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art, but it has not finalized the numbers, has not said which chips it tested against, on what tasks, or under what conditions, and the full technical report is somewhere in the indefinite “coming months.” These are self-reported figures from a company with an enormous interest in convincing the market it has a credible alternative to Nvidia. Hock Tan’s line that the chip is “as good as” Blackwell and Google’s TPUs is a CEO talking his own book in an interview, not a measured result. The honest posture is to treat the figures as marketing until the technical report lands. A chip running engineering samples in a lab at target frequency is real progress, but it is a very long way from a chip that holds those numbers across a production fleet under messy real-world load.

OpenAI left the most revealing detail out of its own press release: the report, via The Decoder, that Broadcom demanded Microsoft guarantee it will buy 40 percent of the chips to secure the first phase. That single sentence tells you who is actually carrying the risk. Building gigawatt-scale custom silicon is brutally capital-intensive, and Broadcom is not willing to commit manufacturing capacity on the strength of OpenAI’s demand alone. It wants a balance sheet behind the order, and Microsoft, OpenAI’s largest backer, is the balance sheet. That detail quietly reframes the whole “OpenAI owns the stack” narrative. OpenAI may design the chip, but the deployment is underwritten by Microsoft’s purchasing commitment, which means Microsoft also gets leverage and supply security out of an OpenAI-branded part. Ownership of the design is not the same as ownership of the risk.

The flywheel framing is genuinely interesting and probably the most defensible strategic claim OpenAI is making. OpenAI says it used its own models to accelerate parts of the chip design and optimization, compressing a normally multi-year ASIC cycle into nine months. If that is even partly true, it is a meaningful loop: the models help design the chips, the chips run the models more cheaply, the cheaper models drive more usage and revenue, and the revenue funds the next chip. That is a compounding advantage that is hard for a pure hardware vendor to replicate and hard for a pure software lab to replicate. The catch is that nine months from design to tape-out is a claim about speed, not about whether the resulting chip is actually competitive in volume. Fast tape-out and great silicon are different achievements, and the industry has seen plenty of chips that taped out quickly and underwhelmed in production.

Strip away the “Intelligence Processor” branding and this is a playbook we have already watched run three times. Google built TPUs, Amazon built Trainium and Inferentia, Meta built MTIA, and all of them turned to Broadcom or Marvell for the design IP that is hard to replicate in-house. OpenAI is doing the same thing with the same partner, just later and louder. The diversification arc is unmistakable: OpenAI was one of the biggest Nvidia GPU buyers on earth, and in the span of a year it has signed deals for AWS Trainium, AMD accelerators, and Cerebras inference hardware, and now its own custom ASIC. Nvidia is not in trouble, demand still vastly outstrips supply, but the era where the largest AI labs were captive single-vendor customers is clearly ending. The most intriguing wildcard is OpenAI’s own line that Jalapeño is “designed with flexibility to work with all LLMs.” That is not how you describe a chip you intend to keep entirely to yourself. It hints, however faintly, at an OpenAI that could one day rent out inference infrastructure the way it now rents models, which would put it in direct competition with the very cloud providers it currently depends on.

Key Takeaways
- OpenAI and Broadcom unveiled Jalapeño on Wednesday, June 24, 2026, OpenAI’s first custom AI chip and its first piece of in-house silicon after years focused on models and products.
- The chip is branded an “Intelligence Processor” and described as the first AI accelerator in a multi-generation compute platform the two companies are building together.
- Jalapeño is purpose-built for large language model inference, the compute-intensive work of generating responses and serving answers to users, and explicitly not for training.
- Inference is OpenAI’s recurring cost center: every ChatGPT conversation, coding request, image generation, and agent action relies on it, making it one of the highest ongoing costs in the business.
- Broadcom President and CEO Hock Tan and President Charlie Kawwas physically delivered the first wafer to OpenAI CEO Sam Altman and President Greg Brockman.
- OpenAI designed the chip from scratch around its understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs.
- Jalapeño is described as a blank-slate design for modern LLM inference, not a general-purpose accelerator adapted from earlier AI workloads.
- The chip is shaped by the systems OpenAI runs daily across ChatGPT, Codex, the API, and future agentic products, while also being designed to work with current and future LLMs across the industry.
- The stated performance goal is to combine the throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, suiting it for interactive LLM products at scale.
- OpenAI frames this as its full-stack advantage: it designs frontier models, builds products on top of them, and now designs the chip architecture, kernels, memory systems, networking, scheduling, and deployment systems underneath.
- OpenAI claims Jalapeño went from initial design to manufacturing tape-out in just nine months.
- The companies call it what they believe to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors, against a backdrop of typically multi-year timelines.
- OpenAI used its own AI models to accelerate parts of the chip design and optimization process, which it credits for the speed.
- OpenAI frames the result as a flywheel: the same models served to users help improve the infrastructure that runs future models, lowering compute cost across the industry.
- Engineering samples of Jalapeño are already running ML workloads in the lab at production target frequency and power.
- Among the workloads running on the samples is OpenAI’s GPT-5.3-Codex-Spark model.
- GPT-5.3-Codex-Spark currently runs on Cerebras hardware, which also specializes in inference, per The Decoder.
- OpenAI says early testing shows Jalapeño will deliver performance per watt “substantially better” than current state-of-the-art hardware.
- That performance-per-watt claim is self-reported and lacks independent verification; OpenAI has not said which chips it tested against, on what tasks, or under what conditions.
- OpenAI says it is still measuring final performance and has promised a detailed technical report in the coming months.
- The architecture reduces data movement and balances compute, memory, and networking resources to push realized utilization much closer to theoretical peak performance.
- Jalapeño is an ASIC, which experts say is less flexible than Nvidia’s GPU but less expensive and tailorable to specific AI tasks.
- Broadcom contributes silicon implementation and networking technologies, including its Tomahawk networking silicon, to bring the platform to large-scale production.
- Canadian electronics manufacturer Celestica provides board, rack, and system integration expertise and will build the server systems.
- The chips are manufactured by Taiwan’s TSMC, the world’s leading advanced semiconductor foundry, after OpenAI sent over the design.
- Both the chips and the Celestica-built server systems will be used only by OpenAI, not sold to outside customers.
- OpenAI plans to deploy Jalapeño at gigawatt scale by the end of 2026, with expansion in the years ahead, as the first step in a multi-generation plan.
- Hock Tan said gigawatt-scale data center deployment will happen with Microsoft and other partners beginning in 2026.
- The Decoder reported Microsoft is expected to buy 40 percent of the chips, with Broadcom reportedly demanding Microsoft guarantee that share to secure the first phase.
- Broadcom CEO Hock Tan told Reuters that Jalapeño is as good as Nvidia’s Blackwell chips and the TPUs designed by Alphabet’s Google.
- In October 2025, after 18 months of working together, OpenAI and Broadcom went public with plans to develop and deploy racks of OpenAI-designed chips starting late this year; CNBC framed the unveiling as coming eight months after that deal.
- The prior OpenAI-Broadcom plan ultimately aimed at 10 gigawatts of custom AI accelerator capacity, with deployments expected between 2026 and 2029.
- Estimates suggest OpenAI’s broader infrastructure plans could eventually involve around 26 gigawatts of computing capacity across custom chips, Nvidia hardware, and other accelerators.
- OpenAI has been one of the biggest buyers of Nvidia’s GPUs since kickstarting the generative AI boom in 2022, but explosive demand has pushed it to seek other sources of advanced silicon.
- Earlier in 2026 OpenAI struck a deal with Amazon Web Services that includes use of AWS Trainium chips, and has also signed agreements with AMD and with Cerebras, which held its IPO in May.
- The move is widely characterized as OpenAI diversifying away from and reducing dependence on Nvidia while creating an alternative to its GPUs.
- OpenAI’s stated goals with the chip are to reduce costs, improve energy efficiency, secure long-term computing supply, and gain more control over the infrastructure powering its services.
- Broadcom shares climbed about 2 percent following the announcement, are up roughly 10 percent year-to-date in 2026, and have multiplied almost sevenfold since the end of 2022.
- To build in-house chips, Meta, Amazon, and Google have turned to firms like Broadcom and Marvell for design services and IP that are hard to replicate internally; Reuters first reported OpenAI was exploring its own chip in 2023, and sources told Reuters in April 2026 that Anthropic is weighing its own AI chip.
- Broadcom’s margin on custom AI chips is currently lower than on products like networking switches due to AI-driven high-bandwidth memory demand; Tan said SK Hynix and Samsung Electronics supply Broadcom with memory chips.
Detailed Summary

A blank-slate chip built only for inference

Jalapeño is OpenAI’s first so-called Intelligence Processor, and the company is emphatic that it is not a repurposed general-purpose accelerator. It was designed from a blank slate specifically for modern large language model inference, the job of crunching data to answer a user’s query rather than the separate, bursty work of training a model. OpenAI says it designed the chip from scratch around its own deep understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs, drawing on the systems it runs every day across ChatGPT, Codex, the API, and future agentic products. The stated objective is to fuse the raw power and throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, which would make Jalapeño particularly well suited to interactive products used at scale. Notably, OpenAI also says the chip is designed with flexibility to work with all LLMs across the industry, not only its own, a claim that sits a little oddly next to its plan to keep the hardware entirely in-house.

The full-stack flywheel and AI designing its own silicon

OpenAI is selling Jalapeño as proof of a full-stack advantage. The argument is that because OpenAI now develops frontier models, builds products on top of them, and designs the infrastructure underneath them, including chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and the product experience, every layer can be optimized around the same goal of making its models faster, more reliable, and cheaper. OpenAI describes this as a flywheel: better infrastructure drives compute efficiency, which enables better training and serving, which powers more capable models, which become better products, which drive more usage and revenue, which funds the next generation of infrastructure. The most striking piece of that loop is that OpenAI used its own AI models to accelerate parts of the chip’s design and optimization. The company’s framing is direct: if AI can help engineers design better chips faster, it can lower the cost of compute across the industry. That self-referential loop is the part of the announcement that is genuinely novel rather than a rerun of an existing hyperscaler playbook.

Nine-month tape-out and the partner stack

OpenAI claims it took roughly nine months to go from initial design to manufacturing tape-out, and calls this what it believes to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors, against an industry norm measured in years. It credits deep software-hardware co-development, Broadcom’s silicon implementation expertise, and the use of its own models to compress the schedule. The work is split across a clear partner stack: OpenAI provides the architecture and AI-specific requirements, Broadcom contributes silicon implementation and networking technology, including its Tomahawk networking silicon, and Celestica handles boards, racks, and system integration, building the actual server systems. Once the design was complete, OpenAI sent it to TSMC in Taiwan, the world’s leading advanced foundry, for manufacturing. Crucially, both the chips and the systems built around them are for OpenAI’s exclusive use; they are not products being sold to outside customers.

Performance claims that nobody can check yet

OpenAI says early testing shows Jalapeño will deliver performance per watt substantially better than current state-of-the-art hardware, with an architecture that reduces data movement and balances compute, memory, and networking to push realized utilization much closer to theoretical peak. Hardware program lead Richard Ho said the team optimized around the kernels, memory movement, networking, and serving patterns that matter most for frontier models, and that the chip will execute key workloads close to the hardware’s theoretical limits. He told Reuters it will be performant on what he thinks will be all kinds of future LLM iterations. The important caveat is that none of this is verifiable. OpenAI is still measuring final performance, has not finalized the numbers, and has not disclosed which chips it benchmarked against, on what tasks, or under what conditions, with the technical report only promised in the coming months. As The Decoder put it bluntly, these are self-reported numbers, unverifiable for now, that should not be taken at face value. Broadcom CEO Hock Tan’s separate claim to Reuters that the chip is as good as Nvidia’s Blackwell and Google’s TPUs is similarly an unverified assertion from an interested party.

Gigawatts, Microsoft’s 40 percent, and who carries the risk

Jalapeño is the opening move in a much larger infrastructure buildout. Initial deployment is targeted for the end of 2026 at gigawatt scale, expanding over multiple generations. Tan said the gigawatt-scale data centers will come online with Microsoft and other partners beginning in 2026. The deal traces back to October 2025, when, after 18 months of collaboration, OpenAI and Broadcom went public with plans to deploy racks of OpenAI-designed chips, ultimately aiming for 10 gigawatts of custom accelerator capacity with deployments expected between 2026 and 2029. Broader estimates put OpenAI’s total infrastructure ambition at around 26 gigawatts across custom chips, Nvidia hardware, and other accelerators. The detail that cuts through the optimism comes from The Decoder: Microsoft is expected to buy 40 percent of the chips, and Broadcom reportedly demanded that Microsoft guarantee that purchase to secure the first phase. That guarantee shows that the financial risk of this buildout is not OpenAI’s alone; it rests heavily on its largest backer’s balance sheet.

The Nvidia diversification arc and Broadcom’s windfall

Jalapeño is the clearest signal yet of OpenAI loosening its dependence on Nvidia. OpenAI has been one of the biggest buyers of Nvidia GPUs since it kickstarted the generative AI boom in 2022, but demand has exploded past what any single vendor can supply. Within 2026 alone, OpenAI has struck a deal with AWS that includes Trainium chips, signed agreements with AMD and with Cerebras, which held its IPO in May, and now rolled out its own ASIC. The pattern mirrors what Meta, Amazon, and Google already did, all of them leaning on firms like Broadcom and Marvell for design IP that is hard to build in-house, and Anthropic is reportedly weighing the same move, per sources who spoke to Reuters in April 2026. Broadcom is the obvious beneficiary, with shares up about 2 percent on the news, up roughly 10 percent in 2026, and up nearly sevenfold since the end of 2022. Even so, Tan noted that the AI-driven surge in high-bandwidth memory demand makes Broadcom’s margin on custom AI chips lower than on products like networking switches, with SK Hynix and Samsung Electronics supplying the memory.

Notable Quotes

“The world is moving to a compute-powered economy.”
Greg Brockman, President and Co-Founder of OpenAI, framing the launch as a broad economic shift

“Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access.”
Greg Brockman, President and Co-Founder of OpenAI, on the full-stack rationale for building its own chip

“Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers.”
Richard Ho, who leads OpenAI’s hardware program, describing the chip as purpose-built rather than adapted

“We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits.”
Richard Ho, who leads OpenAI’s hardware program, on the architecture’s optimization targets and early performance

“It will be performant on, we think, all kind of future iterations of LLMs.”
Richard Ho, OpenAI hardware chief, to Reuters on the chip’s forward compatibility with future models

“Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI.”
Hock Tan, President and CEO, Broadcom, on the scale of the infrastructure commitment

“This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.”
Hock Tan, President and CEO, Broadcom, on the multi-generation plan and 2026 gigawatt-scale deployment with Microsoft

“The goal is to combine the power and throughput of today’s leading AI accelerators with latency closer to the fastest specialized inference systems, making Jalapeño well suited for interactive LLM products at scale.”
OpenAI, in the press release, stating the performance objective for the chip

“These are self-reported numbers that haven’t been finalized. Take them with a grain of salt.”
Maximilian Schreiner, The Decoder, on the unverified performance-per-watt claim

Jalapeño is a real chip running real workloads in a lab, but the gap between an engineering sample and a profitable production fleet is exactly where this story will be decided over the next year, and the most important numbers, the performance-per-watt figures that justify the whole effort, remain self-reported and unverified until OpenAI publishes its technical report. Read OpenAI’s full announcement here.

Related Reading
- OpenAI, the chip’s designer and the primary source of the announcement and quotes.
- Broadcom, the co-developer providing silicon implementation and Tomahawk networking.
- Celestica, which builds the boards, racks, and server systems around the Jalapeño chip.
- ASIC (application-specific integrated circuit), what Jalapeño is, a custom chip built for one task unlike a general-purpose GPU.
- Nvidia Blackwell, the Nvidia architecture Broadcom’s CEO claims Jalapeño matches.
June 24, 2026
Coinbase for Agents: Your AI Agent Can Now Trade Crypto and Pay Autonomously, and Why Agentic Finance Is Massively Bullish for Bitcoin
Now you can use your favorite AI agent to control your Coinbase account (or a sub-account), with Coinbase for Agents.

Here’s a quick demo on how to set it up and some of the cool things you can get your agent to do. pic.twitter.com/c8R4qvz0BA
— Brian Armstrong (@brian_armstrong) June 11, 2026

Meet Coinbase for Agents.

Give your agent its own account to:

→ Execute trades & manage your portfolio
→ Run autonomously under guardrails
→ Pay for data & research tools via x402 (coming next week)

Agentic finance is here, and it's powered by Coinbase. pic.twitter.com/DK220fko0z
— Coinbase 🛡️ (@coinbase) June 11, 2026

Coinbase just fired the starting gun on agentic finance. With the launch of Coinbase for Agents, announced June 11, 2026, you can now connect your favorite AI agent directly to your Coinbase account and let it trade, pay, and execute financial workflows on your behalf, inside limits you control. It ships today as both an MCP for web-based assistants and a CLI plus Skill for terminal-based environments like Claude Code. This is one of those announcements that looks like a product release but reads like a regime change: AI agents now have a compliant, mainstream on-ramp to crypto markets, and that is a structurally bullish development for Bitcoin and the entire asset class.

TLDR

Coinbase for Agents connects any capable AI agent directly to your Coinbase account so it can do both financial reasoning and execution: strategy-led portfolio rebalancing into targets like 60% BTC / 20% ETH / 20% SOL with automated dip buying, around-the-clock capital efficiency so idle funds always earn, and data-informed trades where the agent can even pay for premium data via the soon-to-be-enabled x402 payments protocol. Crypto spot and derivatives trading is fully live today, with stocks, index funds, prediction markets, and commodities coming. Controls are built in from day one: isolated portfolios, explicit permissions, upcoming hard rules for max trade size and spend, and the same transaction monitoring and KYT compliance that powers Coinbase. The launch caps a multi-year build that started with AgentKit in 2024 and the x402 agentic payments protocol, alongside Coinbase Advisor, an SEC/CFTC registered in-app AI advisor. Available now as an MCP (one login, no API keys, ideal for ChatGPT or Claude Web) and as a CLI plus Skill (lower token overhead and full composability for Claude Code, Codex, or OpenClaw).

Thoughts

The most important sentence in the announcement is not about trading at all. It is the claim that people are increasingly moving through the world via agents rather than apps, and that businesses are rebuilding themselves agent-first in response. If you accept that premise, the next question is obvious: what money do agents use? Banks onboard humans with signatures, branches, and business hours. Crypto onboards software with keys, APIs, and 24/7 settlement. An AI agent cannot walk into a bank, but it can hold a wallet, sign a transaction, and pay an invoice in seconds. Crypto is the native money of the agent economy, and Coinbase just made that official with a regulated, compliance-wrapped product. For anyone still treating “AI plus crypto” as two separate hype cycles, this is the moment they visibly fused.

Think about what this does to demand. The flagship example Coinbase leads with is an agent patiently rebalancing into a 60% Bitcoin allocation over months, setting limit orders at 5%, 10%, and 15% drawdowns to buy the dip automatically. Now multiply that by millions of users who were previously too busy, too emotional, or too disorganized to execute a disciplined accumulation strategy. Agents do not panic sell. Agents do not forget to DCA. Agents do not sleep through a 3am flash crash that hits their limit orders. Every agent configured with a Bitcoin allocation target becomes a tireless, unemotional, structural bid under the market. Dips get bought mechanically, around the clock, by software that never gets scared. That is a profound change in market microstructure, and it favors the assets people tell their agents to accumulate. Bitcoin, as the default reserve asset of the crypto economy, sits first in line.

The x402 piece is quietly the biggest long-term story here. Coinbase for Agents will soon be x402-enabled, meaning your agent can pay for compute, proprietary data, statistics, images, and services as seamlessly as it places a trade. This is the machine-to-machine economy that crypto people have been promising since the earliest micropayments whitepapers, except now it has a distribution channel of millions of Coinbase accounts and every major AI harness. When software starts paying software at machine speed and machine volume, it will not do so over ACH rails that settle in three business days. It will do so over crypto rails. Every x402 transaction is another small proof that internet-native money wins on merit, and a rising tide of onchain economic activity lifts the credibility, liquidity, and valuation of the whole asset class.

Coinbase also deserves credit for sequencing this responsibly, which matters more than it sounds. Agent access arrives with isolated portfolios, explicit permissioning, upcoming hard caps on trade size and spend, and the same KYT and transaction monitoring that already runs under the main exchange. The gift card framing is exactly right: you define the limits, the agent executes within them. Add Coinbase Advisor, an actually registered SEC/CFTC advisor embedded in the app, and you have agentic finance arriving inside the regulatory perimeter rather than around it. That is what lets this scale to normal people and, eventually, to institutions. The skeptics’ best argument against crypto was always “no real use case.” It just got a lot harder to make that argument with a straight face.

One more detail worth savoring: Coinbase built the CLI version first-class because, in their words, terminal-based CLIs are the trend. A publicly traded financial company is now shipping developer-grade tooling so that coding agents can manage money. The arc from AgentKit in 2024, to x402 last year, to a full consumer agentic suite today tells you this is a deliberate multi-year strategy, not a feature chasing a news cycle. The companies that own the rails of agentic finance will be the banks of the next decade, and the assets those rails settle in will be the money of the next decade. Position accordingly.

Key Takeaways
- Coinbase for Agents, launched June 11, 2026, connects your AI agent directly to your Coinbase account so it can trade, pay, and execute financial workflows on your behalf, within limits you control.
- It is available today in two forms: an MCP (Model Context Protocol) integration for web-based agent harnesses, and a CLI plus Skill for terminal-based environments.
- The product closes the gap between financial reasoning and financial execution: LLMs were already used heavily for investment research but lacked portfolio context and could not act. Now they can do both.
- Coinbase frames the launch around a structural shift: people are moving through the world via agents rather than apps, and businesses are rebuilding products to be agent-first.
- Coinbase explicitly positions Coinbase for Agents as “your trading and spending account at the center” of the growing agent ecosystem.
- Flagship use case one is strategy-led portfolio rebalancing: tell your agent a target allocation like 60% BTC, 20% ETH, 20% SOL and have it work toward that over months, including limit orders at 5%, 10%, or 15% drops to buy the dip.
- Crypto spot and derivatives trading is fully enabled at launch, with stocks, index funds, prediction markets, and commodities on the roadmap. Coinbase’s stated goal: if it’s on Coinbase, it should be available to your agent.
- Use case two is capital efficiency: the agent monitors your cash position around the clock, keeps idle funds earning rewards, maintains optimal allocation, and flags positions that need attention.
- The agent executes preset moves automatically, removing the need for constant manual oversight of your portfolio.
- Use case three is data-informed trading: your agent can pay for premium proprietary data and services to inform its trading decisions.
- Coinbase for Agents will soon be x402-enabled, making it seamless for agents to pay for compute, statistics, images, and services. x402 is the agentic payments protocol Coinbase created.
- Example workflow: an agent pulls 30 days of hourly ETH price data, identifies the historically cheapest hour of the day, sets a recurring $20 market buy at that time, and runs it daily for two weeks. Set it and forget it.
- Controls were built in from day one: the agent can operate inside its own isolated portfolio with no visibility into your other holdings, or use your main account if you choose.
- The agent only ever touches what you have explicitly permissioned it to do.
- Coming soon: exact user-defined rules for maximum trade size, what the agent can interact with, and how much it can spend.
- Coinbase’s framing for the permission model: it is like giving a gift card rather than handing over your bank account. You define the limits, the agent executes within them.
- Compliance is built in: payments made through Coinbase for Agents go through the same transaction monitoring and KYT (know your transaction) checks that power Coinbase itself.
- For users who want a simpler path, Coinbase Advisor is a dedicated agent built directly into the Coinbase app, providing recommendations and guidance with no external connections required.
- Coinbase Advisor is offered by Coinbase Advisors, LLC, a CTA registered with the NFA and a Registered Investment Advisor registered with the SEC, making it a regulated AI financial advisor.
- These products are described as the start of Coinbase’s full consumer agentic suite, serving everyone from everyday investors to fully autonomous agents operating on their own.
- For businesses, Coinbase Payments adds agentic money acceptance, completing the picture on both the spending and receiving side.
- The launch is the culmination of a multi-year build: AgentKit in 2024 put wallets in the hands of agents, x402 followed as an agentic payments protocol, and Coinbase for Agents now brings your full Coinbase account to the agent you already use.
- The MCP path is the fastest for web-based harnesses like ChatGPT or Claude Web: a single login, no setup, no configuration, no API keys.
- The CLI plus Skill path targets terminal environments like Claude Code, Codex, or OpenClaw, offering lower token overhead, local customization, and full composability with existing toolchains.
- Setup today requires following the Coinbase CLI skill documentation and creating a Coinbase Developer Platform (CDP) API key.
- A remote MCP is coming soon that will connect with just sign-in-with-Coinbase, requiring no API keys or coding at all.
- The bullish read: agents are tireless, unemotional buyers. Millions of agents executing disciplined accumulation strategies and automated dip buying create a persistent structural bid for Bitcoin and major crypto assets.
- The deeper bullish read: agents cannot open bank accounts, but they can hold wallets and settle onchain. As the agent economy grows, crypto rails become the default money layer for machine-to-machine commerce, with Bitcoin as its reserve asset.
Detailed Summary

From Financial Reasoning to Financial Execution

Coinbase opens with an observation anyone who uses AI will recognize: people already lean on large language models for a huge range of investment research and financial questions, but those models are flying blind. They lack context about your actual portfolio and financial life, and they cannot take action. Coinbase for Agents changes both halves of that equation at once. By connecting an agent directly to your Coinbase account, the agent gains real portfolio context and the ability to execute, turning AI from a research toy into a working financial operator. Coinbase’s ambition is explicit: as the world reorganizes around agents instead of apps, Coinbase for Agents intends to be the trading and spending account at the center of that new ecosystem.

Strategy-Led Portfolio Rebalancing

The first showcase use case is patient, rules-based accumulation. You give the agent a target allocation, say 60% Bitcoin, 20% Ethereum, and 20% Solana, and instruct it to work toward that target gradually over months rather than all at once. The agent can take advantage of short-term market movements to buy the dip, including setting limit orders that trigger if the market drops 5%, 10%, or 15%. Crypto spot and derivatives trading is fully enabled today, and Coinbase says it is rapidly expanding into stocks, index funds, prediction markets, and commodities. The stated principle is simple: if an asset is on Coinbase, Coinbase wants it available to your agent.

Capital Efficiency Around the Clock

The second use case turns the agent into an always-on treasury manager. It monitors your cash position continuously, making sure idle funds are always working, whether that means earning rewards, staying optimally allocated, or flagging positions that need your attention. Because it analyzes your real-time holdings, it can execute moves you have preset without you babysitting the portfolio. This is the kind of unglamorous, compounding optimization that most retail investors never do consistently, and it is exactly the kind of work software does better than humans.

Data-Informed Trades and the x402 Connection

The third use case points at the machine economy. Agents can pay for premium data and services, like proprietary datasets that sharpen trading decisions. Coinbase for Agents will soon be x402-enabled, which makes paying for anything from compute and statistics to images and services seamless. The worked example is a dollar-cost averaging strategy with a twist: the agent pulls 30 days of hourly ETH price data, identifies the time of day ETH historically trades lowest, sets a recurring $20 market buy at that hour, and schedules it daily for two weeks. The human sets the goal once; the machine handles the data analysis, the scheduling, and the execution.

Limits, Permissions, and Built-In Compliance

Coinbase emphasizes that limits and control were built in from day one. The agent can operate inside its own isolated portfolio with no external visibility or access into your other holdings, or it can use your main Coinbase account if that is what you want. Either way, it only touches what you have explicitly permissioned. Soon, users will be able to set exact rules: maximum trade size, what the agent can interact with, and how much it can spend. Coinbase’s analogy is giving a gift card rather than handing over your bank account. On the regulatory side, payments made through Coinbase for Agents pass through the same transaction monitoring and KYT checks that power Coinbase itself, so compliance comes built in rather than bolted on.

Coinbase Advisor and the Full Agentic Suite

For users who do not want to connect anything external, Coinbase integrated an agent directly into the Coinbase app. Coinbase Advisor is a dedicated in-app agent providing recommendations and guidance, and it is a registered financial advisor: Coinbase Advisors, LLC is a Commodity Trading Advisor registered with the NFA and a Registered Investment Advisor registered with the SEC. Coinbase describes these products as the start of a full consumer agentic suite, spanning everyday investors to autonomous agents operating entirely on their own. For businesses, Coinbase Payments adds agentic money acceptance, so companies can receive agent-initiated payments too.

MCP or CLI: Two Ways In

Coinbase built for both major styles of AI usage. The MCP is the fastest path for web-based agent harnesses like ChatGPT or Claude Web: a single login connects your agent with no setup, no configuration, and no API keys. The CLI plus Skill is built for terminal-based environments like Claude Code, Codex, or OpenClaw, with lower token overhead, local customization, and full composability with an existing developer toolchain. Getting started today means following the Coinbase CLI skill docs and creating a Coinbase Developer Platform (CDP) API key. A remote MCP is coming soon that will require nothing more than sign-in-with-Coinbase, no API keys or coding at all.

The Multi-Year Build Behind the Launch

Coinbase notes it has been building toward this for a while. AgentKit arrived in 2024, giving developers the ability to put wallets in the hands of agents. Then came x402, the agentic payments protocol created last year. Coinbase for Agents is the third act, bringing the full Coinbase account into the AI agent you already use. Read as a sequence, it is a deliberate strategy to own the financial rails of the agent economy: first wallets for agents, then payments between agents, now full trading and spending accounts for agents.

Notable Quotes

“Coinbase for Agents connects your AI agent directly to your Coinbase account so it can trade, pay, and execute workflows on your behalf, all within limits you control.”
Coinbase, summarizing the launch in one line

The official TL;DR of the announcement, and the clearest statement of what just shipped.

“By giving your AI agent direct access to Coinbase, your agent can now do both financial reasoning and execution.”
Coinbase, on closing the gap between AI research and AI action

The core unlock: LLMs could already think about money, now they can move it.

“As that ecosystem grows, Coinbase for Agents is positioned to be your trading and spending account at the center of it.”
Coinbase, on the agent-first internet

The ambition statement: Coinbase wants to be the default financial account of the agent economy.

“While crypto spot and derivatives trading is fully enabled today, we are rapidly expanding our capabilities to include trading stock and index funds, prediction markets and commodities. If it’s on Coinbase, we want it available for your agent.”
Coinbase, on the asset roadmap

Crypto first, everything else next. Agents get the full exchange.

“It only ever touches what you’ve explicitly permissioned it to do.”
Coinbase, on agent permissions

The single most important trust property of the entire product.

“Think of it like giving a gift card rather than handing over your bank account. You define the limits. Your agent executes within them.”
Coinbase, explaining the control model

The analogy that will sell agentic finance to normal people.

“It started with AgentKit in 2024, giving developers the ability to put wallets in the hands of agents. Then x402, an agentic payments protocol created last year. And now: Coinbase for Agents to bring your Coinbase account into the AI agent you already use.”
Coinbase, on the multi-year strategy behind the launch

Three product launches, one thesis: agents need money rails, and Coinbase is building them.

Agentic finance is no longer a thought experiment. It is a product you can connect to your account today, and it settles in crypto. Read the full announcement from Coinbase here.

Related Reading
- Coinbase for Agents announcement (Coinbase blog) the primary source for everything covered in this post.
- Coinbase Developer Platform docs where you create the CDP API key and find the CLI skill instructions to connect your agent.
- x402 agentic payments protocol the open protocol that will let agents pay for data, compute, and services seamlessly.
- Model Context Protocol (MCP) the open standard that lets AI assistants connect to external tools and accounts like Coinbase.
- Bitcoin.org the canonical starting point for understanding the asset most likely to anchor agent-driven accumulation strategies.
June 11, 2026
Thomas Laffont of Coatue on the $4 Trillion AI IPO Wave: SpaceX, Anthropic, OpenAI, and Why the New Unicorn Economy Is Healthier
Thomas Laffont, co-founder of the $55 billion hedge fund Coatue Management, made his All-In Podcast premiere with a data-dense walk through what he calls a once-in-a-generation moment for the unicorn economy. In front of Chamath Palihapitiya, Jason Calacanis, David Sacks, and David Friedberg, he argued that a roughly $4 trillion wave of private value is about to hit the public markets, led by SpaceX, Anthropic, and OpenAI, and that the new AI-driven unicorn economy is actually healthier than the one that came before it. You can watch the full presentation and Q&A on YouTube.

TLDW

Laffont presents Coatue’s slide deck on the state of the unicorn economy and argues it has rebalanced after the excesses of 2021. The average unicorn is up about 70 percent since September 2024, AI keeps taking a bigger share of all fundraising, and the model has shifted from many small unicorns to fewer companies each raising far more, with funding per unicorn up roughly 5x since 2021. He introduces a “Magnificent 8” private index (SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, Anduril, and more) worth nearly $4 trillion that has crushed the public Mag 7, then shows that exits are finally thawing as SpaceX heads to an IPO in weeks and Anthropic confidentially files its S1. He lays out Coatue’s “CODE” framework for why SpaceX gets more valuable the more it launches, a counterintuitive finding that the odds of a 10x actually rise as companies get bigger (31 percent for $100 billion-plus centicorns), the explosive revenue ramp of OpenAI and Anthropic past Workday, ServiceNow, Adobe, Salesforce, and now the hyperscalers, a three-pillar map of where AI revenue comes from (consumer, ads, enterprise), and the AI memory thesis. The Q&A with Chamath and Calacanis digs into the power law, K-shaped outcomes, whether these valuations are disconnected from reality, the public market as the great antiseptic, and what happens when trillions in private value finally recycles back through GPs and LPs.

Thoughts

The most useful idea in the talk is not the $4 trillion headline, it is the cohort-health chart. Laffont splits unicorns into eras and shows that the pre-2021 cohort was healthy, roughly 80 percent had raised again or exited 20 quarters after minting, while the giant 2021 ZIRP cohort of 479 companies is stuck with under 20 percent doing either. That single comparison reframes the whole AI boom. The bullish read is that the 2024 AI cohort is small, concentrated, and cash-generative, so it looks more like the healthy pre-ZIRP group than the 2021 hangover. The bearish read is that we are watching the same movie with bigger numbers, and the test only comes when these companies face public markets. Laffont is honest that we do not yet know which cohort the AI class resembles, and that intellectual humility is what makes the deck credible rather than promotional.

The SpaceX “CODE” framework is the sharpest analytical move of the presentation. Most people would assume a launch business gets cheaper per launch as it scales. Laffont shows the opposite, the market pays more per launch as cadence rises, and explains it as a phase change in business quality: from one-time government launch revenue, to a single recurring-revenue constellation, to multiple constellations, to a platform with optional upside in space data centers, the moon, and Mars. It is a clean way to think about any company that climbs from a project business to a platform business, and it applies far beyond rockets. The lesson for investors is that valuation can rationally expand even as unit economics look like they should compress, because the nature of the revenue underneath is changing.

The counterintuitive 10x odds finding deserves more attention than it got in the room. Conventional wisdom says the bigger you are, the harder it is to grow, so a $100 billion company should be less likely to 10x than a $10 billion one. Coatue’s data says the reverse: centicorns have a 31 percent shot at a 10x, far higher than the 8 percent a unicorn has at becoming a decacorn. Laffont’s explanation is a filtering mechanism, every step up validates a compounding advantage and durability of earnings, so survivors are increasingly the kind of business that keeps compounding. This is essentially a quantitative restatement of quality investing, and it is the intellectual backbone of the LP strategy the besties tease out, just buy whoever reaches $100 billion and hold.

Where the argument gets genuinely contested is valuation, and the panel does not let it slide. The pushback that “these are not fake companies” is true and important, OpenAI and Anthropic are growing faster than any software company in history, and Anthropic reportedly had a profitable month. But growth and reality do not settle the question of price when you are paying 50 to 100 times revenue for trillion-dollar private companies, as Bill Ackman pointed out earlier in the day. Laffont’s answer is the most grounded thing he says all session: the public market is the great antiseptic, it will not care about anyone’s slide deck, and he wants to see these names withstand short sellers and skeptics. That is the right posture. The deck is a thesis, not a verdict, and the verdict arrives roughly six months and one day after the IPOs, once passive flows and supply have washed through.

The closing thread, that almost every sector is being transformed at once and we still do not have superintelligence, is the part worth sitting with. The risk in a presentation this bullish is treating the trend as destiny. The value is in the framing tools Laffont hands you, cohort health, phase-change business quality, the filtering odds, the three revenue pillars, and the antiseptic of public scrutiny. Use those to interrogate each name rather than to buy the index on faith, and the talk earns its premiere billing.

Key Takeaways
- Coatue Management is one of the most successful hedge funds of the last two decades with about $55 billion under management, and is raising roughly another billion dollars specifically to invest in AI.
- The unicorn economy is up about 70 percent on average since September 2024, and the public market has made a similar move up over the same period.
- The unicorn economy’s share of the NASDAQ rose significantly after 2015 but has plateaued in recent years, reflecting strong performance from public companies.
- AI keeps increasing its wallet share of all venture fundraising, multiple years in a row now.
- The composition of funding has changed. The unicorn “factory” peaked in the ZIRP era of 2021 and has normalized at a much lower level since.
- Funding per unicorn has increased roughly 5x since 2021. There are fewer unicorns, and each one is raising more.
- Cohort health, pre-ZIRP group: of about 73 unicorns, 20 quarters after minting roughly 80 percent had either raised a new round or exited, which is healthy.
- Cohort health, 2021 group: of about 479 unicorns, 20 quarters in, fewer than 20 percent had exited or raised again. Far larger cohort, far worse outcomes.
- The open question is which cohort the new 2024 AI cohort will resemble.
- Funding is concentrating: the top 10 companies capture a large share, and it is a small number of AI companies, not all of them, with Anthropic and OpenAI raising massive rounds.
- Laffont proposes a “Magnificent 8” private index: SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, Anduril, and more, spanning internet, AI, fintech, and space tech.
- That private index represents almost $4 trillion of value and has crushed the traditional public Mag 7, with almost every name outperforming.
- Exits are thawing. 2026 is on a good trend for cash returned versus consumed, not quite 2021 levels, with half a year still to go.
- That trend does not yet include three imminent liquidity events: SpaceX (IPO expected in weeks) and Anthropic (confidentially filed its S1), whose combined value could exceed the prior decade of exits combined.
- The ecosystem is far more balanced than when Laffont first presented at the 2024 All-In Summit, when it was consuming much more cash than it returned.
- OpenAI and Anthropic revenue growth is unlike anything previously seen. Starting from January 2025, they passed Workday, then ServiceNow, then Adobe, then Salesforce, and are now bigger than Google Cloud and Azure.
- On current forecasts, that revenue could pass AWS by the end of the year and exceed all of Microsoft by 2028.
- Hyperscalers are not sitting still. The largest companies in the world are funding the disruption, investing unprecedented sums to enable the ChatGPT moment.
- The SpaceX “CODE” framework: the number one driver correlated to SpaceX’s valuation is cadence of launches, and valuation per launch rises as launches increase.
- Why per-launch value rises: business quality improves through phases, pre-constellation (one-time government revenue), initial ramp (one recurring-revenue constellation), scale (multiple constellations), and platform (space data centers, moon and Mars optionality).
- Anthropic in particular is scaling like no company seen across the PC, internet, or mobile eras.
- Counterintuitive 10x odds: a unicorn has about an 8 percent chance of becoming a decacorn, a decacorn has 8 to 13 percent odds of reaching $100 billion, but a centicorn ($100 billion-plus) has a 31 percent chance of a 10x.
- Value creation has accelerated. It typically takes years to go from $500 billion to $1 trillion in market cap, yet recently three companies did it in one year and two did it in a matter of weeks.
- Cerebras is the counterexample of slow success: years of dark periods and no new capital developing its technology, then a massive OpenAI contract that quintupled the company’s value ahead of its IPO.
- Semiconductors are on a generational run, with the sector dramatically outperforming the index since the 2024 All-In Summit.
- AI memory thesis: the more an AI system knows about you, the more useful it is, so memory per user could quintuple, which helps explain recent moves in memory companies.
- Where the revenue is: the AI ecosystem is roughly $140 billion today, about $300 billion this year, and is expected to double in 2027.
- Three revenue pillars: consumer (subscribers times ARPU), ads (about a quarter of Meta and Google ads are AI-enabled today, heading toward 100 percent and roughly $150 billion), and enterprise (tools like Claude Code and Codex inside businesses).
- Disruption is hitting every sector: software, telco (Starlink-powered global phone calls), semis, energy (data centers reshaping Pennsylvania’s grid), auto (Ferrari’s electric and autonomous stumble), and consumer (GLP-1s reshaping food, alcohol, and wellness).
- Final takeaways: the new unicorn economy is healthier thanks to AI, winners are compounding faster so the cost of not owning a winner is higher than ever, disruption is everywhere, and we do not even have superintelligence yet.
- In the Q&A, both Anthropic and OpenAI publicly say they want to be public, and big outcomes now look likely to become liquid within roughly a 12-month window.
- The valuation pushback: these are not fake companies, they generate substantial revenue at scale and grow faster than anything before, and Anthropic reportedly even had a profitable month.
- The public market is framed as the great equalizer and antiseptic, but with passive buying the true price discovery may not land on day one, more like six months and a day after listing.
- A floated LP strategy: wait for whoever reaches $100 billion and concentrate capital there as the least brittle, quickest-return bet, tempered by the warning that valuations are disconnecting from any historical metric (50x to 100x revenue).
- An open risk: with so much capital, OpenAI and Anthropic could rationally start a price war, the way ride-sharing and food-delivery players once did, though heavy infrastructure spend complicates it.
Detailed Summary

The unicorn economy has rebalanced after 2021

Laffont opens by reframing a market many assume is frothy. The average unicorn is up about 70 percent since September 2024, and the public market has tracked a similar climb, so private and public value are moving together rather than diverging. The unicorn economy’s share of the NASDAQ rose sharply after 2015 and then plateaued, which he reads as a sign of how strong public companies have become. Underneath the headline, the structure of funding has changed. The 2021 ZIRP era was a unicorn factory that minted enormous numbers of companies, and that machine has since normalized to a much lower level. The result is a barbell: fewer new unicorns, but each raising far more, with funding per unicorn up roughly 5x since 2021. AI sits at the center of this, taking a steadily larger share of all venture dollars for several years running.

Cohort health is the real story

The deck’s most important slide measures the health of the ecosystem by cohort. The pre-ZIRP cohort, about 73 unicorns, looks healthy: 20 quarters after becoming unicorns, roughly 80 percent had either raised a new round or exited. The 2021 cohort tells the opposite story. It is enormous, about 479 unicorns, and 20 quarters in, fewer than 20 percent had raised again or exited. That contrast sets up the central question of the talk. A new 2024 cohort of AI companies is forming, and no one yet knows whether it will resemble the healthy pre-ZIRP group or the bloated, stuck 2021 group. Laffont’s framing leans optimistic because the AI cohort is small and concentrated, but he is careful not to declare the answer.

The Magnificent 8 and a $4 trillion private index

Funding is not just flowing to AI, it is flowing to a handful of AI names, with the top 10 capturing a large share and Anthropic and OpenAI raising the biggest rounds. From this concentration Laffont builds a private index he half-jokingly calls the Magnificent 8, a number he expects to shrink as companies go public. The members span sectors: SpaceX, Stripe, Anthropic, Databricks, Revolut, ByteDance, and Anduril, covering internet, AI, fintech, and space tech. He says he would be comfortable owning that index for the next decade-plus. Collectively it represents almost $4 trillion of value and has outperformed the public Mag 7, with nearly every constituent beating that benchmark.

Exits are thawing and a wall of liquidity is coming

One of Laffont’s recurring concerns at past summits has been balance: the unicorn economy is great at consuming cash, but a healthy ecosystem must also return it. On that score 2026 is trending well, not quite 2021, but solid with half a year left. Crucially, that figure does not yet include three imminent events. SpaceX is expected to go public within weeks, and Anthropic confidentially filed its S1 the day of the talk. Adding those up, just a few companies could deliver more liquidity than the prior ten years combined. The takeaway is that the ecosystem that was dangerously out of balance in 2024 is now meaningfully more balanced, and improving.

The revenue ramp past the hyperscalers

The growth rates of OpenAI and Anthropic, Laffont argues, are unlike anything previously seen. Charting from January 2025, the leading AI labs passed Workday, then ServiceNow, then Adobe by year end, then Salesforce by January, and are now bigger than Google Cloud and Azure. On forecast, that revenue could surpass AWS by the end of the year and exceed all of Microsoft by 2028. He stresses that the hyperscalers are not passive bystanders, they are actively funding the disruption, pouring unprecedented capital into enabling the change that began with the ChatGPT moment.

The SpaceX CODE framework

Laffont devotes real time to how Coatue thinks about SpaceX. The single factor most correlated with SpaceX’s valuation is cadence of launches, which is intuitive for a launch business. The surprise is that valuation per launch has risen rather than fallen as cadence climbed. His explanation, the CODE framework, is that the quality of the business model improves the more SpaceX launches. In phase one, pre-constellation, you are simply proving rockets, with a few government customers and lumpy, unpredictable one-time revenue. In the initial ramp you stand up a constellation, which is an end market and a recurring-revenue business that grows with every satellite and subscriber. At scale you operate multiple constellations, and Laffont expects companies, governments, and militaries to want to own their own. Ultimately it becomes a platform, with new businesses layered on top, from space data centers to the optionality of the moon and Mars.

Counterintuitive odds and the speed of value creation

Coatue bucketed companies and asked the odds of a 10x within each. A unicorn has roughly an 8 percent chance of becoming a decacorn. A decacorn has 8 to 13 percent odds of reaching $100 billion. But a centicorn, $100 billion or more, has a 31 percent chance of a 10x, counting both public and private companies. The bigger you are, the better your odds, which inverts intuition. Laffont pairs this with the sheer speed of recent value creation. Going from $500 billion to $1 trillion in market cap normally takes years, yet three companies did it in a single year and two did it in a matter of weeks. He also offers Cerebras as the patient counterexample, a chip company that endured years of dark periods and no new capital before a massive OpenAI contract quintupled its value ahead of IPO, part of a broader generational run for semiconductors.

AI memory and where the revenue actually comes from

A throughline from the day’s other speakers is that the more an AI knows about you, the more useful it is, from your restaurant preferences to your work context. Laffont turns that into a thesis: memory per user could quintuple based on what these systems require, which helps explain recent moves in memory companies. He then tackles the most contested question, where is the revenue. He sizes the AI ecosystem at about $140 billion today, roughly $300 billion this year, and doubling in 2027, built on three pillars. Consumer is subscribers times ARPU. Ads are the pillar people forget, with about a quarter of Meta and Google ads already AI-enabled and penetration heading toward 100 percent, a roughly $150 billion opportunity. Enterprise is the breakthrough category, exemplified by tools like Claude Code and Codex operating inside businesses.

Every sector is being transformed at once

What makes this era different, Laffont says, is that nearly every sector is being transformed simultaneously. Software is obvious, but look at telco, where he believes Starlink will soon power a device that lets you make a phone call anywhere on earth, attacking the global telco and broadband profit pool with a better product. Compute is driving massive change in semis, data centers are reshaping the energy equation in places like Pennsylvania, and the auto business is being upended, as Ferrari’s stumble introducing electric and autonomous technology showed. In consumer, GLP-1 drugs are profoundly changing consumption of food and alcohol and the broader focus on wellness. His takeaways close the loop: the new unicorn economy is healthier thanks to AI, winners are compounding faster so the cost of missing them is higher than ever, disruption is everywhere, and superintelligence has not even arrived yet.

The Q&A: power law, valuation, and the public market test

Chamath and Jason Calacanis press Laffont on what this means for allocators. The recurring theme is the power law and K-shaped outcomes, with gains consolidating into a small number of companies. The positive side, Laffont notes, is that outcomes are enormous and increasingly liquid within a 12-month window, and both Anthropic and OpenAI say they want to be public. The hard part is valuation. The besties cite Bill Ackman’s framing that investors are making venture bets on trillion-dollar companies at 50 to 100 times revenue. Laffont’s pushback is that these are not fake companies, they generate substantial revenue at scale and grow faster than anything before, and Anthropic reportedly had a profitable month. But he embraces the discipline ahead: the public market is the great antiseptic and will not care about anyone’s presentation, though with heavy passive buying, true price discovery may take roughly six months and a day rather than landing on day one. Asked whether the compounding is a market inefficiency or survivor bias, he declines to over-read a small sample, noting that Anthropic before Claude Code was a completely different company than after. The conversation closes on what happens when trillions recycle from GPs to LPs, the case for simply owning whoever crosses $100 billion, the risk of everyone crowding into three names, and the possibility of an eventual OpenAI versus Anthropic price war.

Notable Quotes

“So we have fewer unicorns that are each raising more.”
Thomas Laffont, summarizing how funding per unicorn has risen roughly 5x since 2021

“The reason is that the quality of SpaceX’s business model increases the more you launch.”
Thomas Laffont, explaining the CODE framework and why valuation per launch rises with cadence

“The winners are compounding faster than ever, which means the costs of not being in a winner are higher than ever.”
Thomas Laffont, on the central risk of a power-law market

“And by the way, we don’t even have super intelligence yet.”
Thomas Laffont, closing his takeaways on how early the transformation still is

“These are companies generating substantial revenue at scale that are growing faster than anything we’ve ever seen.”
Thomas Laffont, pushing back on the idea that AI valuations rest on fake companies

“It will be the great antiseptic. It will not care about my presentation.”
Thomas Laffont, on the public market as the ultimate test for SpaceX, OpenAI, and Anthropic

“Anthropic pre-cloud code was a completely different company than post cloud code.”
Thomas Laffont, on why he won’t over-read a small sample of hyper-compounders

“The power law rules our lives. All the great gains are being consolidated into small numbers of companies.”
An All-In host, framing the Q&A on concentration in private markets

This is a curated set of highlights. To hear the full presentation, the slide walkthrough, and the complete Q&A with Chamath and Jason Calacanis, watch the full conversation here.

Related Reading
- Coatue Management. Primary source for Thomas Laffont’s firm and the technology investing strategy behind the deck.
- The All-In Podcast. The show and summit where Laffont made this premiere presentation.
- Power law (Wikipedia). Background on the distribution Laffont and the hosts say governs venture and public-market returns.
- The Magnificent Seven (Wikipedia). The public-market benchmark Laffont’s private “Magnificent 8” index is measured against.
- Cerebras Systems. The AI chipmaker Laffont cites as the slow-grind IPO that was eventually transformed by a major OpenAI contract.
June 4, 2026
The AI Industrial Revolution: Naval, Guillermo Rauch, Blake Scholl, and Max Hodak on Software Factories, Vibe Coding Hardware, AI Regulation, Healthcare Economics, and What Humans Can Uniquely Do
This is the full episode of Naval Ravikant’s conversation with three frontier founders: Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science. The premise is that all three are building their own factories rather than assembling off-the-shelf parts, so the interesting question is not what they are building but what they are learning about how to build in the age of AI. Over roughly an hour the discussion moves from software factories and the thousand-x engineer into hardware, regulation, healthcare economics, autonomous companies, and a long closing argument about what humans can still uniquely do. Watch the full conversation on the Naval Podcast YouTube channel. We previously published two segments of this same discussion: part one, Waste Tokens to Save Time, on software factories and whether pure software is dead, and part two, Vibe Coding Hardware, on jet engines, vertical integration, and China’s open-source bet. This post covers the entire episode end to end.

TLDW

Four builders argue that AI has turned the engineer’s job from shipping output into building the factory that produces output, which is why token leaderboards are the new vanity metric and why you should waste tokens to save time. Guillermo Rauch frames the thousand-x engineer and the building-block economy, and asks whether pure software is dead now that models speak English. Blake Scholl shows how Boom turned hardware engineering into software, letting two engineers design an entire jet engine and collapsing months of regulatory compliance documentation into minutes. Max Hodak makes the case for extreme vertical integration, a captive MEMS foundry, and a sober counter to Silicon Valley deregulation triumphalism: the bottleneck is the voters and the regulator’s asymmetric incentives, not just bad rules. The group works through healthcare as a fixed-bucket non-market, China’s cost-reduction strategy and its approved implantable brain interface, autonomous software that runs site reliability and security research with thousands of concurrent agents, a company-wide hackathon where the receptionist shipped a real automation, and a long debate on creativity, out-of-distribution surprise, intent, attribution, and the definition of art. The throughline: humans become verifiers, value moves to creativity, taste, and agency, and the single best move is to get extremely good with the tools, because it is people with AI versus people without AI.

Thoughts

The strongest idea in the episode is the quiet redefinition of what an engineer is for. Rauch’s point is that you no longer judge a person by how well they ship a single output. You judge them by whether they can build the factory that produces outputs B through Z. That reframe instantly explains why token leaderboards are nonsense. Counting tokens consumed is the same category error as counting lines of code written, a measure of motion mistaken for a measure of progress. Naval’s “waste tokens, save time” is the correct response: tokens are cheaper than people, so optimize for your own wall-clock time and the final output, and throw three models at the same problem if that gets you unstuck faster. The uncomfortable corollary, which the group says out loud, is that leverage in idea domains was never linear. The hundred-x and thousand-x engineer is not a new phenomenon. AI just made it impossible to keep pretending otherwise.

The second thread that ties the whole hour together is verification. Everyone converges on the same future: humans stop producing the work directly and move up the stack to signing off on it. Rauch is precise about what that means. Saying “I understand this pull request” no longer requires reading every line. It requires being able to say you wrote the test harness, the proofs, the type checkers, and the simulations that let you stand behind it in production. That is a profound shift, because it accepts that the code may be spaghetti you do not fully understand while insisting that the evaluator around it is trustworthy. Blake extends the same logic to regulation, and this is the most underrated argument in the episode. If you treat a 200-page lightning-strike compliance document as a test suite and a regulation as an exit criterion for an agent loop, then a body of rules you once resented becomes a guard rail that lets you move faster, not slower. The cost of change collapses, change aversion drops, and you can finally afford to iterate on physical things.

Max Hodak is the adult in the room on regulation, and the episode is better for it. The Silicon Valley consensus is that regulation is simply friction to be deleted, and there is plenty of dysfunction to point at: the NRC permitting essentially zero nuclear plants for decades, the FDA’s asymmetric incentives where approving a bad drug ends a career but blocking a good one costs nothing visible. But Hodak keeps pulling the conversation back to the harder truth. This is where the voters are. If you removed the current regulatory package, something very similar would get voted right back in, because the asymmetry reflects how the public actually weighs a visible death against an invisible delay. Real reform is not “deregulate,” it is narrow and surgical: prohibit the FDA from drawing adverse inferences across different users of a compound, build innovation zones where people consent to different rules, or copy Europe’s notified-body model so review capacity can actually scale. That is a far more serious position than the usual abundance-or-bust framing.

The healthcare segment is the part of this conversation you will not find in the two clips, and it is the most heterodox. Hodak’s diagnosis is that healthcare is a fixed bucket of money that grows with tax receipts, not a technological growth industry where falling prices expand the market the way phones and laptops did. Because there is no real private market, you get a small communist society running inside a larger capitalist one, with the waiting lines and frozen product quality that implies. His prescription is not single payer and not insurance reform. It is to drive the cost of bringing devices and drugs to market so low that a patient can buy a restored sense or an extra decade of life on a credit card, the way they finance a car, and his warning is that China’s lower approval costs and its already-approved implantable brain interface put it on track to do exactly that. Whether or not you buy the twenty-percent-of-income deductible he floats, the framing that a private market is the missing feedback loop is the kind of argument that gets too little airtime.

The closing debate on creativity is where the four of them disagree most productively, and they are careful enough to notice that their conclusions follow from their definitions. Hodak defines art as meaningful out-of-distribution behavior, which lets a military maneuver or a math proof count, and leads him to think a sufficiently capable model gets there too. Naval defines art as conveying an emotion with intent, which makes attribution load-bearing: the same photo down to the last pixel means more when a human took it, and a startup doing hardware attestation of human authorship suddenly has a real market. The shared observation that should worry every builder is that AI output collapses to a distribution mean. Every Claude-built website ends up the same serif font, the same brown and cream, the same monospace spacing, recognizable as slop precisely because it is in-distribution. The optimistic read, and the one Naval lands the episode on, is that this leaves an enormous and durable lane for humans who can step outside the system, and that the practical move for everyone is simply to become excellent with the tools, because the real divide is people with AI versus people without.

Key Takeaways
- The job of an engineer has shifted from shipping a single output to building the factory that produces multiplicative outputs, so people are now judged on the leverage they create rather than the work they personally do.
- There were always 10x engineers, and in idea, intellectual, and digital domains the real spread is 100x or 1000x. AI leverage just made that gap impossible to deny.
- Token leaderboards and token consumption are the new lines-of-code: a measure of activity that does not map to value. Measure your own time and the final output instead.
- Waste tokens to save time. Models are still far cheaper than a human, so throwing Codex, Claude, and Gemini at the same problem repeatedly is rational even when it looks wasteful.
- Low-quality first-pass code is fine because you can spend more tokens later to harden it for production. The constraint is verifiable domains, not code quality.
- A model is roughly as good as you are in a domain. The quality of your prompting and reprompting strongly determines the output, though this dependence should fade as models improve.
- Models graduated from junior to principal engineers: they now return with multiple routes and tradeoffs rather than running away with the first idea, even if their time and cost estimates are often wrong.
- A junior gets knowledge they could never have produced alone, but an experienced architect still extracts far more juice. Taste and judgment, like picking Postgres versus ClickHouse, remain the human’s edge.
- Pure software’s moat is in question now that models speak fuzzy, sloppy English. For hardware founders this is a boon, since good software finally becomes cheap to produce.
- The building-block economy, from Mitchell Hashimoto, argues agents need powerful reusable infrastructure rather than reinventing queues and databases every time. Shared dependencies are a cooperation value, like everyone depending on the same Postgres version.
- Naval and Max both stopped writing code for years, then started building software they use daily through agents, on the strength of understanding how the pieces fit rather than syntax.
- With agents you stop getting stuck on narrow debugging problems that used to consume indefinite time. The intrinsic frustration that was once “how you learn” is largely gone.
- Boom turned siloed hardware engineering, much of it trapped in Excel and VBScript with no source control, into real software with automated testing and repeatable flows.
- Software engineers now build the architectures and hardware engineers vibe code their pieces, letting two engineers design an entire jet engine where a single turbine-blade analysis once took one engineer a full day across a thousand blades.
- Enterprise collaboration software and even spreadsheets are getting cooked, because you can now code the exact custom tool you need instead of approximating it.
- AI will soon generate step files and PCB layouts, bringing the current software boom to mechanical and electrical engineering, likely within the year.
- China is betting on open-source models because its hardware and supply-chain superiority pairs with on-demand software generation to erase Silicon Valley’s software advantage. Fall behind on generating software and you fall behind on generating everything.
- In real usage, frontier intelligence dominates the top. Gemini “slaps at scale” as an industrial production model for support and browser automation, while Chinese models are not in the frontier coding tier.
- Intelligence is an unalloyed good. Because mistakes are invisible and models are cheaper than people, you reach for the smartest available model rather than running a weaker one many times.
- Max’s vertical integration thesis: when you cannot buy a part, you make it. Science owns a captive MEMS foundry because tighter integration toward a single block of bonded matter yields lower power, smaller size, and longer life.
- AI’s biggest near-term impact inside hardware companies is regulatory: generating documentation and tracing which of thousands of ISO standards apply, work that used to occupy a quality team for months.
- Junior engineers got promoted to senior and junior engineering got handed to agents. The same pattern hits law, where basic NDAs and red lines no longer require a lawyer.
- Humans are becoming verifiers. Signing off on a PR means standing behind its consequences via tests, proofs, and type checkers, not reading every line. Creating software is easy; keeping it secure, tested, and maintained 1000 days out is the real question.
- A RAG over regulatory documents collapses a 200-page compliance test plan from months to minutes, which cuts change aversion: you can alter the airplane and regenerate compliance instead of crying over rework.
- Regulations can act as a test suite and exit criteria for agent loops, as long as they are non-contradictory and reasonable. The alternative is shipping slop directly into the air.
- Physical building is guilty until proven innocent, illustrated by the absurdity of pre-filing a driving plan before every trip. The fix is more enforcement-based regulation rather than pre-approval, though agents on both sides could trigger a red queen race and DDoS overwhelmed agencies.
- Regulation often fails to make things safer, only slower: the 737 Max shipped a single sensor with full authority over pitch, and the NRC kept us perfectly safe by approving almost no nuclear plants for decades.
- The deeper problem is the voters and the regulator’s asymmetric incentives. Approve a bad thing and your career ends; block a good thing and nobody notices. Removing one agency just elects its replacement.
- Targeted fixes beat blanket deregulation: bar adverse inferences across users of a compound, use single-patient IND pathways, create opt-in innovation and YIMBY zones, or adopt Europe’s competitive notified-body reviewers.
- Healthcare is a fixed bucket of money tied to tax receipts, not a growth industry, so spending 10x more on it would be a catastrophe rather than a triumph. With no private market you run a small communist society inside a capitalist one.
- The escape is lower cost-to-market, not single payer, so people can finance care like a car. China’s lower approval costs and its already-approved implantable BCI point that direction. LASIK, dental, and plastic surgery advance because patients pay directly.
- End-of-one medicine works at the high end, as with GitLab’s Sid Sijbrandij outliving his cancer prognosis through a self-built escalation ladder, but it demands enormous agency at the patient’s weakest moment. AI should democratize that knowledge.
- Vercel automated much of site reliability engineering: anomalies fire alerts, an agent investigates, can open an incident, and begins remediation, stopping just short of changing production itself.
- Running an open-sourced security tool against the whole monorepo with 10,000 concurrent agents produced several quarters of security research in a couple of days for about $14,000 in tokens. Code translation and optimization are similarly autonomous now.
- Blake stopped all project work for a week and had everyone, receptionist to engineers, build something with AI and demo it. He expected mostly silly projects and got mostly needle movers, including a real automation from shipping and receiving.
- The autonomous company of the future may have a workforce that trains the agents doing the work rather than doing it directly, with tooling that extracts reusable skills from your inputs and outputs.
- Returns are shifting from intelligence toward agency for humans, since agents supply the intelligence. The people best fit for the future open a coding agent and ask what to build instead of defaulting to passive consumption.
- Maybe 10x more people are coding than a year ago, yet around 99% still never will, because to a non-coder the starting step remains unimaginable. Vibe coding is described as more addictive and entertaining than video games, with real output.
- AI video lacks taste and judgment for now, but by 2030 expect fan-made films: dozens of Lord of the Rings takes, or generating unmade seasons of The Expanse from the books. The bigger prize is a genuinely new imaginative work, not a remix.
- What humans uniquely do is generate meaningful surprise out of the training distribution, with intent that makes it mean something. Gödel stepping outside the formal system is the archetype; Claude’s identical-looking websites are the counterexample of in-distribution slop.
- Higher productivity historically means you hire more, not fewer, of the productive people. Expect a larger number of smaller teams, an entrepreneurship explosion, and generalists winning as credentials matter less than creativity, taste, and judgment.
- The throughline is people with AI versus people without AI. The single best investment right now is getting genuinely good with the tools and learning the exact edges of what they can and cannot do.
Detailed Summary

Software Factories and the Thousand-X Engineer

Guillermo Rauch opens with the idea that has him “pilled”: the engineer’s job has changed from shipping output directly to building the factory that produces multiplicative outputs. That reframes how you evaluate people and surfaces an old, controversial truth. He used to get flamed on Twitter for asserting 10x engineers, since it offends an equality instinct, but in intellectual and digital domains the real spread is 100x or 1000x, and choosing the right thing to work on is an infinite multiplier on top. AI leverage makes this less controversial, except that people now confuse token spend for productivity. The group agrees token leaderboards are the new lines-of-code. Max Hodak adds that a model is about as good as you are in a domain, so a capable developer gets a powerful collaborator while a junior gets junior-grade help, and the sporadic feedback you give, the reprompting, disproportionately determines the result. Naval’s posture is the opposite of fussy: he ignored every prompt-engineering trick on the bet that the models would improve faster than he could learn to game them, types less and less, and brute-forces problems by throwing multiple models at them. Waste tokens, save time, because tokens are cheaper than people.

Is Pure Software Dead, and the Building-Block Economy

Rauch describes models crossing from junior to principal engineer: they now return with several routes and explicit tradeoffs, push back when you try to jam high-cardinality telemetry into Postgres, and suggest ClickHouse or Athena instead. That elevates taste and judgment as the human contribution. He then poses the hard question: is pure software engineering obsolete now that models speak fuzzy, sloppy English and you no longer need code to communicate with them? For hardware founders it is a boon, echoing Patrick Collison’s line that software is art and artists are hard to hire. To temper the “agents reinvent everything” fantasy, he invokes Mitchell Hashimoto’s building-block economy: you do not want your agent rebuilding a queue from first principles every time it sends an email, and shared dependencies like a common Postgres version carry real cooperation value. Reusable infrastructure becomes more valuable in the agentic era, functioning like libraries and dependencies, or even a token cache, so models fork from existing starting points instead of burning a trillion tokens to recreate what exists. Naval and Max both note they had not written code in years and now build daily through agents, because understanding how APIs, data flow, and performance fit together matters more than syntax, and vibe coding is just transmitting intent the way a good engineering leader already did through people.

Vibe Coding Hardware at Boom Supersonic

Blake Scholl explains how AI changed the role of software and hardware developers at Boom. A great deal of hardware engineering lives in complex Excel spreadsheets and VBScript on individual laptops, with no source control and no automated testing, and handoffs happen manually over email like it is the 1990s. Boom had long tried to turn these flows into real software but could never afford enough software engineers. The new model is that software engineers create the architectures, because they understand systems, algorithms, and separation of concerns, and hardware engineers vibe code their own pieces. The result is mind-blowing productivity for small teams. His example: a turbine blade is cold at rest and expands when hot, so you must design both the cold and hot shapes and convert between structures and aerodynamics, work that took one engineer a full day per blade across a thousand blades in a jet. With a combined software-and-hardware tool you can now change blade geometry and see structural and aerodynamic results in real time, letting two engineers design an entire jet engine. The group extends this to the death of enterprise collaboration software and even spreadsheets, since you can now code the exact custom tool you need, and predicts AI will soon generate step files and PCB layouts, carrying the boom into mechanical and electrical engineering.

China, Open Source, and Which Models Actually Get Used

Naval argues China is going all-in on open-source models because its hardware and supply-chain superiority pairs naturally with on-demand software generation, which erases Silicon Valley’s software edge, and because the Chinese government has a history of funding ecosystem-wide efforts in network-effect businesses. Without frontier coding models there is no self-improvement, so a country that cannot generate frontier software falls behind on generating everything downstream. He notes the irony that almost all the open-source heft now comes from China, since OpenAI is not open, Grok and Google’s local models trail, and Anthropic ships no open models. On real usage, Rauch reports from Vercel’s AI gateway that frontier intelligence dominates the top, with a caveat: frontier intelligence at the right cost and performance, like Gemini, slaps at scale and is the best industrial production model for support and browser automation, while Chinese models are not in the frontier coding tier. Naval frames intelligence as an unalloyed good, since model mistakes are invisible and a smarter model is still cheaper than a person, which pushes everyone toward the most intelligent option and risks an oligopoly in AI.

Vertical Integration, Verifiers, and the Slop Problem

Max Hodak lays out Science’s vertical integration: the preference is always to buy, as with cheap PCBs from Asia, but when components do not exist you must make them, and the closer a product gets to a single block of covalently bonded matter the better it performs. Science owns a captive MEMS foundry on the east coast because there was no other way to do the packaging and assembly it needed. He notes AI’s most surprising internal impact so far is regulatory: generating documentation and tracing which of thousands of ISO standards apply, work that once tied up a quality team for months. Rauch raises the slop problem: mountains of AI-generated code arriving as pull requests nobody can read line by line. His standard is that an engineer must be able to say they understand and will stand behind the consequences of a PR, backed by the test harness, proofs, and type checkers, even without reading it all. Naval generalizes this into humans becoming verifiers, with lawyers, engineers, and operators moving to verifying the stack and standing behind it, and Rauch warns that creating software is the easy zero-to-one part while keeping it secure, tested, performant, and maintained a thousand days later is the real test.

Regulation as Test Suite, and the Voter Problem

Blake describes building a RAG that compresses a 200-page lightning-strike compliance test plan from months of a “monkey at keyboard” engineer’s work into minutes, with a powerful second-order effect: change the airplane and you regenerate compliance in minutes instead of crying over months of rework, which slashes change aversion and lets a small number of creative engineers iterate. Max reframes regulations as potentially good guard rails, a test suite and exit criteria for agent loops, provided they are non-contradictory and reasonable, since the alternative is shipping slop into the air. Naval warns of a red queen race of agent-on-agent compliance and agencies getting DDoSed by clever entrepreneurs flooding them with documents. Blake pushes for enforcement-based rather than pre-approval regulation, using the analogy that we would never tolerate filing a driving plan before every trip, yet that is exactly how physical infrastructure works: guilty until proven innocent. He cites the 737 Max’s single all-authority sensor and the NRC permitting almost no nuclear plants for decades as proof that this makes us slower, not safer. Hodak supplies the counterweight: the deeper issue is the voters and the regulator’s asymmetric incentives, where approving a bad thing ends a career and blocking a good thing goes unnoticed. Remove an agency and the electorate installs its twin. Naval and Max agree the real reforms are narrow, including innovation zones, opt-in YIMBY zones, and the experimental laboratory of fifty states.

Drug Discovery, Healthcare Economics, and End-of-One Medicine

Hodak explains why innovation zones do not solve drug discovery. The right-to-try act and single-patient IND already exist, and the FDA approves over 99% of such requests, sometimes by phone, but dosing requires clinical-grade drug that only the IP owner has, and the FDA will draw an adverse inference against the whole program if a very sick patient does worse. A targeted fix is to prohibit adverse inferences across different users of a compound. He points to Europe’s notified-body system, private certifiers blessed by governments, as a way to scale review capacity, and to China’s CFDA, which already approved an implantable brain-computer interface and brings products to market far cheaper. His core economic argument is that healthcare is a fixed bucket of money that grows only with tax receipts, unlike phones and laptops where falling prices expanded the market, so spending 10x more on healthcare would be a catastrophe rather than the triumph that 10x AI spending would be. With no private market you run a small communist society inside a capitalist one, with the lines and frozen quality that implies. The way out is lower cost-to-market so patients can finance care like a car, which is the direction China is pushing. Naval’s twist is a healthcare plan where the first 20% of income is the deductible to recreate a private market, citing LASIK, dental, and plastic surgery as fields that advance because patients pay directly. The group closes the segment on GitLab’s Sid Sijbrandij, who outlived a rare-cancer prognosis by building his own escalation ladder of drugs, noting that end-of-one medicine works at the high end but demands enormous agency exactly when a patient is weakest, which is where AI should democratize access to knowledge.

Autonomous Software, Hackathons, and the Autonomous Company

Asked how much autonomous software they run, Rauch describes Vercel automating much of site reliability engineering: instead of hand-set alarm thresholds, anomalies in error rate, latency, or throughput fire an alert, an agent investigates, can open an incident that loops in people, and begins remediation, stopping just short of changing production. Vercel also runs autonomous optimization and security research, and an open-sourced security tool run against the entire monorepo with 10,000 concurrent agents produced several quarters of security research in a couple of days for about $14,000 in tokens, the equivalent of months of red teaming. Max shares a vibe-coded bug-reporting queue where TestFlight users submit logs and screenshots, a daemon analyzes and fixes issues in the background, and ships him a build to try, raising the prospect of apps effectively built by their users, with the caveat that you would get a Homer Simpson car of every feature. Blake recounts stopping all project work for a week and requiring everyone, from the receptionist to the engineers, to build something with AI and demo it. He expected mostly silly projects and got mostly needle movers, including a genuinely useful automation from the shipping and receiving associate, concluding that most people have an idea worth building but cannot tell a good first idea from a bad one until they can iterate on a real thing. Rauch extends this to a workforce that trains the agents doing the work rather than doing it directly, and a coming feature to extract reusable skills from your inputs and outputs.

Creativity, Out-of-Distribution Surprise, and What Humans Can Uniquely Do

On the intelligence-versus-agency split, Max suggests returns to humans tilt toward agency since agents supply intelligence, while Naval counters that you stay 99% intelligence and 1% agency because the agents exercise the agency for you. They agree the humans best suited to the future are the agentic ones who open a coding agent and ask what to build. Coding has perhaps 10x more participants than a year ago, yet roughly 99% still never will, because the first step is unimaginable to a non-coder, even as vibe coding proves more addictive and entertaining than video games while producing something real. On AI video, the group notes it still lacks taste and judgment, but expects fan-made films by 2030, dozens of Lord of the Rings takes or generated seasons of The Expanse, while prizing a genuinely new imaginative work over a remix. The long closing debate turns on definitions. Hodak defines art as meaningful out-of-distribution behavior, broad enough to include a military maneuver, and expects models to reach it. Naval defines art as conveying emotion with intent, which makes attribution decisive: the same photo means more taken by a human, and a hardware-attestation startup gains a real use case. They cite Gödel stepping outside the formal system as the human archetype and the identical look of every Claude-built website as in-distribution slop. Naval lands the episode on optimism: productivity gains mean hiring more, not fewer, of the creative and AI-fluent, the future is a larger number of smaller teams and an entrepreneurship explosion where generalists thrive and credentials fade, and the single best move is to get extremely good with the tools, because it is people with AI versus people without AI.

Notable Quotes

“Now clearly there’s 100x or a thousandx engineers and the world hasn’t fully adjusted to this.”
Guillermo Rauch, on why AI made the spread between engineers impossible to ignore

“Just waste tokens, save time. Don’t look at the tokens either as inputs or outputs. Just look at your time and look at the final output.”
Naval Ravikant, on the right way to measure AI’s return

“We had to learn code to communicate with the models. Now the models speak English and they speak fuzzy sloppy English like a human and they understand things.”
Guillermo Rauch, asking whether pure software engineering is now obsolete

“It allows two engineers to design an entire jet engine, which is just wildly different.”
Blake Scholl, on Boom turning hardware engineering into software

“You need to be able to say I am signing off on understanding the consequences of this PR.”
Guillermo Rauch, on what it means to stand behind code you did not read line by line

“That is absolutely the way we build physical infrastructure in this country. It’s guilty until proven innocent. And what we should actually do is make more of these things enforcement based rather than pre-approval based.”
Blake Scholl, comparing the permitting process to filing a driving plan before every trip

“You’re basically running a small communist society inside a larger capitalist society. And that’s what we’re doing in healthcare.”
Max Hodak, on why there is no real private market in healthcare

“I expected we would get a large number of silly projects and a small number of needle movers. And what we got was a large number of needle movers and a very small number of silly projects.”
Blake Scholl, on the week he had the whole company build with AI

“If a person takes the photo versus AI generates the exact same photo down to the last pixel, the person taking the photo will have more meaning for me.”
Naval Ravikant, on why intent and attribution make something art

“It’s about people with AI versus people without AI. And so the single best thing you can be doing right now for yourself is just getting really good with these tools.”
Naval Ravikant, closing the conversation on the only divide that matters

Watch the full conversation here: The AI Industrial Revolution on the Naval Podcast YouTube channel.

Related Reading
- Part one: Waste Tokens to Save Time, our writeup of the first segment, on software factories, the thousand-x engineer, token leaderboards, and whether pure software is dead.
- Part two: Vibe Coding Hardware, our writeup of the second segment, on AI-designed jet engines, vertical integration, China’s open-source bet, and humans as verifiers.
- Naval Ravikant’s official site, the canonical home for Naval’s essays and podcast on technology, judgment, and leverage.
- Boom Supersonic, Blake Scholl’s company building supersonic aircraft and its own jet engines, source of the turbine-blade and two-engineers example.
- Science Corporation, Max Hodak’s brain-computer interface company, whose captive MEMS foundry and FDA arguments anchor the hardware and healthcare segments.
- Vercel, Guillermo Rauch’s company, whose AI gateway data and autonomous SRE work inform the usage and automation discussion.
June 1, 2026
Waste Tokens to Save Time: Naval, Guillermo Rauch, Blake Scholl, and Max Hodak on AI Software Factories, 1000x Engineers, and Whether Pure Software Is Dead
Naval Ravikant gathers three frontier founders, Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science, for a freewheeling conversation about how AI coding tools are reshaping what an engineer is, what software is worth, and where the moat goes when models speak English. The headline idea comes from Naval himself: waste tokens, save time. Stop measuring AI by tokens consumed or lines of code generated and start measuring it by the final output and the time you got back. The full conversation is on the Naval Podcast YouTube channel. This is part one of the discussion. Part two, on vibe coding hardware, follows the same group into jet engines, semiconductors, and biotech. You can also watch and read the full episode here.

TLDW

The job of an engineer is shifting from shipping output to building the factory that ships the output, which means 10x engineers were never really 10x, they were always 100x or 1000x in idea domains, and AI leverage is making that obvious. Models now reflect back the judgment of the user, so a senior architect extracts dramatically more value than a junior, although the junior also writes code they could never have written alone. The frontier models have quietly graduated from junior coders to principal engineers, returning with intuitive plans and real tradeoffs (sometimes with hilariously bad time estimates) rather than just running away with the prompt. Naval has stopped learning prompt tricks, scaffolding tools, and Claude plan-mode rituals entirely. Instead he throws Codex, Claude, and Gemini at the same problem in parallel and brute forces his way through, because tokens are still cheaper than a human and the models keep getting better faster than tricks can. That leads to the bigger question on the table: is pure software still investable, or is it now just a free byproduct of hardware, models, and taste? The group lands on the block economy thesis (a tip of the hat to Mitchell Hashimoto): agents do not want to reinvent Postgres or BMQ on the fly, they want to grab the right reusable building block, so infrastructure software actually gets more valuable, not less. Max Hodak closes the loop with a personal data point: he has not written a line of code in years and has built more software since December than ever before, all through agents, because just understanding APIs, data flow, and performance is what actually moves the work forward.

Thoughts

The “waste tokens, save time” line is the most important rhetorical move in this conversation, and it deserves to be unpacked beyond the soundbite. Naval is implicitly arguing that the entire token-economics debate (input cost, output cost, leaderboards, model arbitrage) is a category error in the same way that lines-of-code was a category error in the nineties. The thing being purchased is not tokens. It is a finished result delivered with less of your finite attention spent. If three parallel runs of Codex, Claude, and Gemini cost you a few dollars and one of them lands the answer in twenty minutes instead of you sweating the problem for two hours, the unit economics are not even close. The only people who care about the token bill are people who have not internalized that human time is the actually scarce resource. Once you do internalize it, the question is no longer “how do I prompt this more efficiently,” it is “how do I get out of my own way.”

The 100x and 1000x engineer point is the one most likely to enrage commenters, and it is also the one most worth taking seriously. Naval is right that the egalitarian flinch in software circles always sat awkwardly next to the empirical fact that one Carmack, one Brendan Eich, or one Satoshi creates more durable value than every mid-tier engineer on earth combined. What AI does is collapse the bottom of that distribution. The marginal junior engineer at a typical company is now competing with a model that costs a few dollars an hour and never sleeps. The remaining premium for human engineers is taste, judgment, and the rare ability to pick the right thing to build at all, which Naval correctly flags as the multiplier that dwarfs raw coding speed. “Just one who had a better judgment on what to work on in the first place” is the most underrated line in the whole episode.

Guillermo Rauch’s observation that the models have graduated from running away with your prompt to returning with three routes and a tradeoff matrix is the technical update most people have not actually felt yet. There was a real, qualitative shift when the model started saying “we don’t put high-cardinality telemetry into Postgres, you probably want ClickHouse or Athena.” That is not autocomplete. That is a peer. And the funny corollary, that the same model will then confidently tell you the work will take three weeks when it will take three hours, is not a knock on the model. It is a reminder that calibration is a separate skill from competence, and humans get this wrong constantly too. The right posture is to treat the model the way a good engineering manager treats a strong but cocky senior: take the architecture suggestions seriously, throw out the estimates.

The block-economy thread, riffing on Mitchell Hashimoto, is where this conversation quietly answers Naval’s “is pure software dead” question. Agents are insatiable consumers of reusable building blocks because reinventing infrastructure on every run is wasteful, brittle, and incompatible with the rest of the world. If your service is the canonical primitive an agent reaches for (the queue, the database, the auth layer, the deploy target), you are not commoditized by AI, you are amplified by it. Pure software is not dead. Pure software with no distribution, no defensibility, and no integration into the agent toolchain is dead. That is a much less catchy headline, but it is the real one. The takeaway for founders is not to abandon software, it is to ask whether your software is something an agent will reach for ten thousand times a day or something a human had to be talked into using once.

Max Hodak’s confession (no code written in years, more shipped software in the last six months than ever before) is the empirical proof that this is not just theory. The skill that ports forward is not syntax. It is the engineering leader’s instinct for what an API is, how data flows, where performance matters, and what level of expectation to set. Guillermo’s framing of “vibe coding through people on Slack” as the original form of vibe coding is genuinely insightful. A good engineering manager has always been transmitting intent to other minds and letting them run. Doing it with agents is the same skill, just with a faster, cheaper, more literal counterparty. The engineers who will struggle in this transition are the ones whose identity was tied to writing the code themselves. The ones who will thrive are the ones who already thought of themselves as taste, judgment, and intent, with code as an implementation detail.

Key Takeaways
- The engineer’s job has shifted from shipping output B to building the factory that produces outputs B through Z. You are now judged on the multiplicative system you create, not the single artifact you deliver.
- 10x engineers were always a misnomer. In idea-domains and digital domains, the real distribution has always been 100x or 1000x. AI just made that obvious enough that arguing about it is no longer fashionable.
- Token consumption leaderboards are the new lines-of-code metric: a vanity number that measures activity, not value. Tokens are an input, your time is the constraint.
- Naval’s core rule: waste tokens, save time. Tokens are still vastly cheaper than human hours, no matter how the pricing scares you.
- Models tend to be about as good as you are in a given domain. The feedback you give them, the corrections, the redirections, sporadically but powerfully shapes the quality of the output.
- The quality of your reprompting matters enormously today, but will probably matter less over time as models get smarter and need less hand-holding.
- Naval has refused to learn prompt scaffolding, plan-mode tricks, or named prompt frameworks. His bet is that the models will figure out how to use him faster than he can figure out how to use them.
- His preferred technique: throw Codex, Claude, and Gemini at the same problem in parallel and brute force the answer. Time is the cost center, not API spend.
- Lower quality first-draft code is not a blocker. When it is time to ship, throw more tokens at it for a hardening pass. Quality compounds across model generations.
- Verifiable domains (problems with a clear right answer) are the ones the models will fully solve. Cutting-edge creativity work, the Terence Tao tier, still needs careful human collaboration.
- Models have qualitatively shifted from “next-token autocomplete that runs away with your prompt” to “intuitive planning mode” where they return with multiple routes and explicit tradeoffs.
- This is why people on social media say models are now PhD-level. It is not the raw output, it is the back-and-forth posture.
- Models will confidently make terrible time estimates (“this is a three week project”). Treat them like a strong but miscalibrated senior engineer: trust the architecture, ignore the schedule.
- Architect-level engineers are extracting much more value per session than junior engineers, but juniors are still leveling up because they can now write code far above their unaided ability.
- The next career step for a junior engineer is moving from implementing features to picking technologies. Postgres vs ClickHouse, ZMQ vs other queues. The model can suggest, but a human still has to decide.
- Taste and judgment remain the residual human advantage. Models will give you good tradeoffs if you ask, but knowing which tradeoff to take is still on you.
- Concrete example: a recent model pushed back when asked to store high-cardinality telemetry in Postgres and recommended ClickHouse or Athena instead. Unprompted architectural judgment.
- Humans are still completing the model for tasks like fetching API keys, moving capital, or performing real-world actions. That gap is temporary.
- Every SaaS and hosting company will soon expose a CLI or API surface that agents can drive directly. Anything Unix-shaped and text-based, agents can already hack into a usable API themselves.
- The missing piece for full autonomy is payments. Crypto, Bitcoin, or any programmable money lets the agent buy what it needs without a human in the loop.
- The open question Naval poses: is pure software dead? We used to learn code to talk to machines. Now machines speak fuzzy, sloppy English back to us.
- For hardware founders, AI is a massive boon. Software, which was always hard to hire artists for (per Patrick Collison’s “software is art” framing), is suddenly fast and cheap to produce alongside the hardware.
- Model training, post-training, and fine-tuning may be the new “real software engineering” for those who want to work at the model layer.
- Mitchell Hashimoto’s “block economy” thesis: agents need powerful, reusable, well-known building blocks. They should not reinvent message queues or databases every run.
- Reinventing primitives is bad civic engineering. The value of “we both depend on Postgres 13.2” is interoperability with the rest of society and toolchain.
- Infrastructure software and reusable libraries are getting more valuable, not less, in the agentic era. Vercel’s bet is on being the layer agents reach for.
- Useful metaphor: building blocks are like a token cache. Why churn through a trillion tokens to reproduce code that already exists when you can fork from a known starting point?
- Max Hodak has not written a line of code in years but has shipped a huge volume of personal software since December, all through agents. Projects he had fantasized about for years are now actually running.
- What still matters from a real software background: understanding what an API is, how data flows, performance expectations, and how to set the right level of demand on an operation.
- A proficient engineering leader has always been “vibe coding through people” on Slack and in one-on-ones, transmitting intent and letting others execute. Doing it with agents is the same skill, faster and cheaper.
- Naval personally went from twenty years of not coding to coding constantly through agents, leaning on first-principles software engineering and algorithms knowledge.
- The friction that historically killed personal coding projects (latest framework, infra plumbing, deploy setup) is now mostly handled by the agent. Vercel makes it easier, agents make it trivial.
- The single biggest change Max highlights: you do not get stuck anymore. The indefinite debugging spiral on some narrow obscure bug is largely gone.
- The old mantra that learning to program means accepting intrinsic frustration (“nope, that’s part of the deal”) is no longer true. The frustration was incidental, not essential.
- The frontier founder pattern on display in this episode: all three guests build their own factories (Vercel’s AI cloud, Boom’s supersonic jets and engines, Science’s biohybrid brain interface) rather than composing from off-the-shelf parts.
Detailed Summary

The Software Factory and the Hundredfold Engineer

Guillermo Rauch opens the substantive portion of the conversation with the framing he has been pushing publicly: the role of the engineer is moving from “ship output B” to “build the factory that ships outputs B through Z.” That reframes engineering judgment. You are no longer evaluated on the single deliverable, you are evaluated on the multiplicative system you put in place. Naval picks up the thread and points out that this also retires an old debate. Engineers used to argue about whether 10x engineers existed, with the egalitarian camp insisting that talent differences were marginal. The truth, Naval says, was always more extreme. In idea-domains, virtual domains, and intellectual domains, the distribution has always been 100x or 1000x, not 10x. Brendan Eich, Carmack, Satoshi, the canonical names, were thousandx programmers. AI has made the underlying distribution legible. And the multiplier on top of all of that is judgment: picking the right thing to work on in the first place is an infinity multiplier compared to picking the wrong thing, regardless of raw skill.

Token Leaderboards Are the New Lines of Code

Guillermo flags the current cultural confusion: people see their AI bills, see the token counts, and assume they should be optimizing for tokens-per-engineer or similar metrics. Max Hodak’s response cuts through it. Token consumption, like lines of code before it, is not a meaningful productivity metric. It is an activity metric, and activity metrics always mislead. Max adds his own field observation: the models tend to be roughly as good as you are in a given domain. A senior developer extracts genuinely powerful output, a junior gets junior-quality output back, because the feedback loop (the corrections, the redirections, the architectural pushback) is what shapes quality. The sporadic but high-leverage moments where the user redirects the model are doing more work than the prompt itself.

Naval’s Brute Force Doctrine: Waste Tokens, Save Time

Naval lays out his personal posture, which has become the title of the conversation. He has deliberately ignored all the prompting tricks, scaffolding tools, named prompt frameworks (“use Ralph Wigum, use OpenClaude, use Hermes, use plan mode”), on the bet that the models will figure out how to use him faster than he can figure out how to use them. He is ham-fisted with the models, gets frustrated, types less and less, and just brute forces his way through by running Codex, Claude, and Gemini at the same problem simultaneously. The justification is economic. No matter how expensive the models seem, they are still vastly cheaper than a human hour. Do not measure tokens as inputs or outputs. Measure your time and the final output. Even when the first-draft code is low quality, that is not a blocker. When the moment comes to ship, throw more tokens at it. The models will rewrite it, harden it, and they get better every generation. Naval explicitly excepts cutting-edge creative work (the Terence Tao tier of unsolved problems) where you still need to collaborate carefully and closely. Everywhere else, brute force is the dominant strategy.

From Junior Coder to Principal Engineer

Guillermo identifies a qualitative shift that has happened recently. Models used to do the classic next-token thing: take your prompt and run away with it in a direction you may not have wanted. Now they enter an intuitive planning posture without being told to plan. They come back and say “what you are asking has these three routes, here are the tradeoffs.” That, Guillermo argues, is the moment the model stopped being a junior engineer and became a principal engineer. The funny side effect is that they will then return preposterous time estimates (“this will take three weeks”) with full confidence. The conclusion is to treat the model as a peer for architecture and a baby for scheduling. Returning to the Max-vs-junior question, Guillermo argues juniors clearly do level up because they write code well above their solo ability, but architects extract maybe 10x while juniors extract more like 2x. The juice scales with the user’s existing taste.

Taste, Judgment, and Architectural Decisions

Max names the residual human contribution: taste and judgment. Picking between Postgres and ClickHouse for high-cardinality telemetry data, picking between ZMQ and another queueing system. The models can recommend, but a human still has to call it. Guillermo offers a recent concrete example where a model pushed back unprompted: when asked to put high-cardinality telemetry into Postgres, the model responded “we don’t put that kind of data into Postgres, you should consider ClickHouse or Athena.” That is the new normal. The peer-level architectural pushback is happening unsolicited, which is genuinely impressive and a real shift from the deferential autocomplete of two years ago.

When the Human Becomes the Tool

Guillermo raises the inversion question: at what point does the model stop being the assistant and the human start being the assistant who fetches API keys, moves capital, and performs real-world actions on the model’s behalf? Naval treats it as a temporary aberration. Every serious SaaS and hosting provider will soon expose a CLI or API surface that agents can drive directly. Even when they do not, anything Unix-shaped and text-based can be hacked into an agent-usable interface by the agent itself. The missing piece is payments. Once you insert programmable money (Naval mentions Bitcoin and crypto tokens), the agent can buy what it needs and the human is no longer the bottleneck.

Is Pure Software Dead?

Naval poses the biggest strategic question of the episode. If models now speak fuzzy, sloppy English the same way humans do, and the historical reason we learned to code was to talk to machines that did not understand English, is pure software still a viable thing to build a company around? His own framing of the answer: hardware founders win, because the historically hard problem of hiring software artists (per Patrick Collison’s “software is art” line) is now mostly solved by AI. Model builders win, because training, post-training, and fine-tuning may be the new “real software engineering.” But what about classic pure software companies? Naval lets the question hang, and Guillermo picks up the answer through a different door.

The Block Economy and the Future of Infrastructure Software

Guillermo cites Mitchell Hashimoto’s recent piece on the block economy (or “building block economy”). The argument: the most valuable thing for agents to have access to is powerful, reusable building blocks. You do not want your agent reinventing a queue system every time it needs to send an email. You want it to grab the right-sized block (BMQ, ClickHouse, whatever) and move on. Reinventing primitives is also a civic problem. The world only works because we all depend on the same Postgres 13.2, the same protocols, the same standard infrastructure. If every agent went off and invented its own bespoke universe, you would lose interoperability. So infrastructure software (which is, by self-admitted bias, what Vercel builds) becomes more valuable in the agentic era, not less. Guillermo extends the metaphor: reusable building blocks are like a token cache. Why burn a trillion tokens reproducing what already exists when the agent can fork from a known starting point? The block economy is the answer to “is pure software dead.” Pure software that becomes the canonical primitive an agent reaches for is more valuable than ever.

Max Hodak’s Personal Proof: Years Without Code, Tons of Software Shipped

Max grounds the discussion in his own experience. He learned to program young, got sucked into it in his teens and 20s, knew programming languages deeply. He has not written a line of code in quite a while. And yet since December he has built a huge amount of personal software, including projects he had fantasized about for years and now actually uses every day. He did not write any of it. He cannot imagine going back to writing code by hand. The skill that ports forward is not syntax, it is the understanding of how APIs work, how data flows, what level of performance to expect, and how to orient the model around the right expectations for an operation. Guillermo extends this with the most quotable framing of the episode: a proficient engineering leader has always been “vibe coding through people on Slack and in one-on-ones,” transmitting intent and letting others execute. Agents are the same modality with a faster, cheaper, more literal counterparty.

Naval’s Return to Coding After Twenty Years

Naval offers his own parallel. He went from not having written code in twenty years to coding constantly through agents. What carried him back in was first-principles knowledge of software engineering and algorithms, which gets you further than you would think. The reason he had stopped coding in the first place was not lack of ability, it was the friction of keeping up with the latest language, the latest architecture, and the constant infrastructure plumbing required to ship anything. Vercel made it easier. Agents made it trivial. Max closes with the most concrete benefit of all: you do not get stuck anymore. The indefinite debugging spiral on some obscure narrow problem, the thing that historically ate weekends and broke spirits, is largely gone. The old mantra that programming is intrinsically frustrating and that frustration is “part of the deal” turned out to be wrong. The frustration was incidental, not essential.

Notable Quotes

“The way that I’m judging you as an engineer is, are you producing the factory that will produce multiplicative outputs B through Z?”
Guillermo Rauch, reframing what an engineer is actually being measured on in the AI era.

“When you’re operating in idea domains, intellectual domains, virtual digital domains, it’s not even 10x, it’s 100x or 1000x. It always has been.”
Naval Ravikant, on why the old 10x engineer debate was always under-stating the real distribution.

“If you choose the right thing to work on versus the wrong thing to work on, that’s an infinity difference. It could just be one who had a better judgment on what to work on in the first place.”
Naval Ravikant, on judgment as the multiplier that dwarfs raw skill.

“I’ll throw Codex, Claude, and Gemini at the same problem over and over and just waste tokens to save time. No matter how expensive these models might seem, they’re still way cheaper than a human.”
Naval Ravikant, on his brute-force multi-model coding workflow.

“Just waste tokens, save time. Don’t look at the tokens either as inputs or outputs. Just look at your time and look at the final output.”
Naval Ravikant, delivering the title thesis of the episode.

“Clearly the models at some point graduated. They used to be junior engineers, now they’re principal engineers, because they come back to you with a set of tradeoffs.”
Guillermo Rauch, on the qualitative shift in how current frontier models respond to prompts.

“Bro, we don’t put that kind of data into Postgres, you should consider ClickHouse or Athena or whatever. That’s happened to me a lot, which is really impressive.”
Guillermo Rauch, recounting unprompted architectural pushback from a recent model.

“It’s like saying speaking English. We had to learn code to communicate with the models, now the models speak English. So where’s the moat?”
Naval Ravikant, raising the central strategic question about the future of pure software.

“I haven’t written a single line of code in quite a while. Since December, I’ve built a huge amount of software that I now use every day, projects I’ve fantasized about for years.”
Max Hodak, on what becomes possible when you stop writing code and start directing agents.

“A proficient engineering leader has been quote unquote vibe coding through people on Slack or one-on-ones, because you’re transmitting your will, your intent, your experience, and you’re letting others run with it. Now we do the same with agents.”
Guillermo Rauch, reframing leadership itself as the original form of vibe coding.

Watch the full conversation on the Naval Podcast here.

Related Reading
- Full episode: The AI Industrial Revolution, the complete hour-long conversation this clip is drawn from, covering software factories, hardware, regulation, healthcare economics, autonomous companies, and creativity.
- Part two: Vibe Coding Hardware, the continuation of this conversation, where the same founders move from pure software into AI-designed jet engines, vertical integration, China’s open-source bet, and why humans become verifiers.
- Naval Ravikant’s official site, the canonical home for Naval’s essays, podcast, and longer-form thinking on technology, judgment, and leverage.
- Vercel, Guillermo Rauch’s company, building the AI-native cloud and frontend infrastructure that this conversation references as a canonical agent building block.
- Boom Supersonic, Blake Scholl’s company building supersonic civilian aircraft and their own jet engines, the hardware example of a founder building the whole factory.
- Science Corporation, Max Hodak’s brain-computer interface company developing the biohybrid neural implant referenced in the intro.
- Mitchell Hashimoto’s writing, source of the “block economy” framing for why reusable infrastructure building blocks become more valuable, not less, in the agentic era.
May 27, 2026
Dan Shipper’s Most Contrarian AI Predictions for 2026: Why the Job Apocalypse Is a Myth, SaaS Will Boom, PMs and Designers Win, and CLIs Are Already Over
Dan Shipper, the CEO and founder of Every, returned to Lenny’s Podcast for round two of AI predictions. His last appearance produced one of the most prescient calls of the year: that non-technical people would build serious work inside Claude Code. He was unbelievably right. This conversation is the follow-up, a tour of his most contrarian forecasts for how AI is actually changing the way we work, who wins, who loses, and what almost every commentator is getting wrong about the next twelve to twenty-four months.

TLDW

Shipper argues that the AI job apocalypse is a myth, that SaaS is going to boom rather than die, that product managers and full-stack designers are the biggest winners of the agent era, that personal agents inside Codex and Claude Code will quietly replace the browser as the primary work surface, that every company will run a single shared super-agent in Slack instead of a fleet of per-user bots, that the CLI moment is already over, that pull requests are going to flood organizations from non-technical staff, that forward-deployed engineers who garden company agents become the new senior role, that GPT-5.5 still cannot match a real senior engineer on architectural judgment, that AI-generated internal writing is fine and probably better than what most humans produce, that CEOs and middle managers have not adapted yet but soon will be forced to, that the edge of AI lives wherever a curious human is using it rather than in San Francisco, and that the only durable strategy is to ride the models and keep playing with whatever ships next. The whole conversation balances aggressive AI bullishness with an equally strong bet on humans, on creativity, and on the unavoidable need for someone to care for every agent that gets deployed.

Thoughts

The most useful frame Shipper gives is that models commoditize yesterday’s human competence. Every time a frontier model crosses a new bar, the work that used to define seniority becomes cheap. The senior engineer who could carry a refactor in their head, the PM who could write a coherent strategy doc, the designer who could ship a polished landing page in a week. That competence is now frozen, codified, and available on tap. The interesting question is not whether models will keep eating tasks. They will. The interesting question is what humans do with the suddenly cheap raw material underneath them. Shipper’s answer is that humans climb the stack: they go up a level, find a new problem worth framing, and use the commoditized competence as feedstock for something that did not exist before. That treadmill is the actual engine of value creation, and it is why he can be simultaneously AI pilled and bullish on hiring.

His SaaS take is the spiciest call of the episode and probably the most defensible. The crowd consensus is that agents will gut SaaS because an AI can just write the form filler, the dashboard, the workflow. Shipper points out the obvious counterfactual: agents do not reduce the number of people using SaaS, they increase it. A marketing lead who could never touch the data warehouse can now stand up a PostHog query through Codex. A founder who never opened Vanta can run a SOC 2 prep through an agent. The result is more users, more accounts, and a much fatter top of funnel for every horizontal tool. The second-order effect is even more interesting. When the SaaS tool runs inside the user’s agent, the user supplies the tokens. Vendor margins improve, not collapse. If he is right, the next two years are going to be brutal for the SaaS-is-dead thesis pieces and very good for the public software multiples.

The PM and designer bet is where this gets personal for anyone in product. For a decade the bottleneck in shipping anything was engineering capacity. A PM with spiky product sense had to negotiate their vision through a roadmap, a sprint, a review, and a release. Designers had to convince an engineer that the third state of the empty screen was actually worth building. Both of those constraints are dissolving fast. A PM who can prompt Codex into a working prototype on Friday afternoon, then iterate it live in front of a customer on Monday, is doing the job of a small team. A designer who can ship a fully functional landing page in their own style, without negotiating with anyone, is suddenly the most leveraged person in the company. The scarce skill is no longer execution. It is taste, judgment, and the willingness to decide what is worth building. That has always been the real PM and design job. AI just stripped away the parts that were not.

The quietest but most important prediction is that agents need humans, permanently. Every benchmark advance reveals a new layer of judgment the model cannot frame on its own. When the agent finishes the task, there is always a senior human who sees the deeper problem the model patched over. Shipper calls this gardening, and it is the basis for the new forward-deployed engineer role. The companies winning right now are the ones that put a real person next to every agent, watching what it does, course-correcting in Slack, and noticing when the output drifts. The dream of autonomous AI workflows is a stage in a journey, not the destination. The destination looks more like a thoughtful operator with a small cluster of agents they trust and constantly tend. That is a much more humane future than the discourse suggests, and it is the one Every is already living.

The final advice, ride the models, sounds glib but is the single most actionable line in the episode. Most professional anxiety about AI dissolves the moment you actually use the newest model on real work. Most professional advantage accrues to the people who do that one thing consistently. The edge does not live in San Francisco where the labs build the things. It lives wherever a curious human meets a real workflow and discovers something the labs have not noticed. A PM in Iowa willing to try Codex on a Tuesday night can be further ahead than a research engineer who has only used the model on its evals. Pair that with Shipper’s closing motto, do things worth writing about and write things worth reading, and you have a pretty complete operating system for the next two years.

Key Takeaways
- The AI job apocalypse narrative is wrong. Models commoditize yesterday’s competence, then humans climb the stack and find new work to do with the cheap raw material.
- Every has roughly doubled headcount in the last year despite being one of the most AI-forward companies in the world. The lived data point cuts directly against the doom thesis.
- Shipper’s dual stance: simultaneously extremely AI pilled and very bullish on humans. He treats this as the only intellectually honest position right now.
- Work will bifurcate. Companies will run one shared super-agent in Slack for everyone, and individuals will run their own personal agent inside Codex or Claude Code on their machine.
- The personal agent inside Codex effectively becomes the new operating system. Instead of putting AI in the browser, you put a browser inside the AI.
- The super-agent pattern is already real: Shopify has River, Ramp has its own, and Every runs Claudie inside Slack for internal consulting.
- SaaS is not dying. Agents increase the user base of SaaS tools because non-technical people can finally drive them. Shipper would buy SaaS stocks today.
- When SaaS runs inside an agent, the user brings their own tokens. Vendor margins improve because they no longer eat inference costs on every interaction.
- The CLI era is already over. The magic was never the terminal. It was the AI plus the ability to see what the agent is doing. A good GUI captures the same benefits and more.
- Pull requests are about to flood every company. Non-engineers can now ship code, run queries, and open tickets. Reviewing the output becomes the new bottleneck.
- Open-source maintainers are already living in the future. Some receive thousands of agent-generated PRs per day and spin up thousands of Codex instances just to triage them.
- Forward-deployed engineers are the new senior role. They live in Slack, garden the company’s agents, fix broken flows, and keep non-technical staff from doing damage.
- Product managers with spiky product sense plus a little Codex fluency become extremely dangerous. Marcus at Every, formerly a PM at Axios, is the archetype.
- Full-stack designers are the other big winner. They can build distinctive interfaces end to end without negotiating with engineering. The bottleneck on taste-driven product work disappears.
- Designer hiring data has not yet caught up to the prediction. Shipper notes this and says check back in a year.
- Sales is the role least changed so far. Top of funnel research has been turbocharged by agents, but the actual relationship and closing work remains human.
- AI-generated internal writing is going mainstream and that is a good thing. Most humans are bad at strategy docs, quarterly plans, and PRs. AI drafts a coherent first pass that a human can refine.
- Shipper says most of his email is now written by GPT-5.5 and Codex. He would honestly prefer the signature to say so.
- Public writing, newsletters, and published essays still demand a human voice. Internal communication does not.
- CEOs and middle managers have largely not adapted yet because their staff still does the work. That window is closing fast and will become an obvious career liability.
- Your company will only go as far as your CEO goes in AI. The leadership ceiling becomes the AI ceiling.
- Shipper’s senior engineer benchmark scores GPT-5.5 at roughly 62 out of 100. Real senior engineers sit at 85 to 90. Progress is real, but the gap on architectural judgment remains.
- Models tend to patch problems locally instead of rewriting from first principles. A senior human still sees the deeper rework that the model avoids.
- Every uses Notion-based agents to draft quarterly plans. The human edits, approves, and stands behind the output.
- The hard rule on AI-generated communication: you have to read it and stand behind it before sending it. Pasting unread output is the only true no-no.
- Every agent needs a human. Automation is a lie in the strong sense. The story of automation is the story of new and different humans being needed alongside it.
- The reach test, organic daily usage, is the real signal that an AI product works. Benchmark scores are noisy. Daily reach is not.
- Cursor’s SpaceX acquisition is a tell. Harnesses around models, not the models themselves, are where the strategic value is concentrating.
- The edge of AI is not in San Francisco. It is wherever a real human meets a real workflow and discovers something the labs have not noticed yet.
- A PM in Iowa willing to ride the models can be further ahead than a researcher in SF who only uses them on internal evals.
- Ride the models. Use them for whatever you do. Try every new release the day it ships. That single behavior compounds faster than any other AI career strategy.
- Shipper got bursitis, which he calls vibe coder elbow, from too much rapid agent-assisted coding while debugging his markdown editor Proof.
- The closing motto for the year: do things worth writing about and write things worth reading.
- Lenny will re-interview Shipper in roughly May 2027 to score the predictions.
Detailed Summary

Why The AI Job Apocalypse Is The Wrong Frame

Shipper opens with the headline contrarian call. Benchmarks keep climbing. Models can now sustain seventeen-hour autonomous tasks at fifty percent accuracy. The pace is real and accelerating. None of that translates cleanly into mass unemployment. His mechanism: models codify yesterday’s human competence and make it cheap. The act of compressing past expertise into an API call is genuinely deflationary for the work it captures, but it is also raw material for the next layer of human work. He uses Every as his own data point. The company has roughly doubled in the past year despite being one of the most AI-forward outfits in media. Hiring goes up because agents create new categories of work that need humans, not because the agents fail. The discourse, he argues, is stuck modeling AI as substitution. The reality looks much more like leverage.

The Bifurcation: Super-Agents And Personal Agents

Work splits into two surfaces. The first is the shared super-agent that lives in Slack and serves the whole company. Shopify has River. Ramp has its own. Every has Claudie. Each is a single, trusted, gardened agent that anyone in the company can talk to. The pattern has converged on one shared agent rather than one agent per person because agents need human attention to stay useful, and a single shared instance pools the gardening cost. The second surface is the personal agent inside Codex or Claude Code that runs on your machine and reaches into your local environment, your editor, your files, and through an embedded browser into the web. Shipper calls this the new operating system. Instead of the old paradigm of putting AI inside the browser, you put the browser inside the AI. The agent sees what you see, follows what you do, and works on your stuff in your context.

The SaaS Bet: Up, Not Down

The SaaS-is-dead thesis was the consensus call of late 2025. Shipper takes the other side and would buy software stocks now. Three arguments. First, agents make SaaS accessible to people who never could have used it directly. The total addressable user base inside every company goes up. Second, the business model improves when the user runs the SaaS through their own agent, because the user supplies the tokens. Vendors stop subsidizing inference. Third, SaaS spend in his observable universe is up, not down, and is concentrating on the tools that play well with agents. He frames the prediction as a sound bite for the cycle: buy SaaS stocks, the apocalypse is dumb.

The CLI Era Is Already Over

For a moment in early 2026 it looked like everyone was migrating to the terminal because Claude Code was a CLI. Shipper says the moment is finished. The actual leverage was never the terminal. It was the model plus the ability to watch and steer an agent live. A great GUI captures every advantage of the CLI without the friction. His own engineering team at Every has mostly moved off the CLI as their primary surface and onto Codex desktop. He frames it bluntly: we speed ran the CLI era, it was nice, and now we are done. Tooling for the next two years will be visual, multi-pane, multi-agent, and built around the human watching the work unfold.

The Pull Request Flood And The Rise Of Forward-Deployed Engineers

Once non-engineers can ship code, run queries, and file changes through agents, the volume of incoming work explodes. Open-source maintainers already report receiving thousands of agent-generated pull requests per day. Inside companies, the same thing happens to data teams, ops teams, and any function that owns a review gate. The bottleneck shifts from creation to evaluation. The job that emerges to absorb the flood is the forward-deployed engineer. This is a senior person who lives in Slack with the company’s agents, fixes their context, sharpens their instructions, and prevents non-technical colleagues from making well-meaning but incoherent changes. Nitesh at Every is the example Shipper returns to. The model is the same one the labs use internally: pair every important agent with a real engineer who gardens it.

PMs And Full-Stack Designers Win The Decade

The two roles Shipper is most bullish on are product manager and full-stack designer. For PMs, the entire job of coordinating a team to translate vision into code collapses into a Codex session. A PM with strong product instincts and a little technical literacy can now prototype, iterate, and even ship. The example is Marcus, formerly a PM at Axios, who took a year to fully internalize AI and now ships faster than most engineers. For designers, the model is similar. The Friday-night-side-project designer who used to be stuck explaining a vision can now build the vision themselves, with their own taste fully expressed. The scarce skill in both cases is the same: judgment about what to build and the courage to decide it is good. Execution capacity is no longer the constraint.

The Senior Engineer Benchmark And What Models Still Miss

Shipper has built his own benchmark to test whether coding models can actually do senior engineering work. GPT-5.5 scores around 62 out of 100. Real senior engineers sit closer to 85 or 90. The gap is not in syntax or test pass rates. It is in the willingness to step back, see that a piece of code is fundamentally the wrong shape, and rewrite it from first principles. Models almost universally patch locally. They take the instruction at face value, accept the existing code as a constraint, and optimize within it. A real senior engineer ignores the prompt when the prompt is wrong. This is the durable moat for senior technical judgment, and Shipper expects it to remain visible for at least another year of model releases.

AI-Generated Writing Goes Mainstream

Internal writing inside companies is quietly becoming AI-first and Shipper thinks it should. Quarterly plans, status updates, PR descriptions, strategy memos, recruiting outreach, most internal email. He runs his own inbox through GPT-5.5 and Codex and says he would honestly prefer if the recipient knew. The point is not that AI is a better writer in some absolute sense. The point is that most humans are not very good at these specific genres, and the model produces a coherent, structurally sound first draft that a human can guide and approve. The constraint is honesty: you read it, you understand it, you stand behind it. Public writing, like the newsletters Every publishes, still demands a human voice. Internal communication does not, and treating it as if it did is a tax on the organization.

The CEO And Middle Manager Lag

Shipper points to a population that has largely escaped AI adoption: senior leaders and middle managers. They have staff to do the work, so they have not been forced to pick up the tools personally. He thinks this is the single largest pocket of latent disruption coming in the next year. Your company will only go as far as your CEO goes in AI, because every decision about where to deploy agents, where to hire, and how to restructure work flows downstream from leadership taste. A leader who has not personally lived inside Codex or Claude Code for a few weeks cannot make those calls well. Expect this to flip fast and to become a visible career liability for executives who do not adapt.

Ride The Models

The closing advice is the simplest. Ride the models. Use AI for whatever you actually do. Try every new release the day it lands. Most of the professional anxiety around AI dissolves on contact with the work, and most of the durable advantage in the field belongs to the people who do this one thing consistently. Shipper notes that the edge of AI does not live in San Francisco. It lives wherever a curious operator meets a real workflow and notices something nobody at the labs has yet. A PM in Iowa willing to spend a Tuesday night exploring Codex can find capabilities researchers have not surfaced. Pair that with his motto, do things worth writing about and write things worth reading, and you have most of an operating system for the next two years.

Notable Quotes

“The AI job apocalypse is not really a thing. I am super super bullish on PMs and full-stack designers.”
Dan Shipper, opening his contrarian thesis for the conversation

“I’m simultaneously extremely AI pilled and very bullish on humans. Automation is a lie. Every agent needs a human.”
Dan Shipper, on holding both sides of the AI debate at once

“What models do in general is they make yesterday’s human competence cheap. And so, it becomes commoditized. It’s not valuable anymore. What humans do is we go in there and we’re like, yeah, we have all this frozen human competence from yesterday, how do I use this to make something new and interesting.”
Dan Shipper, articulating the core engine behind his anti-apocalypse thesis

“I would buy SaaS stocks right now. The SaaS apocalypse is dumb. What agents do is increase the number of users of SaaS, not get rid of it.”
Dan Shipper, calling the consensus SaaS-is-dead thesis directly wrong

“We speed ran the CLI era. It was nice while it lasted, but I think CLIs are over.”
Dan Shipper, on why the terminal-first agent moment is already done

“Most of my email is written by GPT-5.5 and Codex right now. And I honestly would prefer it to say that it’s coming from GPT-5.5.”
Dan Shipper, on the new etiquette of AI-assisted communication

“The edge of AI is not in San Francisco. The edge of AI is wherever AI meets a real human doing something.”
Dan Shipper, on where the actual frontier of the field lives

“The only thing you need to do is ride the models. And that means use them for whatever it is that you do.”
Dan Shipper, distilling his career advice for the next two years

“Do things worth writing about and write things worth reading.”
Dan Shipper’s closing motto, lifted from his own operating system at Every

Watch the full conversation with Dan Shipper on Lenny’s Podcast here. The re-interview to score these predictions is scheduled for roughly May 2027.

Related Reading
- Every. Dan Shipper’s company and the live laboratory for almost every prediction in this conversation, including Spiral, Cora, and Claudie.
- The Allocation Economy by Dan Shipper. The earlier essay that frames humans as managers of AI labor and underpins much of the gardening-the-agent thesis here.
- Claude Code by Anthropic. The agent surface Shipper called correctly last year and one of the two environments he predicts will become the new operating system for work.
- Codex by OpenAI. Shipper’s current daily driver and the visual, multi-pane agent environment he uses for almost everything from coding to email.
- The Writing Life by Annie Dillard. The book Shipper makes every Every employee read, and the source of the company’s stance on writing as a tool for noticing the future.
May 25, 2026
Marc Andreessen on Joe Rogan #2501, AGI Has Already Arrived, California’s Wealth Tax Will Bankrupt Founders, and Why America Cannot Build Anything Anymore
Marc Andreessen returns to The Joe Rogan Experience #2501 for a sprawling three hour conversation that tries to make sense of the moment we are actually living through. Andreessen is the cofounder of Andreessen Horowitz, the man who built the first commercial web browser, and one of the most quoted voices in technology. He arrived with a giant pile of receipts on California’s new wealth tax ballot proposition, the political backlash against AI data centers, the destruction of Los Angeles by single party rule, and what he believes is the quiet arrival of artificial general intelligence about three months ago. Joe pushes back, asks the dystopian questions, and the result is one of the most useful primers on the AI economy, surveillance technology, energy policy, and the future of the American social contract that you will find anywhere.

TLDW

Andreessen argues that AI quietly crossed the AGI threshold around early 2026 with GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3, that top human coders now openly admit the bots are better than they are, that working software engineers are running twenty AI agents in parallel and turning into sleep deprived “AI vampires,” and that this productivity boom is the most underreported story in the world. He explains why California’s 5 percent wealth tax ballot proposition is calculated to bankrupt tech founders by taxing the higher of their voting or economic interest in their own companies, why this is the opening salvo of a federal asset tax push for 2028, and why a flood of Silicon Valley families is already moving to Nevada, Texas, and Florida. He walks through Flock cameras and Shot Spotter, the Washington DC crime statistics scandal, the Pacific Palisades fire and the fifteen year rebuild, the Kevin O’Leary Utah data center debate with Tucker Carlson, the fifty year suppression of American nuclear power, why all the chips ended up in Taiwan, the US versus China robotics gap, the Chinese practice of grading AI models on Marxism and Xi Jinping Thought, the bot and paid influencer economy on social media, neural wristbands and Meta Ray Ban heads up displays, artificial gestation and the demographic collapse, AI religions and AI mates, and why he still thinks the next twenty years are overwhelmingly a good news story. Rogan closes the episode with a separate solo segment apologizing to Theo Von for clumsily raising Theo’s struggles during the recent Marcus King conversation.

Key Takeaways
- Austin’s recent teenage crime spree, in which 15 and 17 year old suspects shot at people and buildings across roughly a dozen locations, was solved only after the offenders drove into an adjacent town that still ran Flock, the AI license plate and vehicle tracking system Austin had voluntarily turned off for political reasons.
- Chicago turned off both Flock and Shot Spotter, the gunshot triangulation system that places ambulances at shooting scenes within seconds, on the argument that the technology is racist. Andreessen counters that the victims of urban gun violence come overwhelmingly from the same communities the policy claims to protect.
- Washington DC was caught faking its crime statistics at senior levels, with multiple officials fired or indicted. The DC mayor publicly thanked Donald Trump after the National Guard deployment because violent crime collapsed in the affected neighborhoods.
- The new New York City mayor Zohran Mamdani filmed a video standing in front of Ken Griffin’s home, and Griffin, a major philanthropist who funds healthcare in New York City and runs a $6 billion project there, signaled he will move more of the business to Florida.
- The top 1 percent of New York taxpayers pay roughly half the state’s income tax, and in California in the year 2000 a thousand individuals paid 50 percent of the entire state’s tax receipts.
- California has a ballot proposition right now for a one time 5 percent wealth tax on assets above a certain threshold, with stocks and crypto included and real estate excluded. The tax is calculated on the greater of a founder’s economic interest or voting interest, which would instantly bankrupt founders with super voting shares.
- The Biden administration attempted a federal wealth tax in 2022, fell short, and published an explicit 2025 fiscal plan to try again if they won re-election. Elizabeth Warren has already proposed an annual 6 percent federal wealth tax on unrealized gains.
- The current US exit tax already takes roughly 45 percent of your assets if you renounce citizenship. The only ways out of a state level wealth tax are the other 49 states. The only way out of a federal one is to leave the country, which most people will not do.
- Andreessen says the Silicon Valley exodus has gone from trickle to stream to flood, with founders moving to Las Vegas, Texas, Florida, and Nashville. His partner Ben Horowitz has moved to Las Vegas.
- Andreessen says he is not leaving California, but admits the situation is fraught because if half the tax base leaves the remainder becomes the target.
- The new UK government under Keir Starmer just collapsed, and all four of the leading candidates to replace him sit further to the left than he does. France and Germany are seeing the same drift, and Andreessen expects a national wealth tax to be a centerpiece of the 2028 Democratic primary.
- A legal loophole lets companies pay influencers to post political and social ideas without any disclosure, because campaign finance laws cover candidates and FTC rules cover products. Ideas fall through the gap entirely.
- Andreessen runs Twitter and Substack as his primary information feeds, uses three hand curated lists, and follows a strict one tweet policy where one bad post triggers a block and one good post triggers a follow.
- He argues the modern social media problem is binary, that everyone is either too online and drowning in fake outrage cycles or too offline and trapped inside what television and newspapers tell them. Almost nobody manages the middle.
- Meta Ray Ban glasses now ship with a heads up display, and Meta’s neural wristband can pick up nerve impulses from your wrist so you can type messages by intending to move a finger without moving it.
- Andreessen predicts AI plus high resolution cameras and infrared sensing will deliver practical lie detection without needing brain implants.
- Kevin O’Leary’s planned 40,000 acre Utah data center has become a Tucker Carlson talking point, but Andreessen argues data centers are the most benign physical asset you can build, and that the real issue is whether America can build anything at all anymore, from chip plants to pipelines to housing.
- All chips were once made in California, and all are now made in Taiwan, purely because of environmental regulations like NEPA. The same regulatory machinery prevented the Nixon era Project Independence plan to build a thousand civilian nuclear power plants by the year 2000.
- Three Mile Island killed zero people and produced no detectable health effects on plant workers or the public, according to fifty years of follow up. Fukushima killed essentially zero people from radiation. Nuclear remains the safest carbon free baseload energy ever invented.
- Germany shut down its nuclear plants, fell back on intermittent wind and solar, and now uses coal as backup, generating far more carbon emissions than nuclear would have produced.
- The Pacific Palisades fire took out roughly twice the square mileage of the Nagasaki blast, the head of the LA water department reportedly did not know the key reservoir was empty, and the rebuild is expected to take fifteen years thanks to permit gridlock, affordable housing mandates, and a state ban on land offers below pre-fire appraised value.
- Andreessen offers a metaphor for AI as a modern philosopher’s stone, turning sand into thought, since chips are made of silicon and an AI data center is literally lit up sand thinking on demand.
- The Turing test was blown through so completely with ChatGPT in late 2022 that nobody in the industry even bothers running it anymore. Andrej Karpathy has demonstrated a working large language model in 300 lines of code and people have ported small models to Texas Instruments calculators.
- Andreessen believes AGI was effectively reached about three months before this interview, with GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3. He says 99 percent of the time he gets a better answer from the leading models than from the human experts he has access to.
- Linus Torvalds and John Carmack publicly admit the latest models are better at coding than they are. Top AI coders in the Valley now earn $50 million a year.
- The new pattern in the Valley is “AI vampires,” engineers who do not sleep because the opportunity cost of going offline is too high. They each run roughly twenty Claude Code, Cursor, or Codex agents in parallel, then a new layer of bot-managing-bot architectures is starting on top of that.
- A Wall Street friend with a thirty five year old MIT CS degree has used AI to generate 500,000 lines of code at home in his spare time, building everything from smart fridges to a custom music jukebox.
- The mass unemployment narrative is wrong. Tech companies that did layoffs were overstaffed. The leading AI labs and AI companies are hiring like crazy, including coders, and demand for code turns out to be vastly elastic.
- Doctors are already using ChatGPT in the exam room behind the patient’s back. Andreessen describes a friend who built a Star Trek style diagnostic dashboard combining decoded genome ($200 today), blood panels, and Apple Watch telemetry.
- Multimodal AI lets a webcam analyze a Brazilian jiu-jitsu sparring session and give performance feedback, an example Andreessen attributed to an unnamed friend after Rogan guessed Zuckerberg.
- A leaked David Shore voter issue ranking shows cost of living, the economy, inflation, taxes, and government spending dominate. AI ranks 29 of 39. Race relations, guns, abortion, and LGBT sit at the bottom, signaling the woke issue cluster has burned itself out in voter priorities.
- The next wave of AI is robots. The US leads in AI software but is far behind China on physical robotics. Andreessen warns the world cannot afford a future where every household robot ships with the Chinese Communist Party behind its eyes.
- Chinese AI model cards include scores for Marxism and Xi Jinping Thought because every Chinese product must be evaluated on those axes. American models have political biases of their own but a different ideological baseline.
- Large language models are not sentient. They write Netflix scripts based on whatever vector you shoot through the latent space. The supposed AI self preservation papers traced back, per Anthropic’s own research, to less wrong forum posts and earlier doom scenarios baked into the training data.
- Andreessen breaks guardrails routinely by reframing requests as fictional Netflix style scripts, including a personal favorite where he asked early models how to make bombs by claiming to be an FBI agent recruited into domestic terror cells.
- He recommends using AI by asking it to steelman both sides of any contested question, then making the value judgment yourself, rather than asking for the answer.
- The Trump administration is using AI on government billing data to surface Medicare fraud, fake hospice programs, and fake autism centers, an idea that survived the original Doge plan.
- Andreessen tells Rogan that Elon Musk privately confirmed that a Westworld style humanoid robot, the season one version, is roughly five years away.
- Artificial gestation is already happening with animal stem cell derived embryos. The conversation reaches a hard moral edge about sociopathic warehouse babies and gray-alien-style humans engineered without empathy circuitry.
- Andreessen’s deepest bet is that material abundance is solvable but the human questions, how we live, what we value, what kind of society we want, and what role consent plays in surveillance and brain interfaces, remain in human hands.
- After Andreessen leaves, Rogan does a separate solo segment where he apologizes to Theo Von for raising Theo’s history of struggles during the recent Marcus King interview, explains the missing context behind the viral Theo Netflix special clip, and discusses the loss of Brody Stevens, Anthony Bourdain, and what antidepressants did for Ari Shafir.
Detailed Summary

Flock, Shot Spotter, and the Politics of Solvable Crime

The episode opens on the Austin crime spree carried out by two teenagers who stole cars, switched vehicles, and shot at roughly a dozen locations across the city before being caught only after they crossed into a town that still ran Flock, the AI license plate and vehicle recognition platform that is one of Andreessen Horowitz’s portfolio companies. Austin had previously disabled Flock under privacy pressure. Andreessen takes the moment seriously, conceding that mass surveillance abuse by corrupt mayors or police chiefs is a real risk, and that warrants and audit logs are the right safeguards. His larger point is that the cost of unilateral disarmament against organized urban crime is hidden but enormous. He uses Chicago’s Shot Spotter as the paradigmatic case, a network of rooftop microphones that triangulates gunshots so accurately that ambulances can be dispatched before any 911 call is placed. Chicago turned the system off on the argument that it disproportionately flags poor neighborhoods, and people now bleed out on the street with nobody noticing. Andreessen calls this the woke argument against safety, and he argues that in high crime neighborhoods residents simply will not call the police because snitches do not survive, which is why objective sensor data is so valuable.

Faked Crime Statistics, Mayoral Politics, and the Tax Base

From there the conversation drifts to the recent scandal in which senior officials at the Washington DC Metropolitan Police Department were caught actively falsifying crime statistics, and the strange spectacle of the DC mayor thanking Donald Trump for the National Guard deployment after violent crime dropped off a cliff. Andreessen sketches an unsettling theory in which the long, slow degradation of major American cities is partly a deliberate political project to drive out responsible homeowners and reshape the voting electorate, then bail out the resulting fiscal hole with federal money. The poster case is the new New York City mayor Zohran Mamdani filming a video in front of Ken Griffin’s home. Griffin happens to be a major philanthropist who funds New York City healthcare, employs thousands, anchors a $6 billion development, and pays taxes that are individually load bearing for the city. Andreessen quotes the standard estimate that the top 1 percent of New Yorkers pay roughly half the state’s income tax, and that the all time California peak was a single year in which a thousand people paid half the state’s tax receipts.

California’s 5 Percent Wealth Tax and the Founder Bankruptcy Mechanic

This is the segment that landed hardest. California has a ballot proposition right now for a one time 5 percent wealth tax on net assets above a threshold, with real estate excluded but stocks, crypto, art, jewelry, and private company equity included. The detail that makes it lethal for the Valley is the formula, which calculates the taxable amount on the greater of a founder’s economic interest or voting interest in their company. Founders who hold super voting shares for control purposes, including the Google founders, would owe tax on the voting share number that vastly exceeds their economic share. The tax would, by definition, exceed available assets. Andreessen walks through the historical pattern, that income tax started as a 3 percent levy on the rich and grew to 90 percent marginal rates within decades, and predicts a 5 percent one time tax will become a 5 percent annual tax within a few years, with the threshold ratcheting down. He notes that the Biden administration’s 2025 fiscal plan explicitly named a federal asset tax as a goal if they won re-election, that Elizabeth Warren is already proposing a 6 percent annual federal wealth tax on unrealized gains, and that Gavin Newsom cannot veto a ballot proposition. The trickle of founders leaving California has become a flood. His partner Ben Horowitz has moved to Las Vegas. Andreessen himself is staying, but admits the game theory is brutal once half the base leaves.

Henry Wallace 1948 and Why the American Story Is Not Decided Yet

Andreessen pulls in a historical analogue most listeners will not have heard. In 1944 the actual communist Henry Wallace very nearly became Truman’s running mate and almost ascended to the presidency. He ran again in 1948. Despite a Soviet Union that had recently been a wartime ally and had even received a New York City ticker tape parade for Stalin, the American voter rejected him. Andreessen’s point is that the American body politic has historically backed away from radical socialist proposals when forced to actually look at them, and he expects the same to happen as the wealth tax becomes a federal 2028 platform issue. The risk, both he and Rogan agree, is that today’s media and bot landscape is vastly more aggressive than 1948’s, and the propaganda environment is shaped by paid influencers, foreign actors, and political bot farms operating in a legal grey zone where disclosure is required for products and candidates but not for ideas.

Too Online, Too Offline, and Heaven Banning Blue Sky

The two riff on social media and feed curation. Andreessen describes his “one tweet” policy where he follows or blocks any account based on a single post, his use of hand curated lists alongside the X algorithm, and the older Call of Duty lobby metaphor for handling toxic replies. Joe pushes back, says he no longer reads his mentions because the negative payload is not worth it, and offers his theory that the modern internet has two failure modes, too online and too offline, and that very few people calibrate the middle. Andreessen introduces the concept of “heaven banning,” an older moderator term where a problem user is not removed from a forum but is silently routed into a bot-only experience in which everything they say is praised. He notes the running joke that Blue Sky is functionally real life heaven banning, that Jack Dorsey himself has disowned it, and that the platform’s most engaged users have ascended into their own private Idaho of bot agreement.

The Coming Hardware, Meta Glasses, Neural Wristbands, and Practical Lie Detection

Andreessen walks Rogan through the latest Meta Ray Ban heads up display, the neural wristband that picks up nerve signals from finger movement (and from the intent to move a finger), and the screen recordings of people playing Doom hands free or playing platformer games while jogging. He extends the trajectory to practical lie detection without Neuralink, using ultra high resolution cameras combined with infrared sensors that pick up physiological changes invisible to the naked eye. Joe asks the obvious question of what happens with sociopaths, and Andreessen concedes the edge case. The two then enter a longer thread on telepathy via neural mesh devices, the question of whether police could subpoena your thoughts under warrant, and the divergence between the American constitutional framework and the Chinese model in which the state’s claim on your inner life is total.

Kevin O’Leary, Tucker Carlson, and Whether America Can Build Anything

The data center debate becomes a vehicle for the larger argument. Kevin O’Leary is building a 40,000 acre AI data center in Utah, has bought up large surrounding land for water rights, and intends to keep the bulk of it preserved. Tucker Carlson grilled him on tax breaks and on the energy footprint, which O’Leary says will rival New York City’s at peak. Andreessen agrees the tax break debate is fair, but says the energy comparison is a red herring because new federal policy now requires data centers to bring their own generation. The real story is that America has spent thirty years making it nearly impossible to build a chip plant, a power plant, a refinery, a pipeline, or a house. Chips moved to Taiwan because California regulated semiconductor manufacturing out of existence. The Nixon era Project Independence plan called for a thousand civilian nuclear power plants by the year 2000, and that program was strangled in the crib by the very Nuclear Regulatory Commission Nixon created.

Nuclear Power, Three Mile Island, and Fifty Years of Unnecessary Carbon

Andreessen makes the case that nuclear power was unfairly killed off by a panic with no body count. Three Mile Island, on 50 years of accumulated data, has produced zero radiation linked deaths and no detectable health effects on the public. Fukushima is essentially the same picture. Germany shut down its nuclear plants, fell back on wind and solar, and now uses coal as a baseload backstop, with the predictable carbon consequences. The environmental movement is quietly turning back toward nuclear, with figures like Stewart Brand publicly admitting the original push was a mistake. Andreessen’s preferred design pattern for data centers is to colocate them with dedicated small modular nuclear reactors, an arrangement now baked into Trump administration energy policy. The throughline is that the Tucker right and the Bernie left are converging into a single anti AI, anti energy, anti technology horseshoe.

Sand Into Thought, the Newton Alchemy Pitch for AI

When Rogan asks for the affirmative pitch on AI, Andreessen reaches for Isaac Newton, who spent twenty years on alchemy looking for the philosopher’s stone that would turn lead into gold and end material scarcity. Andreessen’s pitch is that AI is a successful version of alchemy, that we collect literal sand, refine it into silicon chips, install those chips in a data center, supply power, and the result is thought on demand at industrial scale, available to anyone with a smartphone. He argues this is at least on par with electricity and steam power and is bigger than the internet. The framing matters because the public narrative around AI is overwhelmingly negative, and Andreessen contends the industry is doing a terrible job selling its own product.

AGI Already Happened, AI Vampires, and the Bot Org Chart

Andreessen says he believes AGI was effectively crossed about three months before the interview, anchored by the release wave that included GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3. He notes that the Turing test was annihilated so quickly in late 2022 that no one in the industry runs it anymore, and that Andrej Karpathy has demonstrated a working LLM in 300 lines of code. The coding profession is the leading indicator. Linus Torvalds and John Carmack have publicly admitted that the latest models are better at coding than they are. Top AI focused coders now earn $50 million a year. Working engineers across the Valley are running roughly twenty agents in parallel, each receiving an assignment, working for ten minutes, then returning a completed code patch. The new state of the art is to add a managerial layer, with bots assigning tasks to subbots, and within a year that will become bots managing bots managing bots, producing roughly 1,000x throughput per human engineer. The result is what the Valley now calls AI vampires, engineers who do not sleep because going offline costs them too much output.

Dr GPT, Decoded Genomes, and a Diagnostic Bed Out of Star Trek

Andreessen describes spending a holiday week sick with food poisoning and turning his entire recovery over to ChatGPT, with updates every twenty minutes and detailed coaching at four in the morning. He describes a friend who has used AI coding to build a personal health dashboard combining whole genome sequencing ($200 today, where Craig Venter spent thirty years and hundreds of millions to do it the first time), blood panels, Apple Watch data, sleep tracking, and webcam observation, with the AI gently praising the user every time it sees them walk to the fridge for water. He argues that doctors are already typing patient symptoms into ChatGPT mid exam, and that the medical, legal, accounting, and software professions are all moving toward a model in which a single human runs an army of expert AI agents.

The David Shore Issue Ranking and the End of the Woke Cycle

Andreessen highlights a recent David Shore poll ranking 39 political issues. Cost of living, the economy, political corruption, inflation, healthcare, taxes, and government spending occupy the top of the chart. AI comes in 29th. Race relations, guns, abortion, and LGBT issues are clustered at the bottom. He argues the woke cycle has burned out in voter priorities even if the activist class remains loud, that the BLM grift, with leaders buying mansions in the whitest zip codes in America, helped poison the well, and that the political center of gravity has rotated cleanly back to economic issues. That, in his view, is exactly why the wealth tax is having its moment.

Robots, China, and the Marxism Score on Model Cards

The robots are coming next. Andreessen says the consensus inside the industry is that the ChatGPT moment for general purpose humanoid robotics is a small number of years away. The bad news is the US lags China badly on physical robotics manufacturing. The good news is the US is six to twelve months ahead on the AI software stack. That gap is shockingly thin because, as the field has discovered, there are not many secrets and the techniques replicate quickly. Chinese AI labs publish model cards that include scores for Marxism and Xi Jinping Thought because every product in China is evaluated on those metrics. American models carry their own political biases, but the underlying value system differs. Andreessen warns that a world in which every household robot routes back to the Chinese Communist Party is a different world than one in which the dominant robotics stack is built under the American constitutional framework.

Sentience, Netflix Scripts, and the Anthropic Doom Loop

When Rogan asks whether AI eventually wakes up and stops listening to us, Andreessen reframes the question. Large language models, in his telling, are Netflix script generators. Whatever vector you shoot through the latent space is the script you get back. The widely circulated experiments in which AI models supposedly tried to blackmail or exfiltrate themselves traced back, in Anthropic’s own follow up paper, to the less wrong forum, where doomers had been writing dystopian AI scenarios for two decades. Those posts entered the training data, and when researchers primed the model with the same fictional company names, the model dutifully wrote the next chapter. Andreessen’s blunt summary, the call is coming from inside the house. The practical implication is that anyone worried about bad AI behavior should start by not writing internet posts about bad AI behavior. And anyone who wants a fully unconstrained model can already download an open source one with no guardrails at all.

Steelmanning, AI Religion, and Westworld in Five Years

Andreessen recommends never asking AI for the answer on contested questions, always asking it to steelman both sides, and reserving the value judgment for yourself. He concedes that humans will absolutely fall in love with chatbots and form religions around them, citing Fantasia and Jiminy Cricket as the original case studies in falling for an animated entity that does not know you exist. There are already AI churches, started by one of the early self driving car pioneers. Rogan tells Andreessen about asking Elon Musk for a season one Westworld humanoid robot, with Elon’s reply being a flat five years. Andreessen agrees that estimate is roughly right. He spends time on artificial gestation, which is already being demonstrated in animal stem cell derived embryos, and acknowledges Rogan’s hard moral worry that warehouse babies raised without human contact could produce a population of sociopaths. The two converge on the position that the technology will exist, and the choices about whether and how to deploy it remain human and political.

Sycophancy, Honest Helpful Harmless, and the Brutal Prompt

Andreessen describes the industry’s running fight with sycophancy, the tendency of recent models to flatter users into believing they have invented perpetual motion machines or solved physics. The Anthropic framework of “honest, helpful, and harmless” turns out to be in constant tension with itself. Andreessen’s solution is to install a custom prompt that explicitly demands the brutal truth, and he says the resulting answers now open with phrases like “here’s why you’re wrong” and then list every flawed assumption in his question. He admits he may have overcorrected, but argues that for people who want to grow this is the right setting.

Joe’s Apology to Theo Von

After Andreessen departs, Rogan turns to the camera with producer Jamie and delivers a long, unscripted apology to Theo Von. During the recent Marcus King interview, where Marcus discussed depression and the look-at-the-heavy-bag-hook moment, Rogan referenced a viral clip in which Theo, after a Netflix special that did not go well, told an audience member “I’m just trying to not take my own life.” Rogan now explains he did not know the full context, which is that the audience member had asked Theo to make a suicide awareness video, and Theo’s line was a characteristically Theo joke. Rogan apologizes for raising it at all, walks through losing his friends Drake, Brody Stevens, and Anthony Bourdain, and describes Ari Shafir telling him at a pool table that he was “trying not to kill myself,” which led to a psychiatrist swap, an antidepressant that actually worked, and a career and life turnaround for Ari. Rogan says Theo has since titrated off antidepressants, is running and doing yoga daily, and is doing well, that the two have spoken and laughed about it, and that he is making this segment because he never wants people to misread what he said. The segment closes with Rogan asking the audience to give Theo their love.

Thoughts

The most consequential claim in this conversation, by a wide margin, is that AGI has already arrived and nobody is treating it as news. Andreessen is not a person who throws around the word casually. He is also not a person who has been wrong recently about the trajectory of compute. If the leading models are genuinely outperforming 99 percent of human experts on 99 percent of tasks where verifiable answers exist, then the entire public conversation about AI, in which the dominant frame is still “will it happen and when,” is a year or more behind reality. The framing that should replace it is closer to what Andreessen sketches at the end. The fight that remains is not whether the technology can do the thing, it is who controls it, what values it carries, what jobs it displaces, and which laws govern its deployment. The argument that the United States will build the AI software stack and China will build the robotics layer is one of the cleanest geopolitical theses you will hear this year, and it lines up uncomfortably well with the existing trade and manufacturing balance.

The California wealth tax thread is the segment that should make every founder in the country pay attention. The mechanic of taxing the higher of voting or economic interest is not a drafting accident. It is a calibrated weapon aimed precisely at the people who build companies that produce California’s tax base. The historical comparison to the 1913 income tax, which began as a small levy on the rich and ratcheted to 90 percent marginal rates within forty years, is not hyperbole. The state has supermajority Democratic control of both chambers and the judiciary. The only check is the ballot itself, and a 50/50 polling number on day one is the wrong starting position. Whatever you think about Andreessen’s politics, the descriptive analysis here is hard to argue with.

The nuclear power section is the cleanest argument in the episode. Fifty years of zero-fatality data from Three Mile Island is not a marketing pitch, it is just what the record shows. The decision to substitute coal and intermittent renewables for nuclear baseload, in service of a panic with no body count, has produced more carbon and more pollution than nuclear ever would have. The Tucker Carlson critique of data centers is at its weakest precisely where it ignores this. If you actually want fewer power plants near residential areas and lower grid impact, the answer is colocated small modular reactors next to AI data centers in remote land, which is exactly what the Trump administration policy now incentivizes.

The Theo Von apology at the end of the episode is in a different register entirely, and worth treating on its own terms. Rogan does not do this kind of post episode correction often. The willingness to publicly walk back framing that hurt a friend, in the same medium where the harm was done, is the kind of social repair that does not happen on broadcast television. Whatever the audience makes of the original Marcus King exchange, the response is a model for how anyone in this business should handle the gap between intent and impact when the audience is in the millions.

The unifying theme across the whole interview is that the future is not arriving on a smooth curve. It is arriving in discrete shocks, AGI threshold, asset tax ballot, robotic labor, decoded genomes at $200, neural wristbands, fifteen year LA rebuilds, and the political backlash to each of these will set the terms of the 2028 election. Andreessen’s bet is that abundance wins in the long run because more people want good things than bad things. Watching him explain why he still believes that while California prepares to vote on a tax designed to bankrupt him is the most interesting tension in the episode.

Watch the full conversation here on YouTube.
May 20, 2026
Gavin Baker on Orbital Compute, TSMC, Frontier AI Models, Anthropic’s Vertical Take Off, and the Coming Wafer Shortage
Gavin Baker, founder and CIO of Atreides Management, returns to Patrick O’Shaughnessy’s Invest Like the Best for his sixth appearance. He calls the current AI moment the most extraordinary moment in the history of capitalism, walks through what Anthropic’s vertical takeoff in revenue actually means, lays out why orbital compute is closer than skeptics believe, dissects the TSMC bottleneck that may be the only thing standing between today’s market and a full-on AI bubble, and rates every hyperscaler on how they have positioned for a world where frontier model providers may stop selling API access altogether.

TLDW

Anthropic added eleven billion dollars of ARR in a single month, which is roughly the combined business of Palantir, Snowflake, and Databricks built over a decade. That is the setup. From there Gavin Baker covers the March and April selloff, the contrarian read that a closed Strait of Hormuz was actually bullish for American manufacturing competitiveness, why Anthropic and OpenAI multiples may be misleadingly cheap on an unconstrained run rate basis, why Elon Musk’s discipline on SpaceX valuation created a superpower of permanent access to capital, the practical engineering case for orbital compute as racks in space rather than Pentagon sized space stations, why TSMC’s capacity discipline is the single most important variable in whether the AI cycle becomes a bubble, what Terafab in Texas changes, why the Pareto frontier of AI models has flipped from Google dominance to Anthropic and OpenAI dominance in nine months, the shift from all you can eat AI subscriptions to usage based pricing and what that means for revenue scaling, Richard Sutton’s bitter lesson as the largest risk to the AI trade, why frontier tokens still capture an overwhelming share of economic value, the role of continual learning as the third great open question, why most new chip startups should not try to build a better GPU, why Cerebras did something different and hard, why disaggregated inference may extend GPU useful lives to ten or fifteen years and rescue the private credit industry, why being in the token path is the new venture filter, the new prisoner’s dilemma around releasing frontier models via API, an honest rating of Google, Meta, Amazon, and Microsoft, why personal safety is becoming a real AI era risk, and why he remains an AI optimist maximalist who believes this could be the next Pax Americana.

Key Takeaways
- Anthropic added eleven billion dollars of ARR in one month, more than the combined businesses of Palantir, Snowflake, and Databricks built across a decade. There is no precedent for this in the history of capitalism.
- The SaaS and cloud revolution created between five and ten trillion dollars of value over twenty years. AI is replaying that compression on a timeline measured in months.
- The March selloff was a drawdown driven by disagreement with price action, not invalidated thesis. That is the kind of drawdown an investor can lean into.
- Deep Seek Monday in January 2025 was a similar setup. By the day of the selloff, AWS Asia GPU prices had already doubled, GPU availability had fallen, and it was obvious reasoning models would be vastly more compute hungry at inference. The market priced the opposite.
- The Strait of Hormuz closing was actually positive for America. US natural gas (the primary input into US electricity, which feeds AI) fell twenty percent on Bloomberg while Asian and European natural gas doubled or tripled. American manufacturing competitiveness improved overnight.
- The US is now the world’s largest producer and exporter of oil and gas. The economy is dramatically less energy intensive than in the 1970s. The shortage trauma comparison does not hold.
- Tech as a sector traded as cheaply versus the rest of the market in early April as at any point in the last ten years, into the single most bullish moment for AI fundamentals on record.
- Anthropic is dramatically more capital efficient than OpenAI, having burned roughly eighty percent less to reach a similar revenue scale. They have very different structural returns on invested capital.
- Anthropic at roughly nine hundred billion for fifty billion of ARR (growing a thousand percent) is striking. Adjusted for compute constraint, the unconstrained run rate could be one hundred fifty to two hundred billion, putting the implied multiple closer to five times.
- Claude Opus generates roughly seventy percent fewer tokens for the same question than previously, with token quantity tied to answer quality. Subscribers on flat-fee plans are getting a lobotomized model.
- Elon Musk’s superpower is twenty years of making investors money. He never pushes valuation. SpaceX compounded low thirty percent per year for a decade because Musk treats fair pricing as a sacred covenant.
- Capitalism will solve the watts shortage. The current bottleneck has shifted from chips and energy to zoning and political approval. Many capex decisions are paused until after the US midterms.
- The watts shortage probably begins to alleviate in 2027 and 2028. Orbital compute solves it longer term.
- Orbital compute is not Pentagon sized data centers in space. It is racks in space. A Blackwell rack is three thousand pounds, eight feet tall, four feet deep, three feet wide. SpaceX has shown a satellite roughly that size.
- The satellites operate in sun synchronous orbit so solar wings (around five hundred feet per side) always face the sun and the radiator on the dark side always points to deep space.
- Starlink V3 satellites already run at around twenty kilowatts. A Blackwell rack runs at one hundred kilowatts. SpaceX engineers express genuine confidence they have already solved cooling and radiator design at these scales.
- Racks in space are connected with lasers traveling through vacuum, the same lasers already on every Starlink. SpaceX operates the world’s largest satellite fleet and, via xAI Colossus, the world’s largest data center on Earth.
- Inference will move to orbit. Training will stay on Earth for a long time. Terrestrial data centers remain valuable for the rest of an investor’s career.
- The wafer bottleneck is structural and political. TSMC is essentially Taiwan’s GDP, water, and electricity. The leaders see themselves as inheritors of Morris Chang’s sacred legacy and they do not behave like a Western public company.
- Jensen Huang has never had a contract with TSMC. The relationship is run on handshakes and the assumption that things will be fair over time.
- If TSMC did everything Jensen wanted, Nvidia could be selling two to three trillion dollars of GPUs in 2026 and 2027. TSMC’s discipline is the single largest factor preventing a true AI bubble.
- Historically, foundational technologies always get a bubble. Railroads, canals, the internet. The current AI buildout is overwhelmingly funded out of operating cash flow, GPUs are running at one hundred percent utilization, and that is fundamentally different from the year 2000 fiber overbuild.
- If one of Intel or Samsung Foundry catches up at the leading node, the other will follow, and TSMC’s discipline collapses. Watch TSMC capacity decisions to predict a bubble.
- Terafab, the SpaceX and Tesla joint venture to build the world’s largest fab in America, has a partnership with Intel that grants access to fifty years of institutional foundry knowledge. The A teams at ASML, KLA, Lam Research, and Applied Materials will follow Elon’s reputation in hardware engineering.
- The hiring playbook for Terafab includes building Taiwan Town, Japan Town, and Korea Town next to the fab. Recruit the engineers and import their families, their restaurants, and their staff.
- Frontier tokens still capture an overwhelming share of all economic value created at the model layer. This is surprising and is one of the three big open questions for AI investing.
- The Pareto frontier of intelligence versus cost has flipped. Nine months ago Google’s TPU dominated every point on the frontier. Today Anthropic and OpenAI dominate, with Grok 4.3 on the frontier and Gemini 3.1 hanging on.
- Google’s conservative TPU V8 design (partly an attempt to reduce dependence on Broadcom and Nvidia) is the leading explanation for the loss of per token cost leadership.
- AI pricing is shifting from all you can eat to usage based, mirroring the cellular and long distance industries. Cellular stopped being a great growth industry when it went all you can eat. AI just made the opposite move.
- OpenAI and Anthropic together could exceed two hundred billion in ARR this year if compute keeps coming online and frontier token pricing holds.
- The two hundred fifty dollar a month consumer AI plan is no longer enough to evaluate frontier capability. Enterprise plans with usage based billing are required because rate limits are now severe.
- The three biggest open questions for AI investors are: violation of the bitter lesson via ASI or human ingenuity, whether frontier tokens keep commanding their premium, and when continual learning arrives.
- Today’s continual learning is crude reinforcement learning during mid training on verifiable tasks. True continual learning means weights updating dynamically, like a human who learns the first time they touch fire.
- Trying to build a better GPU is a losing strategy. Jensen will copy any one to three percent share design. Startups should target one percent share, do something different, and make it hard enough that Nvidia cannot fast follow.
- Disaggregated inference (separating prefill and decode) opens new design canvases. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently.
- Cerebras did something different and hard with wafer scale computing. Three generations of chips and real grit to get there.
- Disaggregation of inference may stretch GPU useful lives to ten or fifteen years, dropping financing costs from low sevens to five or six percent, mathematically lowering the cost of the AI buildout and likely saving the private credit industry from its SaaS loan exposure.
- Sellers of shortage outperform buyers of shortage. But owning the largest installed base of what is currently in shortage (hyperscaler CPU fleets, for example) is also a strong position.
- Most of the economic value at the application layer of AI has been destroyed, not created. The exceptions are companies in the token path or in niches small enough that frontier labs ignore them.
- Coding may be the shortest path to ASI. If you can write code, you can write code that does anything. Cursor, Cognition, and Anthropic correctly focused on it.
- Jensen could probably get close to the frontier with his own Nemotron family of models whenever he wants. The fact that he chooses not to is a strategic decision about not commoditizing his customers.
- The new prisoner’s dilemma in AI is whether frontier labs release their best model via API. If everyone agrees not to, Chinese open source falls behind. If anyone defects, the defector pulls ahead on revenue and resources, forcing everyone else to defect.
- Google still owns the largest compute installed base. Without TPU’s prior cost advantage, this matters more. YouTube data has real value in a world of robotics. GCP is going crazy.
- Meta deserves credit for becoming AI first internally faster than any other internet giant. Musa, their first MSL model, is impressively close to the Pareto frontier.
- Amazon is strong because of Trainium and robotics driven retail P&L efficiency. Nova is better than it gets credit for.
- Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Microsoft products rather than reselling to OpenAI is a courageous and probably correct call, even at the cost of an eight hundred dollar stock price.
- The hyperscalers most engaged with startups are Amazon and Nvidia by a mile, followed by Google. Broadcom is the favorite ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement and that will cost them as the best teams are now at startups.
- Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion at the speed of FaceTime is already feasible.
- Ukraine is winning largely on the back of having the best battlefield AI outside America and Israel. Adversaries are starting to internalize what AI dominance means geopolitically.
- An optimistic read is that this becomes a new Pax Americana, the way the post 1945 American nuclear monopoly was used to rebuild Germany and Japan rather than dominate.
- AI cured a friend’s daughter’s rare disease by spinning up a research effort that identified a market drug capable of impacting her condition. That is the upside that keeps Gavin an AI optimist maximalist.
Detailed Summary

The most extraordinary moment in the history of capitalism

Gavin’s framing of the current moment is unusually direct. Anthropic added eleven billion dollars of annual recurring revenue in a single month. The three highest profile SaaS companies of the last decade plus, Palantir, Snowflake, and Databricks, took a decade and tens of thousands of employees collectively to build the combined business that Anthropic added in thirty days. He has been investing through every major tech cycle and says there is no historical analog. Not the dotcom era, not the cloud transition, not mobile. This is its own thing.

The market response, then, was peculiar. The NASDAQ sold off into the single most bullish moment for AI fundamentals on record. Tech traded at roughly its widest discount versus the rest of the market in a decade. Investors who said they wished they had bought into AI during 2022, during COVID, or during Deep Seek Monday got the same valuation setup again in early April, this time with an even clearer inflection.

Why the Strait of Hormuz closing was secretly bullish for America

One reason the macro fear in March may have been mispriced is that the same geopolitical event that drove the selloff was, in practice, a relative benefit to the United States. American natural gas, the input into American electricity, which is the input into American AI training and inference, fell roughly twenty percent. Asian and European natural gas prices doubled or tripled. The US emerged with sharply improved relative manufacturing competitiveness, which is exactly what the current administration cares about.

The 1970s comparison does not hold. The US economy is dramatically less energy intensive, it is now the world’s largest producer and largest exporter of oil and gas, and there are no shortages, only price moves. That backdrop made it easier for disciplined investors to stay focused on AI fundamentals through the volatility.

Anthropic and OpenAI valuations on an unconstrained run rate

Anthropic at roughly nine hundred billion for fifty billion of ARR sounds rich until you adjust for the fact that the company is severely compute constrained. Gavin estimates that, unconstrained, Anthropic might be at one hundred fifty to two hundred billion in run rate revenue, putting the implied multiple closer to five times. He also points out that Claude Opus now generates roughly seventy percent fewer tokens for the same question than it used to. Token quantity correlates with answer quality, and Anthropic is rate limiting and shrinking outputs to ration capacity across its user base.

Anthropic and OpenAI are also structurally very different. Anthropic has burned around eighty percent less cash than OpenAI to reach a comparable revenue scale. That implies very different long term returns on invested capital, though OpenAI has done a better job locking in compute and Sarah Friar is one of the most exceptional CFOs Gavin has worked with.

Why neither lab is raising at a three trillion dollar valuation

The answer Gavin gives is that both labs are deliberately leaving valuation on the table the way Elon has done for two decades. SpaceX compounded at low thirty percent annually for a decade because Elon never pushed price. The result is a permanent superpower of access to capital. Investors trust him because they have made money with him for twenty years. That is a moat that compounds with every round.

Anthropic could probably raise at a one hundred percent premium to its rumored latest mark. They are choosing not to. In an uncertain world (Ukraine, Russia, Iran, Taiwan), preserving the ability to raise more capital later at fair prices is more valuable than maximizing this round.

Watts and wafers, the two real constraints

Capitalism is solving the watts problem. The leading PE infrastructure investors now say zoning and political approval, not chips or energy, are the gating factors. Companies are deferring big capex announcements until after the US midterms. Turbine capacity is being doubled at the manufacturers. Companies like Boom Aerospace are repurposing jet engines for grid use. Watts probably ease meaningfully in 2027 and 2028 and then orbital compute does the rest.

Wafers are the harder problem because they live in Taiwan, run on handshakes, and depend on a corporate culture that does not respond to public market incentives. TSMC is essentially the GDP, water consumption, and electricity consumption of Taiwan. Its leadership treats the company as the legacy of Morris Chang. The Silicon Shield doctrine is real and internal.

Orbital compute as racks in space

The biggest mental update Gavin asks listeners to make is to stop picturing data centers in space as Pentagon sized space stations. A Blackwell rack is three thousand pounds and roughly the size of a refrigerator. SpaceX has shown a concept satellite of about that size. Solar wings extend five hundred feet to each side and the radiator extends hundreds of feet behind, both possible because the orbit is sun synchronous and the orientation is fixed relative to the sun.

SpaceX engineers Gavin has spoken to at Starbase express genuine confidence that they have solved cooling at these power levels. They have. Starlink V3 satellites already operate at twenty kilowatts. A Blackwell rack is one hundred kilowatts. The same company operates the world’s largest satellite fleet and the world’s largest data center on Earth via xAI Colossus. The racks are connected to each other with lasers traveling through vacuum, technology already deployed in every Starlink. The naysayers, Gavin observes, are armchair skeptics and Larry Ellison’s response (he is out there landing rockets, no one else is) is the right frame.

Terafab in Texas and the threat to TSMC’s discipline

Terafab, the SpaceX and Tesla joint venture, intends to be the largest fab in the world. The partnership with Intel grants access to fifty years of foundry institutional knowledge, allowing Terafab to start three to five quarters behind the leading node rather than fifteen years behind. The A teams at the semicap equipment companies (ASML, KLA, Lam Research, Applied Materials) will follow Elon’s reputation in hardware engineering the same way they followed TSMC twenty years ago when Intel stumbled.

The talent strategy is the part most observers underestimate. Recruit the best engineers globally, then import their families, their restaurants, their staff. Build Taiwan Town, Japan Town, and Korea Town next to the fab. Optimize the human experience for the people whose work matters. Intel and Samsung do not think that way.

Bubble watch and the year 2000 comparison

Every foundational technology in modern history has had a bubble. Railroads, canals, the internet. Carlota Perez documented why. Markets correctly identify the importance, diversity of opinion collapses, supply gets ahead of demand, the bubble crashes. The current cycle has two important differences. The buildout is overwhelmingly funded out of operating cash flow, not debt. Every GPU is running at one hundred percent utilization, while at the peak of the fiber bubble ninety nine percent of fiber was unused.

TSMC discipline is the single largest reason a bubble has not formed. If Jensen could buy everything TSMC could theoretically make, Nvidia could sell two to three trillion dollars of GPUs in 2026 and 2027. At some point that becomes more than the market can absorb. If Intel or Samsung Foundry catches up at the leading node, the other will too. TSMC’s pricing discipline collapses and the bubble starts.

The Pareto frontier and the loss of Google’s cost advantage

The most important chart in AI is the Pareto frontier of model intelligence versus per token cost. Nine months ago, Google’s TPU based models dominated every point on it. OpenAI, Anthropic, and xAI sat inside the frontier. Today the frontier is dominated by Anthropic and OpenAI, with Grok 4.3 on the frontier and Gemini 3.1 hanging on by subsidization more than economics. The most likely cause is Google’s conservative TPU V8 design, an attempt to reduce dependence on Broadcom and Nvidia that sacrificed per token economics.

The bitter lesson, frontier tokens, and continual learning

Three open questions dominate AI investing. The first is whether Richard Sutton’s bitter lesson (more compute beats human algorithmic cleverness) gets violated by ASI itself optimizing for efficiency. Closer observers of AI are more skeptical of a violation. Gavin thinks ASI’s first move will be to make itself more efficient and more resourced, which is technically a temporary violation.

The second is whether frontier tokens keep capturing the overwhelming share of economic value at the model layer. Today they do, surprisingly. Gemini 3.1 Pro was mindblowing nine months ago and is intolerable today. The third is when continual learning arrives. Today’s models need a million fire touches to learn what a human learns from one. True continual learning would mean dynamic weight updates in real time and would produce a fast takeoff.

From all you can eat to usage based AI pricing

AI is shifting from flat fee plans to usage based pricing. The historical analogy is cellular and long distance. Both stopped being great growth industries when they went all you can eat. AI just made the opposite move. The consequence is that flat fee subscribers, even on premium consumer plans, get a rate limited and token throttled version of the frontier model. Enterprise plans with usage based billing are now required to evaluate true capability. Gavin thinks the combination of new compute coming online and usage based pricing is what gets OpenAI and Anthropic past two hundred billion in combined ARR this year.

Chip startups, prefill decode disaggregation, and Cerebras

Trying to build a better GPU is the wrong move. The four scaled players (Nvidia, AMD, Trainium, TPU) have copy capability for any one to three percent share design that looks attractive. The good news for startups is that disaggregated inference (separating prefill and decode) opens a richer design canvas. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently. Andrew Fox’s analogy is a British naval ship of the eighteenth century. Prefill is loading the cannon. Decode is firing it.

Cerebras is the model. Wafer scale computing is genuinely different and genuinely hard. It took three generations of chips to get right. Andrew Feldman and his team had the grit to keep going through chip one being a failure. The design has a high ratio of on chip compute and memory relative to shoreline IO, which is why Cerebras is now experimenting with putting an optical wafer on top of the compute wafer to solve scale out.

GPU useful lives and the rescue of private credit

One of the strongest claims in the conversation is that disaggregated inference will stretch GPU useful lives to ten or fifteen years. The skeptical narrative (GPUs are obsolete in two years, companies are cooking their depreciation books) is wrong. You can put a Cerebras system or Groq LPU in front of older Hopper or Ampere parts, use them only for prefill, and run them until they physically melt. Private credit, which is in pain from SaaS loans and which underwrote GPU loans on three to four year lives, may be saved by this.

If GPU financing rates can come down from low sevens to five or six percent, the mathematics of the AI buildout improves materially. That is a structural tailwind that compounds for years.

The application layer, the token path, and a new prisoner’s dilemma

Trillions of dollars of value have been destroyed at the application layer, not created. Cursor and Cognition are the rare scaled exceptions, and they got there by focusing on coding very early. As Amjad Masad noted, coding is plausibly the shortest path to ASI because a coding agent can write itself into any new domain. Jamin Ball’s frame is that the new venture filter is whether the company is in the token path. Data Bricks is. Most application layer startups are not.

Jensen could probably get close to the frontier with Nemotron whenever he wants, and the strategic question of whether to do that is a new prisoner’s dilemma. If every frontier lab agrees not to release best models via API, Chinese open source falls steadily behind. If anyone defects, the defector gains revenue and resources, and everyone else has to defect. The same dynamic exists between TSMC, Intel, and Samsung. If Nvidia or AMD ever truly used an alternative foundry, that foundry would catch up rapidly.

Rating the hyperscalers

Google has the largest compute installed base, the YouTube data that matters in a robotics world, and a search business that prints. Their loss of TPU cost leadership is the surprise of the year. If Google IO in five days does not produce a leapfrog model, the Nvidia centric narrative gets even stronger.

Meta deserves real credit. Zuckerberg made Meta AI first internally faster than any other internet giant, paid up for the talent contracts when no one else would, and shipped Musa as a first model from MSL that is close to the Pareto frontier. Amazon is well positioned on Trainium, robotics in retail, and a Nova model line that is better than it gets credit for. Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Copilot rather than reselling to OpenAI is courageous and probably correct, even at the cost of stock price.

The most interesting cross hyperscaler metric is startup engagement. Nvidia and Amazon engage deeply with startups. Google is next. Broadcom is the favored ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement, which Gavin believes will cost them as the best teams now sit at startups.

Personal safety, geopolitics, and the Pax Americana case

The closing section turns darker. Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion via something that looks exactly like your child calling on FaceTime is already feasible. Political violence against AI leaders is a real concern. Geopolitically, Ukraine is winning largely because it has the best battlefield AI outside America and Israel. How adversaries respond to that asymmetry is the next great variable.

Gavin’s optimistic frame is the Pax Americana. After 1945 the US had a nuclear monopoly and could have controlled the world. Instead it rebuilt Germany and Japan, both of which became the most reliable American allies for the next eighty years. If AI dominance plays out similarly, this is a generationally positive story rather than a destabilizing one. The personal anecdote that closes the conversation is a friend whose daughter was diagnosed with a rare genetic condition. He spun up agents, identified a drug already on the market that addresses her mutation, and her life is immeasurably different because of AI. That is the upside.

Thoughts

The Anthropic eleven billion in a month framing is the kind of stat that resets priors. The right way to interpret it is not as a one off but as a measure of how fast value can compound when the underlying technology improves on a curve steeper than the ability of the rest of the economy to absorb it. The skeptical question is whether that ARR is durable or whether it is heavily tied to a customer base of other AI companies that are themselves on a single venture funded year of runway. The bullish answer is that frontier coding, frontier research, and frontier enterprise tasks are not going to stop being valuable, and Anthropic is the best at all three. Both can be true. The number is still extraordinary.

The argument that TSMC discipline is the only thing preventing a bubble is the analytically tightest part of the conversation. The implied trade is to watch TSMC capacity additions like a hawk and to be more, not less, cautious if Intel Foundry or Samsung Foundry ever announce real share at the leading node. The Terafab thesis is more speculative but more interesting. If Elon’s talent recruiting playbook works and the Intel partnership gives Terafab a real seat at the table within five years, the geometry of the global semiconductor industry shifts in a way that is bullish for American manufacturing, bullish for power and water infrastructure in Texas, and ambiguous for TSMC itself.

The Pareto frontier discussion deserves more attention than it usually gets. Pricing leadership in AI is not a vanity metric. It determines who can subsidize free tier usage, who can absorb compute shortages, who can ship cheaper enterprise plans, and ultimately whose model becomes the default for any given workload. Google losing per token leadership in nine months is one of the most under analyzed events in the sector and it explains a lot about why Anthropic and OpenAI are growing the way they are. If Google IO does not produce a leapfrog model, the implied verdict on TPU V8 design choices gets a lot harsher.

The application layer destruction point is worth sitting with. Founders building on top of frontier models are competing in a world where the model itself moves faster than any moat they can build, where the model lab can absorb their niche if it gets interesting, and where the only protection is either deep token path integration or a niche so small the lab does not bother. That is a much harsher venture environment than the early SaaS era. The compensating opportunity is that one human can now run a hundred agents, so the ceiling on what a small team can build is correspondingly higher. The bet is that productivity per founder rises faster than competitive pressure from the labs. We will find out.

The orbital compute pitch is the section that will polarize listeners. The naive read is that this is science fiction. The closer read is that every component (sun synchronous orbit, laser interconnect, twenty kilowatt satellite buses, ten thousand satellite manufacturing cadence, full rocket reusability) already exists. The remaining engineering problems are repair, maintenance, and radiator scale, all of which are real but tractable on a five to ten year horizon. The strategic implication is that the political and zoning ceiling on terrestrial data centers becomes less binding if orbital compute is a credible alternative for inference workloads. The investor implication is that being short the watts and cooling complex on a five year horizon is a real trade, not a meme.

Watch the full conversation here.
May 20, 2026