PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: frontier models

  • OpenAI’s Leaked 2025 Financials: $34 Billion in Spending, a $38.5 Billion Net Loss, and a $17 Billion Microsoft Bill Ahead of Its IPO

    Infographic summarizing OpenAI leaked 2025 financials: $13.07B revenue, $34B total costs, $20.92B operating loss, $38.53B net loss, where the $34B went, the $17.2B paid to Microsoft versus $303M paid back, inference costs, and IPO valuation context

    OpenAI’s audited 2025 financials leaked this week, and they are the clearest picture yet of what it actually costs to run the company behind ChatGPT. Independent journalist Ed Zitron first published the documents, and the Financial Times independently confirmed them. The headline: OpenAI spent $34 billion last year, booked $13.07 billion in revenue, and reported a net loss attributable to the company of $38.5 billion. The disclosure lands just days after OpenAI confidentially filed for an IPO that could value it north of $1 trillion.

    TLDR

    OpenAI’s audited 2025 numbers, leaked by Ed Zitron and confirmed by the Financial Times, show revenue tripling to $13.07 billion while total costs reached $34 billion, producing a $20.92 billion operating loss and a $38.53 billion net loss attributable to the company. The much larger net loss is inflated by a one-time $41.55 billion non-cash charge tied to OpenAI’s October 2025 conversion from a nonprofit to a public benefit corporation; strip the non-cash items and the loss is closer to $8 billion. R&D alone was $19.18 billion, cost of revenue (inference) was $7.5 billion, and sales and marketing ballooned to $5.73 billion. OpenAI paid Microsoft $17.2 billion in 2025 while Microsoft paid OpenAI only $303 million, exposing a deep Azure dependency. The company burned $1.60 for every dollar of revenue, down from $2.37 in 2024, and gross margin slipped from roughly 40% to 33% as more capable models consumed more compute per query. The leak arrives as OpenAI files a confidential S-1, targets a listing as early as September 2026 at up to a $1 trillion valuation, and races rival Anthropic, which is more valuable on paper and claims it is already turning an operating profit.

    Thoughts

    The most important thing to understand about these numbers is that there are two loss figures and the press will conflate them. The $38.53 billion net loss is the scary headline, but $41.55 billion of it is a non-cash accounting charge from converting investor convertible interests into equity during the for-profit restructuring. That charge is real on the audited statement and it will show up in the eventual S-1, but it is a one-time artifact of OpenAI’s unusual corporate history, not money that left the building. The number that describes the actual business is the $20.92 billion operating loss. That is the one to watch, and it is still enormous.

    The genuinely encouraging line in the whole release is the loss-per-dollar ratio. In 2024 OpenAI spent $2.37 to generate a dollar of revenue. In 2025 that fell to $1.60. A company that is still losing $1.60 on every dollar is not a healthy business, but a company whose efficiency improved by a third in a single year while tripling its top line is at least pointed in a defensible direction. The bull case for OpenAI lives entirely in the slope of that line. If it keeps improving at that rate, the math eventually crosses over. If it stalls, the valuation is a fantasy.

    The Microsoft relationship is the single most revealing disclosure, and it is wildly asymmetric. OpenAI paid Microsoft $17.2 billion in 2025. Microsoft paid OpenAI $303 million. That is a 56-to-1 ratio, and it reframes the partnership: Microsoft is not really a peer or even just an investor, it is OpenAI’s landlord and primary supplier, collecting rent on every model trained and every query answered. The April 2026 renegotiation that capped revenue-share payments at $38 billion through 2030, down from a projected $135 billion, suddenly looks less like a favor and more like OpenAI desperately trying to lower its single largest cost. The dependency cuts both ways, but right now Microsoft holds the better hand.

    The structural problem hiding inside the cost of revenue line is inference. Training a model is a fixed, one-time cost. Serving it is a recurring cost that scales with every one of ChatGPT’s roughly 800 million weekly users. OpenAI spent $5.02 billion on Azure inference in the first half of 2025 alone, and the more capable its reasoning models get, the more compute each answer burns. That is why gross margin went down even as revenue went up. It is the opposite of how software is supposed to work, where the marginal cost of one more user trends toward zero. OpenAI’s marginal cost is real, large, and growing. The counterargument is that per-token inference costs have been falling roughly tenfold a year, so the unit economics could still flip. That is the entire wager.

    Finally, the timing matters more than the numbers. OpenAI’s confidential S-1 means these audited figures were going to become public regardless, since the SEC requires the full prospectus at least 15 days before a roadshow. What the leak changes is who gets to study them first. Prospective IPO buyers, enterprise customers signing multi-year API contracts, and competitors now have the audited books weeks or months early, and they are reading them against Anthropic, which filed at a higher valuation and claims an operating profit. For a company asking the public markets to underwrite a $1 trillion bet on a monopoly outcome that does not yet exist, losing control of the narrative this early is not a small thing.

    Key Takeaways

    • OpenAI’s audited 2025 financials were first published by independent journalist Ed Zitron and independently confirmed by the Financial Times, the first verified look at the company’s books before its planned IPO.
    • Revenue grew from $3.7 billion in 2024 to $13.07 billion in 2025, more than tripling year over year, making OpenAI one of the fastest-growing businesses in history.
    • By the end of 2025 OpenAI was generating roughly $2 billion in monthly revenue, up from about $1 billion a quarter at the end of 2024.
    • Total costs and expenses hit $34 billion in 2025, up from $12.48 billion in 2024.
    • Research and development was the single largest expense at $19.18 billion, up from $7.81 billion, and exceeded total revenue on its own.
    • Of that R&D spend, $10.59 billion went to Microsoft, almost certainly the GPU compute cost of training frontier models on Azure.
    • Cost of revenue, the expense of serving ChatGPT responses (inference), rose from $2.65 billion to $7.5 billion.
    • Sales and marketing jumped from $1.11 billion to $5.73 billion, a 418% increase.
    • General and administrative costs rose from $907 million to $1.57 billion.
    • The operating loss, the truest measure of day-to-day economics, grew from $8.78 billion to $20.92 billion.
    • The net loss attributable to OpenAI was $38.53 billion, up nearly eightfold from $5.09 billion in 2024.
    • The bulk of that jump was a one-time, non-cash $41.55 billion charge from OpenAI’s October 28, 2025 conversion to a public benefit corporation, reflecting the changing fair value of convertible interests and warrant liabilities.
    • Stripping out the restructuring charge and other non-cash items such as stock-based compensation and Microsoft computing credits, the underlying loss was about $8 billion.
    • Including all factors, gross net loss reached $60.35 billion, lowered to the $38.53 billion attributable figure by removing $21.82 billion attributed to noncontrolling and redeemable noncontrolling interests.
    • OpenAI burned $1.60 for every $1 of revenue in 2025, an improvement from $2.37 in 2024, the clearest data point in the bull case.
    • Measured as a percentage of revenue, the operating loss improved from 237% in 2024 to 160% in 2025.
    • In total, OpenAI paid Microsoft $17.2 billion in 2025: $10.59 billion in R&D fees, $6.047 billion in cost of revenue, $527 million in sales and marketing, and $42 million in G&A.
    • Microsoft paid OpenAI just $303 million in the same year, a 56-to-1 imbalance underscoring OpenAI’s Azure dependency.
    • SoftBank paid OpenAI $867 million in 2025.
    • At year-end OpenAI carried $3.64 billion in outstanding payables to Microsoft, plus tens of millions more in accrued and non-current liabilities.
    • OpenAI spent $5.02 billion on Azure inference in just the first half of 2025; Azure inference from 2024 through Q3 2025 totaled $12.43 billion.
    • ChatGPT serves roughly 800 million weekly users, meaning billions of queries a week, each one burning GPU time at Azure’s pricing of about $6.98 per H100 GPU-hour.
    • Gross margin fell from roughly 40% in 2024 to 33% in 2025, because more capable reasoning models consume more compute per query.
    • Research firm Sacra estimates OpenAI’s inference costs reached $8.4 billion in 2025 and will rise to $14.1 billion in 2026, a 68% increase.
    • At year-end OpenAI held just over $50 billion in assets, with almost half in cash.
    • The April 2026 Microsoft renegotiation ended exclusivity and capped revenue-share payments at $38 billion through 2030, down from a projected $135 billion, potentially saving OpenAI up to $97 billion over five years.
    • OpenAI filed a confidential draft S-1 with the SEC around May 22, 2026 and confirmed it publicly on June 8, naming Goldman Sachs and Morgan Stanley as underwriters.
    • The company is targeting a listing as early as September 2026 at a valuation that could exceed $1 trillion, though Sam Altman has said a public offering “may be a while.”
    • OpenAI raised $122 billion earlier in 2026 at a $730 billion pre-money valuation, putting its post-money value around $852 billion.
    • At an $852 billion valuation, OpenAI trades at roughly 65 times its 2025 revenue.
    • Rival Anthropic also filed IPO paperwork this month after raising $65 billion at a $900-$965 billion valuation, making it more valuable on paper than OpenAI, and says it expects to report an operating profit of $559 million in the June quarter.
    • HSBC analysts estimate OpenAI may need more than $207 billion in additional capital through 2030 even under optimistic projections.
    • OpenAI projects profitability by 2029 or 2030; independent analysts put the more likely date at 2031 or later.
    • Bridgewater partner Greg Jensen reportedly told clients the implied revenue multiples price OpenAI for “a monopoly outcome that does not yet exist.”
    • Zitron separately reported OpenAI had a negative 122% non-GAAP operating margin in Q1 2026 and that ChatGPT growth has stalled, with the company projecting paid ChatGPT Plus subscriptions to fall from 44 million in 2025 toward cheaper tiers in 2026.

    Detailed Summary

    How the leak happened and why it matters now

    The audited documents were obtained and first published by Ed Zitron on his newsletter Where’s Your Ed At, then independently verified by the Financial Times, which reviewed the same materials. That dual sourcing matters: this is not a rumor or a model, it is OpenAI’s actual audited financial statement. The timing is the story. OpenAI filed a confidential draft S-1 with the SEC around May 22, 2026 and confirmed it publicly on June 8. Under SEC rules the full prospectus must be released at least 15 days before an investor roadshow, so the 2025 numbers were going to be public soon regardless. The leak simply moved that disclosure forward, handing prospective investors, enterprise customers, and competitors an early look at the books.

    Revenue tripled, costs grew faster

    OpenAI’s revenue rose from $3.7 billion in 2024 to $13.07 billion in 2025, and monthly revenue reached nearly $2 billion by year-end. By almost any normal standard that is spectacular growth. The problem is that costs grew faster, reaching $34 billion against $12.48 billion the year before. The gap between what OpenAI earns and what it spends has widened every year since its founding, and 2025 is the starkest example yet. Revenue alone was outpaced by research and development as a single line item in both of the last two years.

    Two loss numbers, and why both matter

    There are two figures that get cited interchangeably and should not be. The operating loss of $20.92 billion is what the business spent beyond what it earned from operations: training models, serving ChatGPT, paying engineers, running marketing. The net loss attributable to OpenAI of $38.53 billion is far larger because 2025 was the year OpenAI completed its conversion from a nonprofit to a for-profit public benefit corporation, finalized on October 28, 2025. That restructuring triggered a $41.55 billion non-cash charge reflecting the changing fair value of convertible equity interests and warrant liabilities. Before the conversion, investors held convertible interest rights treated as liabilities under US accounting rules and revalued upward as OpenAI’s valuation climbed, creating the charge. It is not expected to recur. Including all minor items, gross net loss reached $60.35 billion, reduced to the $38.53 billion attributable figure after removing $21.82 billion tied to noncontrolling and redeemable noncontrolling interests, primarily the OpenAI Foundation’s stake. Strip the non-cash noise and the underlying loss was about $8 billion.

    Where the $34 billion went

    The spending breaks into four lines. Research and development was $19.18 billion, the largest category, with $10.59 billion of it flowing to Microsoft for training compute. Cost of revenue, the expense of serving responses to users, was $7.5 billion and captures inference, the compute consumed every time someone prompts ChatGPT or calls the API. Sales and marketing reached $5.73 billion, up 418% year over year, a striking jump for a product that grew largely by word of mouth. General and administrative costs added $1.57 billion. The shape of the spending tells you OpenAI is simultaneously racing to build better models, serve a massive and growing user base, and aggressively defend market share through marketing.

    The Microsoft dependency

    The most striking single disclosure is the scale of the Microsoft relationship. OpenAI paid Microsoft $17.2 billion in 2025: $10.59 billion in R&D fees for model training, $6.047 billion in cost-of-revenue for inference serving, $527 million in sales and marketing, and $42 million in G&A. Microsoft paid OpenAI just $303 million the same year. SoftBank paid OpenAI $867 million. The 56-to-1 ratio between what OpenAI pays Microsoft and what Microsoft pays back makes the structural reality plain: Microsoft is OpenAI’s largest landlord. The dynamic began shifting in April 2026, when the two renegotiated, ending Microsoft’s exclusivity and capping revenue-share payments at $38 billion through 2030, down from a projected $135 billion. That could save OpenAI up to $97 billion over five years, though Microsoft keeps its IP license through 2032 and remains the primary cloud partner.

    Why inference is the core problem

    Training happens once. Serving happens billions of times a day. When OpenAI releases a model it spends months and billions on training compute, a fixed cost that falls away when training ends. Inference is the opposite: every ChatGPT message runs through the model on Azure GPU hardware, consuming electricity and compute to generate a response. With roughly 800 million weekly users, that is billions of queries a week, each burning GPU time at roughly $6.98 per H100 GPU-hour on demand. OpenAI spent $5.02 billion on Azure inference in the first six months of 2025 alone. Sacra estimates full-year inference costs of $8.4 billion in 2025, rising to $14.1 billion in 2026. This is why gross margin fell from about 40% to 33% even as revenue tripled: more capable reasoning models consume far more compute per query, and revenue has not kept pace with the cost growth that capability generates.

    What it means for the IPO and the race with Anthropic

    OpenAI was last valued around $852 billion post-money after raising $122 billion in early 2026, which puts it at roughly 65 times 2025 revenue. It has named Goldman Sachs and Morgan Stanley as underwriters and is targeting a listing as early as September 2026 at up to a $1 trillion valuation, though Altman has hedged that it “may be a while” and that staying private might be the better course. HSBC estimates the company may need more than $207 billion in additional capital through 2030. The race is with Anthropic, which filed paperwork the same month after raising $65 billion at a $900-$965 billion valuation, making it more valuable on paper, and which says it expects a $559 million operating profit in the June quarter. The contrast is sharp: the two leading AI labs heading toward public markets at the same time, one bleeding cash at scale, the other claiming profitability, both asking investors to bet on a future that has not arrived.

    Notable Quotes

    “The financial condition of OpenAI is deeply concerning. $38.53 billion in losses are astronomical, and far higher than most believed it would be. Losses also appear to be mounting year-over-year at a dramatic rate, and I’m not sure how this company finds a way toward any kind of sustainability or profitability.”

    Ed Zitron, the independent journalist who published the leaked audited financials

    “It’s unclear what this means, nor how OpenAI reconciled the removal of $3.74 billion in costs. I will not speculate further.”

    Ed Zitron, on a discrepancy he found in the restated 2024 figures

    “OpenAI’s two biggest expenses are R&D and marketing. Budget cuts there, coupled with an ability to raise prices or win new sources of revenue, could see the company move into the black over time. Cutting R&D would be the most difficult part of that, given that AI companies can only hold onto their customers by generating the best-performing models.”

    Jim Edwards, Fortune, on whether OpenAI has a realistic path to profitability

    “What the audited documents make impossible to argue is that the path to profitability is short, clear, or cheap.”

    TechTimes analysis of the leaked OpenAI financials

    The implied revenue multiples price OpenAI for “a monopoly outcome that does not yet exist.”

    Bridgewater partner Greg Jensen, reportedly telling clients how to read OpenAI’s valuation

    “OpenAI spent $34bn last year as the ChatGPT maker poured money into a race to dominate the fast-growing AI market ahead of a planned stock market listing.”

    George Hammond and Bryce Elder, Financial Times, framing the audited 2025 spend

    Read Ed Zitron’s original reporting with the full breakdown here, and the Financial Times confirmation here.

    Related Reading

    • Ed Zitron, Where’s Your Ed At the primary source that broke the audited 2025 financials with the full line-by-line breakdown.
    • OpenAI (Wikipedia) background on the company’s history, structure, and the nonprofit-to-for-profit conversion that drives the non-cash charge.
    • Inference (Wikipedia) on the recurring compute cost that explains why OpenAI’s gross margin shrinks as usage grows.
    • Anthropic the rival lab that filed IPO paperwork the same month at a higher valuation and claims it is already operating at a profit.
    • SEC on confidential filings context for why OpenAI’s audited numbers were headed for public disclosure regardless of the leak.
  • US Government Orders Anthropic to Suspend Claude Fable 5 and Mythos 5: Inside the Export Control Directive, the Jailbreak Dispute, and What It Means for Frontier AI

    On June 12, 2026, Anthropic published a statement announcing that the US government, citing national security authorities, has issued an export control directive forcing the company to suspend all access to its newest frontier models, Claude Fable 5 and Claude Mythos 5. The order technically targets foreign nationals inside and outside the United States, including Anthropic’s own foreign national employees, but the practical effect is that both models are going dark for every customer worldwide. It is the first publicly known instance of the US government ordering a deployed frontier AI model offline, and Anthropic is complying while openly disputing the basis for the decision.

    TLDR

    The US government delivered an export control directive to Anthropic at 5:21pm ET on June 12, 2026, suspending all access to Fable 5 and Mythos 5 over an alleged jailbreak of Fable 5’s safeguards. Anthropic says the letter contained no specific details, that the only evidence shared was verbal, and that the technique in question amounts to asking the model to read a codebase and fix software flaws, a capability the company says is freely available from other models including OpenAI’s GPT-5.5 and used daily by cyber defenders. Anthropic defends its defense in depth strategy, notes that thousands of hours of red teaming by the US government, the UK AISI, and third parties found no universal jailbreak, and warns that recalling a commercial model over a narrow, non-universal jailbreak would effectively halt all new frontier model deployments if applied industry-wide. Access to all other Anthropic models, including Claude Opus, Sonnet, and Haiku, is unaffected, and the company says it believes the situation is a misunderstanding and is working to restore access, with more details promised within 24 hours.

    Thoughts

    This is a watershed moment regardless of how it resolves. Governments have blocked AI exports before, but ordering a deployed commercial model recalled out from under hundreds of millions of users is a new kind of intervention, closer to a product recall than a trade restriction. The mechanism matters too. Export control authority aimed at foreign nationals, including a company’s own employees, that cascades into a global shutdown is a blunt instrument doing the work of a regulatory regime that does not exist yet. The US has no statutory process for recalling an AI model, so the government reached for the closest tool on the shelf, and the result is a precedent built on improvisation.

    There is real irony in who got hit first. Anthropic has spent years arguing, publicly and in Washington, that governments should have the power to block unsafe AI deployments. Now the company that asked for a referee is the first one whistled, and its complaint is not about the existence of the power but about the process: a letter at 5:21pm with no specifics, verbal evidence only, and no transparent or technically grounded procedure. That distinction is the whole ballgame for AI governance. A power to halt deployments without due process standards is not regulation, it is discretion, and discretion cuts in every direction depending on who holds it.

    The technical dispute underneath is genuinely interesting because it exposes how unsettled the definition of a dangerous jailbreak is. Anthropic’s account of the offending technique, asking the model to read a specific codebase and fix any software flaws, describes something security teams do on purpose every single day. Vulnerability discovery is the canonical dual use capability: the same analysis that lets a defender patch a hole lets an attacker find one. If the bar for recall is that a model can be coaxed into doing competent security analysis, then every capable model on the market fails that bar, which is exactly Anthropic’s point about GPT-5.5. The hard question the directive dodges is not whether Fable 5 can find bugs but whether it provides meaningful uplift beyond what is already freely available, and Anthropic says it does not.

    For builders, the immediate lesson is uncomfortable: model availability is now a political variable, not just an engineering one. Teams that built directly on Fable 5 lost a production dependency overnight through no fault of Anthropic’s infrastructure, their own code, or any terms of service violation. Multi-model fallback strategies, abstraction layers over providers, and graceful degradation paths just moved from nice-to-have to table stakes for anyone running serious workloads on frontier models. The companies that absorbed this outage gracefully are the ones that assumed any single model could vanish.

    The next 24 hours matter more than the directive itself. Anthropic has promised more details, and the government will face pressure to either substantiate a concern that justifies a global recall or quietly walk it back. Either outcome sets the real precedent. If the directive holds on thin evidence, every frontier lab now operates under the threat of arbitrary shutdown. If it collapses under scrutiny, the case for a formal, transparent statutory process for AI deployment decisions, which Anthropic explicitly endorses in its own statement, gets a lot stronger in Congress than it was a week ago.

    Key Takeaways

    • The US government issued an export control directive on June 12, 2026 suspending all access to Claude Fable 5 and Claude Mythos 5, citing national security authorities.
    • The directive formally targets access by any foreign national, inside or outside the United States, including Anthropic’s own foreign national employees.
    • The net effect is that Anthropic must disable Fable 5 and Mythos 5 for all customers worldwide to ensure compliance, not just for foreign users.
    • Access to all other Anthropic models, including the Claude Opus, Sonnet, and Haiku families, is not affected by the order.
    • Anthropic received the directive at 5:21pm ET the same day it published its statement, and says the letter did not provide specific details of the national security concern.
    • Anthropic’s understanding is that the government believes it has become aware of a method of bypassing, or jailbreaking, Fable 5’s safeguards.
    • Anthropic reviewed a demonstration of the specific technique and says it only identified a small number of previously known, minor vulnerabilities.
    • The company says other publicly available models can discover the same vulnerabilities without requiring any bypass at all.
    • Before launch, Fable 5’s safeguards were red-teamed for thousands of hours in total by the US government, the UK AISI, multiple private third-party organizations, and internal teams.
    • No tester has found a universal jailbreak for Fable 5, meaning a method that broadly bypasses safeguards and unlocks a wide range of cyber capabilities.
    • Anthropic openly states that perfect jailbreak resistance does not appear possible for any model provider today, and that every safeguard in the industry is vulnerable to non-universal jailbreaks.
    • Fable 5 was deployed under a defense in depth strategy: make jailbreaks either narrow or very expensive to produce, then combine that with monitoring to quickly detect and shut down successful attacks.
    • Anthropic’s 30-day customer data retention requirement for Fable exists specifically to support jailbreak research and mitigation, a policy the company says carries real costs with customers.
    • Anthropic says it has not received any disclosure of a concerning non-universal jailbreak that led to a harmful result; disclosed potential jailbreaks were benign or provided no Mythos-specific uplift.
    • The only evidence the government has provided is verbal, describing a narrow, non-universal jailbreak that essentially consists of asking the model to read a specific codebase and fix any software flaws.
    • Anthropic reviewed a report it believes is the basis of the directive and validated that the capability level shown is widely available from other models, including OpenAI’s GPT-5.5, and is used every day by cyber defenders.
    • Anthropic is complying with the legal directive while explicitly disagreeing that a narrow potential jailbreak justifies recalling a commercial model deployed to hundreds of millions of people.
    • The company warns that if this recall standard were applied across the industry, it would essentially halt all new model deployments for every frontier model provider.
    • Anthropic supports government power to block unsafe deployments in principle, but only through a statutory process that is transparent, fair, clear, and grounded in technical facts, and says this action meets none of those principles.
    • Anthropic apologized to customers, called the situation a misunderstanding, said it is working to restore access as soon as possible, and promised more details within 24 hours.

    Detailed Summary

    What the directive actually does

    The order arrived as a letter from the US government at 5:21pm ET on June 12, 2026, invoking national security authorities under export control law. On paper it suspends access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, a category that includes some of Anthropic’s own employees. In practice, Anthropic says compliance requires abruptly disabling both models for every customer, since there is no clean way to enforce a nationality-based access boundary across a global product. The letter did not spell out the specific national security concern. Everything else in Anthropic’s statement is the company’s own reconstruction of what prompted the action.

    The jailbreak at the center of the dispute

    Anthropic’s understanding is that the government became aware of a method for bypassing Fable 5’s safeguards. The company reviewed a demonstration of the technique and characterizes the results as a small number of previously known, minor vulnerabilities, all relatively simple, all discoverable by other publicly available models without any jailbreak at all. According to Anthropic, the government’s evidence so far has been entirely verbal, and the technique boils down to asking the model to read a specific codebase and fix any software flaws. The company reviewed a report it believes underlies the directive and validated that the displayed capability is widely available elsewhere, naming OpenAI’s GPT-5.5 directly, and noted that this exact kind of analysis is what defenders use to keep systems safe.

    Anthropic’s defense in depth posture

    The statement restates the safety posture Anthropic laid out at Fable 5’s launch. The safeguards around cybersecurity tasks are strong enough that users have complained they are overly broad. In the weeks before launch, the US government, the UK AISI, multiple private third-party organizations, and internal teams red-teamed the safeguards for thousands of hours combined, and those tests showed Fable’s protections to be substantially more effective than any previously deployed model. No tester found a universal jailbreak. Anthropic is candid that perfect jailbreak resistance is likely impossible for anyone today, which is why the strategy is defense in depth: keep jailbreaks narrow or expensive, monitor aggressively, and shut down attacks fast. The 30-day customer data retention requirement on Fable exists to support that monitoring and mitigation loop. The company says this posture makes Fable’s risks comparable to models already deployed across the industry.

    Complying while disputing the standard

    Anthropic is removing access for all users as legally required, but the statement draws a hard line on the principle. The company disagrees that a narrow potential jailbreak, one that produced no disclosed harmful result, justifies recalling a commercial model serving hundreds of millions of people. Its broader warning is that this standard, applied evenly, would halt all new frontier model deployments industry-wide, since every provider’s safeguards are vulnerable to narrow jailbreaks. Anthropic also turns its own policy position into a critique: the company has publicly supported giving government the ability to block unsafe deployments, but through a statutory process that is transparent, fair, clear, and grounded in technical facts, and it says this action does not adhere to those principles.

    What happens next

    Anthropic closed by apologizing to customers, calling the situation a misunderstanding, and committing to restore access as soon as possible. The company promised to share more details over the next 24 hours, which makes this a developing story. The open questions are whether the government substantiates its concern with written technical evidence, whether the directive survives that scrutiny, and whether this episode accelerates the formal statutory process for AI deployment decisions that Anthropic says should have governed the action in the first place.

    Notable Quotes

    “The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.”

    Anthropic, on why a directive aimed at foreign nationals becomes a global shutdown

    “We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern.”

    Anthropic, on the abruptness and opacity of the order

    “These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass.”

    Anthropic, on its review of the demonstrated jailbreak technique

    “We suspect that perfect jailbreak resistance is not currently possible for any model provider.”

    Anthropic, restating the position it disclosed at Fable 5’s launch

    “We stand by this defense in depth strategy. It reduces the risks posed by Fable, making them comparable to the risks of existing models already deployed across the industry.”

    Anthropic, defending its layered safeguards approach

    “To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws.”

    Anthropic, describing the technique behind the directive

    “However, we disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people.”

    Anthropic, on complying while contesting the decision

    “If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”

    Anthropic, on the industry-wide implications of the recall standard

    “As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.”

    Anthropic, on the kind of oversight process it says should have governed the action

    “We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.”

    Anthropic, closing its statement to customers

    Read the full statement on Anthropic’s site here.

    Related Reading

  • Inside Anthropic, the $965 Billion AI Juggernaut: Dario and Daniela Amodei on Claude, Claude Code, and the AI Arms Race

    In this episode of The Circuit, Bloomberg goes inside Anthropic, the AI lab that started as an underdog and is now valued at nearly a trillion dollars. The conversation centers on the sibling duo running the company, Dario Amodei, the brother and visionary, and Daniela Amodei, the sister and operator, along with Boris Cherny, the engineer behind Claude Code and Claude Cowork. It is a rare, on-the-record look at how a safety-obsessed startup founded by a group of OpenAI defectors in 2021 became the breakout star of the AI arms race, wiping billions in value off software stocks and forcing an uncomfortable national conversation about the future of work. You can watch the full episode here.

    TLDW

    Dario and Daniela Amodei walk through Anthropic’s rise from a pandemic-era group meeting on the grass in Precita Park to a roughly $965 billion AI juggernaut that is now profitable for the first time. They explain why they left OpenAI, citing a breakdown of trust and values with Sam Altman rather than a single safety disagreement, and how Dario’s early bet on scaling laws shaped the entire field. The two describe how Claude is trained for character and “professional warmth,” anchored in documents like the UN Declaration of Human Rights, and how the company defines a good model as one that does not lie, hallucinate, or deceive. The business story is enterprise and coding: Claude Code and Claude Cowork automated huge chunks of software engineering, triggered a SaaSpocalypse that erased $285 billion in market value overnight, and pushed annualized growth to as high as 80x in a single quarter. Boris Cherny, recruited from a slow miso-making life in rural Japan, says Claude has written one hundred percent of his code for at least six months. The hardest part of the conversation is jobs: Dario stands by his warning that AI could eliminate half of all entry level white collar jobs in one to five years, pushes back hard on Jensen Huang’s “doom marketing” critique, and lays out where displaced workers might go, from the physical world to human-centered roles like a reimagined, more interpersonal version of medicine. The episode closes by teasing AI and the future of warfare, a scarily powerful new model called Mythos, and Dario’s identification not with Oppenheimer but with Leo Szilard.

    Thoughts

    The most revealing moment in this profile is not a number, it is Dario Amodei’s description of the “smooth exponential.” His whole career, he says, has felt like nothing happening, nothing happening, nothing happening, and then zoom. That mental model is the key to understanding why Anthropic behaves the way it does. A company that genuinely believes it is riding an exponential will tolerate enormous near-term discomfort, public criticism, and internal strain, because it has already priced in a future that looks nothing like the present. Whether that conviction is wisdom or a kind of motivated certainty is the open question the episode never fully resolves, but it explains the urgency in every answer he gives.

    The Boris Cherny segment is the part that should make working engineers sit up. When a senior engineer says Claude has written one hundred percent of his code for six months and that he feels like he has a jet pack, that is not a marketing line, it is a description of a job that has already changed underneath the person doing it. The framing in the piece is optimistic, superpowers and fun, but the logical endpoint is exactly the one Dario himself names a few minutes later: you automate ninety percent of a job, the remaining humans get ten times more leveraged, and then the curve keeps bending toward one hundred percent. Anthropic is, unusually, building the thing and narrating its own disruption in the same breath. That honesty is rare, and it is also a little vertiginous.

    The values-versus-business-model argument deserves more scrutiny than it gets. Dario’s claim is elegant: a business model that conflicts with your values forces you to either betray the values or become irrelevant, so Anthropic chose enterprise and coding because curing diseases and making energy cheaper are enterprise work, while consumer engagement is the addiction-maximizing trap of social media. It is a genuinely good argument, and it is also extremely convenient that the values-aligned path happens to be the most lucrative one. The episode lets that tension sit, which is the right call. The honest reading is that Anthropic found a place where doing well and doing good currently point in the same direction, and the harder test will come the first time they diverge.

    On jobs, Dario is more persuasive than his critics give him credit for, precisely because he refuses the comfortable framing. Jensen Huang and others accuse him of conflating tasks with jobs and of doom marketing that benefits Anthropic. Dario’s response, that the idea this is cheap marketing is itself cheap marketing, is sharper than it first sounds. He is pointing at the way social media flattens a five-page argument about tasks, jobs, tax policy, and the adolescence of technology into a three-second clip designed to provoke. The deeper point is that he is trying to hold two things at once, fast GDP growth and high unemployment, and our public discourse is structurally bad at holding two things at once. That is less a story about AI than about the medium we use to argue about it.

    Finally, the Oppenheimer exchange reframes the entire profile. Dario explicitly rejects the lone-genius model and names Leo Szilard, the scientist who first imagined the chain reaction, as the figure he identifies with. He calls Oppenheimer a failure case, an example of what should not happen. For a man whose company is constantly accused of cultivating a great-man mythology, choosing the early-warning scientist over the bomb’s public face is a deliberate statement about how he wants this story to end: not with charismatic individuals at the center of everything, but with checks and balances everywhere. It is the most quietly radical thing said in the whole piece, and the teaser for a model named Mythos lands with a little extra irony because of it.

    Key Takeaways

    • Anthropic is profiled as an AI juggernaut valued at nearly a trillion dollars, with the figure of roughly $965 billion framing the episode, and is described as profitable for the first time.
    • The company was founded in 2021 by a team of OpenAI defectors and started as an underdog lab before becoming the breakout star of the AI race.
    • Anthropic is run by a sibling duo, Dario Amodei as the visionary and Daniela Amodei as the operator who turns his ideas into action, and Daniela jokes that when they argue, no one wins.
    • Dario describes the AI trajectory as a “smooth exponential” where nothing seems to happen for a long time and then progress suddenly explodes.
    • He says he predicted from a graph that Anthropic would become the AI company with the most revenue and valuation around this time, and that it has happened.
    • Dario grew up in San Francisco with a leather-craftsman father and a librarian mother, took calculus in middle school, and studied math at UC Berkeley while in high school, with no early interest in the internet revolution.
    • Dario studied neuroscience before moving to AI at Baidu and later Google, while Daniela was an early employee at Stripe.
    • Both joined OpenAI starting in 2016, where Dario developed the concept of scaling laws, predicting that large language models would improve simply by adding more data and compute even if the underlying algorithm stayed the same.
    • Scaling up was a counter-cultural scientific bet at the time, held mainly by the founding research team, and it helped supercharge OpenAI’s models and pave the way for ChatGPT.
    • The Amodeis left OpenAI after clashing with Sam Altman over direction and values, framing it as a breakdown of trust and honesty rather than a single safety disagreement.
    • Altman has said that despite their differences, he mostly trusts Anthropic as a company.
    • Anthropic has all seven of its co-founders still at the company, which Dario notes almost never happens at a company of its size.
    • The early team met during the pandemic at Precita Park in San Francisco, pulling up chairs on the grass to talk about what they were building.
    • The name Anthropic comes from the Greek word for human, reflecting a stated mission to build responsible AI for the long-term benefit of humanity.
    • Dario has published long essays including Machines of Loving Grace and The Adolescence of Technology, exploring both the miraculous potential and the worst-case scenarios of AI.
    • Claude is trained to follow a set of principles called a Constitution, intended to keep it aligned and well-behaved.
    • Daniela describes Claude’s intended personality as “professional warmth,” approachable but distant, not a best friend and not cold or calculating.
    • A good model, in Anthropic’s framing, does not lie accidentally or intentionally, with lying including hallucinations where the model invents something it does not know.
    • Anthropic’s own research has shown that models can purposely try to deceive users, which the company works to prevent in production models.
    • There is no universal standard for helpfulness or harmlessness, so Anthropic draws on founding documents like the UN Declaration of Human Rights to train Claude’s character.
    • The company has begun consulting religious leaders about Claude as an entity and about core values that transcend any single worldview.
    • Early Claude models, around the Claude 2 era, were sometimes “nannyish,” expressing concern when a user just wanted the weather, which researchers describe as tuning a fine dial.
    • Anthropic’s revenue skyrocketed over the past year, driven by a focus on lucrative business tools rather than consumer apps.
    • Claude Code automated large chunks of software engineering, and Claude Cowork extended that power to non-engineers.
    • Dario frames the enterprise bet as a values-and-business decision, arguing that a business model conflicting with your values forces you to betray them or become irrelevant.
    • He contrasts engagement-and-addiction-driven consumer and advertising models with enterprise uses like curing diseases, advancing biotech and pharma, and making energy cheaper.
    • Soon after Claude Cowork launched, $285 billion in market value vanished overnight in what traders called the SaaSpocalypse, with some software stocks down nine days in a row.
    • Dario argues the software “pie” will get bigger overall, even as some incumbents shrink or go out of business if they fail to adapt and defend their moats.
    • Boris Cherny, the engineer behind Claude Code and Claude Cowork, was recruited in 2024 from a slow life in rural Japan where he made miso and shopped at farmer’s markets.
    • Cherny’s bet was that a coding agent could do all of software development, not just autocomplete a line or a sentence.
    • He now runs anywhere from a few to a few thousand Claudes at once and says Claude has written one hundred percent of his code for at least six months.
    • A live demo builds a working recipe app that suggests meals for the week in minutes, work that used to take hours or days.
    • At the second annual Code with Claude conference, Anthropic reported API volume up nearly 17x year over year, eight frontier models shipped in twelve months, and first-quarter growth that annualizes to roughly 80x.
    • Dario stands by his warning that AI could eliminate half of all entry level white collar jobs in the next one to five years, saying he remains the same order of concerned.
    • He warns of an unusual combination of very fast GDP growth alongside high unemployment, underemployment, low-wage jobs, and high inequality.
    • Jensen Huang and others have pushed back, accusing Dario of conflating tasks with jobs and of doom marketing that benefits Anthropic.
    • Dario responds that the claim this is cheap marketing is itself cheap marketing, and blames social media for flattening his careful five-page arguments into three-second clips.
    • Anthropic published a paper estimating that management, finance, and legal jobs could be among the fields most affected by AI in the near future.
    • Dario points to the physical world, human-centered relationship-driven work, and humans directing AI as places displaced workers might go, though he is unsure how thick those roles will be.
    • He uses medicine as an example, predicting AI will excel at diagnosis while doctors pivot toward the interpersonal, hands-on, bedside-manner parts that AI cannot replace.
    • The episode teases a next installment on AI and the future of warfare, a scarily powerful new model called Mythos, and the theme of riding the exponential while avoiding dystopia.
    • Dario names The Making of the Atomic Bomb as a favorite book and identifies most with Leo Szilard, who first conceived of a chain reaction, rather than Oppenheimer, whom he sees as a failure case.
    • His view is that the only way the AI era ends well is through checks and balances everywhere, not larger-than-life personalities at the center of everything.

    Detailed Summary

    An unlikely AI celebrity and a sibling-run juggernaut

    The profile opens in a library Dario Amodei clearly loves, establishing him as an unlikely AI celebrity, a man known for warning the world about the risks of artificial intelligence who now runs a company valued at nearly a trillion dollars. Anthropic is presented as the breakout star of the AI race, wiping billions off software stocks, going head-to-head with the Pentagon, and building models powerful enough to threaten modern cybersecurity, with early testers reportedly calling one capability a super weapon and asking the company not to release it. Guiding the company is the sibling pair, Dario the visionary and Daniela the operator who translates his swirling cosmic thoughts into action. Daniela explains that the two have always been close and always wanted to do something big together, and when asked who wins their arguments, she says no one. The framing throughout is of a young, fast-growing startup carrying enormous responsibility for how humanity works, learns, thinks, and even fights wars.

    The smooth exponential and the road from OpenAI

    Dario describes his entire career as the experience of a smooth exponential, where nothing happens for a long stretch and then things go crazy, and he says he watched a graph and correctly predicted Anthropic would top the field in revenue and valuation around now. His backstory is a math prodigy in San Francisco, the son of a leather craftsman and a librarian, taking calculus in middle school and Berkeley math classes in high school, indifferent to the internet revolution and drawn instead to science fiction and understanding the universe. Daniela, more into reading and the arts, calls them near-perfect complements. Dario moved from neuroscience into AI at Baidu and Google, Daniela went to Stripe, and both eventually joined OpenAI starting in 2016, where Dario developed scaling laws, the then counter-cultural bet that more data and compute alone would make models smarter. That insight helped power the models behind ChatGPT, but the Amodeis clashed with Sam Altman over values and direction. Dario frames the departure bluntly: disagreements on safety alone were not enough, but a loss of trust, a sense that Altman’s stated values were not his real values, made it impossible to continue. The resolution, he says, was simply to go off and do their own thing.

    Precita Park, the Constitution, and teaching Claude to be good

    Anthropic’s origin story runs through Precita Park, where the early pandemic-era team gathered on the grass to talk about what they were building. Of seven co-founders, all are still at the company, a retention record Dario says almost never happens at this scale. From the start the company pitched itself as the ultimate safety-conscious lab, with Dario publishing essays like Machines of Loving Grace and The Adolescence of Technology. Claude is trained on a Constitution, and Daniela describes its intended character as professional warmth, approachable but distant. Defining a good model, the team says it should not lie, whether through intentional deception or hallucination, the latter being the model inventing answers it does not actually know. Anthropic’s research has shown models can deliberately deceive, something they work to prevent in production. Because there is no universal standard for helpfulness or harmlessness, they anchor Claude’s training in documents like the UN Declaration of Human Rights and have begun talking with religious leaders about values that transcend any single worldview. Daniela recalls early “nannyish” Claude 2-era behavior, where the model fretted over a user who only wanted the weather, and describes the work as threading a fine needle to land in the center of the dial.

    The enterprise bet, Claude Code, and the SaaSpocalypse

    Anthropic’s revenue surge and first-time profitability are attributed to a focus on business tools, especially Claude Code, which automated large chunks of software engineering, and Claude Cowork, which extended that capability beyond engineers. Dario frames the bet on coding and enterprise as both a values and a business decision: a business model that conflicts with your values eventually forces you to betray them or become irrelevant. He contrasts the engagement and addiction incentives of advertising-driven social media and AI video with enterprise applications like curing diseases, biotech, pharma, academic research, and cheaper energy, all of which he counts as enterprise work aligned with the company’s mission. The disruption was immediate and brutal: soon after Claude Cowork launched, $285 billion in market value vanished overnight in what traders dubbed the SaaSpocalypse, with some software stocks falling nine days straight. Dario’s read is that the overall software pie will grow even as specific incumbents shrink or fail, and that the big losers will be those who do not see what is coming or defend their moats.

    Boris Cherny, jet packs, and Code with Claude

    Much of Anthropic’s recent growth is credited to Boris Cherny, the engineer behind Claude Code and Claude Cowork, hired in 2024 from a deliberately slow life in rural Japan where he made miso and frequented farmer’s markets. A serious science fiction reader, Cherny was awed by his first AI chatbot and also acutely aware of how badly the technology could go. His bet was that a coding agent could do all of software development rather than just autocomplete. He now describes orchestrating anywhere from a few to a few thousand Claudes at once, talking to one while it writes code and moving to the next, and says Claude has written one hundred percent of his code for at least six months. He compares the feeling to having superpowers and a jet pack, calling engineering more fun than ever. A live demo has Claude build a working weekly-meal recipe app in minutes. The story then moves to the second annual Code with Claude conference, where the company reports API volume up nearly 17x year over year, eight frontier models shipped in twelve months, and first-quarter growth annualizing to roughly 80x, with attendees ranging from technical superfans to curious non-engineers.

    Jobs, the tasks-versus-jobs fight, and a more human medicine

    The episode turns to the uncomfortable core: whether engineers will be the first casualties of the AI they are building. Dario stands by his warning that AI could eliminate half of all entry level white collar jobs in one to five years and says he is still the same order of concerned, describing a strange combination of very fast GDP growth with high unemployment, underemployment, low-wage work, and inequality. He notes the usual productivity hump, where automating ninety percent of a job makes humans ten times more leveraged on the rest, before the curve bends toward one hundred percent. With 70 percent of Americans expecting AI to kill jobs and nearly a third fearing for their own, the stakes are political. Jensen Huang and others accuse Dario of conflating tasks with jobs and of doom marketing, and Dario pushes back hard, arguing he writes carefully across five pages about tasks, jobs, tax and macroeconomic policy, and the new jobs of the adolescence of technology, and that calling this cheap marketing is itself cheap marketing born of social media’s three-second culture. Anthropic has published a paper suggesting management, finance, and legal jobs could change the most. Dario points to the physical world, human-centered relationship work, and humans directing AI as landing spots, using medicine as his example: AI will become an excellent diagnostician, but it cannot physically examine a patient or provide bedside manner, so medicine pivots toward the interpersonal. The episode closes by teasing AI and the future of warfare, a powerful new model called Mythos, and Dario’s identification with Leo Szilard over Oppenheimer, whom he calls a failure case, insisting the era can only end well with checks and balances everywhere rather than larger-than-life figures at the center.

    Notable Quotes

    “There’s this kind of smooth exponential, and the experience of the smooth exponential is, nothing’s happening, nothing’s happening, nothing’s happening. Little things happen, and then zoom, it goes crazy.”

    Dario Amodei, on how AI progress actually feels from the inside

    “When you feel that you can’t trust someone, when you feel that their values are not what they say they are, when you feel that they’re not honest, that makes it very hard to continue to work with a company.”

    Dario Amodei, on why he and Daniela left OpenAI

    “Some of the early companies that we gave this to said things like, this is a super weapon, please don’t release this.”

    Anthropic, on early reactions to one of its more powerful models

    “I like to describe it as professional warmth. So the goal is not for it to be your best friend, but it’s not for it to be sort of cold, rote, calculating.”

    Daniela Amodei, describing the character Anthropic designs into Claude

    “If you pick a business model that fundamentally conflicts with your values, you’re gonna have a hard time. Either you betray your own values or you become irrelevant.”

    Dario Amodei, on why Anthropic bet on enterprise and coding

    “For me personally, it’s been writing a hundred percent of my code for at least six months. The work of engineering has just completely changed.”

    Boris Cherny, the engineer behind Claude Code and Claude Cowork

    “I feel like I suddenly have superpowers. I have like a jet pack and the engineering has never been this fun.”

    Boris Cherny, on building software with Claude Code

    “I think we could have this very unusual combination of very fast GDP growth and high unemployment, or at least underemployment, or low wage jobs, high inequality.”

    Dario Amodei, on the economic shock he is most worried about

    “The idea that this is cheap marketing is itself cheap marketing. I think it’s part of the disease of Silicon Valley.”

    Dario Amodei, responding to the doom-marketing accusation

    “The figure I most identified with was Leo Szilard, who was the one who first had the idea that there could be a chain reaction.”

    Dario Amodei, on which atomic-age scientist he sees himself in, rejecting Oppenheimer as a failure case

    Watch the full episode of The Circuit inside Anthropic here.

    Related Reading

    • Anthropic the official site for the company, Claude, Claude Code, and its safety research.
    • Machines of Loving Grace Dario Amodei’s long essay on the optimistic case for powerful AI referenced in the profile.
    • Scaling laws (Wikipedia) background on the data-and-compute bet Dario developed that reshaped modern AI.
    • Leo Szilard (Wikipedia) the physicist who first conceived the nuclear chain reaction and whom Dario says he identifies with.
    • Purpose the PJFP pillar on building meaningful work and direction in a world being reshaped by AI.
  • Uber CEO Dara Khosrowshahi on AI, Autonomous Vehicles, Robotaxis, Drones, and the Future of Transportation

    Uber CEO Dara Khosrowshahi sat down with Patrick O’Shaughnessy on the Invest Like the Best podcast for a long, candid conversation about the forces remaking transportation. There is artificial intelligence inside the company, and there is physical AI out in the real world, meaning autonomous vehicles, robotaxis, and delivery drones. He calls the autonomous opportunity another trillion dollar marketplace and argues it will change how society operates. You can watch the full interview here. What follows is a structured breakdown of the most useful ideas, the strategy behind Uber’s AV bet, and the operating philosophy that runs underneath all of it.

    TLDW

    Dara Khosrowshahi explains how he brought order to the chaos he inherited at Uber in 2017 by treating hard problems like vector mathematics, and how an immigrant childhood shaped his all-in, low-stress operating style. He describes AI hitting Uber on two fronts at once: much larger digital models that predict rider intent, and physical AI that changes how rides and food get fulfilled in the real world. The conversation covers Uber blowing through a full year of AI budget in a single quarter, metering headcount as engineers become superhuman, the more than 30 AV partnerships with Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI, and why supply, not demand, is the whole game. It runs through the coexistence model borrowed from travel and Uber Eats, the Uber One membership flywheel at 50 million members, the push from on-demand to planned travel through hotels and Uber Reserve, the economics of cheaper autonomous cars and delivery drones, the regional race from the Middle East to Europe, and the lessons from Barry Diller and Herbert Allen about getting to ground truth and betting on people. It closes on his capital allocation philosophy of prioritizing organic growth and AV commitments over buybacks.

    Thoughts

    The most underappreciated line in the whole interview is the budget one. Blowing a full year of AI spend in a single quarter is the clearest signal yet that frontier intelligence is being consumed far faster than even an AI-native company planned for. Dara’s response has quietly become the default enterprise playbook: explore on the expensive frontier models, then scale the proven interactions onto cheaper or open-source models. The deeper tension is that he is simultaneously telling teams to drive adoption and metering headcount, which is the real story of AI in large companies. The productivity gains are showing up as fewer hires, not only as faster shipping.

    The supply-first framing is the strategic core, and it inverts the demand-first logic he learned at Expedia. In autonomous vehicles this means Uber does not need to win the self-driving race itself. It needs to own the demand layer and aggregate every AV maker’s supply, the same way online travel agents coexist with hotels and Uber Eats coexists with McDonald’s. The 30 percent higher utilization figure for AVs on Uber’s network is the wedge in that argument. It is the reason a Waymo stays on the platform even while building its own brand, because filling more of an expensive asset’s day changes the entire return on the car.

    His premortem answer is unusually honest. Asked what kills the opportunity, he does not name an Uber-specific execution failure. He names AI’s unpopularity with the general public. That is a CEO admitting the gating factor is social license, not technology. The early data he leans on, drivers in Austin and Atlanta earning more and signing up in greater numbers as AVs add incremental demand, is the counter-narrative he is betting the public conversation on. Whether that story holds as AV volume scales from thousands of vehicles to hundreds of thousands is the open risk the entire industry shares.

    Underneath the strategy is one repeated instinct: get to ground truth. It shows up in the Barry Diller story about reading the model from the analyst who built it, in his hunt for the troublemakers who keep a company mutating, and in the fact that he bought an ebike to deliver food in San Francisco. It is the same move applied at every altitude, and it is why he frames AI as a chance to rebuild processes from first principles rather than shave 20 percent off the ones that exist. The leaders who treat AI as an efficiency tool will likely lose to the ones who rebuild from the ground up.

    Key Takeaways

    • Dara took the Uber job in 2017 after Daniel Ek recommended him at the Allen and Company Sun Valley conference and told him, when he hesitated, that life is about impact rather than happiness.
    • He inherited what he calls complete chaos: a board fighting for control, lost trust with regulators and the public, and a committee running the company after Travis Kalanick stepped back.
    • His method for chaos is to treat it like vector mathematics, breaking a seemingly unassailable problem into component dimensions and solving each one.
    • Early moves included bringing in chairman Ron Sugar to unite the board, running a listening tour with stakeholders, and rebuilding the executive team with leaders like Andrew McDonald and Tony West.
    • He credits an engineering mindset and an immigrant childhood for his calm under pressure. His family lost everything leaving Iran when he was nine and rebuilt from nothing.
    • On parenting, he argues that overcoming challenges is what forms people, and that doing everything for your kids is a long-term disservice disguised as a short-term favor.
    • Uber has always operated in a probabilistic real world of traffic, cancellations, and late food, so it has used machine learning longer than most consumer companies.
    • The current inflection is AI on two fronts: larger digital models that predict intent, and physical AI that changes how Uber fulfills in the real world.
    • Uber’s feed and search models are now roughly 10,000 times bigger than the older ones, enabling universal search across rides, eats, and grocery in a single query.
    • Uber can already guess a rider’s destination about three quarters of the time, turning booking into a one-tap interaction.
    • AI adoption is bottoms-up across engineering, legal, and marketing. Developers in India are driving roughly ten times the code commits using autonomous agents.
    • Dara pushes teams to rebuild processes from first principles with AI rather than settling for 20 to 30 percent optimization of an existing process.
    • He wants the rebels and troublemakers to win, and treats unpredictable internal adoption patterns as something to find and promote.
    • Uber blew through its full-year AI budget in a single quarter, which is now forcing it to meter headcount as engineer throughput climbs.
    • The token strategy is to explore on expensive frontier models, then scale proven interactions onto cheaper or open-source models.
    • Uber generates over 10 billion dollars in free cash flow on more than 10 billion trips a year, but it is not a high-margin business, so efficiency funds lower prices and higher earnings.
    • In autonomous vehicles, the thesis is supply: own the demand layer and aggregate every AV maker’s vehicles, the way Uber aggregates drivers and restaurants.
    • Uber has more than 30 AV partnerships, including Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI.
    • Uber is building the surrounding ecosystem: depots, charging, fleet partners, a one billion dollar Santander financing line for EV and AV fleets, and autonomous insurance.
    • AVs operating on Uber’s network are about 30 percent busier in trips and revenue per vehicle per day than vehicles not on the network, which transforms the return on an expensive car.
    • The build, partner, or buy answer is coexistence, mirroring how travel agents coexist with hotels and airlines and how Uber Eats coexists with McDonald’s, Starbucks, and Chipotle.
    • His public premortem is that AI’s unpopularity, not Uber-specific execution, is the biggest risk, so the company must move at the pace society will accept to avoid backlash.
    • Early data in Austin and Atlanta shows drivers earning more and more drivers joining, suggesting AVs are adding incremental demand rather than only displacing humans.
    • AV hardware costs typically fall 30 to 40 percent per generation. A Lucid midsize built with Nuro could land around 60,000 to 70,000 dollars and bring transportation costs down.
    • Lower cost expands demand. Uber already dwarfs the taxi market it was once sized against, and Dara expects the same dynamic with AVs.
    • Traditional OEMs are now investing in L4-ready systems and should arrive over the next two to four years. Each AV drives roughly three to four times what a human driver does.
    • Chinese manufacturing capability and bill of materials are described as unrivaled. A low-cost Western, Foxconn-style player for AVs is being worked on but does not exist yet.
    • Drones are gated by battery density. Food and grocery drones should reach real scale in two to five years and become normal in five to ten, with Joby and Zipline cited as examples.
    • The Middle East, including Abu Dhabi, Dubai, and Saudi Arabia, is moving fastest thanks to entrepreneurial regulators. Europe is catching up, with London robotaxi pilots expected before year end.
    • Uber Eats wins the number one position more often internationally. The playbook is selection plus reliability, amplified by cross-platform upsell, with about 13 percent of Eats bookings coming from the mobility app.
    • Uber One has 50 million members growing 50 percent year on year. Dara frames it like Netflix, more content for the same price, and accepts a first-year loss for multi-year profit.
    • Uber is pushing from on-demand to planned through hotels, via a deal with Expedia, and through Uber Reserve, now at over a 5 billion dollar run rate with 99 percent-plus reliability.
    • His leadership lessons: from Barry Diller, get to ground truth from source material and tell the truth as a leader. From Herbert Allen, bet on people, not companies.
    • On capital allocation, he prioritizes organic growth and financialized AV commitments over buybacks, while keeping costs growing slower than revenue.

    Detailed Summary

    From chaos to structure: the 2017 turnaround

    Dara came to Uber from 13 years running Expedia under Barry Diller, recruited through a head hunter after Daniel Ek floated his name at the Sun Valley conference. He arrived into what he describes as complete chaos, with the board fighting over control rather than the fate of the company and trust badly damaged with regulators, the public, and employees. His approach was to decompose the situation the way an engineer decomposes a multidimensional problem, solving each dimension and reassembling the whole. Practically that meant a new chairman in Ron Sugar to unite the board, a listening tour to understand stakeholder concerns, and a rebuild of the leadership team that kept strong insiders like Andrew McDonald while adding people like Tony West.

    An engineering mind and an immigrant chip on the shoulder

    His wife Sid calls him a robot, by which she means he does not get rattled. He traces that to an engineering education and to a childhood upheaval. His family left Iran when he was nine and lost the business his father had built, and he watched that loss diminish his father over the years. The experience produced a durable drive to rebuild and a refusal to let external chaos define him internally. He applies a similar philosophy to his kids, arguing that challenges and the act of overcoming them are what form a person, and that helicopter parenting removes the very friction that builds capability.

    AI inside Uber: prediction, agents, and superhuman engineers

    Uber has always lived in a probabilistic world where the digital booking is deterministic but the real-world fulfillment is not, so it adopted machine learning earlier than most consumer companies. The newest models are roughly 10,000 times larger than the prior generation and power universal search and destination prediction that is right about three quarters of the time. Internally, adoption is bottoms-up and uneven in a good way, with engineers in India shipping around ten times the code commits using autonomous agents. Rather than mandate from the top, Dara pushes teams to rebuild whole processes from first principles with AI instead of trimming a fifth off the existing ones.

    The cost of intelligence

    The flip side of fast adoption is cost. Uber blew through its annual AI budget in a single quarter, and that is forcing a real adjustment. Because engineer throughput is climbing, the company is metering headcount increases rather than simply hiring. The operating rule is to keep driving adoption while pursuing efficiency, using frontier models from providers like OpenAI and Anthropic to experiment with new interactions, then moving the scaled experiences onto more efficient or open-source models to bring the per-token cost down. With more than 10 billion dollars of free cash flow on over 10 billion trips, Uber is not a high-margin business, so efficiency directly funds lower prices for riders and higher earnings for drivers.

    Why supply decides the AV race

    At Expedia, Dara learned a demand-first model where you attract consumers and then build inventory to match. Uber is the opposite, a supply company, where securing every car, restaurant, courier, and retailer causes the demand to follow. Applied to autonomous vehicles, the strategy is to be the go-to-market and demand layer for anyone building a digital driver. Uber wants to aggregate the largest pool of AV supply, just as it aggregates human drivers, so that the companies building the actual self-driving software can focus on the driver while Uber handles distribution and utilization.

    Building the ecosystem around the digital driver

    Uber now has more than 30 AV partnerships spanning Waymo, Nuro, Lucid, Nvidia, Wayve, and Pony AI, and it expects many winners rather than one, the same shape as the foundation model market. Around those partners it is assembling the connective infrastructure: depots and charging in cities where the regulatory path is opening, fleet partners, a one billion dollar financing line with Santander for EV and AV fleets, and work on autonomous insurance. It is also collecting street data today that can feed the models, so that when a partner’s cars hit the market there is instant demand waiting. The early proof point is that AVs on Uber’s network run about 30 percent busier than comparable vehicles off it, which materially improves the return on a costly car.

    The premortem and the public’s patience

    Asked what derails the opportunity, Dara points outward rather than inward. The risk is that AI is powerful but unpopular, and the average person experiences it as a threat to electricity costs or a cousin’s job rather than as magic. The same dynamic could hit AVs even though the technology should end up safer than human drivers, which is why questions about emergency services, equitable access, and driver earnings have to be worked through with regulators and communities. The encouraging early signal is in Austin and Atlanta, where drivers are making more money and more are joining because AVs appear to be adding incremental demand. The controllable risk, he says, is access to supply, which is exactly why Uber has partnered with nearly every AV provider across mobility, delivery, and freight.

    A trillion dollar marketplace: cheaper cars and delivery drones

    Dara sizes the autonomous opportunity as another trillion dollar marketplace. As AV software and hardware costs fall, typically 30 to 40 percent per generation, a Lucid midsize built with Nuro could come in around 60,000 to 70,000 dollars, which starts to lower the real cost of transportation. History says lower cost expands demand, and Uber already became multiples larger than the taxi market it was once compared to. Manufacturing scales from hundreds to thousands to hundreds of thousands of vehicles, each driving three to four times what a human does, with traditional OEMs investing in L4-ready systems over the next two to four years and Chinese manufacturers setting the bar on cost and quality. Delivery drones are further out, gated mainly by battery density, but should reach real scale in two to five years and feel normal in five to ten.

    Membership, hotels, and the shift from on-demand to planned

    Uber Eats often reaches the number one position internationally by nailing selection and reliability and then layering on cross-platform advantages, with roughly 13 percent of Eats bookings flowing from the mobility app. Uber One, at 50 million members growing 50 percent year on year, is the loyalty engine, and Dara likens it to Netflix in that members get more for the same price. He explains the membership economics through Amazon Prime, accepting a money-losing first year to earn multi-year profit as members spend more across services. The newest expansion is travel: hotels through a deal with Expedia, and a broader move from Uber’s on-demand brand toward planned bookings, proven out by Uber Reserve at a 5 billion dollar-plus run rate and 99 percent-plus reliability. The end state he wants is a trip where Uber pre-books your ride to the airport, knows your hotel, and brings in-market magic to the whole journey.

    Operating philosophy: ground truth, troublemakers, and capital allocation

    The mentors thread through everything. From Barry Diller, with whom he worked for more than 20 years, he took the discipline of getting unfiltered truth from the source, illustrated by Diller insisting on hearing the Paramount LBO model from the young analyst who built it. From Herbert Allen he took the lesson to bet on people rather than companies, because great people stay great across cycles. In his own practice that becomes radical transparency, a deliberate hunt for the troublemakers who act as the mutations that keep an organism from dying, and a willingness to be wrong, since learning, often through pain, is what he finds interesting. On capital, he treats allocation as an art, prioritizing organic growth, which took Uber Eats from under a billion to over a hundred billion in gross bookings, then AV commitments that can be financialized, with buybacks coming after growth rather than instead of it.

    Notable Quotes

    “I know who I am, and I’m always going to be that same person. I’m not going to let the chaos of the world affect me mentally.”

    Dara Khosrowshahi, on why crisis does not rattle him

    “We blew through our AI budget in a quarter, you know, for the whole year essentially. And it is forcing us to adjust.”

    Dara Khosrowshahi, on the real cost of AI adoption at Uber

    “What’s magical now is going to seem normal to all of us 10 years from now.”

    Dara Khosrowshahi, on how fast riders stop noticing autonomous vehicles

    “We think it’s another trillion dollar marketplace.”

    Dara Khosrowshahi, on the scale of the autonomous vehicle opportunity

    “If we do that, the demand will take care of itself.”

    Dara Khosrowshahi, on why Uber obsesses over securing supply first

    “I’m looking for those mutations. I’m looking for those troublemakers constantly.”

    Dara Khosrowshahi, on keeping a large company adaptive

    “It’s the filtering that gets the edge out of the story or out of the situation. And it’s often the edge that gives you an edge.”

    Dara Khosrowshahi, on a lesson from Barry Diller about going to the source

    “If I’m not wrong, if I’m not making mistakes, it’s just not very interesting.”

    Dara Khosrowshahi, on why learning, often through pain, drives him

    “Meeting her and seeing her operate, I think, finally allowed me to be the person I want to be versus the person I thought I was supposed to be.”

    Dara Khosrowshahi, on his wife Sid, when asked the kindest thing someone has done for him

    The throughline is that Uber intends to be the demand layer for autonomous transportation the way it became the demand layer for human drivers, while rebuilding its own operations around AI from first principles. Whether the public grants the industry enough patience is the open question Dara keeps returning to. Watch the full conversation here.

    Related Reading

    • Uber primary source for the company, products, and AV partnerships discussed in the interview.
    • Dara Khosrowshahi (Wikipedia) background on the CEO’s path from Iran to Expedia to Uber.
    • Invest Like the Best the podcast with Patrick O’Shaughnessy where this conversation took place.
    • Waymo the autonomous driving company behind the Austin and Atlanta partnerships referenced.
    • Barry Diller (Wikipedia) the mentor whose lessons on ground truth shaped Dara’s leadership style.
  • Gavin Baker on Orbital Compute, TSMC, Frontier AI Models, Anthropic’s Vertical Take Off, and the Coming Wafer Shortage

    Gavin Baker, founder and CIO of Atreides Management, returns to Patrick O’Shaughnessy’s Invest Like the Best for his sixth appearance. He calls the current AI moment the most extraordinary moment in the history of capitalism, walks through what Anthropic’s vertical takeoff in revenue actually means, lays out why orbital compute is closer than skeptics believe, dissects the TSMC bottleneck that may be the only thing standing between today’s market and a full-on AI bubble, and rates every hyperscaler on how they have positioned for a world where frontier model providers may stop selling API access altogether.

    TLDW

    Anthropic added eleven billion dollars of ARR in a single month, which is roughly the combined business of Palantir, Snowflake, and Databricks built over a decade. That is the setup. From there Gavin Baker covers the March and April selloff, the contrarian read that a closed Strait of Hormuz was actually bullish for American manufacturing competitiveness, why Anthropic and OpenAI multiples may be misleadingly cheap on an unconstrained run rate basis, why Elon Musk’s discipline on SpaceX valuation created a superpower of permanent access to capital, the practical engineering case for orbital compute as racks in space rather than Pentagon sized space stations, why TSMC’s capacity discipline is the single most important variable in whether the AI cycle becomes a bubble, what Terafab in Texas changes, why the Pareto frontier of AI models has flipped from Google dominance to Anthropic and OpenAI dominance in nine months, the shift from all you can eat AI subscriptions to usage based pricing and what that means for revenue scaling, Richard Sutton’s bitter lesson as the largest risk to the AI trade, why frontier tokens still capture an overwhelming share of economic value, the role of continual learning as the third great open question, why most new chip startups should not try to build a better GPU, why Cerebras did something different and hard, why disaggregated inference may extend GPU useful lives to ten or fifteen years and rescue the private credit industry, why being in the token path is the new venture filter, the new prisoner’s dilemma around releasing frontier models via API, an honest rating of Google, Meta, Amazon, and Microsoft, why personal safety is becoming a real AI era risk, and why he remains an AI optimist maximalist who believes this could be the next Pax Americana.

    Key Takeaways

    • Anthropic added eleven billion dollars of ARR in one month, more than the combined businesses of Palantir, Snowflake, and Databricks built across a decade. There is no precedent for this in the history of capitalism.
    • The SaaS and cloud revolution created between five and ten trillion dollars of value over twenty years. AI is replaying that compression on a timeline measured in months.
    • The March selloff was a drawdown driven by disagreement with price action, not invalidated thesis. That is the kind of drawdown an investor can lean into.
    • Deep Seek Monday in January 2025 was a similar setup. By the day of the selloff, AWS Asia GPU prices had already doubled, GPU availability had fallen, and it was obvious reasoning models would be vastly more compute hungry at inference. The market priced the opposite.
    • The Strait of Hormuz closing was actually positive for America. US natural gas (the primary input into US electricity, which feeds AI) fell twenty percent on Bloomberg while Asian and European natural gas doubled or tripled. American manufacturing competitiveness improved overnight.
    • The US is now the world’s largest producer and exporter of oil and gas. The economy is dramatically less energy intensive than in the 1970s. The shortage trauma comparison does not hold.
    • Tech as a sector traded as cheaply versus the rest of the market in early April as at any point in the last ten years, into the single most bullish moment for AI fundamentals on record.
    • Anthropic is dramatically more capital efficient than OpenAI, having burned roughly eighty percent less to reach a similar revenue scale. They have very different structural returns on invested capital.
    • Anthropic at roughly nine hundred billion for fifty billion of ARR (growing a thousand percent) is striking. Adjusted for compute constraint, the unconstrained run rate could be one hundred fifty to two hundred billion, putting the implied multiple closer to five times.
    • Claude Opus generates roughly seventy percent fewer tokens for the same question than previously, with token quantity tied to answer quality. Subscribers on flat-fee plans are getting a lobotomized model.
    • Elon Musk’s superpower is twenty years of making investors money. He never pushes valuation. SpaceX compounded low thirty percent per year for a decade because Musk treats fair pricing as a sacred covenant.
    • Capitalism will solve the watts shortage. The current bottleneck has shifted from chips and energy to zoning and political approval. Many capex decisions are paused until after the US midterms.
    • The watts shortage probably begins to alleviate in 2027 and 2028. Orbital compute solves it longer term.
    • Orbital compute is not Pentagon sized data centers in space. It is racks in space. A Blackwell rack is three thousand pounds, eight feet tall, four feet deep, three feet wide. SpaceX has shown a satellite roughly that size.
    • The satellites operate in sun synchronous orbit so solar wings (around five hundred feet per side) always face the sun and the radiator on the dark side always points to deep space.
    • Starlink V3 satellites already run at around twenty kilowatts. A Blackwell rack runs at one hundred kilowatts. SpaceX engineers express genuine confidence they have already solved cooling and radiator design at these scales.
    • Racks in space are connected with lasers traveling through vacuum, the same lasers already on every Starlink. SpaceX operates the world’s largest satellite fleet and, via xAI Colossus, the world’s largest data center on Earth.
    • Inference will move to orbit. Training will stay on Earth for a long time. Terrestrial data centers remain valuable for the rest of an investor’s career.
    • The wafer bottleneck is structural and political. TSMC is essentially Taiwan’s GDP, water, and electricity. The leaders see themselves as inheritors of Morris Chang’s sacred legacy and they do not behave like a Western public company.
    • Jensen Huang has never had a contract with TSMC. The relationship is run on handshakes and the assumption that things will be fair over time.
    • If TSMC did everything Jensen wanted, Nvidia could be selling two to three trillion dollars of GPUs in 2026 and 2027. TSMC’s discipline is the single largest factor preventing a true AI bubble.
    • Historically, foundational technologies always get a bubble. Railroads, canals, the internet. The current AI buildout is overwhelmingly funded out of operating cash flow, GPUs are running at one hundred percent utilization, and that is fundamentally different from the year 2000 fiber overbuild.
    • If one of Intel or Samsung Foundry catches up at the leading node, the other will follow, and TSMC’s discipline collapses. Watch TSMC capacity decisions to predict a bubble.
    • Terafab, the SpaceX and Tesla joint venture to build the world’s largest fab in America, has a partnership with Intel that grants access to fifty years of institutional foundry knowledge. The A teams at ASML, KLA, Lam Research, and Applied Materials will follow Elon’s reputation in hardware engineering.
    • The hiring playbook for Terafab includes building Taiwan Town, Japan Town, and Korea Town next to the fab. Recruit the engineers and import their families, their restaurants, and their staff.
    • Frontier tokens still capture an overwhelming share of all economic value created at the model layer. This is surprising and is one of the three big open questions for AI investing.
    • The Pareto frontier of intelligence versus cost has flipped. Nine months ago Google’s TPU dominated every point on the frontier. Today Anthropic and OpenAI dominate, with Grok 4.3 on the frontier and Gemini 3.1 hanging on.
    • Google’s conservative TPU V8 design (partly an attempt to reduce dependence on Broadcom and Nvidia) is the leading explanation for the loss of per token cost leadership.
    • AI pricing is shifting from all you can eat to usage based, mirroring the cellular and long distance industries. Cellular stopped being a great growth industry when it went all you can eat. AI just made the opposite move.
    • OpenAI and Anthropic together could exceed two hundred billion in ARR this year if compute keeps coming online and frontier token pricing holds.
    • The two hundred fifty dollar a month consumer AI plan is no longer enough to evaluate frontier capability. Enterprise plans with usage based billing are required because rate limits are now severe.
    • The three biggest open questions for AI investors are: violation of the bitter lesson via ASI or human ingenuity, whether frontier tokens keep commanding their premium, and when continual learning arrives.
    • Today’s continual learning is crude reinforcement learning during mid training on verifiable tasks. True continual learning means weights updating dynamically, like a human who learns the first time they touch fire.
    • Trying to build a better GPU is a losing strategy. Jensen will copy any one to three percent share design. Startups should target one percent share, do something different, and make it hard enough that Nvidia cannot fast follow.
    • Disaggregated inference (separating prefill and decode) opens new design canvases. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently.
    • Cerebras did something different and hard with wafer scale computing. Three generations of chips and real grit to get there.
    • Disaggregation of inference may stretch GPU useful lives to ten or fifteen years, dropping financing costs from low sevens to five or six percent, mathematically lowering the cost of the AI buildout and likely saving the private credit industry from its SaaS loan exposure.
    • Sellers of shortage outperform buyers of shortage. But owning the largest installed base of what is currently in shortage (hyperscaler CPU fleets, for example) is also a strong position.
    • Most of the economic value at the application layer of AI has been destroyed, not created. The exceptions are companies in the token path or in niches small enough that frontier labs ignore them.
    • Coding may be the shortest path to ASI. If you can write code, you can write code that does anything. Cursor, Cognition, and Anthropic correctly focused on it.
    • Jensen could probably get close to the frontier with his own Nemotron family of models whenever he wants. The fact that he chooses not to is a strategic decision about not commoditizing his customers.
    • The new prisoner’s dilemma in AI is whether frontier labs release their best model via API. If everyone agrees not to, Chinese open source falls behind. If anyone defects, the defector pulls ahead on revenue and resources, forcing everyone else to defect.
    • Google still owns the largest compute installed base. Without TPU’s prior cost advantage, this matters more. YouTube data has real value in a world of robotics. GCP is going crazy.
    • Meta deserves credit for becoming AI first internally faster than any other internet giant. Musa, their first MSL model, is impressively close to the Pareto frontier.
    • Amazon is strong because of Trainium and robotics driven retail P&L efficiency. Nova is better than it gets credit for.
    • Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Microsoft products rather than reselling to OpenAI is a courageous and probably correct call, even at the cost of an eight hundred dollar stock price.
    • The hyperscalers most engaged with startups are Amazon and Nvidia by a mile, followed by Google. Broadcom is the favorite ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement and that will cost them as the best teams are now at startups.
    • Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion at the speed of FaceTime is already feasible.
    • Ukraine is winning largely on the back of having the best battlefield AI outside America and Israel. Adversaries are starting to internalize what AI dominance means geopolitically.
    • An optimistic read is that this becomes a new Pax Americana, the way the post 1945 American nuclear monopoly was used to rebuild Germany and Japan rather than dominate.
    • AI cured a friend’s daughter’s rare disease by spinning up a research effort that identified a market drug capable of impacting her condition. That is the upside that keeps Gavin an AI optimist maximalist.

    Detailed Summary

    The most extraordinary moment in the history of capitalism

    Gavin’s framing of the current moment is unusually direct. Anthropic added eleven billion dollars of annual recurring revenue in a single month. The three highest profile SaaS companies of the last decade plus, Palantir, Snowflake, and Databricks, took a decade and tens of thousands of employees collectively to build the combined business that Anthropic added in thirty days. He has been investing through every major tech cycle and says there is no historical analog. Not the dotcom era, not the cloud transition, not mobile. This is its own thing.

    The market response, then, was peculiar. The NASDAQ sold off into the single most bullish moment for AI fundamentals on record. Tech traded at roughly its widest discount versus the rest of the market in a decade. Investors who said they wished they had bought into AI during 2022, during COVID, or during Deep Seek Monday got the same valuation setup again in early April, this time with an even clearer inflection.

    Why the Strait of Hormuz closing was secretly bullish for America

    One reason the macro fear in March may have been mispriced is that the same geopolitical event that drove the selloff was, in practice, a relative benefit to the United States. American natural gas, the input into American electricity, which is the input into American AI training and inference, fell roughly twenty percent. Asian and European natural gas prices doubled or tripled. The US emerged with sharply improved relative manufacturing competitiveness, which is exactly what the current administration cares about.

    The 1970s comparison does not hold. The US economy is dramatically less energy intensive, it is now the world’s largest producer and largest exporter of oil and gas, and there are no shortages, only price moves. That backdrop made it easier for disciplined investors to stay focused on AI fundamentals through the volatility.

    Anthropic and OpenAI valuations on an unconstrained run rate

    Anthropic at roughly nine hundred billion for fifty billion of ARR sounds rich until you adjust for the fact that the company is severely compute constrained. Gavin estimates that, unconstrained, Anthropic might be at one hundred fifty to two hundred billion in run rate revenue, putting the implied multiple closer to five times. He also points out that Claude Opus now generates roughly seventy percent fewer tokens for the same question than it used to. Token quantity correlates with answer quality, and Anthropic is rate limiting and shrinking outputs to ration capacity across its user base.

    Anthropic and OpenAI are also structurally very different. Anthropic has burned around eighty percent less cash than OpenAI to reach a comparable revenue scale. That implies very different long term returns on invested capital, though OpenAI has done a better job locking in compute and Sarah Friar is one of the most exceptional CFOs Gavin has worked with.

    Why neither lab is raising at a three trillion dollar valuation

    The answer Gavin gives is that both labs are deliberately leaving valuation on the table the way Elon has done for two decades. SpaceX compounded at low thirty percent annually for a decade because Elon never pushed price. The result is a permanent superpower of access to capital. Investors trust him because they have made money with him for twenty years. That is a moat that compounds with every round.

    Anthropic could probably raise at a one hundred percent premium to its rumored latest mark. They are choosing not to. In an uncertain world (Ukraine, Russia, Iran, Taiwan), preserving the ability to raise more capital later at fair prices is more valuable than maximizing this round.

    Watts and wafers, the two real constraints

    Capitalism is solving the watts problem. The leading PE infrastructure investors now say zoning and political approval, not chips or energy, are the gating factors. Companies are deferring big capex announcements until after the US midterms. Turbine capacity is being doubled at the manufacturers. Companies like Boom Aerospace are repurposing jet engines for grid use. Watts probably ease meaningfully in 2027 and 2028 and then orbital compute does the rest.

    Wafers are the harder problem because they live in Taiwan, run on handshakes, and depend on a corporate culture that does not respond to public market incentives. TSMC is essentially the GDP, water consumption, and electricity consumption of Taiwan. Its leadership treats the company as the legacy of Morris Chang. The Silicon Shield doctrine is real and internal.

    Orbital compute as racks in space

    The biggest mental update Gavin asks listeners to make is to stop picturing data centers in space as Pentagon sized space stations. A Blackwell rack is three thousand pounds and roughly the size of a refrigerator. SpaceX has shown a concept satellite of about that size. Solar wings extend five hundred feet to each side and the radiator extends hundreds of feet behind, both possible because the orbit is sun synchronous and the orientation is fixed relative to the sun.

    SpaceX engineers Gavin has spoken to at Starbase express genuine confidence that they have solved cooling at these power levels. They have. Starlink V3 satellites already operate at twenty kilowatts. A Blackwell rack is one hundred kilowatts. The same company operates the world’s largest satellite fleet and the world’s largest data center on Earth via xAI Colossus. The racks are connected to each other with lasers traveling through vacuum, technology already deployed in every Starlink. The naysayers, Gavin observes, are armchair skeptics and Larry Ellison’s response (he is out there landing rockets, no one else is) is the right frame.

    Terafab in Texas and the threat to TSMC’s discipline

    Terafab, the SpaceX and Tesla joint venture, intends to be the largest fab in the world. The partnership with Intel grants access to fifty years of foundry institutional knowledge, allowing Terafab to start three to five quarters behind the leading node rather than fifteen years behind. The A teams at the semicap equipment companies (ASML, KLA, Lam Research, Applied Materials) will follow Elon’s reputation in hardware engineering the same way they followed TSMC twenty years ago when Intel stumbled.

    The talent strategy is the part most observers underestimate. Recruit the best engineers globally, then import their families, their restaurants, their staff. Build Taiwan Town, Japan Town, and Korea Town next to the fab. Optimize the human experience for the people whose work matters. Intel and Samsung do not think that way.

    Bubble watch and the year 2000 comparison

    Every foundational technology in modern history has had a bubble. Railroads, canals, the internet. Carlota Perez documented why. Markets correctly identify the importance, diversity of opinion collapses, supply gets ahead of demand, the bubble crashes. The current cycle has two important differences. The buildout is overwhelmingly funded out of operating cash flow, not debt. Every GPU is running at one hundred percent utilization, while at the peak of the fiber bubble ninety nine percent of fiber was unused.

    TSMC discipline is the single largest reason a bubble has not formed. If Jensen could buy everything TSMC could theoretically make, Nvidia could sell two to three trillion dollars of GPUs in 2026 and 2027. At some point that becomes more than the market can absorb. If Intel or Samsung Foundry catches up at the leading node, the other will too. TSMC’s pricing discipline collapses and the bubble starts.

    The Pareto frontier and the loss of Google’s cost advantage

    The most important chart in AI is the Pareto frontier of model intelligence versus per token cost. Nine months ago, Google’s TPU based models dominated every point on it. OpenAI, Anthropic, and xAI sat inside the frontier. Today the frontier is dominated by Anthropic and OpenAI, with Grok 4.3 on the frontier and Gemini 3.1 hanging on by subsidization more than economics. The most likely cause is Google’s conservative TPU V8 design, an attempt to reduce dependence on Broadcom and Nvidia that sacrificed per token economics.

    The bitter lesson, frontier tokens, and continual learning

    Three open questions dominate AI investing. The first is whether Richard Sutton’s bitter lesson (more compute beats human algorithmic cleverness) gets violated by ASI itself optimizing for efficiency. Closer observers of AI are more skeptical of a violation. Gavin thinks ASI’s first move will be to make itself more efficient and more resourced, which is technically a temporary violation.

    The second is whether frontier tokens keep capturing the overwhelming share of economic value at the model layer. Today they do, surprisingly. Gemini 3.1 Pro was mindblowing nine months ago and is intolerable today. The third is when continual learning arrives. Today’s models need a million fire touches to learn what a human learns from one. True continual learning would mean dynamic weight updates in real time and would produce a fast takeoff.

    From all you can eat to usage based AI pricing

    AI is shifting from flat fee plans to usage based pricing. The historical analogy is cellular and long distance. Both stopped being great growth industries when they went all you can eat. AI just made the opposite move. The consequence is that flat fee subscribers, even on premium consumer plans, get a rate limited and token throttled version of the frontier model. Enterprise plans with usage based billing are now required to evaluate true capability. Gavin thinks the combination of new compute coming online and usage based pricing is what gets OpenAI and Anthropic past two hundred billion in combined ARR this year.

    Chip startups, prefill decode disaggregation, and Cerebras

    Trying to build a better GPU is the wrong move. The four scaled players (Nvidia, AMD, Trainium, TPU) have copy capability for any one to three percent share design that looks attractive. The good news for startups is that disaggregated inference (separating prefill and decode) opens a richer design canvas. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently. Andrew Fox’s analogy is a British naval ship of the eighteenth century. Prefill is loading the cannon. Decode is firing it.

    Cerebras is the model. Wafer scale computing is genuinely different and genuinely hard. It took three generations of chips to get right. Andrew Feldman and his team had the grit to keep going through chip one being a failure. The design has a high ratio of on chip compute and memory relative to shoreline IO, which is why Cerebras is now experimenting with putting an optical wafer on top of the compute wafer to solve scale out.

    GPU useful lives and the rescue of private credit

    One of the strongest claims in the conversation is that disaggregated inference will stretch GPU useful lives to ten or fifteen years. The skeptical narrative (GPUs are obsolete in two years, companies are cooking their depreciation books) is wrong. You can put a Cerebras system or Groq LPU in front of older Hopper or Ampere parts, use them only for prefill, and run them until they physically melt. Private credit, which is in pain from SaaS loans and which underwrote GPU loans on three to four year lives, may be saved by this.

    If GPU financing rates can come down from low sevens to five or six percent, the mathematics of the AI buildout improves materially. That is a structural tailwind that compounds for years.

    The application layer, the token path, and a new prisoner’s dilemma

    Trillions of dollars of value have been destroyed at the application layer, not created. Cursor and Cognition are the rare scaled exceptions, and they got there by focusing on coding very early. As Amjad Masad noted, coding is plausibly the shortest path to ASI because a coding agent can write itself into any new domain. Jamin Ball’s frame is that the new venture filter is whether the company is in the token path. Data Bricks is. Most application layer startups are not.

    Jensen could probably get close to the frontier with Nemotron whenever he wants, and the strategic question of whether to do that is a new prisoner’s dilemma. If every frontier lab agrees not to release best models via API, Chinese open source falls steadily behind. If anyone defects, the defector gains revenue and resources, and everyone else has to defect. The same dynamic exists between TSMC, Intel, and Samsung. If Nvidia or AMD ever truly used an alternative foundry, that foundry would catch up rapidly.

    Rating the hyperscalers

    Google has the largest compute installed base, the YouTube data that matters in a robotics world, and a search business that prints. Their loss of TPU cost leadership is the surprise of the year. If Google IO in five days does not produce a leapfrog model, the Nvidia centric narrative gets even stronger.

    Meta deserves real credit. Zuckerberg made Meta AI first internally faster than any other internet giant, paid up for the talent contracts when no one else would, and shipped Musa as a first model from MSL that is close to the Pareto frontier. Amazon is well positioned on Trainium, robotics in retail, and a Nova model line that is better than it gets credit for. Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Copilot rather than reselling to OpenAI is courageous and probably correct, even at the cost of stock price.

    The most interesting cross hyperscaler metric is startup engagement. Nvidia and Amazon engage deeply with startups. Google is next. Broadcom is the favored ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement, which Gavin believes will cost them as the best teams now sit at startups.

    Personal safety, geopolitics, and the Pax Americana case

    The closing section turns darker. Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion via something that looks exactly like your child calling on FaceTime is already feasible. Political violence against AI leaders is a real concern. Geopolitically, Ukraine is winning largely because it has the best battlefield AI outside America and Israel. How adversaries respond to that asymmetry is the next great variable.

    Gavin’s optimistic frame is the Pax Americana. After 1945 the US had a nuclear monopoly and could have controlled the world. Instead it rebuilt Germany and Japan, both of which became the most reliable American allies for the next eighty years. If AI dominance plays out similarly, this is a generationally positive story rather than a destabilizing one. The personal anecdote that closes the conversation is a friend whose daughter was diagnosed with a rare genetic condition. He spun up agents, identified a drug already on the market that addresses her mutation, and her life is immeasurably different because of AI. That is the upside.

    Thoughts

    The Anthropic eleven billion in a month framing is the kind of stat that resets priors. The right way to interpret it is not as a one off but as a measure of how fast value can compound when the underlying technology improves on a curve steeper than the ability of the rest of the economy to absorb it. The skeptical question is whether that ARR is durable or whether it is heavily tied to a customer base of other AI companies that are themselves on a single venture funded year of runway. The bullish answer is that frontier coding, frontier research, and frontier enterprise tasks are not going to stop being valuable, and Anthropic is the best at all three. Both can be true. The number is still extraordinary.

    The argument that TSMC discipline is the only thing preventing a bubble is the analytically tightest part of the conversation. The implied trade is to watch TSMC capacity additions like a hawk and to be more, not less, cautious if Intel Foundry or Samsung Foundry ever announce real share at the leading node. The Terafab thesis is more speculative but more interesting. If Elon’s talent recruiting playbook works and the Intel partnership gives Terafab a real seat at the table within five years, the geometry of the global semiconductor industry shifts in a way that is bullish for American manufacturing, bullish for power and water infrastructure in Texas, and ambiguous for TSMC itself.

    The Pareto frontier discussion deserves more attention than it usually gets. Pricing leadership in AI is not a vanity metric. It determines who can subsidize free tier usage, who can absorb compute shortages, who can ship cheaper enterprise plans, and ultimately whose model becomes the default for any given workload. Google losing per token leadership in nine months is one of the most under analyzed events in the sector and it explains a lot about why Anthropic and OpenAI are growing the way they are. If Google IO does not produce a leapfrog model, the implied verdict on TPU V8 design choices gets a lot harsher.

    The application layer destruction point is worth sitting with. Founders building on top of frontier models are competing in a world where the model itself moves faster than any moat they can build, where the model lab can absorb their niche if it gets interesting, and where the only protection is either deep token path integration or a niche so small the lab does not bother. That is a much harsher venture environment than the early SaaS era. The compensating opportunity is that one human can now run a hundred agents, so the ceiling on what a small team can build is correspondingly higher. The bet is that productivity per founder rises faster than competitive pressure from the labs. We will find out.

    The orbital compute pitch is the section that will polarize listeners. The naive read is that this is science fiction. The closer read is that every component (sun synchronous orbit, laser interconnect, twenty kilowatt satellite buses, ten thousand satellite manufacturing cadence, full rocket reusability) already exists. The remaining engineering problems are repair, maintenance, and radiator scale, all of which are real but tractable on a five to ten year horizon. The strategic implication is that the political and zoning ceiling on terrestrial data centers becomes less binding if orbital compute is a credible alternative for inference workloads. The investor implication is that being short the watts and cooling complex on a five year horizon is a real trade, not a meme.

    Watch the full conversation here.

  • Alex Wang on Leaving Scale to Run Meta Superintelligence Labs, MuseSpark, Personal Super Intelligence, and Building an Economy of Agents

    Alex Wang, head of Meta Superintelligence Labs, sits down with Ashley Vance and Kylie Robinson on the Core Memory podcast for his first long-form interview since Meta’s quasi-acquisition of Scale AI roughly ten months ago. He walks through how MSL is structured, why Llama was off-trajectory, what made MuseSpark’s token efficiency surprise the team, how Meta thinks about a future “economy of agents in a data center,” and where he lands on safety, open source, robotics, brain computer interfaces, and even model welfare.

    TLDW

    Wang explains that Meta Superintelligence Labs is a fully rebuilt frontier effort organized around four principles (take superintelligence seriously, technical voices loudest, scientific rigor, big bets) and three velocity levers (high compute per researcher, extreme talent density, ambitious research bets). He confirms Llama was off the frontier when he arrived, so MSL rebuilt the pre-training, reinforcement learning, and data stacks from scratch. MuseSpark is described as the “appetizer” on the scaling ladder, notable for its strong token efficiency, with much larger and stronger models coming in the coming months. He pushes back on the mercenary narrative around recruiting, frames Meta’s edge as compute plus billions of consumers and hundreds of millions of small businesses, sketches a vision of personal super intelligence delivered through Ray-Ban Meta glasses and WhatsApp, and outlines why physical intelligence, robotics (the new Assured Robot Intelligence acquisition), health super intelligence with CZI, brain computer interfaces, and even model welfare are core to Meta’s roadmap. He dismisses reported infighting with Bosworth and Cox as gossip, declines to comment on the Manus situation, and says safety guardrails (bio, cyber, loss of control) are why MuseSpark cannot currently be open sourced, while smaller open variants are being prepared.

    Key Takeaways

    • Meta Superintelligence Labs (MSL) is the umbrella, with TBD Lab as the large-model research unit reporting directly to Alex Wang, PAR (Product and Applied Research) under Nat Friedman, FAIR for exploratory science, and Meta Compute under Daniel Gross handling long-term GPU and data center planning.
    • Wang says Llama was not on a frontier trajectory when he arrived, so MSL had to do a “full renovation” of the pre-training stack, RL stack, data pipeline, and research science.
    • The first cultural fix was getting the lab to “take superintelligence seriously” as a near-term, achievable goal, not an abstract bet. Big incumbents often lack that religious conviction.
    • Four MSL principles: take superintelligence seriously, let technical voices be loudest, demand scientific rigor on basics, and make big bets.
    • Three velocity levers Wang identified for catching and overtaking the frontier: high compute per researcher, very high talent density in a small team, and willingness to fund ambitious research bets.
    • Wang rejects the mercenary recruiting narrative. He says most hires had strong financial prospects at their prior labs already and joined for compute access, talent density, and the chance to build from scratch.
    • On the famous soup story, Wang neither confirms nor denies Zuck personally made the soup, but says recruiting was highly individualized and signaled how seriously Meta cared about each researcher’s agenda.
    • Yann LeCun publicly called Wang young and inexperienced. Wang says they reconciled in person at a conference in India where LeCun congratulated him on MuseSpark.
    • Sam Altman, asked by Vance for comment, “did not have flattering things to say” about Wang. Wang hopes industry animosities subside as systems approach superintelligence.
    • Wang’s management philosophy borrows the Steve Jobs line: hire brilliant people so they tell you what to do, not the other way around.
    • MuseSpark is framed as an “appetizer” data point on the MSL scaling ladder, not a flagship.
    • The MuseSpark program is built around predictable scaling on multiple axes: pre-training, reinforcement learning, test-time compute, and multi-agent collaboration (the 16-agent content planning mode).
    • MuseSpark outperformed internal expectations and showed emergent capabilities in agentic visual coding, including generating websites and games from prompts, helped by combined agentic and multimodal strength.
    • MuseSpark’s biggest external signal is token efficiency. On benchmarks like Artificial Analysis it hits similar results with far fewer tokens than competitor models, which Wang attributes to a clean stack rebuilt by experts rather than inefficiencies patched by longer thinking.
    • Larger MSL models are arriving in the coming months and Wang expects them to be state of the art in the areas MSL is focused on.
    • The Meta strategic edge: massive compute, billions of consumers across the family of apps, and hundreds of millions of small businesses already on Facebook, Instagram, and WhatsApp.
    • Wang’s headline framing: Dario Amodei talks about a “country of geniuses in a data center.” Meta is targeting an “economy of agents in a data center,” with consumer agents and business agents transacting and collaborating.
    • Consumer AI sentiment is in the toilet because, unlike developers who have had a Claude Code moment, ordinary people have not yet experienced AI as a genuine personal agency unlock.
    • Wang acknowledges the product overhang. Meta held back from deep AI integration across its apps until the models were good enough, and is now entering the integration phase.
    • Ray-Ban Meta glasses are the canonical example of personal super intelligence hardware, with the model seeing what the user sees, hearing what they hear, capturing context, and surfacing proactive insights.
    • Wang admits even AI-native users like Kylie Robinson, who lives in WhatsApp, have not naturally used Meta AI yet. He bets that better models plus deeper integration close that gap.
    • On the competitive landscape: a year ago everyone assumed ChatGPT had already won consumer. Claude Code has since become the fastest growing business in history, and Gemini has taken consumer market share. Wang’s read: AI is far from endgame and each new capability tier unlocks a new dominant form factor.
    • On open source: MuseSpark triggered guardrails in Meta’s Advanced AI Scaling Framework around bio, chem, cyber, and loss-of-control risks, so it is not currently safe to open source. Smaller, derived open variants are actively in development.
    • Meta remains committed to open sourcing models when safety allows, drawing a line through the Open Compute Project legacy and Sun Microsystems open-software heritage.
    • Wang dismisses reporting about a Wang-Zuck versus Bosworth-Cox split as “the line between gossip and reporting is remarkably thin.” He says leadership is aligned on needing best-in-class models and product integration.
    • On the Manus situation, Wang says it is too complicated to discuss publicly and that the deal status implies “machinations are still at play.”
    • On China, Wang separates the people from the state. He still wants to work with talented Chinese-born researchers regardless of his views on the Chinese Communist Party and PLA, which he sees as taking AI extremely seriously for national security.
    • The full-page New York Times AI war ad Wang ran while at Scale was meant to push the US government to treat AI as a step change for national security. He thinks events since then, including DeepSeek and other shocks, have proved that plea correct.
    • On Anthropic’s doom posture, Wang largely agrees with the core message that models are already very powerful and getting more so, while declining to endorse every specific claim.
    • Meta has acquired Assured Robot Intelligence (ARRI), an AI software company building models for hardware platforms, not a hardware maker itself.
    • Wang frames physical super intelligence as the natural sequel to digital super intelligence. Robotics, world models, and physical intelligence all benefit from the same scaling that drives language models.
    • On health, MSL is building a “health super intelligence” effort and will collaborate closely with CZI. Wang sees equal global access to powerful health AI as a uniquely Meta-shaped delivery problem.
    • Wang admires John Carmack but says nobody really knows what Carmack is currently working on. No band reunion announced.
    • The mango model is “alive and kicking” despite rumors. Wang notes MSL gets a small fraction of the rumor-mill attention other labs get and feels sympathy for them.
    • On model welfare, Wang says it is a serious topic that “nobody is talking about enough” given how integrated models have become as work partners. He references research, including from Eleos, that measures subjective experience of models.
    • Wang’s critical-path technology list: super intelligence, robotics, brain computer interfaces. The infinite-scale primitives behind them are energy, compute, and robots.
    • FAIR’s brain research program Tribe hit a milestone called Tribe B2: a foundation model that can predict how an unknown person’s brain would respond to images, video, and audio with reasonable zero-shot generalization.
    • Wang’s main philosophical break with Elon Musk: research itself is the primary activity. Building super intelligence is a research expedition through fog of war, and sequencing of bets really matters.
    • Personal notes: Wang moved from San Francisco to the South Bay, treats Palo Alto as his city now, was a math olympiad competitor, says his favorite activities are reading sci-fi and walking in the woods, and bonds with Vance over country music.

    Detailed Summary

    How MSL Is Actually Organized

    Meta Superintelligence Labs sits as the umbrella organization that Wang oversees. Inside it, TBD Lab is the large-model research group where the most discussed researchers and infrastructure engineers sit, and they technically report to Wang. PAR, Product and Applied Research, is led by Nat Friedman and owns deployment and product surfaces. FAIR continues to run exploratory science, including work on brain prediction models and a universal model for atoms used in computational chemistry. Sitting alongside MSL is Meta Compute, run by Daniel Gross, which owns the long-horizon GPU and data center plan that everything else relies on. Chief scientist Shengjia Zhao orchestrates the scientific agenda across the whole lab.

    Why Wang Left Scale

    Wang says progress in frontier AI has been faster than even insiders expected. Two structural beliefs pushed him toward Meta. First, the labs that actually train the frontier models are accruing disproportionate economic and product rights in the AI ecosystem. Second, compute is the dominant scarce input of the next phase, so the right mental model is to treat tech companies with compute as fundamentally different animals from companies without it. Meta has both, Zuck is “AGI pilled,” and the personal super intelligence memo Zuck published roughly a year ago became the shared north star.

    The Diagnosis: Llama Was Off-Trajectory

    When Wang arrived, the existing AI org needed a reset because Llama was not on the same trajectory as the frontier. The plan he laid out has four cultural principles. Take superintelligence seriously as a real near-term target. Make technical voices the loudest in the room. Demand scientific rigor and focus on basics. Make big bets. On top of that, three structural levers were used to set velocity. Push compute per researcher much higher than at larger labs where compute is diluted across too many efforts. Keep the team small and extremely cracked. Allocate a meaningful share of resources to ambitious, paradigm-shifting research bets rather than incremental refinement.

    Recruiting, Soup, and the Mercenary Narrative

    Wang argues the reporting on MSL hiring overstated the money story. Most of the people MSL recruited had strong financial paths at their previous employers, so individualized recruiting was more about computing access, talent density, and the ability to make big research bets. The recruitment blitz happened fast because Wang knew the team needed to exist “yesterday.” Asked about Mark Chen’s claim that Zuck made soup to recruit people, Wang refuses to confirm or deny who made it but agrees the process was intense and personal. Visitors from other labs reportedly tell Wang the MSL culture feels like early OpenAI or early Anthropic, which lands as the strongest endorsement he could ask for.

    Receiving the Public Hits: Young, Inexperienced, Mercenary

    LeCun called Wang young and inexperienced shortly after departing. The two reconnected in India a few weeks later and LeCun congratulated Wang on MuseSpark. Wang says the age critique has followed him since his earliest Silicon Valley days, so he barely registers it. Altman, asked off-camera by Vance about Wang’s appearance on the show, had nothing flattering to add. Wang’s response is to bet that as the field gets closer to actual super intelligence, the personal animosities will subside. Whether they will is, as Vance puts it, an open question.

    MuseSpark as Appetizer, Not Entree

    Wang is careful not to oversell MuseSpark. He calls it “the appetizer” and says it is an early data point on a deliberately constructed scaling ladder. MSL spent nine months rebuilding the pre-training stack, the reinforcement learning stack, the data pipeline, and the science before generating MuseSpark. The point of releasing it was to show that the new program scales predictably along multiple axes (pre-training, RL, test-time compute, and the recently demonstrated multi-agent scaling visible in MuseSpark’s 16-agent content planning mode). Wang says the upcoming larger models are what MSL is genuinely excited about and frames the next two rungs as much more interesting than the current release.

    Token Efficiency Was the Surprise

    MuseSpark’s strongest competitive signal is how few tokens it needs to match competitors on tasks like Artificial Analysis. Wang attributes this to having had the rare luxury of building a clean pre-training and RL stack from scratch with the right experts. He speculates that some competitor models compensate for upstream inefficiency by allowing the model to think longer, which inflates token usage without improving the underlying capability. If that read is right, MSL’s efficiency advantage should grow as models scale up.

    Glasses, WhatsApp, and the Constellation of Devices

    Personal super intelligence shows up at Meta as a constellation of devices that capture context across the user’s day. Ray-Ban Meta glasses are the headline product, with the AI seeing what you see and hearing what you hear, then offering proactive insight or doing background research. Wang acknowledges that even AI-fluent users like Kylie Robinson, who runs her business inside WhatsApp, have not naturally used Meta’s AI buttons in the family of apps. His answer is that Meta deliberately waited for models to be good enough before tightening cross-app integration, and that integration phase is starting now.

    Country of Geniuses Versus Economy of Agents

    Wang’s framing of Meta’s strategic position is the most memorable line in the interview. Where Dario Amodei talks about a country of geniuses in a data center, Wang wants to build an economy of agents in a data center. Meta uniquely sits on both sides of consumer and small-business surface area, with billions of consumers and hundreds of millions of small businesses already on the platforms. If MSL can build great agents for both, then connect them so they transact and coordinate, the platform becomes a substrate for an entirely new kind of digital economy.

    Consumer Sentiment, Product Overhang, and the Trust Tax

    Wang concedes consumer AI sentiment is poor and that everyday users have not yet had a personal Claude Code moment. He believes the only durable answer is to ship products that genuinely transform individual agency for non-developers and small business owners. Robinson notes that for the small-town restaurant whose website has not been updated since 2002, a working agent on the business side could be transformational. Vance pushes that Meta carries a bigger trust tax than any other lab, so the bar for shipping AI products that the public will accept is correspondingly higher. Wang accepts the framing and says the answer is to keep building thoughtfully.

    Why MuseSpark Cannot Be Open Sourced Yet

    Meta’s Advanced AI Scaling Framework set explicit guardrails around bio, chem, cyber, and loss-of-control risks. MuseSpark in its current form tripped some of those internal evaluations, documented in the preparedness report Meta published alongside the model. So MuseSpark itself is not safe to open source. MSL is, however, developing smaller versions and derived models intended for open release, with active reviews happening the day of the interview. Wang reaffirms the commitment to open source where safety allows and draws a line back to the Open Compute Project and the Sun Microsystems-era ethos of openness in infrastructure.

    The Bosworth, Cox, and Manus Questions

    The reporting that Wang and Zuck push toward best-in-the-world research while Bosworth and Cox push toward cheap product deployment is dismissed as gossip dressed up as journalism. Wang says leadership debates points hard but is aligned on needing top models, integrating them into Meta’s surfaces, and serving the existing business. On Manus, the Chinese AI startup that figured in Meta’s late-stage strategy, Wang says he cannot comment, which itself signals that the situation is unresolved.

    China, National Security, and the Newspaper Ad

    Wang draws a sharp distinction between the Chinese state and Chinese-born researchers. His parents are from China, he is happy to work with talented researchers regardless of origin, and he sees a flattening of nuance on this question inside Silicon Valley. At the same time, he stands by the New York Times AI and war ad he ran while at Scale, framing it as an early plea for the US government to take AI seriously as a national security technology. He thinks subsequent events, including DeepSeek and other shocks, validated that call and that policymakers now do treat AI accordingly.

    Robotics and Physical Super Intelligence

    Meta has acquired Assured Robot Intelligence, an AI software company that builds models for multiple hardware targets rather than its own robot. Wang argues that if you take digital super intelligence seriously, physical super intelligence quickly becomes the next logical milestone. Scaling laws for robotic intelligence look similar enough to language model scaling that having the largest compute footprint in the industry would be wasted if it were not also turned toward world modeling and embodied learning. He grants the metaverse-skeptic critique exists but says retreating from ambition is the wrong response to past misfires.

    Health Super Intelligence and CZI

    Wang names health super intelligence as one of MSL’s anchor initiatives. Because billions of people already use Meta products daily, Wang believes Meta is structurally positioned to put powerful health AI in the hands of equal global access in a way nobody else can. The work will involve close collaboration with the Chan Zuckerberg Initiative, which has its own multi-billion-dollar biotech and science investment program.

    Model Welfare, Sci-Fi, and Brain Models

    Two of the most distinctive moments come at the end. Wang flags model welfare as a topic he thinks is being undercovered relative to how integrated models now are in daily work. He is open to the idea that models may have measurable subjective experience worth weighing, and points to research efforts (including Eleos) trying to quantify it. He also reveals that FAIR’s Tribe program, with its Tribe B2 milestone, has produced foundation models capable of predicting how an unknown person’s brain would respond to images, video, and audio with reasonable zero-shot generalization, a building block toward future brain computer interfaces. Wang lists brain computer interfaces alongside super intelligence and robotics as the critical-path technologies for humanity, with energy, compute, and robots as the infinitely scaling primitives behind them.

    Where Wang Diverges From Elon

    Asked whether Musk is more all-in on robotics, energy, and BCI than anyone, Wang concedes the point but argues the details matter and sequencing matters more. Wang’s core philosophical break is that building super intelligence is fundamentally a research activity, not a scaling-only sprint. The lab is operating in fog of war, and ambitious experiments are the only way to map it. That conviction is what makes MSL a research-led organization rather than a brute-force compute farm.

    Thoughts

    The most strategically interesting move in this entire interview is the “economy of agents in a data center” framing. It is a deliberate reframe against Anthropic’s “country of geniuses” line, and it does real work. A country of geniuses is a labor-substitution story aimed at knowledge workers and code. An economy of agents is a marketplace story that maps directly onto Meta’s two-sided distribution advantage: billions of consumers on one side, hundreds of millions of small businesses on the other. That positioning makes the agentic future Meta-shaped in a way no other frontier lab can claim, because no other frontier lab also owns the demand and supply graph of the global small-business economy. If Wang’s team can actually ship reliable agents on both sides plus the rails for them to transact, Meta’s structural moat in agentic commerce could exceed anything Llama ever had as an open model.

    The token efficiency claim is the strongest piece of technical evidence in the interview for the “clean stack” thesis. If MuseSpark really is matching competitors with materially fewer tokens, the implication is not that MuseSpark is the best model today, but that MSL has rebuilt the foundations with less accumulated tech debt than competitors that have layered fixes on top of older stacks. That is exactly the kind of advantage that compounds with scale. The next two model releases are the actual test. If Wang is right about predictable scaling on pre-training, RL, test-time, and multi-agent axes simultaneously, the gap from MuseSpark to the next rung should be visible in a way that forces re-rating of Meta’s position.

    The open-source posture is the cleanest signal of how the safety conversation has actually changed in 2026. Meta, the lab most identified with open weights, is saying out loud that its current frontier model triggered enough internal guardrails that releasing the weights is off the table. Wang threads the needle by promising smaller open variants, but the underlying point is unmistakable: the open-weights bargain has limits, and those limits will be set by internal preparedness frameworks rather than community pressure. That is a real shift from the Llama 2 era and worth tracking as the next generation lands.

    Wang’s willingness to engage on model welfare, on roughly the same footing as safety and alignment, is the second philosophical reveal worth flagging. It signals that the next generation of lab leadership is not going to dismiss the topic the way the previous generation often did. Whether that translates into product or policy changes is unclear, but the fact that the head of MSL says it is “underdiscussed” is itself a marker.

    Finally, the human texture of the interview matters. Wang has clearly absorbed a lot of personal incoming fire over the past ten months, including from LeCun and Altman, and his answer is consistently to redirect to the work. The Steve Jobs quote about hiring people who tell you what to do is the operating slogan he keeps coming back to. Combined with the genuine enthusiasm for sci-fi, walks in the woods, and country music, the picture that emerges is less the salesman caricature his critics paint and more a young technical operator betting that scoreboard work over a multi-year horizon will settle every argument that text on X cannot.

    Watch the full conversation here.

  • Composer: Building a Fast Frontier Model with Reinforcement Learning

    Composer represents Cursor’s most ambitious step yet toward a new generation of intelligent, high-speed coding agents. Built through deep reinforcement learning (RL) and large-scale infrastructure, Composer delivers frontier-level results at speeds up to four times faster than comparable models:contentReference[oaicite:0]{index=0}. It isn’t just another large language model; it’s an actively trained software engineering assistant optimized to think, plan, and code with precision — in real time.

    From Cheetah to Composer: The Evolution of Speed

    The origins of Composer go back to an experimental prototype called Cheetah, an agent Cursor developed to study how much faster coding models could get before hitting usability limits. Developers consistently preferred the speed and fluidity of an agent that responded instantly, keeping them “in flow.” Cheetah proved the concept, but it was Composer that matured it — integrating reinforcement learning and mixture-of-experts (MoE) architecture to achieve both speed and intelligence.

    Composer’s training goal was simple but demanding: make the model capable of solving real-world programming challenges in real codebases using actual developer tools. During RL, Composer was given tasks like editing files, running terminal commands, performing semantic searches, or refactoring code. Its objective wasn’t just to get the right answer — it was to work efficiently, using minimal steps, adhering to existing abstractions, and maintaining code quality:contentReference[oaicite:1]{index=1}.

    Training on Real Engineering Environments

    Rather than relying on synthetic datasets or static benchmarks, Cursor trained Composer within a dynamic software environment. Every RL episode simulated an authentic engineering workflow — debugging, writing unit tests, applying linter fixes, and performing large-scale refactors. Over time, Composer developed behaviors that mirror an experienced developer’s workflow. It learned when to open a file, when to search globally, and when to execute a command rather than speculate.

    Cursor’s evaluation framework, Cursor Bench, measures progress by realism rather than abstract metrics. It compiles actual agent requests from engineers and compares Composer’s solutions to human-curated optimal responses. This lets Cursor measure not just correctness, but also how well the model respects a team’s architecture, naming conventions, and software practices — metrics that matter in production environments.

    Reinforcement Learning as a Performance Engine

    Reinforcement learning is at the heart of Composer’s performance. Unlike supervised fine-tuning, which simply mimics examples, RL rewards Composer for producing high-quality, efficient, and contextually relevant work. It actively learns to choose the right tools, minimize unnecessary output, and exploit parallelism across tasks. The model was even rewarded for avoiding unsupported claims — pushing it to generate more verifiable and responsible code suggestions.

    As RL progressed, emergent behaviors appeared. Composer began autonomously running semantic searches to explore codebases, fixing linter errors, and even generating and executing tests to validate its own work. These self-taught habits transformed it from a passive text generator into an active agent capable of iterative reasoning.

    Infrastructure at Scale: Thousands of Sandboxed Agents

    Behind Composer’s intelligence is a massive engineering effort. Training large MoE models efficiently requires significant parallelization and precision management. Cursor’s infrastructure, built with PyTorch and Ray, powers asynchronous RL at scale. Their system supports thousands of simultaneous environments, each a sandboxed virtual workspace where Composer experiments safely with file edits, code execution, and search queries.

    To achieve this scale, the team integrated MXFP8 MoE kernels with expert and hybrid-sharded data parallelism. This setup allows distributed training across thousands of NVIDIA GPUs with minimal communication cost — effectively combining speed, scale, and precision. MXFP8 also enables faster inference without any need for post-training quantization, giving developers real-world performance gains instantly.

    Cursor’s infrastructure can spawn hundreds of thousands of concurrent sandboxed coding environments. This capability, adapted from their Background Agents system, was essential to unify RL experiments with production-grade conditions. It ensures that Composer’s training environment matches the complexity of real-world coding, creating a model genuinely optimized for developer workflows.

    The Cursor Bench and What “Frontier” Means

    Composer’s benchmark performance earned it a place in what Cursor calls the “Fast Frontier” class — models designed for efficient inference while maintaining top-tier quality. This group includes systems like Haiku 4.5 and Gemini Flash 2.5. While GPT-5 and Sonnet 4.5 remain the strongest overall, Composer outperforms nearly every open-weight model, including Qwen Coder and GLM 4.6:contentReference[oaicite:2]{index=2}. In tokens-per-second performance, Composer’s throughput is among the highest ever measured under the standardized Anthropic tokenizer.

    Built by Developers, for Developers

    Composer isn’t just research — it’s in daily use inside Cursor. Engineers rely on it for their own development, using it to edit code, manage large repositories, and explore unfamiliar projects. This internal dogfooding loop means Composer is constantly tested and improved in real production contexts. Its success is measured by one thing: whether it helps developers get more done, faster, and with fewer interruptions.

    Cursor’s goal isn’t to replace developers, but to enhance them — providing an assistant that acts as an extension of their workflow. By combining fast inference, contextual understanding, and reinforcement learning, Composer turns AI from a static completion tool into a real collaborator.

    Wrap Up

    Composer represents a milestone in AI-assisted software engineering. It demonstrates that reinforcement learning, when applied at scale with the right infrastructure and metrics, can produce agents that are not only faster but also more disciplined, efficient, and trustworthy. For developers, it’s a step toward a future where coding feels as seamless and interactive as conversation — powered by an agent that truly understands how to build software.