PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: ai economics

  • OpenAI’s Leaked 2025 Financials: $34 Billion in Spending, a $38.5 Billion Net Loss, and a $17 Billion Microsoft Bill Ahead of Its IPO

    Infographic summarizing OpenAI leaked 2025 financials: $13.07B revenue, $34B total costs, $20.92B operating loss, $38.53B net loss, where the $34B went, the $17.2B paid to Microsoft versus $303M paid back, inference costs, and IPO valuation context

    OpenAI’s audited 2025 financials leaked this week, and they are the clearest picture yet of what it actually costs to run the company behind ChatGPT. Independent journalist Ed Zitron first published the documents, and the Financial Times independently confirmed them. The headline: OpenAI spent $34 billion last year, booked $13.07 billion in revenue, and reported a net loss attributable to the company of $38.5 billion. The disclosure lands just days after OpenAI confidentially filed for an IPO that could value it north of $1 trillion.

    TLDR

    OpenAI’s audited 2025 numbers, leaked by Ed Zitron and confirmed by the Financial Times, show revenue tripling to $13.07 billion while total costs reached $34 billion, producing a $20.92 billion operating loss and a $38.53 billion net loss attributable to the company. The much larger net loss is inflated by a one-time $41.55 billion non-cash charge tied to OpenAI’s October 2025 conversion from a nonprofit to a public benefit corporation; strip the non-cash items and the loss is closer to $8 billion. R&D alone was $19.18 billion, cost of revenue (inference) was $7.5 billion, and sales and marketing ballooned to $5.73 billion. OpenAI paid Microsoft $17.2 billion in 2025 while Microsoft paid OpenAI only $303 million, exposing a deep Azure dependency. The company burned $1.60 for every dollar of revenue, down from $2.37 in 2024, and gross margin slipped from roughly 40% to 33% as more capable models consumed more compute per query. The leak arrives as OpenAI files a confidential S-1, targets a listing as early as September 2026 at up to a $1 trillion valuation, and races rival Anthropic, which is more valuable on paper and claims it is already turning an operating profit.

    Thoughts

    The most important thing to understand about these numbers is that there are two loss figures and the press will conflate them. The $38.53 billion net loss is the scary headline, but $41.55 billion of it is a non-cash accounting charge from converting investor convertible interests into equity during the for-profit restructuring. That charge is real on the audited statement and it will show up in the eventual S-1, but it is a one-time artifact of OpenAI’s unusual corporate history, not money that left the building. The number that describes the actual business is the $20.92 billion operating loss. That is the one to watch, and it is still enormous.

    The genuinely encouraging line in the whole release is the loss-per-dollar ratio. In 2024 OpenAI spent $2.37 to generate a dollar of revenue. In 2025 that fell to $1.60. A company that is still losing $1.60 on every dollar is not a healthy business, but a company whose efficiency improved by a third in a single year while tripling its top line is at least pointed in a defensible direction. The bull case for OpenAI lives entirely in the slope of that line. If it keeps improving at that rate, the math eventually crosses over. If it stalls, the valuation is a fantasy.

    The Microsoft relationship is the single most revealing disclosure, and it is wildly asymmetric. OpenAI paid Microsoft $17.2 billion in 2025. Microsoft paid OpenAI $303 million. That is a 56-to-1 ratio, and it reframes the partnership: Microsoft is not really a peer or even just an investor, it is OpenAI’s landlord and primary supplier, collecting rent on every model trained and every query answered. The April 2026 renegotiation that capped revenue-share payments at $38 billion through 2030, down from a projected $135 billion, suddenly looks less like a favor and more like OpenAI desperately trying to lower its single largest cost. The dependency cuts both ways, but right now Microsoft holds the better hand.

    The structural problem hiding inside the cost of revenue line is inference. Training a model is a fixed, one-time cost. Serving it is a recurring cost that scales with every one of ChatGPT’s roughly 800 million weekly users. OpenAI spent $5.02 billion on Azure inference in the first half of 2025 alone, and the more capable its reasoning models get, the more compute each answer burns. That is why gross margin went down even as revenue went up. It is the opposite of how software is supposed to work, where the marginal cost of one more user trends toward zero. OpenAI’s marginal cost is real, large, and growing. The counterargument is that per-token inference costs have been falling roughly tenfold a year, so the unit economics could still flip. That is the entire wager.

    Finally, the timing matters more than the numbers. OpenAI’s confidential S-1 means these audited figures were going to become public regardless, since the SEC requires the full prospectus at least 15 days before a roadshow. What the leak changes is who gets to study them first. Prospective IPO buyers, enterprise customers signing multi-year API contracts, and competitors now have the audited books weeks or months early, and they are reading them against Anthropic, which filed at a higher valuation and claims an operating profit. For a company asking the public markets to underwrite a $1 trillion bet on a monopoly outcome that does not yet exist, losing control of the narrative this early is not a small thing.

    Key Takeaways

    • OpenAI’s audited 2025 financials were first published by independent journalist Ed Zitron and independently confirmed by the Financial Times, the first verified look at the company’s books before its planned IPO.
    • Revenue grew from $3.7 billion in 2024 to $13.07 billion in 2025, more than tripling year over year, making OpenAI one of the fastest-growing businesses in history.
    • By the end of 2025 OpenAI was generating roughly $2 billion in monthly revenue, up from about $1 billion a quarter at the end of 2024.
    • Total costs and expenses hit $34 billion in 2025, up from $12.48 billion in 2024.
    • Research and development was the single largest expense at $19.18 billion, up from $7.81 billion, and exceeded total revenue on its own.
    • Of that R&D spend, $10.59 billion went to Microsoft, almost certainly the GPU compute cost of training frontier models on Azure.
    • Cost of revenue, the expense of serving ChatGPT responses (inference), rose from $2.65 billion to $7.5 billion.
    • Sales and marketing jumped from $1.11 billion to $5.73 billion, a 418% increase.
    • General and administrative costs rose from $907 million to $1.57 billion.
    • The operating loss, the truest measure of day-to-day economics, grew from $8.78 billion to $20.92 billion.
    • The net loss attributable to OpenAI was $38.53 billion, up nearly eightfold from $5.09 billion in 2024.
    • The bulk of that jump was a one-time, non-cash $41.55 billion charge from OpenAI’s October 28, 2025 conversion to a public benefit corporation, reflecting the changing fair value of convertible interests and warrant liabilities.
    • Stripping out the restructuring charge and other non-cash items such as stock-based compensation and Microsoft computing credits, the underlying loss was about $8 billion.
    • Including all factors, gross net loss reached $60.35 billion, lowered to the $38.53 billion attributable figure by removing $21.82 billion attributed to noncontrolling and redeemable noncontrolling interests.
    • OpenAI burned $1.60 for every $1 of revenue in 2025, an improvement from $2.37 in 2024, the clearest data point in the bull case.
    • Measured as a percentage of revenue, the operating loss improved from 237% in 2024 to 160% in 2025.
    • In total, OpenAI paid Microsoft $17.2 billion in 2025: $10.59 billion in R&D fees, $6.047 billion in cost of revenue, $527 million in sales and marketing, and $42 million in G&A.
    • Microsoft paid OpenAI just $303 million in the same year, a 56-to-1 imbalance underscoring OpenAI’s Azure dependency.
    • SoftBank paid OpenAI $867 million in 2025.
    • At year-end OpenAI carried $3.64 billion in outstanding payables to Microsoft, plus tens of millions more in accrued and non-current liabilities.
    • OpenAI spent $5.02 billion on Azure inference in just the first half of 2025; Azure inference from 2024 through Q3 2025 totaled $12.43 billion.
    • ChatGPT serves roughly 800 million weekly users, meaning billions of queries a week, each one burning GPU time at Azure’s pricing of about $6.98 per H100 GPU-hour.
    • Gross margin fell from roughly 40% in 2024 to 33% in 2025, because more capable reasoning models consume more compute per query.
    • Research firm Sacra estimates OpenAI’s inference costs reached $8.4 billion in 2025 and will rise to $14.1 billion in 2026, a 68% increase.
    • At year-end OpenAI held just over $50 billion in assets, with almost half in cash.
    • The April 2026 Microsoft renegotiation ended exclusivity and capped revenue-share payments at $38 billion through 2030, down from a projected $135 billion, potentially saving OpenAI up to $97 billion over five years.
    • OpenAI filed a confidential draft S-1 with the SEC around May 22, 2026 and confirmed it publicly on June 8, naming Goldman Sachs and Morgan Stanley as underwriters.
    • The company is targeting a listing as early as September 2026 at a valuation that could exceed $1 trillion, though Sam Altman has said a public offering “may be a while.”
    • OpenAI raised $122 billion earlier in 2026 at a $730 billion pre-money valuation, putting its post-money value around $852 billion.
    • At an $852 billion valuation, OpenAI trades at roughly 65 times its 2025 revenue.
    • Rival Anthropic also filed IPO paperwork this month after raising $65 billion at a $900-$965 billion valuation, making it more valuable on paper than OpenAI, and says it expects to report an operating profit of $559 million in the June quarter.
    • HSBC analysts estimate OpenAI may need more than $207 billion in additional capital through 2030 even under optimistic projections.
    • OpenAI projects profitability by 2029 or 2030; independent analysts put the more likely date at 2031 or later.
    • Bridgewater partner Greg Jensen reportedly told clients the implied revenue multiples price OpenAI for “a monopoly outcome that does not yet exist.”
    • Zitron separately reported OpenAI had a negative 122% non-GAAP operating margin in Q1 2026 and that ChatGPT growth has stalled, with the company projecting paid ChatGPT Plus subscriptions to fall from 44 million in 2025 toward cheaper tiers in 2026.

    Detailed Summary

    How the leak happened and why it matters now

    The audited documents were obtained and first published by Ed Zitron on his newsletter Where’s Your Ed At, then independently verified by the Financial Times, which reviewed the same materials. That dual sourcing matters: this is not a rumor or a model, it is OpenAI’s actual audited financial statement. The timing is the story. OpenAI filed a confidential draft S-1 with the SEC around May 22, 2026 and confirmed it publicly on June 8. Under SEC rules the full prospectus must be released at least 15 days before an investor roadshow, so the 2025 numbers were going to be public soon regardless. The leak simply moved that disclosure forward, handing prospective investors, enterprise customers, and competitors an early look at the books.

    Revenue tripled, costs grew faster

    OpenAI’s revenue rose from $3.7 billion in 2024 to $13.07 billion in 2025, and monthly revenue reached nearly $2 billion by year-end. By almost any normal standard that is spectacular growth. The problem is that costs grew faster, reaching $34 billion against $12.48 billion the year before. The gap between what OpenAI earns and what it spends has widened every year since its founding, and 2025 is the starkest example yet. Revenue alone was outpaced by research and development as a single line item in both of the last two years.

    Two loss numbers, and why both matter

    There are two figures that get cited interchangeably and should not be. The operating loss of $20.92 billion is what the business spent beyond what it earned from operations: training models, serving ChatGPT, paying engineers, running marketing. The net loss attributable to OpenAI of $38.53 billion is far larger because 2025 was the year OpenAI completed its conversion from a nonprofit to a for-profit public benefit corporation, finalized on October 28, 2025. That restructuring triggered a $41.55 billion non-cash charge reflecting the changing fair value of convertible equity interests and warrant liabilities. Before the conversion, investors held convertible interest rights treated as liabilities under US accounting rules and revalued upward as OpenAI’s valuation climbed, creating the charge. It is not expected to recur. Including all minor items, gross net loss reached $60.35 billion, reduced to the $38.53 billion attributable figure after removing $21.82 billion tied to noncontrolling and redeemable noncontrolling interests, primarily the OpenAI Foundation’s stake. Strip the non-cash noise and the underlying loss was about $8 billion.

    Where the $34 billion went

    The spending breaks into four lines. Research and development was $19.18 billion, the largest category, with $10.59 billion of it flowing to Microsoft for training compute. Cost of revenue, the expense of serving responses to users, was $7.5 billion and captures inference, the compute consumed every time someone prompts ChatGPT or calls the API. Sales and marketing reached $5.73 billion, up 418% year over year, a striking jump for a product that grew largely by word of mouth. General and administrative costs added $1.57 billion. The shape of the spending tells you OpenAI is simultaneously racing to build better models, serve a massive and growing user base, and aggressively defend market share through marketing.

    The Microsoft dependency

    The most striking single disclosure is the scale of the Microsoft relationship. OpenAI paid Microsoft $17.2 billion in 2025: $10.59 billion in R&D fees for model training, $6.047 billion in cost-of-revenue for inference serving, $527 million in sales and marketing, and $42 million in G&A. Microsoft paid OpenAI just $303 million the same year. SoftBank paid OpenAI $867 million. The 56-to-1 ratio between what OpenAI pays Microsoft and what Microsoft pays back makes the structural reality plain: Microsoft is OpenAI’s largest landlord. The dynamic began shifting in April 2026, when the two renegotiated, ending Microsoft’s exclusivity and capping revenue-share payments at $38 billion through 2030, down from a projected $135 billion. That could save OpenAI up to $97 billion over five years, though Microsoft keeps its IP license through 2032 and remains the primary cloud partner.

    Why inference is the core problem

    Training happens once. Serving happens billions of times a day. When OpenAI releases a model it spends months and billions on training compute, a fixed cost that falls away when training ends. Inference is the opposite: every ChatGPT message runs through the model on Azure GPU hardware, consuming electricity and compute to generate a response. With roughly 800 million weekly users, that is billions of queries a week, each burning GPU time at roughly $6.98 per H100 GPU-hour on demand. OpenAI spent $5.02 billion on Azure inference in the first six months of 2025 alone. Sacra estimates full-year inference costs of $8.4 billion in 2025, rising to $14.1 billion in 2026. This is why gross margin fell from about 40% to 33% even as revenue tripled: more capable reasoning models consume far more compute per query, and revenue has not kept pace with the cost growth that capability generates.

    What it means for the IPO and the race with Anthropic

    OpenAI was last valued around $852 billion post-money after raising $122 billion in early 2026, which puts it at roughly 65 times 2025 revenue. It has named Goldman Sachs and Morgan Stanley as underwriters and is targeting a listing as early as September 2026 at up to a $1 trillion valuation, though Altman has hedged that it “may be a while” and that staying private might be the better course. HSBC estimates the company may need more than $207 billion in additional capital through 2030. The race is with Anthropic, which filed paperwork the same month after raising $65 billion at a $900-$965 billion valuation, making it more valuable on paper, and which says it expects a $559 million operating profit in the June quarter. The contrast is sharp: the two leading AI labs heading toward public markets at the same time, one bleeding cash at scale, the other claiming profitability, both asking investors to bet on a future that has not arrived.

    Notable Quotes

    “The financial condition of OpenAI is deeply concerning. $38.53 billion in losses are astronomical, and far higher than most believed it would be. Losses also appear to be mounting year-over-year at a dramatic rate, and I’m not sure how this company finds a way toward any kind of sustainability or profitability.”

    Ed Zitron, the independent journalist who published the leaked audited financials

    “It’s unclear what this means, nor how OpenAI reconciled the removal of $3.74 billion in costs. I will not speculate further.”

    Ed Zitron, on a discrepancy he found in the restated 2024 figures

    “OpenAI’s two biggest expenses are R&D and marketing. Budget cuts there, coupled with an ability to raise prices or win new sources of revenue, could see the company move into the black over time. Cutting R&D would be the most difficult part of that, given that AI companies can only hold onto their customers by generating the best-performing models.”

    Jim Edwards, Fortune, on whether OpenAI has a realistic path to profitability

    “What the audited documents make impossible to argue is that the path to profitability is short, clear, or cheap.”

    TechTimes analysis of the leaked OpenAI financials

    The implied revenue multiples price OpenAI for “a monopoly outcome that does not yet exist.”

    Bridgewater partner Greg Jensen, reportedly telling clients how to read OpenAI’s valuation

    “OpenAI spent $34bn last year as the ChatGPT maker poured money into a race to dominate the fast-growing AI market ahead of a planned stock market listing.”

    George Hammond and Bryce Elder, Financial Times, framing the audited 2025 spend

    Read Ed Zitron’s original reporting with the full breakdown here, and the Financial Times confirmation here.

    Related Reading

    • Ed Zitron, Where’s Your Ed At the primary source that broke the audited 2025 financials with the full line-by-line breakdown.
    • OpenAI (Wikipedia) background on the company’s history, structure, and the nonprofit-to-for-profit conversion that drives the non-cash charge.
    • Inference (Wikipedia) on the recurring compute cost that explains why OpenAI’s gross margin shrinks as usage grows.
    • Anthropic the rival lab that filed IPO paperwork the same month at a higher valuation and claims it is already operating at a profit.
    • SEC on confidential filings context for why OpenAI’s audited numbers were headed for public disclosure regardless of the leak.
  • Benedict Evans on the Economics of AI Usage, Why Foundation Models May Become Commodities, and What Comes Next for SaaS

    Benedict Evans returns to the a16z podcast to update the thesis behind his widely read “AI eats the world” presentation, and the picture he paints is less about hype and more about hard economics. In this conversation he works through what has actually played out in the last year, why agentic coding became the one use case with real product market fit, and why he keeps arguing that foundation models may end up as commodities while the value moves somewhere else entirely. You can watch the full conversation here.

    TLDW

    Benedict Evans argues that the AI moment looks a lot like the early internet, the early PC era, and the rollout of mobile data, which means it is exciting, genuinely transformative, and almost impossible to predict use case by use case. Agentic coding is the only field with clear product market fit right now, with revenue run rates exploding from roughly nine billion to forty seven billion, while consumers still use chatbots weekly rather than daily. His central claim is that foundation models show no obvious network effect or sustainable differentiation, the chatbot is a limited v1 interface, and the model labs cannot build every application, so the value will likely move up the stack the way it did with chips, ISPs, and mobile networks rather than staying with the model providers. He covers the brutal supply and demand disequilibrium driving today’s token pricing and ten thousand dollar surprise bills, the financial gravity problem of hyperscalers spending over half their revenue on capex, the Jevons paradox and consumer surplus that may compete away productivity gains, the way the important questions move out of San Francisco and into industries like law, consulting, finance, and advertising, and the distinction between automating tasks and changing jobs. His closing image is an IBM ad from the 1950s promising “150 extra engineers,” a reminder that every platform shift feels unprecedented and that in twenty years we will simply say of course computers do that.

    Thoughts

    The most useful thing Evans does here is refuse to collapse uncertainty into a clean prediction, and then explain exactly why that refusal is the correct posture rather than a cop out. He distinguishes between the parts where he will commit to a view, that foundation models are probably not a product and the chatbot is probably not the right interface, and the parts where there are simply too many open paths to call. That discipline is rare in AI commentary, where the incentive is to sound certain. The commodity argument is not “models are worthless.” It is a chain of reasoning: there is no visible network effect, no durable differentiation beyond willingness to spend, no lock in comparable to Windows or iOS, and a likely structure of three to six well funded competitors plus open source and edge models all selling the same thing. Ask where price discipline comes from in that picture and the honest answer is that it probably does not, which is how you get a commodity even when demand is effectively infinite.

    The mobile data analogy is the load bearing comparison and it deserves to be taken seriously. Mobile data traffic rose something like fifteen hundred to two thousand times over fifteen years, the networks built an extraordinary piece of global infrastructure, everyone came to depend on it, and yet the operators captured almost none of the value because all the interesting stuff got built on top by someone else. Telco stocks were flat for two decades. If that is the template, then the trillion dollars of capex flowing into AI infrastructure can be both a worthwhile investment and a terrible place to expect outsized equity returns, because building the road is not the same as owning the traffic. The counterpoint Evans keeps fairly on the table is the operating system path, where Windows and iOS did capture value, but he notes they had levers and network effects that LLMs do not appear to have.

    His framing of where the questions live is the part most people in tech underweight. Once a technology works, the interesting questions stop being technology questions. Netflix is not a tech company in the sense that matters, because its real decisions are Los Angeles decisions about shows, talent, and sports, not San Francisco decisions about infrastructure. By the same logic, what AI means for a law firm is mostly a question for people who understand what associates actually do and what clients are actually paying for, not for model researchers. This is why the “the model will just do the whole thing” story keeps running aground. Most valuable software does not solve a problem the customer already knew they had. It often takes years to convince an industry that a problem even exists, and an LLM prompt does not surface latent problems that no one has articulated.

    The economic plumbing he describes is where the near term risk actually sits. We are in extreme disequilibrium, where twenty dollars a month can buy ten thousand dollars of tokens on one side and a weekend of experimentation can produce a ten thousand dollar bill on the other, exactly the pattern mobile data went through around 2009 and 2010. That gets resolved with the boring machinery of caps, throttling, and pricing tiers, not with magic. Layered on top is the financial gravity problem: Microsoft, Meta, and Google heading toward spending more than half of revenue on capex, with roughly seven hundred billion dollars of guidance across the big players, against a hard ceiling because there is not ten trillion dollars a year available to spend. And even when the productivity gains are real, the Jevons paradox and consumer surplus suggest much of the benefit gets competed away. If a discounted cash flow model used to take a week and now takes ten seconds, you do fifty of them and charge the client the same, which is great for clients and unremarkable for margins.

    The honest takeaway for builders is that the answer to “what does this do to software” is more software, probably one or two orders of magnitude more, just as SaaS itself produced an explosion rather than a consolidation. The SaaS apocalypse is real in the sense that some meaningful percentage of existing companies get wiped out, and unknowable in the sense that no one can yet say which ones, which is why thoughtful investors are reluctant to be long software in the dark. For anyone pursuing a more deliberate, purposeful relationship with technology, the closing note is the one to keep: every one of these shifts felt singular and world ending and world making at the time, it reshaped work and put people out of jobs and created things we love, and then it quietly became invisible. The goal is to stay clear eyed about which of those buckets a given change lands in rather than getting swept up in the noise of what someone said at a party yesterday.

    Key Takeaways

    • Agentic coding shifted from “kind of useful” to “really changing everything” at the start of the year, and it is the single field with unambiguous product market fit, where customers are pulling it out of your hands.
    • Coding working first was foreseeable in hindsight: software developers were the ones messing with the tools, and the first thing people do with a new kind of computer is build more computing, just as the first thing people did with PCs was make computers.
    • Anthropic, with less capital raised, chose to focus on coding and got it working, while OpenAI cycled through a more everything all at once strategy before narrowing in.
    • The intense focus on coding comes bundled with a supply crunch, a capacity crunch, and a price and capex imbalance that defines the current moment.
    • Most of the fundamental questions from two or three years ago still have no answers: whether there will be a winner in models, whether models capture value up the stack, how much they can do, and whether consumers will use this daily rather than weekly.
    • There is a wide gap between Valley insiders running clusters of Mac Studios all day and the roughly forty percent of people who say AI is “kind of useful, I used it last week for something.”
    • Outside tech, companies are adopting AI as one at a time point solutions for specific back office processes, like a commodities company using LLMs for better cash flow forecasting, not as a general purpose assistant.
    • Adoption always compounds on prior platforms: you could not have nine hundred million weekly active users in the Netscape era because there were not nine hundred million PCs on the planet.
    • Early in any platform shift almost nothing works smoothly, from sound cards and floppy disks with TCP/IP to computers that froze and lost your work, and AI is at that stage now.
    • Today’s token pricing crunch mirrors the mobile data shock of 2009 to 2010, where flat rate plans collided with surging usage and networks had to realign price with marginal cost through caps, fair use, and throttling.
    • Mobile data traffic rose roughly fifteen hundred to two thousand times in fifteen years, mobile networks earn around a trillion dollars and spend about two hundred billion a year on capex, yet their stocks have been flat for twenty years because all the value moved up the stack.
    • The central LLM question is whether the model can do the whole thing or whether you need hundreds of applications built on top, the same way you needed apps on Windows and iOS.
    • Evans sees no network effect and no sustainable differentiation between models beyond willingness to spend money, which points toward commodity infrastructure sold near marginal cost.
    • Chip companies, ISPs, and mobile operators did not capture the value; Windows and iOS did, but only because they had levers to move up the stack and real network effects, which models lack.
    • A useful comparison is semiconductors, where each generation gets more expensive and the field narrows to fewer players, suggesting three to six frontier model makers spending somewhere between two hundred billion and two trillion dollars a year.
    • Enterprises do not standardize on a model the way they once thought about AWS; the cloud and the model get abstracted away, so customers do not even know which one their SaaS product runs on.
    • Demand for tokens being effectively infinite does not prevent a price equilibrium, exactly as infinite demand for mobile bits still produced murderous price wars between commodity carriers.
    • History teaches that something will happen but rarely what; the smartest people in tech wrongly predicted Android would crush the iPhone on open versus closed grounds.
    • One characteristic of tech is that the moment you understand how something works is the moment to move on, which is why Evans stopped updating his Apple spreadsheet years ago.
    • The people who are good at using a tool are usually not the people who are good at designing what the tool should be, which is why model labs cannot build every skill or vertical application.
    • Claude skills and similar templates resemble file new in Excel: useful starting points that users eventually outgrow, raising the question of who builds the real software.
    • The questions increasingly move out of technology and into specific industries; what AI means for law, consulting, advertising, or accounting is partly an AI question and partly a deep domain question.
    • Netflix is not a tech company in the way that matters, because its real questions are media industry questions about shows, talent, and sports, not infrastructure; the same logic now applies across industries facing AI.
    • AI differs from prior platform shifts because the physical limits are unknown; in 1995 you knew PCs cost three thousand dollars and broadband could not reach everyone overnight, but no one knows how cheap, fast, or capable models will get.
    • Evans offers four buttons to press on any use case: is it just price elasticity and the Jevons paradox, does it remove a cost barrier to entry, does it unlock a new business model, or does it make something previously impossible now possible like trains over horses or Spotify over CDs.
    • Advertising and e-commerce are a standout opportunity because today’s systems know a SKU and a metadata field but not what a product actually is or why people buy it, and LLMs could change that level of understanding.
    • The valuable shift is not doing the old thing more, like more spreadsheets or better email, but doing genuinely new things, such as asking an LLM how to change prices to improve churn using all your call recordings, CRM flows, and product telemetry.
    • Enterprise software today splits into three buckets: big horizontal systems like SAP and Workday, three to four hundred vertical SaaS apps plus a thousand internal apps, and a fuzzy improvised middle of Excel, email, and shared files, with AI arriving as a new option across all three.
    • A core design tension is where to put the probabilistic software that can make mistakes versus the deterministic database that cannot, and whether the LLM sits at the top or the bottom of the stack; the answer is probably both depending on the task.
    • The net effect on software is way more software, since SaaS itself produced one to two orders of magnitude more software and all software companies exist to solve problems created by other software companies.
    • The SaaS apocalypse is real but unknowable: some percentage of SaaS companies get wiped out, but no one knows which, so you should not derate the whole sector fifty percent and many investors are wary of being long software for now.
    • Much of what an organization does is implicit, undocumented, and not in the training data, which is exactly the value McKinsey, Bain, and BCG provide by getting license to map how a company really works.
    • The real decisions are usually exception handling: the question is always what you cannot automate and what still requires human judgment about cases that were never written down.
    • Distinguish tasks from jobs: accountants spend almost none of their time the way they did fifty years ago, yet to the client the job looks the same.
    • LLMs excel where you want the average, the answer anyone would give, and struggle where you specifically do not want the average and cannot fully explain why you did it differently.
    • There is a financial gravity ceiling: Microsoft, Meta, and Google are on track to spend over fifty percent of revenue on capex versus fifteen to twenty percent for capital intensive telecoms, with seven hundred billion in guidance this year and no path to ten trillion.
    • Hyperscalers face an existential FOMO trap: returns look positive now, but they cannot let rivals build the future of compute without participating, even as the CFO asks how much participation is enough.
    • Token maxing will face a reckoning as the disequilibrium resolves, but measuring ROI is hard because most reported benefits so far, like better analytics, support, and productivity, are tough to put a financial value on.
    • Consumer surplus means many gains get competed away: if analysis that took a week now takes a day, you do five times more analysis and charge the same, the way investment banks did with spreadsheets.
    • Evans closes with a 1950s IBM ad promising “150 extra engineers,” a reminder that every fundamental technology change feels unprecedented, and that in twenty years AI will simply be invisible magic we take for granted.

    Detailed Summary

    What changed in the last year

    Evans frames the past year as a narrowing of focus. A year and a half after the first version of his presentation, the field has developed a much clearer sense of diverging product strategies and competitive tension that goes beyond simply building a bigger model with more compute. The dominant shift is that agentic coding started genuinely working, and the entire industry narrowed in on it because it has absolute product market fit, the kind where customers pull the product out of your hands. That success arrives alongside the supply crunch, capacity constraints, and price imbalance that now define the moment. At the same time, the charts keep climbing, models keep getting bigger, capex keeps growing, and usage keeps growing, while the deep questions from a few years ago remain unanswered.

    Why coding worked first

    That coding led was predictable at a naive level: the people experimenting with the tools were software developers, and they naturally tried to make software development work. Evans compares the moment to the internet around 1997 and 1998, and also to PCs in the late seventies and early eighties, when the technology was exciting but it was not clear what it was for and it did not quite work yet. The first thing people did with PCs was make computers, and since LLMs are in a sense computers, the first thing people are doing with them is making more compute. What was harder to foresee was the precise timing of the shift, the moment when agentic coding flipped from useful to transformative at the start of this year.

    Jobs, juniors, and what we have not learned

    On the question of what this means for engineers and team structure, Evans is blunt that we have learned almost nothing yet, because this did not even work six months ago and everyone is scrambling to interpret it. The pricing crunch alone means it will take a couple of years to settle. The newly concrete questions include whether you still hire junior people and what they would do, and why you were hiring juniors in the first place, whether to do the work itself or to develop people. Because software development now genuinely automates a class of work that used to be done by people, those questions have moved from theoretical to real, but no one can responsibly claim to know what a software team or a software career looks like in three years.

    OpenAI, Anthropic, and the strategy split

    Evans dryly notes the drama around the model labs, including the disruption of a senior leadership medical leave at OpenAI. In the latter part of last year, OpenAI’s question was essentially what to build on top of the models, an everything all at once approach that looked almost like asking the model for fifteen ideas and then doing all of them. Anthropic, with less capital raised, instead committed to coding and got it working, whether by deliberate strategy or by stumbling into it. The result is that software development plus a few other fields are where things genuinely work, surrounded by a large population of people excited around the edges and corporations quietly automating specific back office processes. He cites a commodities company that wants LLMs for better cash flow forecasting across many small producers, a very different thing from asking a chatbot to summarize your meetings.

    The mobile data analogy and value capture

    The richest section is the comparison to mobile. Adoption always compounds on prior platforms, so AI inherits a far larger installed base than the internet or mobile did at their starts. Early on, nothing works smoothly, and Evans recalls the era of buying a three hundred dollar sound card or wrestling a floppy disk of TCP/IP into a machine. The pricing dynamics directly echo mobile data around 2009 and 2010, when flat rate plans met exploding usage and ten thousand dollar bills, forcing networks to realign price with marginal cost. Crucially, mobile data traffic then rose fifteen hundred to two thousand times, the networks built extraordinary global infrastructure with around a trillion dollars of revenue and two hundred billion in annual capex, and yet their stocks stayed flat for twenty years because all the cool stuff and all the value got built and captured by someone else higher up the stack. Chip companies, ISPs, and mobile operators did not capture value; Windows and iOS did, but they had levers and network effects that models do not appear to share.

    The case that models become commodities

    Evans lays out the building blocks of his commodity thesis. First, there is no clear way to build a model that is sustainably and fundamentally better than everyone else’s, with no visible network effect and no strategic lever comparable to what Instagram, YouTube, or Google search enjoy. Differences in emphasis and taste exist, but not durable competitive moats beyond spending. Second, the chatbot is a weird, limited v1 interface that works well for some tasks and people but requires tooling, the right data, configuration, control, and thoughtful design for most real jobs, and the people good at a job are rarely the people good at designing the tool for it. Third, the labs cannot build every application any more than Microsoft or Apple could build every Windows or iPhone app. Enterprises do not standardize on a model the way they never standardized on a visible cloud provider, because it gets abstracted away. Taken together, that points to low level infrastructure sold by perhaps half a dozen competitors plus open source and edge, with no obvious source of price discipline, which is the definition of a commodity even when demand is infinite.

    The questions move out of technology

    One of the next big questions is when models become good enough that you no longer need the largest, fastest, most expensive model, and can use an older model, an open source model, or one running on device where compute is effectively free to the developer. But the deeper shift is that the important questions move out of technology and into industries. Drawing on his own essays “content isn’t king” and “Netflix isn’t a tech company,” Evans argues that Netflix’s real decisions are Los Angeles media questions, not San Francisco infrastructure questions, and San Francisco does not even know what the right questions are. By the same logic, what AI means for a law firm is mostly a question for people who understand law firms, what generative video means for Hollywood is a question Ben Affleck can answer better than he can, and the questions become half AI and half something else.

    Four buttons and the new things AI unlocks

    To reason about impact, Evans offers four buttons. Is a use case just price elasticity, the Jevons paradox of doing the same thing for less or more for the same money. Does it remove a cost that was a barrier to entry, like a newspaper’s printing press. Does it unlock something in your business model. Or does it make something previously impossible now possible, the way steam engines made trains possible regardless of how many horses you bought, or Spotify turned fifteen dollars a month into all the music there is. He stresses that the same broad change can mean wildly different things by industry, just as the internet devastated newspapers but barely touched movie studios. His favorite tractable example is advertising and e-commerce, a trillion dollar advertising market against twenty five trillion in retail, where today’s systems know a SKU and a metadata field and that people who bought one thing bought another, but do not know what a product is or why people buy it. An LLM could in principle understand the product, recommend ten coats at different prices with pros and cons, or look at your Instagram and suggest a winter coat that changes your look but not too much, which would have been science fiction three years ago.

    More software, the SaaS apocalypse, and tasks versus jobs

    For software specifically, Evans expects more competition, cheaper and quicker building, and new categories that were impossible before, all under an uncertain new margin structure where outcome based pricing is hard because most software work cannot be tied cleanly to profit and loss. He frames enterprise software as three buckets, big horizontal systems, hundreds of vertical and internal apps, and a fuzzy improvised middle of Excel and email, with AI arriving as another option across all of them. The deeper design tension is where to place probabilistic software that can make mistakes versus deterministic systems that cannot, and whether the LLM sits at the top or bottom of the stack, with the answer being both depending on the task. The net result is way more software, since SaaS itself produced orders of magnitude more software and software exists to solve problems created by other software. That fuels the SaaS apocalypse anxiety: some companies clearly get wiped out, but since no one knows which, you should not derate the whole sector, even as many investors stay cautious about being long software.

    Implicit knowledge, exception handling, and where the average fails

    Much of what organizations do is implicit, undocumented, and absent from any training data, which is precisely the value of strategy consultancies that get license to map how a company really works versus how it is supposed to work. The real decisions tend to be exception handling, the cases that require human judgment because they were never written down or do not look like before. Evans separates tasks from jobs, noting accountants do almost nothing the way they did fifty years ago while the client still buys the same thing. And he offers a sharp test: LLMs are excellent where you want the average, the answer anyone would give, and weak where you specifically do not want the average and cannot fully articulate why you did it differently.

    Capex, financial gravity, and the ROI question

    On spending, Evans describes a financial gravity problem. Microsoft, Meta, and Google are on line to spend over half their revenue on capex this year, against fifteen to twenty percent for capital intensive telecoms, with roughly seven hundred billion in guidance across the big players, a sum comparable to all of telecom or oil and gas. They cannot sustainably leap to one and a half trillion next year because the money is not there, so the curve must eventually taper. The hyperscalers are caught in an existential FOMO trap: returns look positive now, but they cannot sit out what might be the future of compute without risking becoming the next stranded incumbent, even as the CFO asks how much is enough. On token maxing, he expects a reckoning as the disequilibrium resolves, but measuring ROI is genuinely hard because most reported benefits so far are soft and hard to value, and consumer surplus means much of the gain gets competed away, the way faster spreadsheets simply meant more analysis at the same price.

    Closing image

    Evans ends with an IBM advertisement from the early 1950s showing a sea of engineers holding slide rules, with the tagline that an IBM electronic calculator gives you 150 extra engineers, exactly the pitch behind countless modern startup decks. We move through these fundamental technology waves every ten or fifteen or twenty years, each one feeling completely unlike anything before, and AI is amazing and transformative in the same way mobile, the internet, and PCs were. The base case is that it will produce wonderful things, ruin some livelihoods, put people out of work, and eventually become invisible. His one line description of where it all ends up is that it will be magic, and in twenty years we will simply say of course computers do that, the way an hour of crash free streaming HD video over Wi-Fi already feels unremarkable.

    Notable Quotes

    “Agentic coding went from being kind of useful to really changing everything.”

    Benedict Evans, on the pivotal shift at the start of the year

    “We are in this extreme scarcity. We can’t spend $10 trillion a year on AI infrastructure cuz there isn’t $10 trillion a year there to spend on it.”

    Benedict Evans, on the hard ceiling of AI capex

    “I don’t think foundation models are a product. I don’t think a chatbot is a product. I think the value will be further up.”

    Benedict Evans, stating the core of his thesis

    “They built this amazing piece of global incredibly sophisticated very expensive global infrastructure with enormous growth in use, and they didn’t make any money from it because all the value moved up stack.”

    Benedict Evans, on the mobile network analogy

    “The moment that you understand something and you know how it works and what’s going to happen is the moment you should move on to something else.”

    Benedict Evans, on how to pay attention in tech

    “These are all Los Angeles questions. These are not San Francisco questions. No one in San Francisco even knows what the right questions are.”

    Benedict Evans, on why Netflix is not a tech company

    “The important stuff is not doing the old thing but more. It’s doing something new that you couldn’t have done with the old thing.”

    Benedict Evans, on where the real value of a new technology shows up

    “All software companies exist to solve problems created by other software companies.”

    Benedict Evans, on why AI produces more software, not less

    “It’s going to be magic, and in 20 years time we’ll just say, well, of course that’s how it is. Computers have always done that.”

    Benedict Evans, on how the whole shift ends up

    This is a dense, clear eyed conversation that rewards a full listen, especially if you are trying to think past the hype cycle about where AI value actually lands. Watch the full conversation here, and check out the “AI eats the world” presentation referenced throughout.

    Related Reading

    • Benedict Evans’ website home of the “AI eats the world” presentation and his newsletter referenced throughout the conversation.
    • Andreessen Horowitz (a16z) the venture firm whose podcast hosted this discussion and where Evans was formerly a partner.
    • Jevons paradox (Wikipedia) background on the price elasticity idea Evans uses to explain how cheaper AI may lead to more usage rather than savings.
    • Stratechery by Ben Thompson the analysis Evans cites on software as a designed workflow versus a process that grows out of how a business runs.
    • The Pursuit of Purpose a PJFP look at finding direction and meaning in work as automation reshapes careers and industries.
  • Dario Amodei on the AGI Exponential: Anthropic’s High-Stakes Financial Model and the Future of Intelligence

    TL;DW (Too Long; Didn’t Watch)

    Anthropic CEO Dario Amodei joined Dwarkesh Patel for a high-stakes deep dive into the endgame of the AI exponential. Amodei predicts that by 2026 or 2027, we will reach a “country of geniuses in a data center”—AI systems capable of Nobel Prize-level intellectual work across all digital domains. While technical scaling remains remarkably smooth, Amodei warns that the real-world friction of economic diffusion and the ruinous financial risks of $100 billion training clusters are now the primary bottlenecks to total global transformation.


    Key Takeaways

    • The Big Blob Hypothesis: Intelligence is an emergent property of scaling compute, data, and broad distribution; specific algorithmic “cleverness” is often just a temporary workaround for lack of scale.
    • AGI is a 2026-2027 Event: Amodei is 90% certain we reach genius-level AGI by 2035, with a strong “hunch” that the technical threshold for a “country of geniuses” arrives in the next 12-24 months.
    • Software Engineering is the First Domino: Within 6-12 months, models will likely perform end-to-end software engineering tasks, shifting human engineers from “writers” to “editors” and strategic directors.
    • The $100 Billion Gamble: AI labs are entering a “Cournot equilibrium” where massive capital requirements create a high barrier to entry. Being off by just one year in revenue growth projections can lead to company-wide bankruptcy.
    • Economic Diffusion Lag: Even after AGI-level capabilities exist in the lab, real-world adoption (curing diseases, legal integration) will take years due to regulatory “jamming” and organizational change management.

    Detailed Summary: Scaling, Risk, and the Post-Labor Economy

    The Three Laws of Scaling

    Amodei revisits his foundational “Big Blob of Compute” hypothesis, asserting that intelligence scales predictably when compute and data are scaled in proportion—a process he likens to a chemical reaction. He notes a shift from pure pre-training scaling to a new regime of Reinforcement Learning (RL) and Test-Time Scaling. These allow models to “think” longer at inference time, unlocking reasoning capabilities that pre-training alone could not achieve. Crucially, these new scaling laws appear just as smooth and predictable as the ones that preceded them.

    The “Country of Geniuses” and the End of Code

    A recurring theme is the imminent automation of software engineering. Amodei predicts that AI will soon handle end-to-end SWE tasks, including setting technical direction and managing environments. He argues that because AI can ingest a million-line codebase into its context window in seconds, it bypasses the months of “on-the-job” learning required by human engineers. This “country of geniuses” will operate at 10-100x human speed, potentially compressing a century of biological and technical progress into a single decade—a concept he calls the “Compressed 21st Century.”

    Financial Models and Ruinous Risk

    The economics of building the first AGI are terrifying. Anthropic’s revenue has scaled 10x annually (zero to $10 billion in three years), but labs are trapped in a cycle of spending every dollar on the next, larger cluster. Amodei explains that building a $100 billion data center requires a 2-year lead time; if demand growth slows from 10x to 5x during that window, the lab collapses. This financial pressure forces a “soft takeoff” where labs must remain profitable on current models to fund the next leap.

    Governance and the Authoritarian Threat

    Amodei expresses deep concern over “offense-dominant” AI, where a single misaligned model could cause catastrophic damage. He advocates for “AI Constitutions”—teaching models principles like “honesty” and “harm avoidance” rather than rigid rules—to allow for better generalization. Geopolitically, he supports aggressive chip export controls, arguing that democratic nations must hold the “stronger hand” during the inevitable post-AI world order negotiations to prevent a global “totalitarian nightmare.”


    Final Thoughts: The Intelligence Overhang

    The most chilling takeaway from this interview is the concept of the Intelligence Overhang: the gap between what AI can do in a lab and what the economy is prepared to absorb. Amodei suggests that while the “silicon geniuses” will arrive shortly, our institutions—the FDA, the legal system, and corporate procurement—are “jammed.” We are heading into a world of radical “biological freedom” and the potential cure for most diseases, yet we may be stuck in a decade-long regulatory bottleneck while the “country of geniuses” sits idle in their data centers. The winner of the next era won’t just be the lab with the most FLOPs, but the society that can most rapidly retool its institutions to survive its own technological adolescence.

    For more insights, visit Anthropic or check out the full transcript at Dwarkesh Patel’s Podcast.

  • Inside Microsoft’s AGI Masterplan: Satya Nadella Reveals the 50-Year Bet That Will Redefine Computing, Capital, and Control

    1) Fairwater 2 is live at unprecedented scale, with Fairwater 4 linking over a 1 Pb AI WAN

    Nadella walks through the new Fairwater 2 site and states Microsoft has targeted a 10x training capacity increase every 18 to 24 months relative to GPT-5’s compute. He also notes Fairwater 4 will connect on a one petabit network, enabling multi-site aggregation for frontier training, data generation, and inference.

    2) Microsoft’s MAI program, a parallel superintelligence effort alongside OpenAI

    Microsoft is standing up its own frontier lab and will “continue to drop” models in the open, with an omni-model on the roadmap and high-profile hires joining Mustafa Suleyman. This is a clear signal that Microsoft intends to compete at the top tier while still leveraging OpenAI models in products.

    3) Clarification on IP: Microsoft says it has full access to the GPT family’s IP

    Nadella says Microsoft has access to all of OpenAI’s model IP (consumer hardware excluded) and shared that the firms co-developed system-level designs for supercomputers. This resolves long-standing ambiguity about who holds rights to GPT-class systems.

    4) New exclusivity boundaries: OpenAI’s API is Azure-exclusive, SaaS can run elsewhere with limited exceptions

    The interview spells out that OpenAI’s platform API must run on Azure. ChatGPT as SaaS can be hosted elsewhere only under specific carve-outs, for example certain US government cases.

    5) Per-agent future for Microsoft’s business model

    Nadella describes a shift where companies provision Windows 365 style computers for autonomous agents. Licensing and provisioning evolve from per-user to per-user plus per-agent, with identity, security, storage, and observability provided as the substrate.

    6) The 2024–2025 capacity “pause” explained

    Nadella confirms Microsoft paused or dropped some leases in the second half of last year to avoid lock-in to a single accelerator generation, keep the fleet fungible across GB200, GB300, and future parts, and balance training with global serving to match monetization.

    7) Concrete scaling cadence disclosure

    The 10x training capacity target every 18 to 24 months is stated on the record while touring Fairwater 2. This implies the next frontier runs will be roughly an order of magnitude above GPT-5 compute.

    8) Multi-model, multi-supplier posture

    Microsoft will keep using OpenAI models in products for years, build MAI models in parallel, and integrate other frontier models where product quality or cost warrants it.

    Why these points matter

    • Industrial scale: Fairwater’s disclosed networking and capacity targets set a new bar for AI factories and imply rapid model scaling.
    • Strategic independence: MAI plus GPT IP access gives Microsoft a dual track that reduces single-partner risk.
    • Ecosystem control: Azure exclusivity for OpenAI’s API consolidates platform power at the infrastructure layer.
    • New revenue primitives: Per-agent provisioning reframes Microsoft’s core metrics and pricing.

    Pull quotes

      “We’ve tried to 10x the training capacity every 18 to 24 months.”

      “The API is Azure-exclusive. The SaaS business can run anywhere, with a few exceptions.”

      “We have access to the GPT family’s IP.”

    TL;DW

    • Microsoft is building a global network of AI super-datacenters (Fairwater 2 and beyond) designed for fast upgrade cycles and cross-region training at petabit scale.
    • Strategy spans three layers: infrastructure, models, and application scaffolding, so Microsoft creates value regardless of which model wins.
    • AI economics shift margins, so Microsoft blends subscriptions with metered consumption and focuses on tokens per dollar per watt.
    • Future includes autonomous agents that get provisioned like users with identity, security, storage, and observability.
    • Trust and sovereignty are central. Microsoft leans into compliant, sovereign cloud footprints to win globally.

    Detailed Summary

    1) Fairwater 2: AI Superfactory

    Microsoft’s Fairwater 2 is presented as the most powerful AI datacenter yet, packing hundreds of thousands of GB200 and GB300 accelerators, tied by a petabit AI WAN and designed to stitch training jobs across buildings and regions. The key lesson: keep the fleet fungible and avoid overbuilding for a single hardware generation as power density and cooling change with each wave like Vera Rubin and Rubin Ultra.

    2) The Three-Layer Strategy

    • Infrastructure: Azure’s hyperscale footprint, tuned for training, data generation, and inference, with strict flexibility across model architectures.
    • Models: Access to OpenAI’s GPT family for seven years plus Microsoft’s own MAI roadmap for text, image, and audio, moving toward an omni-model.
    • Application Scaffolding: Copilots and agent frameworks like GitHub’s Agent HQ and Mission Control that orchestrate many agents on real repos and workflows.

    This layered approach lets Microsoft compete whether the value accrues to models, tooling, or infrastructure.

    3) Business Models and Margins

    AI raises COGS relative to classic SaaS, so pricing blends entitlements with consumption tiers. GitHub Copilot helped catalyze a multibillion market in a year, even as rivals emerged. Microsoft aims to ride a market that is expanding 10x rather than clinging to legacy share. Efficiency focus: tokens per dollar per watt through software optimization as much as hardware.

    4) Copilot, GitHub, and Agent Control Planes

    GitHub becomes the control plane for multi-agent development. Agent HQ and Mission Control aim to let teams launch, steer, and observe multiple agents working in branches, with repo-native primitives for issues, actions, and reviews.

    5) Models vs Scaffolding

    Nadella argues model monopolies are checked by open source and substitution. Durable value sits in the scaffolding layer that brings context, data liquidity, compliance, and deep tool knowledge, exemplified by Excel Agent that understands formulas and artifacts beyond screen pixels.

    6) Rise of Autonomous Agents

    Two worlds emerge: human-in-the-loop Copilots and fully autonomous agents. Microsoft plans to provision agents with computers, identity, security, storage, and observability, evolving end-user software into an infrastructure business for agents as well as people.

    7) MAI: Microsoft’s In-House Frontier Effort

    Microsoft is assembling a top-tier lab led by Mustafa Suleyman and veterans from DeepMind and Google. Early MAI models show progress in multimodal arenas. The plan is to combine OpenAI access with independent research and product-optimized models for latency and cost.

    8) Capex and Industrial Transformation

    Capex has surged. Microsoft frames this era as capital intensive and knowledge intensive. Software scheduling, workload placement, and continual throughput improvements are essential to maximize returns on a fleet that upgrades every 18 to 24 months.

    9) The Lease Pause and Flexibility

    Microsoft paused some leases to avoid single-generation lock-in and to prevent over-reliance on a small number of mega-customers. The portfolio favors global diversity, regulatory alignment, balanced training and inference, and location choices that respect sovereignty and latency needs.

    10) Chips and Systems

    Custom silicon like Maia will scale in lockstep with Microsoft’s own models and OpenAI collaboration, while Nvidia remains central. The bar for any new accelerator is total fleet TCO, not just raw performance, and system design is co-evolved with model needs.

    11) Sovereign AI and Trust

    Nations want AI benefits with continuity and control. Microsoft’s approach combines sovereign cloud patterns, data residency, confidential computing, and compliance so countries can adopt leading AI while managing concentration risk. Nadella emphasizes trust in American technology and institutions as a decisive global advantage.


    Key Takeaways

    1. Build for flexibility: Datacenters, pricing, and software are optimized for fast evolution and multi-model support.
    2. Three-layer stack wins: Infrastructure, models, and scaffolding compound each other and hedge against shifts in where value accrues.
    3. Agents are the next platform: Provisioned like users with identity and observability, agents will demand a new kind of enterprise infrastructure.
    4. Efficiency is king: Tokens per dollar per watt drives margins more than any single chip choice.
    5. Trust and sovereignty matter: Compliance and credible guarantees are strategic differentiators in a bipolar world.