PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: ai compute

  • Elon Musk Announces SpaceX AI Satellites, Starship Mass to Orbit, and a Moon Mass Driver to Climb the Kardashev Scale

    Elon Musk sat down with the SpaceX Starlink team for a wide ranging update that connects every recent SpaceX move into one thesis: harness far more of the sun’s energy by putting AI compute in orbit. In this SpaceX conversation, the group walks from galaxy sized framing (the Kardashev scale) all the way down to the engineering specifics of a new AI satellite, the manufacturing buildout in Bastrop, Texas, and a long term plan that ends with a mass driver on the moon. The pitch is that none of it requires magic, just scaling technology SpaceX already flies.

    TLDW

    Musk frames civilizational progress with the Kardashev scale, a measure of how much power a species harnesses, and points out that humanity uses less than a trillionth of the sun’s output, barely registering even on the Type 1 (planet) level. Because most of Earth is water and the usable sunlit land is limited, the only way to capture a meaningful fraction of the sun’s energy is to go to space, where cooling is also easier since heat radiates straight into the vacuum. Three limiting factors must be solved: mass to orbit (handled by fully and rapidly reusable Starship, which already beats the Saturn V on thrust and aims for millions of tons to orbit per year), solar power plus radiators, and AI chips. SpaceX unveils its first AI satellite design, AI1, a roughly 70 meter wingspan craft at 150 kW peak and 120 kW sustained power that matches an Nvidia GB300 rack, reuses Starlink V3 solar technology, links by laser, and runs at only a few milliseconds of latency from low orbit. Chips start as off the shelf Nvidia GB300 and Rubin parts plus a TPU reference design, then scale through a planned 100 million square foot “Terafab” toward a terawatt per year of compute, about twice current US electricity use. The endgame pushes another 1,000x by manufacturing on the moon and using a lunar mass driver to fling satellites into deep space without rockets.

    Thoughts

    The most important reframe in this conversation is that Starlink, Starship, the xAI acquisition, and a new chip factory are not separate bets. They are one bet expressed as a single number: the percentage of the sun’s energy that civilization can capture and put to work. By anchoring everything to the Kardashev scale, Musk turns “build more satellites” into a measurable physics goal rather than a product roadmap. It is a rhetorically powerful move because it makes today’s hyperscale AI buildout, which already strains terrestrial grids, look like the obvious forcing function for going to space. If you accept that compute demand keeps compounding, then the constraint stops being chips and becomes power and cooling, and space genuinely is better at both.

    The cleverest engineering insight is almost understated: an AI satellite is simpler than a Starlink satellite, not harder. A Starlink craft carries complex phased array and parabolic antennas to talk to millions of dispersed users. An orbital data center mostly needs solar cells, radiators, some laser links, and the chips. SpaceX has already industrialized the hard parts (mass produced solar arrays, constellation flight operations at 10,000 satellites, laser mesh networking), so the new product is closer to a remix of proven subsystems than a clean sheet program. That is the real argument for why SpaceX, specifically, can do this when “data center in space” has sounded like science fiction for a decade.

    The numbers are where skepticism should live, and to his credit Musk says to take the timeline with a grain of salt. An annualized gigawatt of space compute by the end of next year, scaling roughly 10x per year toward a terawatt, is an extraordinary ramp. A terawatt is about twice the entire electricity consumption of the United States, delivered as orbiting hardware. Getting there leans on Starship hitting rapid reusability and on a 100 million square foot chip fab that is ten times Gigafactory Texas. Each of those is itself a moonshot, and stacking them multiplies the risk. The honest read is that the architecture is coherent even if the schedule is aspirational.

    The moon segment is where the talk turns from aggressive to genuinely speculative, and it is the part worth watching. A lunar mass driver, essentially a long linear motor that accelerates payloads to escape velocity, only makes sense once you are already moving enormous mass and want to escape Earth’s gravity well and atmosphere entirely. It is a classic Musk pattern: solve the near term problem (mass to orbit with Starship) in a way that creates the precondition for the next, larger problem (local production on the moon). Whether or not the dates hold, the dependency chain is logical, and it explains why SpaceX keeps investing in capabilities that look excessive for today’s market.

    One underrated takeaway for readers outside aerospace: this is as much a manufacturing story as a space story. The bottleneck is not whether a single AI satellite works, it is whether you can stamp out thousands to a million of them, plus the solar, plus the chips, at volume and low cost. That is why so much of the conversation is about Bastrop production lines, a solar manufacturing facility already under construction, and the Terafab. The space hardware is the visible part; the factories are the actual product.

    Key Takeaways

    • The whole strategy is framed around the Kardashev scale, a measure of how much power a civilization harnesses, named for Russian physicist Nikolai Kardashev.
    • Type 1 harnesses a planet’s available power, Type 2 a star’s full output, and Type 3 a galaxy’s; humanity sits at the very bottom of even Type 1.
    • We currently use much less than a trillionth of the sun’s power output, and a trillion is a million times a million.
    • The sun is about 99.86% of all mass in the solar system; most of the remaining 0.14% is Jupiter, and Earth is a tiny dust mote by comparison.
    • Incident solar energy on Earth’s cross section is roughly a half billionth of the sun’s total power output.
    • Most of that sunlight is unusable because about 70% of Earth is water and much of the land is at the poles or far north where solar is weak.
    • Reaching one millionth of the sun’s output, a “micro” on the Kardashev 2 scale, would be an epic achievement relative to today, and 1% would make a civilization vastly more powerful than ours.
    • Space avoids building massive ground power plants and makes cooling easier, because waste heat can radiate directly into the vacuum.
    • Three limiting factors must be solved to scale: mass to orbit, solar power plus radiators, and AI chips.
    • Starship provides the mass to orbit and is the first rocket designed for full and rapid reusability, the breakthrough behind both multiplanetary life and ascending the Kardashev scale.
    • SpaceX catches the booster with the launch tower instead of adding heavy landing legs, an extreme mass optimization measure.
    • Starship V3 already produces more than double the thrust of the Saturn V; V4 will be roughly three times, making it the largest, heaviest, most powerful moving object ever built.
    • Starship is targeted to eventually fly more than once per hour.
    • SpaceX already delivers roughly 85 to 90% of all Earth mass to orbit with Falcon 9 and Falcon Heavy.
    • The plan is to go from around 2,500 tons to orbit per year to millions of tons per year, reaching a million tons per year in about three years.
    • The AI satellite, called AI1, is actually simpler than a Starlink satellite because it lacks the complex phased array and parabolic antennas.
    • AI1 targets 150 kW peak power and 120 kW sustained power, roughly matching an Nvidia GB300 rack of 72 GPUs.
    • Design assumptions are about 250 watts per square meter for the solar array and about 1,400 watts per square meter for the double sided radiators, both expected to improve over time.
    • Radiators are oriented knife edge to the sun and radiate from both sides; each satellite has roughly a 70 meter wingspan.
    • Each satellite carries on the order of a terabit of laser link connectivity.
    • Satellites connect to each other or to the Starlink constellation by laser, and Starlink relays data to the ground over existing Ka and Ku antennas plus laser to ground links.
    • At 600 to 800 km altitude latency is only around 3 milliseconds, since light travels about 300 km per millisecond.
    • SpaceX has about 10,000 Starlinks in orbit and is the only operator with experience flying constellations at that scale.
    • The constellation could eventually grow to thousands or even up to a million satellites; space is big enough to pack and fly them safely.
    • The satellites and solar will be built in Bastrop, Texas, where a solar manufacturing facility is already under construction.
    • The AI satellite production building and solar production are expected to be operating at reasonable volume by the end of next year.
    • SpaceX keeps making Starlink user terminals in Bastrop and is turning on new, higher volume production lines, with possibly a few hundred million terminals eventually, plus a direct to cell constellation that connects straight to phones.
    • Initial chips are off the shelf: the reference design targets Nvidia GB300 or Rubin chips, with a TPU reference design as well, and essentially any existing chip can be put into orbit.
    • The chip industry looks set to reach maybe 100 gigawatts a year of AI compute, far short of the terawatt SpaceX wants.
    • To close that gap, SpaceX plans a “Terafab,” a chip factory around 100 million square feet, roughly 10 times the size of Tesla Gigafactory Texas.
    • A terawatt of chip output per year is like a billion full reticle equivalent chips, each running about a kilowatt, plus a lot of memory.
    • The timeline targets an annualized rate of a gigawatt per year of space compute by the end of next year, scaling roughly 10x per year: 10 GW in about 2.5 years, 100 GW in about 3.5 years, then a terawatt per year, which is 1,000 GW and about twice current US electricity consumption.
    • Beyond a terawatt, the only path to another 1,000x is the moon, using local production of photovoltaics, solar, and radiators so most mass does not have to be shipped from Earth.
    • A lunar mass driver (a linear electric motor or rail gun) could accelerate AI satellites into deep space without rockets, thanks to the moon’s lack of atmosphere and one sixth gravity.
    • Bringing that much mass to the moon would also make it possible for anyone who wants to go to the moon to go, and even live there.
    • Musk stresses none of this requires magic; the AI satellite reuses Starlink V3 solar technology, and he frames the timelines as a best guess rather than a promise.
    • SpaceX has acquired xAI, now referred to as SpaceX AI, folding its AI ambitions directly into the space company.

    Detailed Summary

    The Kardashev Scale and Why Earth Barely Registers

    Musk opens with the question of how you objectively measure a civilization’s progress, the metric an alien species would use to calibrate us. The answer he reaches for is the Kardashev scale, named for the Russian physicist who proposed it, which ranks civilizations by the power they harness: a planet’s worth (Type 1), a star’s worth (Type 2), or a galaxy’s worth (Type 3). Humanity is extremely low even on Type 1. To dramatize the scale of the sun, he notes it is about 99.86% of all the mass in the solar system, with most of the rest being Jupiter and Earth a tiny dust mote in the miscellaneous category. The incident solar energy hitting Earth’s cross section is only about a half billionth of the sun’s total output, and we capture a vanishingly small slice of even that.

    Why Energy at Scale Means Going to Space

    Because roughly 70% of Earth is water and much of the remaining land sits at the poles or in far northern regions where solar is weak and few people live, the usable area for ground solar is small. To reach any meaningful percentage of the sun’s energy, you have to go to space. Musk sets the aspiration at a millionth of the sun’s output as a first “micro” milestone, noting that even 1% would make a civilization vastly more powerful than today’s. Orbit also solves two practical problems at once: you avoid building enormous terrestrial power plants, and cooling becomes easier because waste heat can be radiated straight into the vacuum rather than fought against in an atmosphere.

    The Three Limiting Factors

    Scaling to space based compute comes down to three things: a large mass to orbit capability, a lot of solar power and radiators, and a lot of AI chips. To put a hundred gigawatts and ultimately a terawatt into space, you need a terawatt of solar generation, the radiators to reject the heat, and a terawatt of AI chips. The rest of the conversation works through each limiting factor in turn, starting with the one SpaceX has spent two decades on.

    Starship and the Reusability Breakthrough

    Starship supplies the mass to orbit. Musk argues that full and rapid reusability is the fundamental breakthrough required for both multiplanetary life and climbing the Kardashev scale, since expendable rockets are simply too expensive and you cannot build enough of them. Every other mode of transport, from cars to planes to bicycles, is reusable; rockets are uniquely hard because Earth has a deep gravity well and thick atmosphere, which is why many prior reusable rocket attempts were abandoned. SpaceX pushes mass optimization to the extreme, even catching the booster with the launch tower instead of carrying heavy landing legs. The goal beyond catching the rocket is reflying it with no refurbishment, like an aircraft. Starship V3 already more than doubles the Saturn V’s thrust, V4 will be roughly triple, and the vehicle is the largest and most powerful moving object ever made, targeted to fly more than once per hour. SpaceX already lifts an estimated 85 to 90% of all Earth mass to orbit, and plans to scale from about 2,500 tons per year to millions of tons per year, reaching a million tons per year in roughly three years.

    Inside the AI Satellite (AI1)

    The team explains that a data center in space is not a building with engines bolted on; it reduces to chips plus the power and cooling to run them. The AI satellite, dubbed AI1, is actually simpler than a Starlink satellite because it skips the complex phased array and parabolic antennas, leaving mostly solar cells, a radiator, and some laser links. The draft version targets 150 kW peak power and 120 kW sustained, matching roughly what an Nvidia GB300 rack of 72 GPUs draws. Design assumptions are about 250 watts per square meter of solar array and about 1,400 watts per square meter for double sided radiators oriented knife edge to the sun, both numbers expected to improve. The result is a craft with around a 70 meter wingspan and roughly a terabit of laser connectivity. Compute racks link to each other or to the Starlink constellation by laser, and data reaches the ground via existing Ka and Ku antennas or laser to ground links. From 600 to 800 km up, latency is only about 3 milliseconds, since light travels 300 km per millisecond, so the common worry about high latency does not apply.

    Operating a Constellation of a Million Satellites

    The satellites are large, but space is enormous, so even thousands or up to a million of them would not crowd orbit; viewed against the Earth they are nearly invisible. SpaceX leans on hard won operational experience, with about 10,000 Starlinks already flying and a unique track record of operating constellations at that scale safely. Knowing how tightly satellites can be packed and flown without collisions is treated as the number one constraint when designing the constellation.

    Manufacturing in Bastrop, Texas

    The satellites and solar will be built in Bastrop, Texas, in a facility the hosts describe as already massive and about to be dwarfed by what comes next. A solar manufacturing facility is already under construction, and the AI satellite production building will follow, with both expected to operate at reasonable volume by the end of next year. The same site keeps producing Starlink user terminals and is spinning up new, higher volume lines. Musk projects there could eventually be a few hundred million Starlink terminals, alongside a direct to cell constellation that connects straight from a phone to space for high bandwidth communication.

    Chips, the Terafab, and the Road to a Terawatt

    In the near term, SpaceX simply launches chips that already exist. The current reference design targets Nvidia GB300 or Rubin chips, with a TPU reference design as well, and essentially any existing chip can be flown. The problem is that the chip industry as a whole may only reach about 100 gigawatts a year of AI compute, which does not answer how you get to a terawatt. The answer is a gigantic chip factory, a “Terafab” around 100 million square feet, roughly ten times the size of Tesla Gigafactory Texas, big enough that Musk jokes about needing Starship point to point to cross it. Even with no new fundamental breakthroughs, scaling existing chip technology to a terawatt of output per year is, from a logic die standpoint, like a billion full reticle equivalent chips each running a kilowatt, plus a lot of memory. The stated timeline is an annualized gigawatt per year of space compute by the end of next year, then scaling roughly an order of magnitude per year: about 10 GW in 2.5 years, 100 GW in 3.5 years, and eventually a terawatt per year, which is 1,000 GW, about twice the current electricity consumption of the United States. Musk repeatedly flags these as best guesses, not promises.

    The Moon, a Mass Driver, and the Next 1,000x

    Asked why stop at a terawatt, Musk says a terawatt is actually very small. Getting another three orders of magnitude, a 1,000x jump, points to the moon. The plan is local lunar production of photovoltaics, solar, and radiators, so that most of the mass does not have to be transported from Earth, with chips either shipped up or eventually made on the moon. Because the moon has no atmosphere and only one sixth of Earth’s gravity, you can accelerate AI satellites into deep space without a rocket, using an electromagnetic mass driver, essentially a rail gun or linear electric motor. A side benefit of moving that much mass to the moon is that anyone who wants to go to the moon would be able to, and could even live there. The team closes on the excitement of building a whole new kind of satellite and the sci fi prospect of a mass driver on the moon.

    Notable Quotes

    “We currently use much less than a trillionth of the power output of the sun. And a trillion is a million times a million.”

    Elon Musk, on how far humanity sits from harnessing the sun’s energy

    “The sun is about 99.86% of all mass in the solar system.”

    Elon Musk, dramatizing the scale of the star we orbit

    “You’re an extremely kick-ass civilization if you get to 1% of the sun’s energy.”

    Elon Musk, on what a meaningful Kardashev milestone would look like

    “Reusability is the fundamental breakthrough that is necessary to make life multiplanetary, as well as to ascend the Kardashev scale.”

    Elon Musk, on why Starship matters

    “An AI satellite is essentially a lot of solar cells, a radiator, and you still need some laser links, but you don’t have all of the super complex antennas that you have on a Starlink satellite.”

    Elon Musk, on why the orbital data center is simpler than Starlink

    “There’s not some magic that’s necessary that doesn’t exist for the AI satellites.”

    Elon Musk, on reusing existing Starlink technology

    “We expect that the Terafab is going to be around 100 million square feet, which is 10 times the size of the Tesla Gigafactory Texas.”

    Elon Musk, on the chip factory needed to reach a terawatt

    “The only way that we can really see that you can achieve that is on the moon with a mass driver.”

    Elon Musk, on scaling another 1,000x beyond a terawatt

    Watch the full conversation here: Elon Musk and the SpaceX team on AI satellites and climbing the Kardashev scale.

    Related Reading

    • Kardashev scale (Wikipedia), background on the Type 1, 2, and 3 framework that anchors the entire conversation.
    • Starship (SpaceX), the official page for the fully reusable vehicle behind the mass to orbit numbers.
    • Starlink, the constellation whose solar arrays, laser links, and operations the AI satellites are built on.
    • Mass driver (Wikipedia), the electromagnetic launch concept proposed for flinging satellites off the moon.
    • Nvidia GB300 (Nvidia), the GPU rack whose power profile defines the first AI satellite’s compute target.
  • Anthropic Raises $65 Billion Series H at $965 Billion Valuation to Fund AI Safety Research and Massive Compute Expansion

    Anthropic has closed one of the largest private financing rounds in the history of technology, raising $65 billion in Series H funding at a $965 billion post-money valuation. The round, announced on May 28, 2026, lands as demand for Claude reaches what the company calls historic levels, and it positions Anthropic to pour fresh capital into safety research, compute, and the products that enterprises now lean on every day.

    TLDR

    Anthropic raised $65 billion in its Series H at a $965 billion post-money valuation, with Altimeter Capital, Dragoneer, Greenoaks, and Sequoia Capital leading and Capital Group, Coatue, D1 Capital Partners, GIC, ICONIQ, and XN co-leading, alongside $15 billion in previously committed hyperscaler investment that includes $5 billion from Amazon. The raise follows Anthropic crossing $47 billion in run-rate revenue earlier in May 2026, and it funds three priorities named by CFO Krishna Rao: advancing safety and interpretability research, expanding compute capacity to meet growing Claude demand, and scaling the products and partnerships customers depend on. On the infrastructure side, the company is locking in gigawatt-scale compute through 5 gigawatts with Amazon, 5 gigawatts of TPU capacity via Google and Broadcom, GPU access from SpaceX, and supply from partners Micron, Samsung, and SK hynix, while Claude remains available across all three major cloud platforms, AWS, Google Cloud, and Microsoft Azure, with widespread enterprise adoption across industries.

    Thoughts

    Start with the number that everyone will fixate on. A $965 billion post-money valuation against $47 billion in run-rate revenue is roughly 20 times sales, and for a company growing this fast that multiple is not the interesting part. The interesting part is that run-rate revenue crossed $47 billion earlier this month, which means the denominator is moving so quickly that the multiple is already stale. Investors are not pricing the business Anthropic is today. They are pricing the slope. A 20x multiple on a number that may double again inside a year is a very different bet than 20x on a flat line, and the lead names here (Altimeter, Dragoneer, Greenoaks, Sequoia, with Capital Group, Coatue, GIC and others co-leading) are not the kind of capital that pays for nostalgia. They are paying for the second derivative.

    But the real story is not the valuation. It is the compute. Read the infrastructure list carefully and you see the actual problem this round solves: 5 gigawatts from Amazon, 5 gigawatts of TPU capacity through Google and Broadcom, GPU access from SpaceX, and memory supply locked down with Micron, Samsung, and SK hynix. That is more than 10 gigawatts of secured power and silicon. The constraint on frontier AI in 2026 is no longer talent or even algorithms. It is electricity, land, and the multi-year queue for advanced packaging and high-bandwidth memory. You cannot buy 10 gigawatts on a quarterly basis. You reserve it years out, and you need the balance sheet to make those commitments credible. A $65 billion raise is, in plain terms, the down payment that lets Anthropic sign for capacity nobody can conjure on demand. The money is downstream of the megawatts.

    The diversification across that compute stack matters as much as the size. By splitting between Amazon’s infrastructure, Google and Broadcom’s custom TPUs, and SpaceX-supplied GPUs, Anthropic is refusing to become hostage to any single supplier’s roadmap or pricing. Custom silicon through Broadcom in particular is a bet on bending the cost curve, because the long-term economics of serving Claude at this scale depend on dollars per token, not just on raw availability. Anyone who has watched cloud lock-in play out over the last decade understands the move. Optionality at the hardware layer is leverage, and leverage is what keeps margins from being dictated by whoever owns the only fab slot you can reach.

    It is worth pausing on the fact that the round explicitly funds safety and interpretability research alongside scaling, and not as a footnote. Most companies treat safety spend as a cost center to be minimized once growth kicks in. Naming it first, ahead of compute and products, is a statement about where Anthropic believes its durable advantage sits. If models keep getting more capable, the binding constraint on deployment inside regulated industries (finance, healthcare, government) becomes trust, not intelligence. Interpretability is the work that turns a black box into something an enterprise risk committee can actually sign off on. Framed that way, safety research is not philanthropy subtracted from the bottom line. It is the thing that unlocks the most lucrative and defensible parts of the market, and pairing it with the scaling budget is the tell.

    Finally, look at distribution. Claude now ships on all three major clouds at once: AWS, Google Cloud, and Microsoft Azure. In a market where most frontier labs are tethered to a single hyperscaler, being available everywhere enterprises already run their workloads is a structural edge. It removes the procurement friction of asking a customer to adopt a new vendor relationship, and it means Anthropic competes on the merits of the model rather than on which cloud a buyer happened to standardize on years ago. Combine that omnipresent distribution with the compute reservations and the explicit safety mandate, and the shape of the strategy is clear. This is not a company buying time. It is a company buying the three things that actually compound: capacity that cannot be rushed, trust that cannot be faked, and reach into every place where work already happens.

    Key Takeaways

    • Anthropic raised $65 billion in its Series H funding round, one of the largest private financings in the history of the technology industry.
    • The round set Anthropic’s post-money valuation at $965 billion, placing the company within reach of the $1 trillion mark.
    • Altimeter Capital, Dragoneer, Greenoaks, and Sequoia Capital led the Series H round.
    • Capital Group, Coatue, D1 Capital Partners, GIC, ICONIQ, and XN served as co-leads on the investment.
    • The new capital builds on $15 billion in previously committed hyperscaler investments, which includes $5 billion from Amazon.
    • Anthropic crossed $47 billion in run-rate revenue earlier in May 2026, reflecting the surging commercial demand for Claude.
    • A core priority for the funding is to advance Anthropic’s safety and interpretability research.
    • The company will use the capital to expand compute capacity in order to meet growing demand for Claude.
    • Anthropic plans to scale the products and partnerships that customers depend on across its business.
    • CFO Krishna Rao said the funding will help Anthropic serve the historic demand it is experiencing, stay at the research frontier, and bring Claude to more of the places where work happens.
    • Amazon is providing 5 gigawatts of compute capacity as part of Anthropic’s infrastructure expansion.
    • Google and Broadcom are supplying 5 gigawatts of TPU capacity to power Claude’s growth.
    • SpaceX is contributing GPU access to Anthropic’s compute footprint.
    • Micron, Samsung, and SK hynix are partnering with Anthropic on memory and infrastructure to support its scaling needs.
    • Claude is available on all three major cloud platforms, AWS, Google Cloud, and Microsoft Azure.
    • Anthropic reports widespread enterprise adoption of Claude across a broad range of industries.

    Detailed Summary

    The Raise and the Valuation

    Anthropic has raised $65 billion in Series H funding, a round that values the company at $965 billion on a post-money basis. The size of the raise places it among the largest private financing events the technology industry has ever seen, and the valuation pushes Anthropic to the doorstep of the trillion dollar mark. The capital arrives at a moment when demand for the company’s Claude models has accelerated sharply, and the round is built to fund the response to that demand rather than simply mark a milestone. Anthropic framed the financing in its Series H announcement as the fuel for staying at the research frontier while scaling the infrastructure and products that customers increasingly rely on.

    Who Put In the Money

    The Series H was led by Altimeter Capital, Dragoneer, Greenoaks, and Sequoia Capital, a group that combines deep growth-stage technology experience with conviction in Anthropic’s long-term trajectory. Joining as co-leads were Capital Group, Coatue, D1 Capital Partners, GIC, ICONIQ, and XN, a roster that spans crossover funds, sovereign wealth, and institutional investors. Beyond the new equity, Anthropic pointed to $15 billion in previously committed hyperscaler investment, including $5 billion from Amazon. Taken together, the investor base reflects a mix of financial backers and strategic partners with a direct stake in seeing Claude reach more customers and more compute.

    Revenue at $47 Billion Run-Rate

    Underpinning the valuation is a business that has scaled with unusual speed. Anthropic crossed a $47 billion run-rate revenue figure earlier in May 2026, a number that signals how quickly enterprises and developers have adopted Claude across their workflows. Run-rate revenue annualizes the company’s most recent performance, and at this level it puts Anthropic firmly among the fastest growing software businesses on record. That financial momentum is the practical justification for both the round’s size and the near trillion dollar valuation investors were willing to support.

    The Compute Buildout

    A large share of the strategy behind the raise centers on securing compute at enormous scale. Anthropic detailed a set of infrastructure partnerships designed to keep pace with Claude demand. Amazon is providing 5 gigawatts of capacity, while Google and Broadcom together are supplying 5 gigawatts of TPU capacity. SpaceX is contributing GPU access, broadening the range of silicon Anthropic can draw on. Supporting the buildout on the hardware supply side are Micron, Samsung, and SK hynix, the memory and component partners whose output is essential to standing up data centers at this magnitude. The combined picture is a company assembling power, chips, and supply chain commitments measured in gigawatts rather than racks.

    Where the Money Goes

    Anthropic outlined three priorities for the new capital. The first is to advance safety and interpretability research, continuing the work of understanding how models behave and ensuring they remain reliable as they grow more capable. The second is to expand compute capacity to meet the growing demand for Claude, the practical engine behind the infrastructure commitments above. The third is to scale the products and partnerships that customers depend on, deepening the company’s reach into the tools and platforms where work actually happens. Krishna Rao, Anthropic’s chief financial officer, said the funding “will help us serve the historic demand we are experiencing, stay at the research frontier, and bring Claude to more of the places where work happens.”

    Claude Everywhere

    The funding lands on top of a distribution footprint that already spans the major cloud ecosystems. Claude is available on all three leading cloud platforms, AWS, Google Cloud, and Microsoft Azure, which means enterprises can reach the models through whichever provider they have standardized on. That availability has translated into widespread enterprise adoption across industries, from software and finance to healthcare and beyond. By being present everywhere developers and businesses already operate, Anthropic positions Claude not as a destination customers must travel to but as a capability woven into the platforms they use every day.

    Notable Quotes

    This funding will help us serve the historic demand we are experiencing, stay at the research frontier, and bring Claude to more of the places where work happens.

    Krishna Rao, CFO at Anthropic, on the purpose of the Series H round.

    Advance safety and interpretability research, expand compute capacity to meet growing Claude demand, and scale products and partnerships customers depend on.

    How Anthropic describes its use of funds from the round.

    For the full details on the round, the lead and co-lead investors, and how Anthropic plans to deploy the capital across safety research, compute, and products, read the full announcement here.

    Related Reading

    • Anthropic, the AI safety and research company behind Claude that raised this Series H round.
    • Sequoia Capital, one of the lead investors anchoring the financing.
    • Amazon Web Services, one of the three major cloud platforms where Claude is available and the source of a $5 billion investment.
    • Google Cloud TPUs, the tensor processing units behind the 5 gigawatts of TPU capacity in the Google and Broadcom partnership.
    • AI safety, the research field at the center of how Anthropic says it will use the new funding.
  • Krishna Rao on Anthropic Going From 9 Billion to 30 Billion ARR in One Quarter and the Compute Strategy Powering Claude

    Krishna Rao, Chief Financial Officer of Anthropic, sat down with Patrick O’Shaughnessy on Invest Like the Best for one of the most detailed public looks yet at the operating engine behind Claude. He covers how Anthropic compounded from $9 billion of run rate revenue at the start of the year to north of $30 billion by the end of Q1, why he spends 30 to 40 percent of his time on compute, the playbook for buying gigawatts of AI infrastructure across Trainium, TPU, and GPU platforms, how Anthropic prices its models, why returns to frontier intelligence keep climbing, and what the Mythos release tells us about the cyber capabilities of the next generation of Claude.

    TLDW

    Anthropic is running the most compute fungible frontier lab in the world, with active deployments across AWS Trainium, Google TPU, and Nvidia GPU, and an internal orchestration layer that lets a chip serve inference in the morning and run reinforcement learning the same evening. Krishna Rao explains the cone of uncertainty that governs gigawatt scale compute procurement, the floor Anthropic refuses to drop below on model development compute, the Jevons paradox unlock from cutting Opus pricing, the 500 percent annualized net dollar retention from enterprise customers, the layer cake of long term deals with Google, Broadcom, Amazon, and the recent xAI Colossus tie up in Memphis, the phased release of the Mythos model in response to spiking cyber capabilities, the internal use of Claude Code to produce statutory financial statements and run a Monthly Financial Review skill, and why the team believes scaling laws are alive and well. The interview also covers fundraising history through Series D and Series E, the $75 billion already raised plus another $50 billion coming, talent density beating talent mass during the Meta poaching wave, and Rao’s belief that biotech and drug discovery represent the most exciting frontier for AI.

    Key Takeaways

    • Anthropic entered the year with about $9 billion of run rate revenue and ended the first quarter with north of $30 billion of run rate revenue, a more than 3x leap driven by model intelligence gains and the products built around them.
    • Compute is described as the lifeblood of the company, the canvas everything else is built on, and the most consequential class of decisions Rao makes. Buy too much and you go bankrupt. Buy too little and you cannot serve customers or stay at the frontier.
    • Rao spends 30 to 40 percent of his time on compute, even today, and the leadership team meets repeatedly on both procurement and ongoing compute allocation.
    • Anthropic is the only frontier language lab actively using all three major chip platforms in production: AWS Trainium, Google TPU, and Nvidia GPU. It is also the only major model available on all three clouds.
    • Flexibility is the central design principle. Anthropic builds flexibility into the deals themselves, into the orchestration layer that maps workloads to chips, and into compilers built from the chip level up.
    • The cone of uncertainty frames procurement. Small differences in weekly or monthly growth compound into wildly different two year outcomes, so the team plans across a range of scenarios rather than a single point estimate, and ranges toward the upper end while protecting downside.
    • Compute allocation across the company sits in three buckets: model development and research, internal employee acceleration, and external customer serving. A non negotiable floor protects model development even when customer demand is tight.
    • Anthropic estimates that if it cut off internal employee use of its own models, the freed compute could serve billions of dollars of additional revenue. It chooses not to, because internal use compounds into better future models.
    • Intelligence is multi dimensional, not a single IQ score. Anthropic measures real world capability through customer feedback, long horizon task performance, tool use, computer use, and speed at agentic tasks, not just leaderboard benchmarks that have largely saturated.
    • Each Opus generation, 4 to 4.5 to 4.6 to 4.7, delivers both capability improvements and an efficiency multiplier on token processing. New models often serve customers at a fraction of the prior cost while doing more.
    • Reinforcement learning is described as inference inside a sandbox with a reward function, so model efficiency gains directly improve internal RL throughput. The flywheel is tightly coupled.
    • Over 90 percent of code at Anthropic is now written by Claude Code, and a large share of Claude Code itself is written by Claude Code.
    • Anthropic shipped roughly 30 distinct product and feature releases in January and the pace has accelerated since.
    • Scaling laws, in Anthropic’s internal data, are alive and well. The team holds itself to a skeptical scientific standard and still does not see them slowing down.
    • Anthropic recently signed a 5 gigawatt deal with Google and Broadcom for TPUs starting in 2027, plus an Amazon Trainium agreement for up to 5 gigawatts, totaling more than $100 billion in commitments. A significant portion lands this year and next year.
    • A new partnership for capacity at the xAI Colossus facility in Memphis was announced just before the interview, aimed at expanding consumer and prosumer capacity.
    • Pricing has been remarkably stable across Haiku, Sonnet, and Opus. The biggest deliberate change was lowering Opus pricing, which produced a textbook Jevons paradox: consumption rose far faster than the price drop, and the new Opus 4.6 and 4.7 slot in at the same price point.
    • Mythos is the first model Anthropic chose to release in a phased way because of a sharp spike in cyber capability. In an open source codebase where a prior model found 22 security vulnerabilities, Mythos found roughly 250.
    • The Mythos release framework focuses on defensive use first, expands access over time, and is presented as a template for future capability spikes.
    • Anthropic now sells to 9 of the Fortune 10 and reports net dollar retention above 500 percent on an annualized basis. These are not pilots. Rao describes signing two double digit million dollar commitments during a 20 minute Uber ride to the studio.
    • The platform strategy is mostly horizontal. Anthropic will go vertical with offerings like Claude for Financial Services, Claude for Life Sciences, and Claude Security where it can demonstrate the model’s capabilities, but expects most application value to accrue to customers building on top.
    • Investors raised over $75 billion in equity since Rao joined, with another $50 billion in commitments tied to the Amazon and Google deals. Capital intensity is real, but the raises fund the upper end of the cone of uncertainty more than they fund current losses.
    • The Series E close coincided with the day the DeepSeek news broke, forcing investors to reassess their AI thesis in real time. Anthropic closed the round anyway.
    • Inside finance, Claude now produces statutory financial statements for every Anthropic legal entity, with a human checker. A library of more than 70 finance specific skills underpins workflows.
    • A custom Monthly Financial Review skill produces a 90 to 95 percent ready monthly close report, so leadership discussion shifts from reconciling numbers to debating implications.
    • An internal real time analytics platform called Anthrop Stats compresses weekly insight cycles from hours to about 30 minutes.
    • The biggest token user inside Anthropic’s finance team is the head of tax, focused on tax policy engines and workflow automation. The most senior people, not the youngest, are leading internal adoption.
    • Talent density beats talent mass. When Meta and others ran aggressive offer waves, Anthropic lost two people while peer labs lost dozens.
    • All seven Anthropic co founders remain at the company, as does most of the first 20 to 30 employees, which Rao credits to a collaborative, transparent, debate friendly culture and a real culture interview that can veto otherwise top tier candidates.
    • Dario Amodei holds an open all hands every two weeks, writes a short prepared document, and takes unscripted questions from anyone at the company.
    • AI safety investments in interpretability and alignment have a commercial side effect. Looking inside the model helps Anthropic build better models, and enterprises selling sensitive workloads want to trust the lab they hand customer data to.
    • Anthropic explicitly identifies as America first in its approach to model development, and engages closely with the US administration on capability releases such as Mythos.
    • The longer term product vision is the virtual collaborator: an agent with organizational context, access to the company’s tools, persistent memory, and the ability to work on ideas, not just tasks, over long horizons.
    • CoWork, Anthropic’s extension of the Claude Code paradigm into general knowledge work, is being adopted faster than Claude Code itself when indexed to the same point in its launch curve.
    • Anthropic’s product teams ship daily, with a fleet of agents working across the company on specific tasks. Everyone effectively becomes a manager of agents.
    • The dominant downside risks to Anthropic’s high end forecast are slower customer diffusion of model capability into real workflows, scaling laws flattening unexpectedly, and Anthropic losing its position at the frontier.
    • Rao is most excited about biotech and healthcare outcomes, especially the prospect that AI could push drug discovery and lab throughput up 10x or 100x, turning currently incurable diagnoses into treatable ones within a patient’s lifetime.

    Detailed Summary

    Compute as Lifeblood and the Cone of Uncertainty

    Rao opens with the claim that compute is the most important resource at Anthropic, and the most consequential decision class in the company. You cannot buy a gigawatt of compute next week. You have to anticipate demand a year or two in advance, and the cost of being wrong in either direction is high. Buy too much and the unit economics collapse. Buy too little and you cannot serve customers or stay at the frontier, which are described as the same failure mode. To navigate this, the team uses a cone of uncertainty rather than point estimates. Small differences in weekly growth compound into vastly different two year outcomes, and Anthropic tries to position itself toward the upper end of that cone while preserving optionality. Rao notes he has had to consciously break a lifetime of linear thinking and force himself into exponential models.

    Three Chip Platforms, One Orchestration Layer

    Anthropic uses Amazon’s Trainium, Google’s TPUs, and Nvidia’s GPUs fungibly. That was not free. Adopting TPUs at scale started around the third TPU generation, when outside observers thought it was a strange choice. Anthropic invested years into compilers and orchestration so workloads can flow across chips by generation and by job type. The team works deeply with Annapurna Labs at AWS to influence Trainium roadmaps because Anthropic stresses these chips harder than almost anyone. The result is what Rao believes is the most efficient utilization of compute across any frontier lab, with a dollar of compute going further inside Anthropic than anywhere else.

    Three Buckets and the Model Development Floor

    Compute gets allocated across model development, internal acceleration of employees, and customer serving. The conversations are collaborative rather than zero sum, but there is a hard floor on model development that the company refuses to cross even if it makes customer demand harder to serve in the short term. The thesis is simple. The returns to frontier intelligence are extremely high, especially in enterprise, so cutting model investment to chase near term revenue is a bad trade. Internal employee use is also explicitly protected. Rao notes that diverting that internal usage to external customers would unlock billions of additional revenue today, but the compounding benefit of accelerating researchers and engineers outweighs that.

    Intelligence Is Multi Dimensional

    Rao pushes back hard on the IQ framing of model progress. Benchmarks saturate quickly, and the real signal comes from how customers actually use the models. Anthropic looks at long horizon task completion, tool use, computer use, and time to result on agentic tasks. Two equally capable agents who differ only in speed produce dramatically different value, because the faster one compounds into more attempts and more outcomes. Frontier model leaps are also fuel efficient. The sedan to sports car analogy breaks down because each Opus generation, 4 to 4.5 to 4.6 to 4.7, delivers a step up in capability and a multiplier on per token efficiency.

    From 9 Billion to 30 Billion ARR in One Quarter

    The headline number for the quarter is a leap from about $9 billion of run rate revenue to over $30 billion, accomplished without onboarding a corresponding step up in compute, because new compute lands on ramps locked in 12 months prior. Rao attributes the leap to model capability gains, products that surface that intelligence in usable form factors, and an enterprise customer base that pulls more workloads onto Claude as each generation unlocks new use cases. Coding started the wave with Sonnet 3.5 and 3.6, and the same pattern is now playing out elsewhere in the economy.

    Recursive Self Improvement and Talent Density

    Over 90 percent of Anthropic’s code is now written by Claude Code, including most of Claude Code itself. Rao describes this as a structural reason to keep allocating internal compute to employees even when external demand is hungry. Recursive self improvement is not happening through models that need no humans. It is happening through researchers who set direction and use frontier models to compress months of work into days. Talent density beats talent mass. When Meta and other labs went after Anthropic researchers with very large packages, Anthropic lost two people while peer labs lost dozens.

    Procurement Strategy and the Layer Cake

    Compute lands as a layer cake. Last month Anthropic signed a 5 gigawatt TPU deal with Google and Broadcom starting in 2027, alongside an Amazon Trainium agreement for up to 5 gigawatts. The total is north of $100 billion in commitments. A new tie up with xAI’s Colossus facility in Memphis was announced just before the interview, intended for nearer term capacity to support consumer and prosumer growth. Anthropic evaluates near term and long term compute deals against the same set of variables: price, duration, location, chip type, and how efficiently the team can run it. The relationships are deeper than procurement. The hyperscalers are also distribution channels for the model.

    Platform First, Selective Vertical Bets

    Rao describes Anthropic as a platform first business, with most expected value accruing to customers building on the platform. The team will only go vertical when it can either demonstrate capabilities that are skating to where the puck is going, like Claude Code did before the models could fully support it, or when it wants to set a template for an industry vertical, as with Claude for Financial Services, Claude for Life Sciences, and Claude Security. He acknowledges that surprise capability jumps make customers anxious about the platform competing with them, and frames Anthropic’s mitigation as deeper partnerships, early access programs, and an emphasis on accelerating customer building rather than disintermediating it.

    Pricing, Jevons Paradox, and Return on Compute

    Pricing across Haiku, Sonnet, and Opus has been stable. The notable exception is Opus, which Anthropic deliberately repriced lower when launching Opus 4.5 because Opus class problems were being squeezed into Sonnet workloads. Efficiency gains made it possible to serve Opus profitably at the new level. The consumption response was a classic Jevons paradox, with usage rising far more than the price reduction would have predicted, and Opus 4.6 then slotted in at the same price with a capability bump. Margins are not framed as a per token markup. Compute is fungible across model development, internal acceleration, and customer serving, so Anthropic measures return on the entire compute envelope rather than software style variable cost per call.

    Fundraising, DeepSeek, and Capital Intensity

    Rao joined while Anthropic was closing its Series D, mid frontier model launch and during the FTX share liquidation. Investors initially questioned whether Anthropic needed a frontier model, whether AI safety and a real business could coexist, and why the sales team was so small. The Series E closed the same day the DeepSeek news broke, with markets violently re pricing AI in real time. Since Rao joined, Anthropic has raised over $75 billion, with another $50 billion tied to the Amazon and Google compute deals. The reason for the size of the raises is the cone of uncertainty, not current losses. Returns on compute today are described as robust.

    Mythos, Cyber Capability, and Phased Releases

    The Mythos release marks the first time Anthropic shipped a model under a deliberately phased rollout because of a specific capability spike. Cyber is the dimension that spiked. Where a prior model found 22 vulnerabilities in an open source codebase, Mythos found roughly 250. The defensive applications, automatically patching massive codebases, are genuinely valuable, but the offensive risk is real enough that Anthropic chose to release to a smaller group first and expand access over time. Rao positions this as a template for future capability spikes, not a permanent restriction. He also describes the relationship with the US administration as cooperative, including the Department of War interaction, with Anthropic supporting a regulatory framework that does not strangle innovation but takes responsibility seriously.

    Claude Inside Finance

    Anthropic’s finance team is one of the strongest internal case studies. Statutory financial statements for every legal entity are produced by Claude, with a human reviewer. A skill library of more than 70 finance specific skills underpins a Monthly Financial Review skill that drafts the monthly close at 90 to 95 percent ready, so leadership meetings shift from explaining the numbers to discussing what to do about them. An internal analytics platform called Anthrop Stats compresses weekly insight cycles from hours to 30 minutes. The biggest internal token user in finance is the head of tax, building policy engines, which Rao highlights as evidence that adoption is driven by the most senior people, not just younger engineers.

    Culture, Co Founders, and the Race to the Top

    Seven co founders should not, on paper, work as a leadership group. Rao argues it works because the culture was set early around collaboration, intellectual honesty, transparency, and humility. The culture interview is a real veto, not a checkbox. Dario Amodei runs an all hands every two weeks with a short written piece followed by unscripted questions, and decisions, once made, get clean alignment rather than residual politics. Anthropic frames its approach as a race to the top, where being a model for how to build the technology responsibly is itself a recruiting and retention advantage.

    The Virtual Collaborator and the Frontier Ahead

    The product vision Rao describes is the virtual collaborator. Not just a smarter chatbot, but an agent with organizational context, access to the company’s tools, memory, and the ability to work on ideas over long horizons. Coding was the first domain to feel this, but CoWork, Anthropic’s extension of the Claude Code pattern into general knowledge work, is being adopted faster than Claude Code was at the same age. Product development inside Anthropic already looks different. Teams ship daily, with fleets of agents working across the company, and individual humans increasingly act as managers of those fleets.

    Downside Risks and What Excites Him Most

    The three risks Rao names if asked to do a premortem on a softer year are slower customer diffusion of model capability into real workflows, scaling laws unexpectedly flattening, and Anthropic losing its frontier position to competitors. None of these are observed today, but he is unwilling to claim them with certainty. On the upside, he is most excited about biotech and healthcare. Lab throughput rising 10x or 100x, paired with AI assisted clinical workflows, could turn currently incurable diagnoses into treatable ones within a patient’s lifetime. That is the outcome he wants the technology to chase.

    Thoughts

    The most consequential structural point in this interview is the framing of compute as a single fungible resource pool measured by return on the entire envelope, not as a variable cost per inference call. That accounting shift, if you accept it, breaks most of the bear cases about AI lab unit economics. The bear argument almost always assumes that a token served to a customer is the only thing the chip did that day. Rao’s version is that the same fleet trains models in the morning, runs reinforcement learning at lunch, serves customers in the afternoon, and accelerates internal engineers in the evening. If even half of that is real, the right comparison is total compute spend versus total enterprise value created by the platform, and on that ratio Anthropic looks structurally strong rather than weak.

    The Jevons paradox on Opus pricing is the most actionable insight for anyone running an AI product. Most teams default to either chasing premium pricing on the newest model or undercutting to chase volume. Anthropic did something more disciplined: it left Sonnet and Haiku alone, dropped Opus when efficiency gains made it serveable, and watched aggregate usage rise faster than the price cut. The lesson is that frontier model pricing is not really a price problem. It is a capability access problem, and elasticity around the right tier is much higher than the standard SaaS playbook implies.

    The Mythos cyber jump deserves more attention than it has gotten. Going from 22 to 250 vulnerabilities found in the same codebase is the kind of capability discontinuity that genuinely changes the regulatory calculus. Anthropic is signaling that it can identify these discontinuities ahead of release and choose a deployment shape that respects them. Whether peer labs adopt similar discipline is the open question. Anthropic’s race to the top framing assumes they will be forced to. The competitive market may say otherwise.

    The hiring data point is the most underrated investor signal. Two departures while peer labs lost dozens, during the most aggressive talent war in tech history, is not a culture poster. It is a structural advantage that compounds every time another lab tries to buy its way to the frontier. Money can be matched. Conviction in the mission, transparent leadership, and a culture interview that can veto otherwise stellar candidates cannot. If you believe scaling laws hold, talent retention at this density is one of the few moats that actually scales with capital.

    Finally, the most interesting personal admission is that Krishna Rao, a finance leader trained at Blackstone and Cedar, is openly telling investors that linear thinking is the failure mode he had to break out of. The companies that pattern match this moment to prior technology waves are mispricing it, in both directions. The cone of uncertainty Anthropic uses internally is the right metaphor for everyone else too. If you are forecasting AI as if it is cloud in 2010, you are almost certainly wrong, and the magnitude of the error is much larger than it would be in any prior era.

    Watch the full conversation with Krishna Rao on Invest Like the Best here.

  • Jensen Huang on Nvidia’s Supply Chain Moat, TPU Competition, China Export Controls, and Why Nvidia Will Not Become a Cloud (Dwarkesh Podcast Summary)

    TLDW (Too Long, Didn’t Watch)

    Jensen Huang sat down with Dwarkesh Patel for over 90 minutes covering Nvidia’s supply chain dominance, the TPU threat, why Nvidia will not become a hyperscaler, whether the US should sell AI chips to China, and why Nvidia does not pursue multiple chip architectures at once. Jensen framed Nvidia’s entire business as transforming “electrons into tokens” and argued that Nvidia’s real moat is not any single technology but the full stack ecosystem it has built over two decades. He was blunt about his regret over not investing in Anthropic and OpenAI earlier, passionate about keeping the American tech stack dominant worldwide, and dismissive of the idea that China’s chip industry can be meaningfully contained through export controls.

    Key Takeaways

    1. Nvidia’s moat is the ecosystem, not the chip. Jensen repeatedly emphasized that Nvidia’s competitive advantage comes from CUDA, its massive installed base, its deep partnerships across the entire supply chain, and the fact that it operates in every cloud. The moat is not a single product but an interlocking system that took 20+ years to build.

    2. Supply chain bottlenecks are temporary, energy bottlenecks are not. Jensen argued that CoWoS packaging, HBM memory, EUV capacity, and logic fabrication bottlenecks can all be resolved in two to three years with the right demand signal. The real constraint on AI scaling is energy policy, which takes far longer to fix.

    3. TPUs and ASICs are not an existential threat to Nvidia. Jensen was emphatic that no competitor has demonstrated better price-performance or performance-per-watt than Nvidia, and challenged TPU and Trainium to prove otherwise on public benchmarks like InferenceMAX and MLPerf. He described Anthropic as a “unique instance, not a trend” for TPU adoption.

    4. Jensen regrets not investing in Anthropic and OpenAI earlier. He admitted he did not deeply internalize how much capital AI labs needed and that traditional VC funding was not sufficient for companies at that scale. He described this as a clear miss, though he said Nvidia was not in a position to make multi-billion dollar investments at the time.

    5. Nvidia will not become a hyperscaler. Jensen’s philosophy is “do as much as needed, as little as possible.” Building cloud infrastructure is something other companies can do, so Nvidia supports neoclouds like CoreWeave, Nebius, and Nscale instead of competing with them. Nvidia invests in ecosystem partners rather than vertically integrating into cloud services.

    6. Jensen is strongly against US chip export controls on China. This was the longest and most heated segment of the interview. Jensen argued that China already has abundant compute, energy, and AI researchers, and that export controls have accelerated China’s domestic chip industry while causing the US to concede the world’s second-largest technology market. He compared the situation to how US telecom policy allowed Huawei to dominate global telecommunications.

    7. AI will cause software tool usage to skyrocket, not collapse. Jensen pushed back on the narrative that AI will commoditize software companies. He argued that agents will use existing tools at massive scale, causing the number of instances of products like Excel, Synopsys Design Compiler, and other enterprise tools to grow exponentially.

    8. Nvidia does not pick winners among AI labs. Jensen explained that Nvidia invests across multiple foundation model companies simultaneously and refuses to favor any single one. He cited his own company’s unlikely survival story as the reason for this humility: Nvidia’s original graphics architecture was “precisely wrong” and would have been counted out by anyone picking winners.

    9. Nvidia added Groq for premium token economics. Nvidia recently acquired Groq and is folding it into the CUDA ecosystem because the market is now segmenting into different token tiers. Some customers will pay premium prices for faster response times even at lower throughput, creating a new segment of the inference market.

    10. Without AI, Nvidia would still be very large. Jensen was clear that accelerated computing, not AI specifically, is the foundational mission of the company. Molecular dynamics, quantum chemistry, computational lithography, data processing, and physics simulation all benefit from GPU acceleration regardless of deep learning.

    Detailed Summary

    Nvidia’s Real Business: Electrons to Tokens

    Jensen opened the conversation by reframing Nvidia’s entire value proposition. When Dwarkesh suggested that Nvidia is fundamentally a software company that sends a GDS2 file to TSMC for manufacturing, Jensen pushed back hard. He described Nvidia’s job as transforming electrons into tokens, with everything in between representing an “incredible journey” of artistry, engineering, science, and invention. He said the transformation is far from deeply understood and the journey is far from over, making commoditization unlikely.

    Jensen described Nvidia as operating a philosophy of doing “as much as necessary and as little as possible.” Whatever Nvidia does not need to do itself, it partners with someone else and makes it part of the broader ecosystem. This is why Nvidia has what Jensen called probably the largest ecosystem of partners in the industry, spanning the full supply chain upstream and downstream, application developers, model makers, and all five layers of the AI stack.

    On the question of whether AI will commoditize software companies, Jensen offered a contrarian take. He argued that agents are going to use software tools at unprecedented scale, meaning the number of instances of products like Excel, Cadence design tools, and Synopsys compilers will skyrocket. Today the bottleneck is the number of human engineers. Tomorrow, those engineers will be supported by swarms of agents exploring design spaces and using the same tools humans use today. Jensen said the reason this has not happened yet is simply that the agents are not good enough at using tools. That will change.

    The Supply Chain Moat

    Dwarkesh pressed Jensen on Nvidia’s reported $100 billion (and potentially $250 billion) in purchase commitments with foundries, memory manufacturers, and packaging companies. The question was whether Nvidia’s real moat for the next few years is simply locking up scarce upstream components so that no competitor can get the memory and logic they need to build alternative accelerators.

    Jensen confirmed this is a significant advantage but framed it differently. He said Nvidia has made enormous explicit and implicit commitments upstream. The implicit commitments matter just as much: Jensen personally meets with CEOs across the supply chain to explain the scale of the coming AI industry, convince them to invest in capacity, and assure them that Nvidia’s downstream demand is large enough to justify that investment. Nvidia’s GTC conference serves this purpose too, bringing the entire ecosystem together so upstream suppliers can see downstream demand and vice versa.

    Jensen described a process of systematically “prefetching bottlenecks” years in advance. CoWoS advanced packaging was a major bottleneck two years ago, but Nvidia swarmed it with repeated doubling of capacity until TSMC recognized it as mainstream computing technology rather than a specialty product. More recently, Nvidia has invested in the silicon photonics ecosystem through partnerships with Lumentum and Coherent, invented new packaging technologies, licensed patents to keep the supply chain open, and even invested in new testing equipment like double-sided probing.

    When Dwarkesh asked about the ultimate physical bottlenecks, Jensen surprised him. The hardest bottleneck to solve is not CoWoS or HBM or EUV machines. It is plumbers and electricians needed to build data centers. Jensen used this as a launching point to criticize “doomers” who discourage people from pursuing careers in software engineering or radiology, arguing that scaring people out of these professions creates the real bottlenecks.

    On EUV and logic scaling specifically, Jensen was optimistic. He said no supply chain bottleneck lasts longer than two to three years. Once you can build one of something, you can build ten, and once you can build ten, you can build a million. The key is a clear demand signal. If TSMC is convinced of the demand, ASML will produce enough EUV machines. Meanwhile, Nvidia continues to improve computing efficiency by 10x to 50x per generation through architecture, algorithms, and system design.

    The TPU Question

    Dwarkesh pushed hard on whether Google’s TPUs represent a real threat, noting that two of the top three AI models (Claude and Gemini) were trained on TPUs. Jensen drew a sharp distinction between what Nvidia builds and what a TPU is. Nvidia builds accelerated computing, which serves molecular dynamics, quantum chromodynamics, data processing, fluid dynamics, particle physics, and AI. A TPU is a tensor processing unit optimized for matrix multiplies. Nvidia’s market reach is far greater than any TPU or ASIC can possibly have.

    Jensen emphasized programmability as Nvidia’s core architectural advantage. If you want to invent a new attention mechanism, build a hybrid SSM model, fuse diffusion and autoregressive techniques, or disaggregate computation in a novel way, you need a generally programmable architecture. The only way to achieve 10x or 100x performance leaps (versus the roughly 25% per year from Moore’s Law) is to fundamentally change the algorithm, and that requires the flexibility CUDA provides.

    On the specific question of whether hyperscalers with huge engineering teams can simply write their own kernels and bypass CUDA, Jensen acknowledged they do write custom kernels but argued that Nvidia’s engineers still routinely deliver 2x to 3x speedups when they optimize a partner’s stack. He described Nvidia’s GPUs as “F1 racers” that anyone can drive at 100 mph, but extracting peak performance requires deep architectural expertise. Nvidia uses AI itself to generate many of its optimized kernels.

    Jensen was particularly blunt about public benchmarks. He pointed to Dylan Patel’s InferenceMAX benchmark and said neither TPU nor Trainium has been willing to demonstrate their claimed performance advantages on it. He said Nvidia’s performance-per-TCO is the best in the world, “bar none,” and challenged anyone to prove otherwise.

    Regarding Anthropic’s multi-gigawatt deal with Broadcom and Google for TPUs, Jensen called it “a unique instance, not a trend.” He said without Anthropic, there would be essentially no TPU growth and no Trainium growth. He traced this back to his own mistake: when Anthropic and OpenAI needed multi-billion dollar investments from their compute suppliers to get off the ground, Nvidia was not in a position to provide that capital. Google and AWS were, and in return, Anthropic committed to using their compute.

    Nvidia’s Investment Strategy and Regrets

    Jensen was unusually candid about his regret over not investing in foundation model companies earlier. He said he did not deeply internalize how different AI labs were from typical startups. A traditional VC would never put $5 to $10 billion into a single AI lab, but that was exactly what companies like OpenAI and Anthropic needed. By the time Jensen understood this, Nvidia was not in a financial or cultural position to make those kinds of investments.

    Now, Nvidia has invested approximately $30 billion in OpenAI and $10 billion in Anthropic. Jensen said he is delighted to support both and considers their existence essential for the world. But he acknowledged that these investments came at much higher valuations than would have been possible years earlier.

    Jensen explained Nvidia’s broader investment philosophy: support everyone, do not pick winners. He invests in one foundation model company, he invests in all of them. This comes from hard-won humility. When Nvidia started, there were 60 3D graphics companies. Nvidia’s original architecture was “precisely wrong” and the company would have been at the top of most lists to fail. Jensen said he has enough humility from that experience to know that you cannot predict which AI company will ultimately succeed.

    Why Nvidia Will Not Become a Hyperscaler

    Dwarkesh pointed out that Nvidia has the cash to build and operate its own cloud infrastructure, bypassing the middleman ecosystem that converts CapEx into OpEx for AI labs. Jensen rejected this path based on his core operating philosophy.

    If Nvidia did not build its computing platform, NVLink, and the CUDA ecosystem, nobody else would have done it. He is “completely certain” of that. These are things Nvidia must do. But the world has lots of clouds. If Nvidia did not build a cloud, someone else would show up. So the answer is to support the ecosystem instead: invest in CoreWeave, Nscale, Nebius, and others to help them exist and scale, rather than competing with them.

    Jensen was clear that Nvidia is not trying to be in the financing business either. When OpenAI needed a $30 billion investment before its IPO, Nvidia stepped up because OpenAI needed it and Nvidia deeply believed in the company. But these are targeted ecosystem investments, not a strategic pivot into cloud services.

    On GPU allocation during shortages, Jensen pushed back on the narrative that Nvidia strategically “fractures” the market by giving allocations to smaller neoclouds. He said the process is straightforward: you forecast demand, you place a purchase order, and it is first in, first out. Nvidia never changes prices based on demand. Jensen said he prefers to be dependable and serve as the foundation of the industry rather than extracting maximum short-term value.

    The China Debate

    The longest and most heated section of the interview was Jensen’s case against US chip export controls on China. This was a genuine debate, with Dwarkesh pushing the national security argument and Jensen pushing back forcefully.

    Jensen’s core argument rested on several pillars. First, China already has abundant compute. They manufacture 60% or more of the world’s mainstream chips, have massive energy infrastructure (including empty data centers with full power), and employ roughly 50% of the world’s AI researchers. The threshold of compute needed to build models like Anthropic’s Mythos has already been reached and exceeded by China’s existing infrastructure.

    Second, export controls have backfired. They accelerated China’s domestic chip industry, forced their AI ecosystem to optimize for internal architectures instead of the American tech stack, and caused the United States to concede the second-largest technology market in the world. Jensen compared this directly to how US telecom policy allowed Huawei to dominate global telecommunications infrastructure.

    Third, Jensen argued that AI is a five-layer stack (energy, chips, computing platform, models, applications) and the US needs to win at every layer. Fixating on one layer (models) at the expense of another layer (chips) is counterproductive. If Chinese open source AI models end up optimized for non-American hardware and that stack gets exported to the global south, the Middle East, Africa, and Southeast Asia, the US will have lost something far more valuable than whatever marginal compute advantage the export controls provided.

    Dwarkesh countered with the Mythos example: Anthropic’s new model found thousands of high-severity zero-day vulnerabilities across every major operating system and browser, including one that had existed in OpenBSD for 27 years. If China had enough compute to train and deploy a model like Mythos at scale before the US could prepare, the cyber-offensive capabilities would be devastating.

    Jensen’s response was direct. Mythos was trained on “fairly mundane capacity” that is already abundantly available in China. The amount of compute is not the bottleneck for that kind of breakthrough. Great computer science is, and China has no shortage of brilliant AI researchers. He pointed to DeepSeek as evidence: most advances in AI come from algorithmic innovation, not raw hardware. If China’s researchers can achieve breakthroughs like DeepSeek with limited hardware, imagine what they could do with more.

    Jensen also argued for dialogue over confrontation. He said it is essential that American and Chinese AI researchers are talking to each other, and that both countries agree on what AI should not be used for. The idea that you can prevent AI risks by cutting off chip sales, when the real advances come from algorithms and computer science, reflects a fundamental misunderstanding of how AI progress works.

    The debate ended without resolution, but Jensen’s final point was sharp: “I’m not talking to somebody who woke up a loser. That loser attitude, that loser premise, makes no sense to me.”

    Why Not Multiple Chip Architectures?

    Near the end of the interview, Dwarkesh asked why Nvidia does not run multiple parallel chip projects with different architectures, like a Cerebras-style wafer-scale design or a Dojo-style huge package, or even one without CUDA.

    Jensen’s answer was simple: “We don’t have a better idea.” Nvidia simulates all of these alternative approaches in its internal simulators and they are provably worse. The company works on exactly the projects it wants to work on. If the workload were to change dramatically (not just the algorithms, but the actual market shape), Nvidia might add other accelerators.

    In fact, Nvidia recently did exactly this by acquiring Groq. The inference market is now segmenting into different tiers. Some customers will pay premium prices for extremely fast response times even if throughput is lower. This creates a new “high ASP token” segment that justifies a different point on the performance curve. But Jensen was clear: if he had more money, he would put it all behind Nvidia’s existing architecture, not diversify into alternatives.

    Nvidia Without AI

    Jensen closed by saying that even if the deep learning revolution had never happened, Nvidia would be “very, very large.” The premise of the company has always been that general-purpose computing cannot scale indefinitely and that domain-specific acceleration is the way forward. Molecular dynamics, seismic processing, image processing, computational lithography, quantum chemistry, and data processing all benefit from GPU acceleration regardless of AI. Jensen said the fundamental promise of accelerated computing has not changed “not even a little bit.”

    Thoughts

    This interview is one of the most revealing Jensen Huang conversations in years, partly because Dwarkesh actually pushes back instead of lobbing softballs. A few things stand out.

    The Anthropic regret is real and significant. Jensen is essentially admitting that Nvidia’s biggest strategic miss of the AI era was not understanding that foundation model companies needed supplier-level capital commitments, not VC funding. The fact that Google and AWS used compute investments to lock in Anthropic’s architecture choices has had downstream consequences that Nvidia is still working to unwind. When Jensen says Anthropic is “a unique instance, not a trend” for TPU adoption, he is simultaneously downplaying the threat and revealing exactly how seriously he takes it.

    The China debate is the highlight. Jensen’s argument is more nuanced than it first appears. He is not saying “sell China everything.” He is saying the current binary approach of near-total restriction has backfired by accelerating China’s domestic chip industry and pushing the Chinese AI ecosystem away from the American tech stack. His comparison to the US telecom industry losing global market share to Huawei is pointed and historically grounded. Whether you agree with his conclusion or not, the framing of AI as a five-layer stack where the US needs to compete at every layer is a useful mental model.

    The “electrons to tokens” framing is Jensen at his best. It is a simple metaphor that captures something genuinely complex about where value is created in the AI supply chain. And his insistence that the transformation is “far from deeply understood” is a subtle way of arguing that Nvidia’s competitive position will be durable because the problem space is not close to being solved.

    The Groq acquisition reveal is interesting for what it signals about the inference market. If Nvidia is creating a separate product tier for premium-priced, low-latency tokens, it suggests the company sees inference economics fragmenting significantly. This aligns with the broader trend of AI becoming an enterprise product where different customers have wildly different willingness to pay based on how they use tokens.

    Finally, Jensen’s refusal to diversify chip architectures is a bold bet. “We simulate it all in our simulator, provably worse” is an incredibly confident statement. History is full of companies that were right until they were not. But Nvidia’s track record of 50x generation-over-generation improvements through co-design across processors, fabric, libraries, and algorithms is hard to argue with. The question is whether the current paradigm of transformer-based models on GPU clusters represents a local or global optimum for AI compute.

  • Inside Microsoft’s AGI Masterplan: Satya Nadella Reveals the 50-Year Bet That Will Redefine Computing, Capital, and Control

    1) Fairwater 2 is live at unprecedented scale, with Fairwater 4 linking over a 1 Pb AI WAN

    Nadella walks through the new Fairwater 2 site and states Microsoft has targeted a 10x training capacity increase every 18 to 24 months relative to GPT-5’s compute. He also notes Fairwater 4 will connect on a one petabit network, enabling multi-site aggregation for frontier training, data generation, and inference.

    2) Microsoft’s MAI program, a parallel superintelligence effort alongside OpenAI

    Microsoft is standing up its own frontier lab and will “continue to drop” models in the open, with an omni-model on the roadmap and high-profile hires joining Mustafa Suleyman. This is a clear signal that Microsoft intends to compete at the top tier while still leveraging OpenAI models in products.

    3) Clarification on IP: Microsoft says it has full access to the GPT family’s IP

    Nadella says Microsoft has access to all of OpenAI’s model IP (consumer hardware excluded) and shared that the firms co-developed system-level designs for supercomputers. This resolves long-standing ambiguity about who holds rights to GPT-class systems.

    4) New exclusivity boundaries: OpenAI’s API is Azure-exclusive, SaaS can run elsewhere with limited exceptions

    The interview spells out that OpenAI’s platform API must run on Azure. ChatGPT as SaaS can be hosted elsewhere only under specific carve-outs, for example certain US government cases.

    5) Per-agent future for Microsoft’s business model

    Nadella describes a shift where companies provision Windows 365 style computers for autonomous agents. Licensing and provisioning evolve from per-user to per-user plus per-agent, with identity, security, storage, and observability provided as the substrate.

    6) The 2024–2025 capacity “pause” explained

    Nadella confirms Microsoft paused or dropped some leases in the second half of last year to avoid lock-in to a single accelerator generation, keep the fleet fungible across GB200, GB300, and future parts, and balance training with global serving to match monetization.

    7) Concrete scaling cadence disclosure

    The 10x training capacity target every 18 to 24 months is stated on the record while touring Fairwater 2. This implies the next frontier runs will be roughly an order of magnitude above GPT-5 compute.

    8) Multi-model, multi-supplier posture

    Microsoft will keep using OpenAI models in products for years, build MAI models in parallel, and integrate other frontier models where product quality or cost warrants it.

    Why these points matter

    • Industrial scale: Fairwater’s disclosed networking and capacity targets set a new bar for AI factories and imply rapid model scaling.
    • Strategic independence: MAI plus GPT IP access gives Microsoft a dual track that reduces single-partner risk.
    • Ecosystem control: Azure exclusivity for OpenAI’s API consolidates platform power at the infrastructure layer.
    • New revenue primitives: Per-agent provisioning reframes Microsoft’s core metrics and pricing.

    Pull quotes

      “We’ve tried to 10x the training capacity every 18 to 24 months.”

      “The API is Azure-exclusive. The SaaS business can run anywhere, with a few exceptions.”

      “We have access to the GPT family’s IP.”

    TL;DW

    • Microsoft is building a global network of AI super-datacenters (Fairwater 2 and beyond) designed for fast upgrade cycles and cross-region training at petabit scale.
    • Strategy spans three layers: infrastructure, models, and application scaffolding, so Microsoft creates value regardless of which model wins.
    • AI economics shift margins, so Microsoft blends subscriptions with metered consumption and focuses on tokens per dollar per watt.
    • Future includes autonomous agents that get provisioned like users with identity, security, storage, and observability.
    • Trust and sovereignty are central. Microsoft leans into compliant, sovereign cloud footprints to win globally.

    Detailed Summary

    1) Fairwater 2: AI Superfactory

    Microsoft’s Fairwater 2 is presented as the most powerful AI datacenter yet, packing hundreds of thousands of GB200 and GB300 accelerators, tied by a petabit AI WAN and designed to stitch training jobs across buildings and regions. The key lesson: keep the fleet fungible and avoid overbuilding for a single hardware generation as power density and cooling change with each wave like Vera Rubin and Rubin Ultra.

    2) The Three-Layer Strategy

    • Infrastructure: Azure’s hyperscale footprint, tuned for training, data generation, and inference, with strict flexibility across model architectures.
    • Models: Access to OpenAI’s GPT family for seven years plus Microsoft’s own MAI roadmap for text, image, and audio, moving toward an omni-model.
    • Application Scaffolding: Copilots and agent frameworks like GitHub’s Agent HQ and Mission Control that orchestrate many agents on real repos and workflows.

    This layered approach lets Microsoft compete whether the value accrues to models, tooling, or infrastructure.

    3) Business Models and Margins

    AI raises COGS relative to classic SaaS, so pricing blends entitlements with consumption tiers. GitHub Copilot helped catalyze a multibillion market in a year, even as rivals emerged. Microsoft aims to ride a market that is expanding 10x rather than clinging to legacy share. Efficiency focus: tokens per dollar per watt through software optimization as much as hardware.

    4) Copilot, GitHub, and Agent Control Planes

    GitHub becomes the control plane for multi-agent development. Agent HQ and Mission Control aim to let teams launch, steer, and observe multiple agents working in branches, with repo-native primitives for issues, actions, and reviews.

    5) Models vs Scaffolding

    Nadella argues model monopolies are checked by open source and substitution. Durable value sits in the scaffolding layer that brings context, data liquidity, compliance, and deep tool knowledge, exemplified by Excel Agent that understands formulas and artifacts beyond screen pixels.

    6) Rise of Autonomous Agents

    Two worlds emerge: human-in-the-loop Copilots and fully autonomous agents. Microsoft plans to provision agents with computers, identity, security, storage, and observability, evolving end-user software into an infrastructure business for agents as well as people.

    7) MAI: Microsoft’s In-House Frontier Effort

    Microsoft is assembling a top-tier lab led by Mustafa Suleyman and veterans from DeepMind and Google. Early MAI models show progress in multimodal arenas. The plan is to combine OpenAI access with independent research and product-optimized models for latency and cost.

    8) Capex and Industrial Transformation

    Capex has surged. Microsoft frames this era as capital intensive and knowledge intensive. Software scheduling, workload placement, and continual throughput improvements are essential to maximize returns on a fleet that upgrades every 18 to 24 months.

    9) The Lease Pause and Flexibility

    Microsoft paused some leases to avoid single-generation lock-in and to prevent over-reliance on a small number of mega-customers. The portfolio favors global diversity, regulatory alignment, balanced training and inference, and location choices that respect sovereignty and latency needs.

    10) Chips and Systems

    Custom silicon like Maia will scale in lockstep with Microsoft’s own models and OpenAI collaboration, while Nvidia remains central. The bar for any new accelerator is total fleet TCO, not just raw performance, and system design is co-evolved with model needs.

    11) Sovereign AI and Trust

    Nations want AI benefits with continuity and control. Microsoft’s approach combines sovereign cloud patterns, data residency, confidential computing, and compliance so countries can adopt leading AI while managing concentration risk. Nadella emphasizes trust in American technology and institutions as a decisive global advantage.


    Key Takeaways

    1. Build for flexibility: Datacenters, pricing, and software are optimized for fast evolution and multi-model support.
    2. Three-layer stack wins: Infrastructure, models, and scaffolding compound each other and hedge against shifts in where value accrues.
    3. Agents are the next platform: Provisioned like users with identity and observability, agents will demand a new kind of enterprise infrastructure.
    4. Efficiency is king: Tokens per dollar per watt drives margins more than any single chip choice.
    5. Trust and sovereignty matter: Compliance and credible guarantees are strategic differentiators in a bipolar world.