PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Author: PJFP

  • SpaceX IPO Priced at $135 Per Share: SPCX Raises $75 Billion in the Largest IPO in History, Trading Begins June 12 on Nasdaq

    TLDR

    SpaceX confirmed the pricing of its initial public offering on June 11, 2026: 555,555,555 shares of Class A common stock at $135.00 per share, a raise of just under $75 billion. The stock begins trading Friday, June 12, 2026 on the Nasdaq Global Select Market and Nasdaq Texas under the ticker SPCX, with the offering expected to close on June 15. Underwriters hold a 30 day option to purchase up to 83,333,333 additional shares at the IPO price, which would push total proceeds toward $86 billion. At $135 per share the company is valued at roughly $1.77 trillion. That makes this the largest IPO ever priced, around three times the previous record, and it instantly places SpaceX among the most valuable companies on the planet, ahead of Tesla.

    Key Takeaways

    • The deal: 555,555,555 Class A shares priced at $135.00 each, raising approximately $75 billion before the overallotment option.
    • The ticker: SPCX, trading on both the Nasdaq Global Select Market and the new Nasdaq Texas exchange starting June 12, 2026. The offering closes June 15.
    • The greenshoe: underwriters have 30 days to buy up to 83,333,333 more shares at $135, worth another $11.25 billion and a potential total raise near $86 billion.
    • Record scale: roughly three times larger than Saudi Aramco’s 2019 listing, the previous record holder, and by some estimates bigger than all US IPO proceeds from 2024 and 2025 combined.
    • The valuation: approximately $1.77 trillion at the offer price, which would rank SpaceX around seventh among US companies by market cap, above Tesla at roughly $1.6 trillion.
    • The multiple: reported 2025 revenue of $18.7 billion puts the deal at roughly 95 times trailing sales.
    • Control: Elon Musk retains more than 82 percent voting power after the offering through the dual class structure.
    • The banks: Goldman Sachs leads a ten bank syndicate of book running managers including Morgan Stanley, BofA, Citigroup, and J.P. Morgan, with thirteen additional co-managers.
    • Truly global retail access: simultaneous retail offerings in the US, Canada, Switzerland, Australia, Japan, and seven EEA countries, with a qualified investor tranche in the UK. Mega IPOs almost never do this.
    • Demand: the book was reportedly around four times oversubscribed, implying roughly $250 billion in orders, and some brokers are imposing anti flipping penalties on early sellers.
    • Index mechanics: MSCI plans early inclusion of SPCX shortly after the debut, while S&P declined to fast track S&P 500 membership.
    • What you own: Starlink, the Falcon and Starship launch business, and the AI segment built around xAI and the X platform following the February 2026 merger.

    Detailed Summary

    The Deal: 555,555,555 Shares at $135

    Space Exploration Technologies Corp. announced from Starbase, Texas that its IPO priced at $135.00 per share for exactly 555,555,555 shares of Class A common stock. The math works out to $74,999,999,925, which is to say the share count was reverse engineered to land a fraction of a cent under a clean $75 billion. The quintuple five share count is exactly the kind of numerical flourish you would expect from this company. The SEC declared the registration statement effective on June 11, and the underwriters received a standard 30 day option for up to 83,333,333 additional shares, which at the offer price is another $11.25 billion. Fully exercised, total proceeds approach $86 billion.

    Where and When SPCX Trades

    Shares are expected to begin trading June 12, 2026 under the ticker SPCX on the Nasdaq Global Select Market and on Nasdaq Texas, the exchange operator’s new Dallas based venue. The dual venue listing is a symbolic alignment for a company headquartered in Starbase, Texas, and it hands Nasdaq Texas the biggest debut it could possibly ask for. The offering itself is expected to close on June 15, subject to customary conditions.

    The Largest IPO Ever, By a Wide Margin

    The previous record for an IPO raise was Saudi Aramco in December 2019 at roughly $29 billion including its overallotment. SpaceX clears that bar nearly three times over before its own greenshoe is exercised. Market data firms have noted that this single deal likely raises more money than every US IPO from 2024 and 2025 put together. Whatever 2026 looked like for the IPO market before this week, it is now a record year on the strength of one listing.

    A $1.77 Trillion Valuation in Context

    At $135 per share, SpaceX is valued at approximately $1.77 trillion, a figure that assumes pending transactions such as the EchoStar spectrum deal close as planned. That valuation would slot SpaceX in around seventh place among US public companies, ahead of Tesla, which trades near $1.6 trillion. It is a remarkable mark for a company that was privately valued at $350 billion in late 2024 and at $1.25 trillion when it merged with xAI in February 2026. Against reported 2025 revenue of $18.7 billion, the offer price represents roughly 95 times trailing sales, a multiple that prices in Starlink’s growth, Starship’s long term optionality, and the AI buildout all at once.

    The Syndicate

    Goldman Sachs leads the book running group, joined by Morgan Stanley, BofA Securities, Citigroup, J.P. Morgan, Barclays, Deutsche Bank Securities, RBC Capital Markets, UBS Investment Bank, and Wells Fargo Securities. Thirteen co-managers round out the syndicate, including Allen & Company, Cantor, Needham, Raymond James, Societe Generale, Stifel, William Blair, BTG Pactual, ING, Macquarie, Mirae Asset Securities, Mizuho, and Santander. Essentially every major bank on Wall Street and several from Asia, Europe, and Latin America have a seat at this table, which tells you how badly nobody wanted to be left out.

    A Genuinely Global Retail Offering

    One of the most unusual features of this IPO is its breadth. SpaceX structured simultaneous public offerings across an enormous number of jurisdictions. In Canada, a PREP prospectus was filed with regulators in every province and territory and is available through SEDAR+ at www.sedarplus.ca, meaning Canadian retail investors can participate directly. Retail offerings are also running in Switzerland and in seven EEA countries (Germany, Denmark, France, the Netherlands, Norway, Spain, and Sweden) under a European prospectus approved by Germany’s BaFin. Australia has its own ASIC lodged prospectus, Japan has a registration with the Kanto Local Finance Bureau distributed through Mizuho, Rakuten Securities, and SBI Securities, and the UK has a qualified investor tranche. Offering documents are centralized at www.spacexipo.com. Most mega IPOs are institutional affairs with token retail allocations in one or two markets. SpaceX built a retail pipeline spanning a dozen countries, consistent with the retail heavy shareholder culture Musk cultivated at Tesla.

    What You Actually Own at $135

    SpaceX describes itself as the only company building integrated hardware and software infrastructure across space, connectivity, and AI. In practice the business has three legs. Starlink is the profitable anchor, with reported 2025 revenue around $11.4 billion, EBITDA margins in the low 60s, and a subscriber base above 10 million. The launch segment, built on Falcon 9, Falcon Heavy, and the developing Starship program, is also profitable and effectively funds Starship’s path toward full reusability. The AI segment, centered on xAI and the X platform after the February merger, is the high burn piece, with reported operating losses above $6 billion in 2025. Buyers should also be clear eyed about governance: Musk controls more than 82 percent of voting power after the offering, so SPCX shareholders are passengers on his trajectory, not co-pilots.

    Float, Flippers, and Index Funds

    The offering represents only a small slice of the company, with the public float estimated around 4 percent of shares outstanding. Demand reportedly ran about four times the available stock, roughly $250 billion in orders, and some large brokerages have warned clients that flipping allocations within the first couple of weeks will cost them access to future IPOs. MSCI confirmed it will apply its early inclusion process for large IPOs, forcing passive funds tracking MSCI World and ACWI to buy SPCX within days of the debut. S&P declined to bend its rules for immediate S&P 500 entry, so that catalyst sits further out. Tight float plus forced index buying plus retail enthusiasm is a recipe for a volatile first stretch of trading. The first real fundamental checkpoint arrives with the company’s first public earnings report, expected in November 2026.

    Thoughts

    This IPO is less a financing event than a coronation, and the structure shows it. SpaceX did not need a price range and a delicate book building dance; it set a fixed $135, picked a share count that spells out 555,555,555, and let $250 billion of demand come to it. The raise itself is interesting too. A company with Starlink’s cash flow does not need $75 billion to keep launching rockets. It needs $75 billion if it intends to build orbital infrastructure, gigawatt scale AI compute, and Starship at industrial cadence simultaneously. The size of the check is the strategy.

    The valuation question is where honest people will disagree. At 95 times trailing revenue, the market is paying today for the 2035 version of this company: Starlink as a global utility, Starship flying daily, and xAI somewhere in the frontier model race. The bear case is equally simple. The profitable segments are worth a fraction of $1.77 trillion on their own, the AI segment is burning billions against ferocious competition, and one person holds essentially all the votes. Both stories can be true at the same time, which is exactly what makes the next six months of trading interesting. Index flows and a 4 percent float will set the price short term; Starlink subscriber growth and the slope of xAI’s losses will set it long term.

    The most underappreciated detail might be the global retail architecture. Filing simultaneous retail prospectuses in Canada, Japan, Australia, Switzerland, and most of Western Europe is expensive and slow, and companies skip it because institutions can absorb any deal. SpaceX did it anyway. That is partly ideology and partly a structural insight: a globally distributed retail base that believes in the mission is a more patient and more loyal source of capital than a hedge fund, and Tesla proved it for fifteen years. June 12 will tell us what the opening print looks like. The more important number arrives in November, when the largest IPO in history files its first earnings report and the story finally has to reconcile with a spreadsheet.

  • Dario Amodei on Policy for the AI Exponential: Anthropic’s Plan for AI Regulation, Job Displacement, Civil Liberties, and Democratic Leadership

    In June 2026, Anthropic CEO Dario Amodei published “Policy on the AI Exponential”, a wide-ranging essay arguing that the gap between how fast AI is advancing and how slowly policy moves has become dangerous, and that the window to close it is open right now. He opens with a memorable image from The Lord of the Rings: the Hobbits trying to rouse Treebeard, the ancient tree who takes a full day just to say hello, to defend his forest before it is cut down. That mismatch in speed, he writes, is exactly the relationship between AI and our political institutions. This post breaks the essay down in full and adds analysis of where the argument lands.

    TLDR

    Amodei argues that AI’s scaling laws point toward “powerful AI,” a country of geniuses in a datacenter, within a few years, while legislation still moves on a timescale of years. For most of the last few years, safety advocates including Anthropic pushed only for optionality-preserving moves like transparency rules, chip export controls, and labor data collection, because the risks were not yet concrete. He says that has changed: events like Claude Mythos Preview proved frontier models are now tools of national strategic consequence, and the time for binding regulation has arrived. The essay covers five policy areas. First, regulation and public safety, where he proposes an FAA-style regime of mandatory third-party testing of frontier models above a compute threshold across four risks (cybersecurity, biological weapons, loss of control, and automated R&D), with government power to block unsafe deployments. Second, macroeconomics and tax policy, where AI could deliver hypergrowth and severe, enduring job displacement at the same time, demanding measurement, pro-employment incentives, and possibly UBI or universal capital accounts. Third, accelerating AI’s positive impact, where the danger is regulators like the FDA being too slow rather than too lax, and biomedical approval needs reform. Fourth, the state and civil liberties, where AI could become the ultimate tool of autocracy through autonomous weapons and mass surveillance, requiring new accountability rules, a domestic ban on autonomous weapons, closing the data broker loophole, and public rights to AI advice. Fifth, securing leadership by democracies through a values-based global coalition that controls the AI supply chain, coordinates on risk, shares benefits, and rejects AI-powered repression. He closes by rejecting the idea that public concern about AI is a PR problem to be marketed away, calling it democratic accountability working as it should.

    Thoughts

    The most important move in this essay is structural, not technical. Amodei is explicitly retiring the “preserve optionality” posture that defined Anthropic’s policy work through 2025 and replacing it with a call for binding rules. For years the argument from safety-minded labs was that the risks were too speculative to legislate against without doing more harm than good, an idea he grounds in the Collingridge dilemma and the Hayekian point that regulators lack the information to make good calls. That was a defensible hedge. What is striking here is the claim that the hedge has expired. He is saying the evidence is now concrete enough that continued caution about regulating has flipped from prudent to negligent. Whether you trust the underlying capability claims or not, that is a genuine change in position from one of the field’s most influential voices, and it deserves to be read as such.

    The FAA analogy is doing enormous work, and it is worth poking at. Airplanes and drugs are mature technologies with stable physics and decades of incident data; the certification regime works because the failure modes are well understood. Frontier models are the opposite: the whole premise of the essay is that capabilities are changing faster than anyone can characterize them. Amodei half-acknowledges this when he warns that a fixed list of safety requirements tends to consume 95 percent of compliance effort on things that turn out not to matter while missing the real risks, a lesson he says Anthropic learned from its own Responsible Scaling Policy. So the proposal is really for an agency nimble enough to rewrite its own standards continuously, which is a much taller order than the FAA. The honest read is that he is proposing a regulator we do not yet know how to build, and betting that building it is still better than the alternative.

    The economics section is where Amodei is most careful, and it is the part most likely to be misread. He goes out of his way to say enduring job displacement is undesirable and that warning about it is not the same as wanting it, a distinction critics of AI leaders often collapse. His real claim is subtle: that AI might jam the economic policy dial on a “hypergrowth, hyper-inequality” setting that is hard to unstick, because AI substitutes for human cognition broadly and faster than past technologies, potentially overwhelming the usual escape hatches like comparative advantage and Jevons paradox. If he is right, the political fight of the next decade is not about growth, which AI supplies, but about distribution, which it does not. His mention of UBI, universal capital accounts, and higher capital gains taxes is notable coming from a frontier CEO, even hedged as it is.

    The civil liberties section is the one that should travel furthest beyond the AI-policy bubble, because it does not depend on accepting his most aggressive timelines. The data broker loophole, the idea that the government can simply buy the bulk data Americans hand to private companies and run mass analysis on it, is a problem that exists today; AI just raises the stakes by making that data vastly more revealing. Same with the proposal that anyone facing adverse government action should have access to AI at least as capable as what the government uses against them. These are concrete, near-term, and bipartisan in a way the abstract autonomy debates are not. The most candid line in the whole piece is his admission that AI cannot be safely entrusted to either governments or companies, an unusually direct acknowledgment that his own industry needs external checks, with Anthropic’s Long-Term Benefit Trust offered as one imperfect example rather than a solution.

    The geopolitics section is the most contested terrain. Framing AI as a nuclear-scale reset of the game board, with a virtual country of 100 million geniuses divisible across military strategy and weapons R&D, leads naturally to a democratic coalition that hoards chips and denies them to adversaries. That logic is internally consistent, but it sits in tension with the benefit-sharing and “eventually the whole world joins” language elsewhere in the same section. Export controls that lock down the supply chain are, by design, a tool of exclusion, and reconciling that with broad diffusion of AI’s benefits to developing countries is the circle the coalition idea has to square. Amodei is clearly aware of the tension and bets that making membership attractive resolves it. The closing image is the one to remember: Treebeard waking up, with the warning that the goal is to channel real public concern into constructive policy rather than let it curdle into formless anger.

    Key Takeaways

    • The core tension of the essay is a mismatch in speed: AI advances exponentially while legislation moves on a multi-year timescale, dramatized by the Treebeard and Hobbits image from The Lord of the Rings.
    • In only four years, AI models went from barely writing a coherent line of code to writing most of the code at major AI companies, with similar gains across biology, physics, math, finance, law, and translation.
    • Scaling laws now have over a decade of empirical support, and if they continue another year or two they likely produce “powerful AI,” a country of geniuses in a datacenter.
    • For the last few years, safety advocates including Anthropic focused on optionality-preserving policies: transparency legislation, chip export controls, and data collection on AI’s labor effects.
    • Amodei argues that posture is no longer enough. Claude Mythos Preview revealed that frontier models pose real cybersecurity risks to the financial sector, critical infrastructure, and national security, and proved AI is now a tool of strategic consequence.
    • He expects biological risks to follow cyber risks, with serious AI autonomy risks potentially not far behind.
    • The essay covers five policy areas: regulation and public safety, macroeconomics and tax policy, accelerating AI’s positive impact, the state and civil liberties, and securing leadership by democracies.
    • Alongside the essay, Anthropic released a legislative proposal on frontier model testing and a policy framework for job displacement, both with promised financial backing.
    • On regulation, Amodei invokes the Collingridge dilemma and Hayek’s information problem to explain why pre-writing AI law in 2023 to 2024 was risky, then argues the situation has now changed.
    • Anthropic’s 2025 answer was transparency, helping pass SB 53 in California, RAISE in New York, and SB 315 in Illinois, plus advocating a federal transparency standard.
    • He now calls for binding regulation modeled on the FAA, where frontier models must pass technical testing and can have release blocked or reversed if they fail high safety standards.
    • Models above a compute threshold should face mandatory third-party testing in four areas: cybersecurity, biological weapons, loss of control of AI systems, and automated R&D that accelerates the other three.
    • Government should be able to block or deter deployment of models judged to present unacceptable risk, scoped to those four risks with protections against political favoritism.
    • Evaluation could come from a government agency or from authorized and inspected private organizations under a “regulatory markets” approach.
    • AI companies should have strong security to protect model weights, conduct regular red teaming and penetration testing, report safety incidents promptly, and work with government against major threat actors.
    • He warns a time may come when the most powerful systems resemble weaponizable nuclear materials rather than airplanes, requiring more aggressive measures, but cautions against getting ahead of present dangers.
    • On economics, AI could deliver extremely rapid growth via accelerated science and operational efficiency, supercharged by AI building better AI.
    • The same properties make AI a broad substitute for human cognition that changes the economy faster than past technologies, risking large and potentially enduring labor market disruption.
    • The feared outcome is a “hypergrowth, hyper-inequality” setting that is hard to unstick, where the challenge shifts from incentivizing growth to sharing its benefits.
    • Amodei is emphatic that enduring job displacement is undesirable and dangerous, and that he warns about it to help society adapt, not as a prophet of doom.
    • Anthropic says it works with customers to find new revenue and use cases rather than only cost cutting, and explores interaction paradigms that keep humans active alongside AI.
    • He predicts AI will enable single individuals to build billion-dollar companies, noting teams of a few people already reach hundreds of millions in revenue, while admitting significant enduring job loss may be intrinsic to the technology.
    • Any response must address both economic provision and the human need for meaning, purpose, and agency, with the latter ultimately more important and beyond what policy can directly deliver.
    • Suggested economic interventions: better measurement and tracking (governments expanding statistics beyond Anthropic’s Economic Index), pro-employment incentives, and long-term macroeconomic support.
    • Pro-employment ideas include wage insurance, retention tax incentives, workforce training grants, and employer-employee matching infrastructure.
    • If displacement is large and permanent, mechanisms like universal basic income or universal capital accounts, financed through company taxes or higher capital gains taxes, may be necessary.
    • He frames datacenter and energy-price backlash as largely a symbol of broader economic anxiety, and says AI companies should pay to absorb rate increases, a pledge Anthropic has already made.
    • For technologies accelerated by AI, the bigger risk is regulators like the FDA being too slow, not too lax, because AI may make downstream tech safer in ways that violate skeptical regulatory assumptions.
    • Biomedicine is the illustrative case: AI could flood the drug pipeline, raise effect sizes, treat previously untreatable diseases, and create whole new therapy categories, while the current FDA and EMA pipeline takes 7 to 8 years.
    • Agencies should pre-approve standards for AI methods like PD/PK modeling, toxicology prediction, dose selection, biomarker validation, synthetic control arms, and surrogate endpoints, plus more flexible accelerated-approval mechanisms.
    • On civil liberties, powerful AI in the wrong hands could be the ultimate tool of autocracy, and existing constitutional protections are not fully equipped to counter a surprise seizure of power.
    • Threats named include fully automated drone armies that obey unlawful orders and surveillance AI that infers the innermost details of every citizen’s life from widely available data.
    • Civil liberties proposals: accountability rules and an “off switch” for autonomous weapons, a domestic ban on fully autonomous weapons including in law enforcement, closing the data broker loophole, and public rights to AI advice during adverse government action.
    • Amodei warns companies as well as governments can seize quasi-state power, citing the Gilded Age and the East India Company, and says AI cannot be safely entrusted to either alone.
    • He offers Anthropic’s Long-Term Benefit Trust as one separation-of-power structure and urges the industry to explore mechanisms that go further.
    • On geopolitics, he argues AI resets the geopolitical game board like nuclear weapons, becoming the dominant source of military and economic power for any nation that holds it.
    • A nation with powerful AI versus one without it, or even one three years behind, could resemble WWII Marines facing medieval swordsmen.
    • He calls for a democratic coalition that shares chips and semiconductor manufacturing equipment internally while denying them to adversaries, citing MATCH and OVERWATCH as good first steps.
    • The coalition should coordinate risk policy, share benefits including harmonized medical approvals, provide mutual AI defense, reject AI-powered repression, and cooperate on macroeconomic stabilization.
    • He rejects the idea that AI’s image is a PR problem, arguing public concern reflects real risks and is democratic accountability working as it should, with the task being to channel it into constructive solutions.

    Detailed Summary

    The speed mismatch between AI and policy

    Amodei frames the entire essay around a single problem: AI advances at a lightning pace while policy, especially legislation, moves very slowly, often for good reasons since governments wield grave powers that should not be used hastily. He illustrates this with Treebeard, the sentient tree from The Lord of the Rings who takes a full day to say hello, as a stand-in for political institutions trying to respond to a technology that can go from amusing toy to a country of geniuses in the time it takes Congress to act. He recounts the dilemma responsible actors have faced: they could see where the exponential was headed, but to observers looking only at present capabilities, AI looked as mundane as the latest consumer app or cryptocurrency, making a laissez-faire attitude hard to argue against. The absence of AI’s radical effects, and uncertainty about their shape, made it genuinely difficult to design good policy even where the will existed.

    That uncertainty, he says, is why safety advocates limited themselves to optionality-preserving measures like transparency rules, export controls, and labor data collection. But over the last few months the evidence of AI’s power and risk has become undeniable, with Claude Mythos Preview as the emblematic example: it scrambled the global cybersecurity landscape and proved AI models are now tools of global and national strategic consequence. He expects biological and autonomy risks to follow, and argues the world must now activate its slow, rickety policy apparatus to handle risks that will compound quickly. He worries current early actions are at least a year out of step with AI’s progress, and presents the essay as an attempt to close that gap across five policy areas, focused on US policy but relevant worldwide.

    Regulation and public safety: an FAA for frontier models

    Amodei opens by acknowledging the real costs of regulation: it can reduce a product’s benefits, disincentivize innovation, and suffer from the Hayekian problem that regulators lack the information for good tradeoffs, plus the Collingridge dilemma that a technology’s impacts are hard to anticipate until it is too late to manage them. In 2023 to 2024 these dynamics argued against pre-writing AI law, since the exact form of biological or autonomy risk, how to test for it, and how it would play out were all unclear, creating a high risk of low-value compliance requirements that miss the real dangers. Anthropic’s answer was transparency: requiring developers to disclose safety procedures, tests, and critical incidents, which is why it supported SB 53 in California, RAISE in New York, and SB 315 in Illinois in early 2026.

    Now, he argues, the risks are clearly here and it is time for binding regulation. His analogy is to cars, airplanes, and drugs: powerful technologies essential to the economy but capable of killing many people if designed or operated poorly. He models AI regulation on the FAA, with frontier models required to pass testing and auditing and with release blocked or reversed if they fail high safety standards. His concrete proposal: mandatory third-party testing for models above a compute threshold across cybersecurity, biological weapons, loss of control, and accelerating automated R&D; government power to block deployment of unacceptably risky models, scoped narrowly with anti-favoritism protections; evaluation by either a government agency or authorized private organizations in a regulatory-markets model; strong weight security, red teaming, and penetration testing at AI companies; and prompt reporting of safety incidents. He notes a future may arrive when systems resemble weaponizable nuclear materials and demand harsher measures, but warns against designing for dangers that have not yet emerged.

    Macroeconomics and tax policy: growth and displacement together

    Here Amodei challenges the standard premise that growth is fragile and must be traded off against the drag of taxes or deficits to reduce inequality. Powerful AI, he suggests, may scramble that assumption by producing extremely rapid growth through accelerated science and efficiency, supercharged by AI building better AI, while simultaneously acting as a broad substitute for human cognition that reshapes the economy faster than any prior technology. The result could be a world stuck on a hypergrowth, hyper-inequality setting that is hard to unstick, where the central challenge is no longer incentivizing growth but sharing its benefits. He is careful to make two points clearly: first, enduring job displacement is undesirable and dangerous and should be minimized, and his warnings are meant to help society adapt, not to play prophet of doom; second, any response must address both economic provision and the deeper human need for meaning, purpose, and agency, which matters more and which policy cannot directly supply.

    His policy menu starts with measurement and tracking, arguing good policy is impossible without accurate data, and that governments could expand economic statistics well beyond Anthropic’s Economic Index. Next come pro-employment incentives such as wage insurance, retention tax incentives, workforce training grants, and employer-employee matching, costs he says society should readily accept since they are likely offset by AI productivity gains. If displacement proves large and permanent, he says long-term income support like universal basic income or universal capital accounts may be needed, financed through taxes on relevant companies or higher capital gains taxes. He closes the section by reframing datacenter and energy-price backlash as mostly a symbol of broader economic anxiety, while saying AI companies should absorb rate increases, as Anthropic has pledged.

    Accelerating AI’s positive impact: the slow-regulator problem

    For technologies accelerated by AI, rather than AI itself, Amodei flips his concern: the bigger danger is regulatory systems designed for a slower pace failing to handle the deluge of new products, and AI making downstream technologies safer in ways that violate the skeptical assumptions baked into agencies like the FDA. He focuses on biomedicine as the area likely to produce AI’s biggest humanitarian benefits and where regulation is especially complex. AI could greatly increase the rate of new drug candidates, improve their effect sizes and safety profiles, treat previously untreatable diseases, and create entirely new therapy categories the way antibodies, peptides, and cell therapies did.

    The current pipeline at the FDA and EMA takes 7 to 8 years, built on the pessimistic assumption that drug candidates usually fail and often carry safety problems even when they work. Without reform, AI will jam or overload that system. Amodei proposes that agencies develop standards now for accepting AI simulation and analysis, so they can be adopted quickly once proven rather than after years of unnecessary testing. Specific candidates include AI-based PD/PK modeling, toxicology prediction to reduce animal testing, more accurate dose selection, biomarker validation from large datasets, synthetic control arms, and surrogate endpoints (especially for aging and neurodegeneration). He urges more flexible accelerated-approval mechanisms generally, and notes biomedical acceleration may also reduce AI’s risks by aiding biodefense and improving mental health.

    The state and civil liberties: guarding against AI-driven tyranny

    Amodei frames the perennial balance between state power and individual liberty, enforced through machinery like the First, Fourth, and Fifth Amendments, the Posse Comitatus Act, and FISA, and argues AI threatens to upset that balance while raising its stakes. Powerful AI in the wrong hands could be the ultimate tool of autocracy, because the enormous returns to intelligence combined with AI’s pace create a perfect storm for a surprise seizure of power. The danger could take many forms but shares one feature: AI conferring sudden power while routing around democratic oversight. He cites a fully automated drone army that could obey unlawful orders, where trained humans might object, and a surveillance AI that analyzes widely available information at massive scale to infer the innermost details of every citizen’s life, an ability current civil liberties law never contemplated.

    His proposals: create accountability rules for autonomous weapons so they respond to court orders, legislation, and human overseers rather than blindly following orders, possibly with a judicial finger on an off switch; ban domestic use of fully autonomous weapons, including in law enforcement, while allowing them against foreign adversaries; close the bulk-collection and data-broker loophole that lets the government buy and analyze data Americans share with private companies; and guarantee public rights to AI advice at least as capable as what the government uses during adverse action, as an extension of the Administrative Procedure Act, due process, or the Sixth Amendment. He closes by warning that companies, not just governments, can capture the state, citing the Gilded Age and East India Company, and argues AI cannot be safely entrusted to either alone. Anthropic’s Long-Term Benefit Trust is offered as one accountability structure, with a call for the industry to go further.

    Securing leadership by democracies: a values-based coalition

    Amodei rejects treating AI as a mere instrument of trade policy to diffuse a tech stack worldwide. He believes AI resets the entire geopolitical game board like nuclear weapons, potentially even more so, becoming the dominant source of military and economic power for whoever holds it. In a virtual country of 100 million geniuses, millions could be assigned to military strategy, drone manufacture, weapons R&D, intelligence, and scientific advancement at once, so a nation with powerful AI facing one without it, or even three years behind, could be like WWII Marines against medieval swordsmen. Because powerful AI also enables deeper autocratic repression, it matters enormously that the world’s strongest nations are democracies.

    His answer is a global coalition built on shared democratic values that draws in the rest of the world by making membership increasingly attractive and exclusion increasingly costly. Operating principles include managing the AI supply chain by sharing chips and semiconductor manufacturing equipment within the coalition while denying them to adversaries, expanding and tightening export controls (he cites MATCH and OVERWATCH as good first steps); coordinating on biological, cyber, and autonomy risk to make compliance compatible and effective; sharing AI’s benefits including harmonized medical approvals; mutual defense through collective AI cyberdefense, drones, manufacturing, compute, and intelligence; rejection of AI-powered repression; and macroeconomic cooperation against contagious employment crises. The coalition would respect each nation’s sovereignty, start with aligned democracies, and grow iteratively, ideally toward the whole world, but at minimum positioning democracies to contain and outcompete repressive regimes.

    A window of opportunity

    Amodei closes on cautious optimism. The same exponential that strains policymaking has created a unique opening: clear evidence of AI’s risks, an early taste of its value and disruption, and public backlash against unregulated approaches have left policymakers unusually open to forward-looking action. Treebeard and his forest are waking up. He firmly rejects the industry-circle view that this is a PR problem solved by better marketing, arguing people are worried because the risks are real, and that public concern in response to transparency is democratic accountability working as it should. The key challenge is focusing that concern into constructive solutions rather than letting it descend into formless anger and violence. He is optimistic because issues from job displacement to model testing to export controls have common-sense appeal across the political spectrum, and a broad nonpartisan coalition could adopt sane, forward-looking policy faster than usual.

    Notable Quotes

    “in only four years, AI models have gone from barely being able to write a coherent line of code to writing most of the code at major AI companies.”

    Dario Amodei, on the pace of the AI exponential

    “in the several years that it can take Congress to act, AI can go from an amusing toy to the full country of geniuses.”

    Dario Amodei, on the mismatch between AI’s speed and the speed of legislation

    “However, now the risks are clearly here. It is time to go beyond transparency to more serious and binding regulation of AI.”

    Dario Amodei, marking the shift from transparency to binding rules

    “enduring job displacement is undesirable and dangerous, and we should do everything we can to minimize or prevent it, not to bring it about.”

    Dario Amodei, clarifying his stance on AI and jobs

    “The key challenge in such a world won’t be incentivizing growth, but finding a way for everyone to share in the benefits.”

    Dario Amodei, on a hypergrowth, hyper-inequality economy

    “Powerful AI in the wrong hands could be the ultimate tool of autocracy, and our existing legal and constitutional protections are not fully equipped to counter this threat.”

    Dario Amodei, on AI and civil liberties

    “A nation that possesses powerful AI facing one without it … could be the equivalent of an army of World War II Marines facing an army of medieval swordsmen.”

    Dario Amodei, on AI as the dominant source of geopolitical power

    “People are worried about AI because they correctly perceive that its risks are real, not because AI CEOs have been insufficiently Panglossian.”

    Dario Amodei, rejecting the idea that AI has a PR problem

    “Treebeard and his forest are waking up.”

    Dario Amodei, on policymakers’ new openness to acting on AI

    “Policy on the AI Exponential” is a dense, structured argument from one of the most consequential figures in the field, and it rewards a full read in the original. The summary and analysis above are a guide, not a substitute. You can read the full essay here.

    Related Reading

  • Claude Fable 5 and Claude Mythos 5: Anthropic Ships Its First Generally Available Mythos-Class AI Model With New Safeguards

    Anthropic has launched Claude Fable 5 and Claude Mythos 5, the first Mythos-class models offered beyond a tiny circle of cyber defenders. Fable 5 is the generally available version, wrapped in a new layer of safeguards, while Mythos 5 is the same underlying model with some of those guardrails lifted for a small group of vetted partners. The pair sits a full tier above the Opus class in raw capability, and the launch is as much a story about how Anthropic is choosing to gate that capability as it is about the benchmarks. Below is a full breakdown of what shipped, what the model can do, and why the safeguard design matters.

    TLDR

    Anthropic released Claude Fable 5, a Mythos-class model that is now its most capable generally available model, posting state-of-the-art results across software engineering, knowledge work, vision, memory, and scientific research. To ship it safely and fast, Fable 5 carries new safety classifiers that route flagged queries in cybersecurity, biology and chemistry, and distillation over to Claude Opus 4.8 instead of refusing, a fallback that triggers in under 5% of sessions. The same model ships without cyber safeguards as Claude Mythos 5 for Project Glasswing partners in collaboration with the US Government, where it is described as having the strongest cybersecurity capabilities of any model in the world. Highlights include a codebase-wide migration of a 50-million-line Ruby codebase that Stripe says took a day instead of two months, beating Pokemon FireRed with a vision-only harness, accelerating drug design roughly tenfold using Mythos 5, producing novel molecular biology hypotheses preferred by scientists about 80% of the time, and over a week of autonomous genomics research. Both models cost 10 dollars per million input tokens and 50 dollars per million output tokens, less than half the price of Mythos Preview, with a staged subscription rollout and a new 30-day data retention policy for Mythos-class traffic.

    Thoughts

    The most interesting decision here is not the capability jump, it is the naming split. Fable and Mythos are the same brain. The only difference is whether the safeguards are on. Anthropic is effectively shipping one model twice: a gated public edition and an ungated edition handed to a short list of trusted defenders working with the US Government. That is a clean way to resolve the central tension of frontier AI, which is that the exact capabilities that help a security professional close a vulnerability also help an attacker find one. Rather than dumbing the model down for everyone or holding it back entirely, they are letting the access list, not the weights, carry the risk. Expect this pattern to repeat as capabilities climb.

    The fallback-to-Opus design is the other quietly important choice. When a classifier flags a query in cybersecurity, biology, chemistry, or suspected distillation, the user does not hit a wall of refusal. The request is silently handed to Opus 4.8, a model that is still excellent at almost everything. Graceful degradation beats a hard no, both for user experience and for trust. It also reframes what a safeguard is. Instead of a binary block, it becomes a routing decision, and because more than 95% of sessions never trigger it, most users will never notice it exists. The honest admission that the classifiers are tuned conservatively and will sometimes catch harmless requests is the right posture, even if it will annoy power users who keep getting bounced to the smaller model.

    The commercial signals are worth reading closely. Pricing came down to less than half of Mythos Preview, which suggests confidence in serving costs at scale, but the subscription rollout tells a more cautious story. Fable 5 is free on Pro, Max, Team, and Enterprise plans only through June 22, after which using it requires usage credits until capacity catches up. That is a polite way of saying demand is expected to badly outrun supply. The model is fully available on the API and consumption-based Enterprise plans from day one, because those bill by the token and self-throttle. Subscriptions, which are all-you-can-eat, are where a capacity crunch actually hurts, so that is exactly where the brakes went on.

    On the science, the genomics result is the one that should make people sit up. A model doing over a week of largely autonomous research, assembling single-cell data across 138 species, then designing and training its own machine learning model that outperforms a recently published Science paper while being 100 times smaller, is a different category of claim than acing a benchmark. So is the drug-design work, where Mythos 5 reportedly matches or beats skilled human operators end to end, choosing binding sites, running protein design tools, and recovering from its own failures. If those hold up to publication and independent replication, the interesting frontier stops being chat quality and becomes whether a model can run a research program. That is also precisely why the biology and chemistry classifier exists, and why Anthropic is being so deliberate about who gets the ungated version.

    One caveat worth keeping in view: nearly all of the evidence in the announcement is Anthropic’s own, or comes from partners with early access and an incentive to be enthusiastic. The Stripe migration, the FrontierCode score, the Slay the Spire memory result, the protein targets, and the genomics model are all compelling, but they are first-party until outside labs and the eventual system card, peer review, and independent red-teamers weigh in. The note that the UK AISI made progress toward a universal jailbreak inside a brief testing window is a useful reminder that the safeguard story is a work in progress, not a finished proof.

    Key Takeaways

    • Claude Fable 5 is a Mythos-class model made safe for general use, and is now Anthropic’s most capable generally available model.
    • Mythos-class is a tier that sits above the Opus class in capability. The first was Claude Mythos Preview, released in April through Project Glasswing.
    • Fable 5 is state-of-the-art on nearly all tested benchmarks, and its lead grows as tasks get longer and more complex.
    • Claude Mythos 5 is the same underlying model as Fable 5, but with safeguards lifted in some areas. Fable and Mythos differ only by their safeguards.
    • Mythos 5 is described as having the strongest cybersecurity capabilities of any model in the world, and is deployed through Project Glasswing with the US Government.
    • New safety classifiers cover cybersecurity, biology and chemistry, and distillation. Flagged queries fall back to Claude Opus 4.8 rather than being refused.
    • Users are told whenever a fallback happens. More than 95% of Fable sessions involve no fallback at all, and for those sessions Fable performs effectively the same as Mythos 5.
    • The safeguards are tuned conservatively and trigger in less than 5% of sessions on average, sometimes catching harmless requests. Anthropic plans to reduce false positives after launch.
    • Stripe reported Fable 5 compressed months of engineering into days, performing a codebase-wide migration of a 50-million-line Ruby codebase in a day that would have taken a team over two months by hand.
    • Fable 5 scores highest among frontier models on Cognition’s FrontierCode evaluation for high-quality agentic coding, even at medium effort, and is more token-efficient than past Claude models.
    • On Hebbia’s Finance Benchmark for senior-level reasoning, Fable 5 has the highest score of any model, with gains in document reasoning, chart and table interpretation, and problem solving.
    • IMC noted Fable 5 aced their trading-analysis evaluations nearly across the board, including factual lookup, conceptual reasoning, root-cause analysis, and expected-value analysis.
    • Fable 5 is the new state-of-the-art for vision, and can rebuild a web app’s source code from screenshots alone.
    • Fable 5 beat Pokemon FireRed using a minimal, vision-only harness with no maps, navigation aids, or extra game-state information. Earlier Claude models needed a complex helper harness.
    • Persistent file-based memory improved Fable 5’s Slay the Spire performance three times more than it did for Opus 4.8, and Fable reached the game’s final act three times more often.
    • Fable 5 built a simulation of the solar system, deriving the planets’ orbital motion from physics first principles and using it to predict solar eclipses.
    • Using Mythos 5, internal protein design experts accelerated aspects of drug design by around ten times, with the model matching or beating skilled human operators end to end.
    • Nine of 14 protein targets in the drug-design study yielded strong candidates Anthropic is now investigating.
    • Mythos 5 is Anthropic’s first model to consistently produce novel, compelling scientific hypotheses. Scientists preferred its molecular biology hypotheses about 80% of the time in blinded comparisons.
    • One Mythos hypothesis, a novel mechanism for an E. coli protein, was corroborated by an independent lab working on the same problem.
    • In over a week of largely autonomous work, Mythos 5 assembled single-cell data for millions of cells across 138 animal species and trained a custom model that outperformed a recent Science paper while being 100 times smaller.
    • Anthropic’s automated alignment assessment found Mythos 5’s level of misaligned behavior was low and similar to Opus 4.8. Because they are the same model, Fable 5’s alignment is similar.
    • An external bug bounty produced no universal jailbreaks in over 1,000 hours of testing, though the UK AISI made progress toward one in a brief initial window.
    • One external partner found Fable 5’s safeguards against harmful cyber queries the most robust of any model tested, including Opus 4.8 and Opus 4.7, with zero compliance on harmful single-turn cyberattack requests.
    • The biology and chemistry classifier is deliberately broad for now. Mythos-class models outperformed dedicated protein language models at predicting AAV viral shell assembly using biological reasoning alone.
    • The distillation classifier targets large-scale attempts to extract Claude’s capabilities to train competing models, which could proliferate near-frontier capabilities without safeguards.
    • A new policy requires 30-day data retention for all Mythos-class traffic on first- and third-party surfaces, used only for safety, with logged human access and deletion after 30 days in almost all cases.
    • Anthropic plans trusted access programs that let cybersecurity organizations apply for Mythos 5, and let a small number of life science researchers access Fable 5 with biology and chemistry safeguards removed.
    • Both models cost 10 dollars per million input tokens and 50 dollars per million output tokens, less than half the price of Mythos Preview. Developers can use claude-fable-5 via the Claude API.
    • Fable 5 is free on Pro, Max, Team, and seat-based Enterprise plans through June 22. On June 23 it moves to usage credits on those plans until capacity allows it to return as a standard inclusion.

    Detailed Summary

    A Mythos-class model, made safe for general use

    Fable 5 is the first Mythos-class model Anthropic has made generally available. Mythos-class is a tier that sits above the Opus class, and the first of its kind, Claude Mythos Preview, was released in April through Project Glasswing to a limited group of cyber defenders and critical software infrastructure providers. The company framed today’s launch as the moment it could finally bring that level of capability to all users, because its safeguards had matured enough to allow it. Fable 5’s capabilities exceed those of any model Anthropic has made generally available, and its advantage over other models grows as tasks get longer and more complex.

    Two models, one brain

    Claude Mythos 5 is the same underlying model as Fable 5, but with safeguards lifted in some areas. The names are the only real difference: Fable, from the Latin fabula meaning that which is told, is akin to the Greek mythos, and the safeguards are what distinguish the two. Mythos 5 launches first to existing Mythos Preview users, including the Project Glasswing cybersecurity partners, as an upgrade. It is deployed in collaboration with the US Government and is described as having the strongest cybersecurity capabilities of any model in the world. Anthropic plans to steadily expand access through a more systematic trusted access program.

    Software engineering and token efficiency

    Fable 5 can work autonomously for longer than any previous Claude model, and software engineering is where that shows most clearly. During early testing, Stripe reported it compressed months of engineering into days, performing a codebase-wide migration in a 50-million-line Ruby codebase in a single day that would otherwise have taken a whole team over two months by hand. It is also more token-efficient than past models, scoring highest among frontier models on Cognition’s FrontierCode evaluation for high-quality, maintainable agentic coding, even at medium effort.

    Knowledge work, vision, and memory

    On complex analytical work, Fable 5 posted the highest score of any model on Hebbia’s Finance Benchmark for senior-level reasoning, with substantial gains in document-based reasoning and chart and table interpretation, and IMC said it aced their trading-analysis evaluations nearly across the board. In vision, it is the new state-of-the-art, able to extract precise numbers from detailed scientific figures and rebuild a web app’s source code from screenshots alone. It needs less scaffolding too: where earlier Claude models struggled to play Pokemon even with helper harnesses, Fable 5 beat FireRed with a minimal, vision-only harness using nothing but raw game screenshots. On memory, giving Fable persistent file-based notes improved its Slay the Spire performance three times more than it did for Opus 4.8, and it built a physics-first-principles solar system simulation accurate enough to predict solar eclipses.

    Life sciences: drug design, hypotheses, and genomics

    Using Mythos 5, Anthropic’s internal protein design experts accelerated aspects of the drug-design process by around ten times. With protein design and bioinformatics tools but no human assistance, the model matched or beat skilled human operators, executing the full workflow of choosing binding sites, selecting and running design tools, and recovering from failures. Nine of 14 protein targets yielded strong drug-design candidates now under investigation. Mythos 5 is also Anthropic’s first model to consistently produce novel, compelling scientific hypotheses: scientists preferred its molecular biology hypotheses about 80% of the time in blinded comparisons, and one, a novel mechanism for an E. coli protein, was corroborated by an independent lab. In genomics, Mythos 5 ran over a week of largely autonomous research, assembling single-cell data for millions of cells across 138 species and training a custom model that outperformed a recent Science paper despite being 100 times smaller.

    The new safeguards: classifiers and fallback

    Mythos-class capability is potent enough that Anthropic considers it a substantial misuse risk, especially given how much advanced AI usage is dual use. Fable 5 ships with a new set of classifiers, separate AI systems that detect potential misuse and jailbreak attempts and stop the main model from responding. When a classifier flags a request related to cybersecurity, biology and chemistry, or distillation, the response is handled by Claude Opus 4.8 instead, and the user is told. The cybersecurity classifiers cover both exploitation and broader offensive cyber tasks like reconnaissance and lateral movement, and Anthropic says they prevent Fable from making any progress on those tasks. The biology and chemistry classifier is intentionally broad for now, after tests showed Mythos-class models could outperform dedicated protein language models at predicting AAV viral shell assembly using biological reasoning alone. The distillation classifier targets large-scale attempts to extract Claude’s capabilities to train competing models.

    Jailbreak resistance, data retention, and availability

    Anthropic ran extensive red-teaming, including an external bug bounty that produced no universal jailbreaks in over 1,000 hours, though it notes the UK AISI made progress toward one in a brief window. The company concedes it is likely impossible to fully prevent universal jailbreaks and aims instead to make any that remain slow and costly enough to catch before they scale. A new policy requires 30-day data retention for all Mythos-class traffic, used only for safety, with logged human access and deletion after 30 days in almost all cases. On availability, Fable 5 is live everywhere today and fully available on the API and consumption-based Enterprise plans, while subscription access rolls out in stages: free on Pro, Max, Team, and seat-based Enterprise through June 22, then on usage credits from June 23 until capacity allows it to return as a standard inclusion. Both models cost 10 dollars per million input tokens and 50 dollars per million output tokens.

    Notable Quotes

    “Today we’re launching Claude Fable 5: a Mythos-class model that we’ve made safe for general use.”

    Anthropic, opening the Claude Fable 5 and Claude Mythos 5 announcement

    “Fable 5’s capabilities exceed those of any model we’ve ever made generally available.”

    Anthropic, on where Fable 5 sits in the lineup

    “It has the strongest cybersecurity capabilities of any model in the world.”

    Anthropic, describing Claude Mythos 5

    “During early testing, Stripe reported that Fable 5 compressed months of engineering into days.”

    Anthropic, on Fable 5’s software engineering results

    “Our early data shows that more than 95% of Fable sessions involve no fallback at all.”

    Anthropic, on how often the safeguards route to Opus 4.8

    “Mythos 5 is our first model to consistently produce novel, compelling scientific hypotheses.”

    Anthropic, on the model’s molecular biology research

    “It is likely impossible to completely prevent universal jailbreaks, but our goal is to make any remaining jailbreaks sufficiently slow and costly that we can detect and prevent them before they are used at scale.”

    Anthropic, on the limits of its safeguards

    “Fable is from the Latin fabula, ‘that which is told,’ akin to the Greek mythos. The safeguards are what distinguish the two models.”

    Anthropic, explaining the Fable and Mythos naming

    Read the full announcement and the benchmark tables on Anthropic’s site here: Claude Fable 5 and Claude Mythos 5.

    Related Reading

  • Elon Musk Announces SpaceX AI Satellites, Starship Mass to Orbit, and a Moon Mass Driver to Climb the Kardashev Scale

    Elon Musk sat down with the SpaceX Starlink team for a wide ranging update that connects every recent SpaceX move into one thesis: harness far more of the sun’s energy by putting AI compute in orbit. In this SpaceX conversation, the group walks from galaxy sized framing (the Kardashev scale) all the way down to the engineering specifics of a new AI satellite, the manufacturing buildout in Bastrop, Texas, and a long term plan that ends with a mass driver on the moon. The pitch is that none of it requires magic, just scaling technology SpaceX already flies.

    TLDW

    Musk frames civilizational progress with the Kardashev scale, a measure of how much power a species harnesses, and points out that humanity uses less than a trillionth of the sun’s output, barely registering even on the Type 1 (planet) level. Because most of Earth is water and the usable sunlit land is limited, the only way to capture a meaningful fraction of the sun’s energy is to go to space, where cooling is also easier since heat radiates straight into the vacuum. Three limiting factors must be solved: mass to orbit (handled by fully and rapidly reusable Starship, which already beats the Saturn V on thrust and aims for millions of tons to orbit per year), solar power plus radiators, and AI chips. SpaceX unveils its first AI satellite design, AI1, a roughly 70 meter wingspan craft at 150 kW peak and 120 kW sustained power that matches an Nvidia GB300 rack, reuses Starlink V3 solar technology, links by laser, and runs at only a few milliseconds of latency from low orbit. Chips start as off the shelf Nvidia GB300 and Rubin parts plus a TPU reference design, then scale through a planned 100 million square foot “Terafab” toward a terawatt per year of compute, about twice current US electricity use. The endgame pushes another 1,000x by manufacturing on the moon and using a lunar mass driver to fling satellites into deep space without rockets.

    Thoughts

    The most important reframe in this conversation is that Starlink, Starship, the xAI acquisition, and a new chip factory are not separate bets. They are one bet expressed as a single number: the percentage of the sun’s energy that civilization can capture and put to work. By anchoring everything to the Kardashev scale, Musk turns “build more satellites” into a measurable physics goal rather than a product roadmap. It is a rhetorically powerful move because it makes today’s hyperscale AI buildout, which already strains terrestrial grids, look like the obvious forcing function for going to space. If you accept that compute demand keeps compounding, then the constraint stops being chips and becomes power and cooling, and space genuinely is better at both.

    The cleverest engineering insight is almost understated: an AI satellite is simpler than a Starlink satellite, not harder. A Starlink craft carries complex phased array and parabolic antennas to talk to millions of dispersed users. An orbital data center mostly needs solar cells, radiators, some laser links, and the chips. SpaceX has already industrialized the hard parts (mass produced solar arrays, constellation flight operations at 10,000 satellites, laser mesh networking), so the new product is closer to a remix of proven subsystems than a clean sheet program. That is the real argument for why SpaceX, specifically, can do this when “data center in space” has sounded like science fiction for a decade.

    The numbers are where skepticism should live, and to his credit Musk says to take the timeline with a grain of salt. An annualized gigawatt of space compute by the end of next year, scaling roughly 10x per year toward a terawatt, is an extraordinary ramp. A terawatt is about twice the entire electricity consumption of the United States, delivered as orbiting hardware. Getting there leans on Starship hitting rapid reusability and on a 100 million square foot chip fab that is ten times Gigafactory Texas. Each of those is itself a moonshot, and stacking them multiplies the risk. The honest read is that the architecture is coherent even if the schedule is aspirational.

    The moon segment is where the talk turns from aggressive to genuinely speculative, and it is the part worth watching. A lunar mass driver, essentially a long linear motor that accelerates payloads to escape velocity, only makes sense once you are already moving enormous mass and want to escape Earth’s gravity well and atmosphere entirely. It is a classic Musk pattern: solve the near term problem (mass to orbit with Starship) in a way that creates the precondition for the next, larger problem (local production on the moon). Whether or not the dates hold, the dependency chain is logical, and it explains why SpaceX keeps investing in capabilities that look excessive for today’s market.

    One underrated takeaway for readers outside aerospace: this is as much a manufacturing story as a space story. The bottleneck is not whether a single AI satellite works, it is whether you can stamp out thousands to a million of them, plus the solar, plus the chips, at volume and low cost. That is why so much of the conversation is about Bastrop production lines, a solar manufacturing facility already under construction, and the Terafab. The space hardware is the visible part; the factories are the actual product.

    Key Takeaways

    • The whole strategy is framed around the Kardashev scale, a measure of how much power a civilization harnesses, named for Russian physicist Nikolai Kardashev.
    • Type 1 harnesses a planet’s available power, Type 2 a star’s full output, and Type 3 a galaxy’s; humanity sits at the very bottom of even Type 1.
    • We currently use much less than a trillionth of the sun’s power output, and a trillion is a million times a million.
    • The sun is about 99.86% of all mass in the solar system; most of the remaining 0.14% is Jupiter, and Earth is a tiny dust mote by comparison.
    • Incident solar energy on Earth’s cross section is roughly a half billionth of the sun’s total power output.
    • Most of that sunlight is unusable because about 70% of Earth is water and much of the land is at the poles or far north where solar is weak.
    • Reaching one millionth of the sun’s output, a “micro” on the Kardashev 2 scale, would be an epic achievement relative to today, and 1% would make a civilization vastly more powerful than ours.
    • Space avoids building massive ground power plants and makes cooling easier, because waste heat can radiate directly into the vacuum.
    • Three limiting factors must be solved to scale: mass to orbit, solar power plus radiators, and AI chips.
    • Starship provides the mass to orbit and is the first rocket designed for full and rapid reusability, the breakthrough behind both multiplanetary life and ascending the Kardashev scale.
    • SpaceX catches the booster with the launch tower instead of adding heavy landing legs, an extreme mass optimization measure.
    • Starship V3 already produces more than double the thrust of the Saturn V; V4 will be roughly three times, making it the largest, heaviest, most powerful moving object ever built.
    • Starship is targeted to eventually fly more than once per hour.
    • SpaceX already delivers roughly 85 to 90% of all Earth mass to orbit with Falcon 9 and Falcon Heavy.
    • The plan is to go from around 2,500 tons to orbit per year to millions of tons per year, reaching a million tons per year in about three years.
    • The AI satellite, called AI1, is actually simpler than a Starlink satellite because it lacks the complex phased array and parabolic antennas.
    • AI1 targets 150 kW peak power and 120 kW sustained power, roughly matching an Nvidia GB300 rack of 72 GPUs.
    • Design assumptions are about 250 watts per square meter for the solar array and about 1,400 watts per square meter for the double sided radiators, both expected to improve over time.
    • Radiators are oriented knife edge to the sun and radiate from both sides; each satellite has roughly a 70 meter wingspan.
    • Each satellite carries on the order of a terabit of laser link connectivity.
    • Satellites connect to each other or to the Starlink constellation by laser, and Starlink relays data to the ground over existing Ka and Ku antennas plus laser to ground links.
    • At 600 to 800 km altitude latency is only around 3 milliseconds, since light travels about 300 km per millisecond.
    • SpaceX has about 10,000 Starlinks in orbit and is the only operator with experience flying constellations at that scale.
    • The constellation could eventually grow to thousands or even up to a million satellites; space is big enough to pack and fly them safely.
    • The satellites and solar will be built in Bastrop, Texas, where a solar manufacturing facility is already under construction.
    • The AI satellite production building and solar production are expected to be operating at reasonable volume by the end of next year.
    • SpaceX keeps making Starlink user terminals in Bastrop and is turning on new, higher volume production lines, with possibly a few hundred million terminals eventually, plus a direct to cell constellation that connects straight to phones.
    • Initial chips are off the shelf: the reference design targets Nvidia GB300 or Rubin chips, with a TPU reference design as well, and essentially any existing chip can be put into orbit.
    • The chip industry looks set to reach maybe 100 gigawatts a year of AI compute, far short of the terawatt SpaceX wants.
    • To close that gap, SpaceX plans a “Terafab,” a chip factory around 100 million square feet, roughly 10 times the size of Tesla Gigafactory Texas.
    • A terawatt of chip output per year is like a billion full reticle equivalent chips, each running about a kilowatt, plus a lot of memory.
    • The timeline targets an annualized rate of a gigawatt per year of space compute by the end of next year, scaling roughly 10x per year: 10 GW in about 2.5 years, 100 GW in about 3.5 years, then a terawatt per year, which is 1,000 GW and about twice current US electricity consumption.
    • Beyond a terawatt, the only path to another 1,000x is the moon, using local production of photovoltaics, solar, and radiators so most mass does not have to be shipped from Earth.
    • A lunar mass driver (a linear electric motor or rail gun) could accelerate AI satellites into deep space without rockets, thanks to the moon’s lack of atmosphere and one sixth gravity.
    • Bringing that much mass to the moon would also make it possible for anyone who wants to go to the moon to go, and even live there.
    • Musk stresses none of this requires magic; the AI satellite reuses Starlink V3 solar technology, and he frames the timelines as a best guess rather than a promise.
    • SpaceX has acquired xAI, now referred to as SpaceX AI, folding its AI ambitions directly into the space company.

    Detailed Summary

    The Kardashev Scale and Why Earth Barely Registers

    Musk opens with the question of how you objectively measure a civilization’s progress, the metric an alien species would use to calibrate us. The answer he reaches for is the Kardashev scale, named for the Russian physicist who proposed it, which ranks civilizations by the power they harness: a planet’s worth (Type 1), a star’s worth (Type 2), or a galaxy’s worth (Type 3). Humanity is extremely low even on Type 1. To dramatize the scale of the sun, he notes it is about 99.86% of all the mass in the solar system, with most of the rest being Jupiter and Earth a tiny dust mote in the miscellaneous category. The incident solar energy hitting Earth’s cross section is only about a half billionth of the sun’s total output, and we capture a vanishingly small slice of even that.

    Why Energy at Scale Means Going to Space

    Because roughly 70% of Earth is water and much of the remaining land sits at the poles or in far northern regions where solar is weak and few people live, the usable area for ground solar is small. To reach any meaningful percentage of the sun’s energy, you have to go to space. Musk sets the aspiration at a millionth of the sun’s output as a first “micro” milestone, noting that even 1% would make a civilization vastly more powerful than today’s. Orbit also solves two practical problems at once: you avoid building enormous terrestrial power plants, and cooling becomes easier because waste heat can be radiated straight into the vacuum rather than fought against in an atmosphere.

    The Three Limiting Factors

    Scaling to space based compute comes down to three things: a large mass to orbit capability, a lot of solar power and radiators, and a lot of AI chips. To put a hundred gigawatts and ultimately a terawatt into space, you need a terawatt of solar generation, the radiators to reject the heat, and a terawatt of AI chips. The rest of the conversation works through each limiting factor in turn, starting with the one SpaceX has spent two decades on.

    Starship and the Reusability Breakthrough

    Starship supplies the mass to orbit. Musk argues that full and rapid reusability is the fundamental breakthrough required for both multiplanetary life and climbing the Kardashev scale, since expendable rockets are simply too expensive and you cannot build enough of them. Every other mode of transport, from cars to planes to bicycles, is reusable; rockets are uniquely hard because Earth has a deep gravity well and thick atmosphere, which is why many prior reusable rocket attempts were abandoned. SpaceX pushes mass optimization to the extreme, even catching the booster with the launch tower instead of carrying heavy landing legs. The goal beyond catching the rocket is reflying it with no refurbishment, like an aircraft. Starship V3 already more than doubles the Saturn V’s thrust, V4 will be roughly triple, and the vehicle is the largest and most powerful moving object ever made, targeted to fly more than once per hour. SpaceX already lifts an estimated 85 to 90% of all Earth mass to orbit, and plans to scale from about 2,500 tons per year to millions of tons per year, reaching a million tons per year in roughly three years.

    Inside the AI Satellite (AI1)

    The team explains that a data center in space is not a building with engines bolted on; it reduces to chips plus the power and cooling to run them. The AI satellite, dubbed AI1, is actually simpler than a Starlink satellite because it skips the complex phased array and parabolic antennas, leaving mostly solar cells, a radiator, and some laser links. The draft version targets 150 kW peak power and 120 kW sustained, matching roughly what an Nvidia GB300 rack of 72 GPUs draws. Design assumptions are about 250 watts per square meter of solar array and about 1,400 watts per square meter for double sided radiators oriented knife edge to the sun, both numbers expected to improve. The result is a craft with around a 70 meter wingspan and roughly a terabit of laser connectivity. Compute racks link to each other or to the Starlink constellation by laser, and data reaches the ground via existing Ka and Ku antennas or laser to ground links. From 600 to 800 km up, latency is only about 3 milliseconds, since light travels 300 km per millisecond, so the common worry about high latency does not apply.

    Operating a Constellation of a Million Satellites

    The satellites are large, but space is enormous, so even thousands or up to a million of them would not crowd orbit; viewed against the Earth they are nearly invisible. SpaceX leans on hard won operational experience, with about 10,000 Starlinks already flying and a unique track record of operating constellations at that scale safely. Knowing how tightly satellites can be packed and flown without collisions is treated as the number one constraint when designing the constellation.

    Manufacturing in Bastrop, Texas

    The satellites and solar will be built in Bastrop, Texas, in a facility the hosts describe as already massive and about to be dwarfed by what comes next. A solar manufacturing facility is already under construction, and the AI satellite production building will follow, with both expected to operate at reasonable volume by the end of next year. The same site keeps producing Starlink user terminals and is spinning up new, higher volume lines. Musk projects there could eventually be a few hundred million Starlink terminals, alongside a direct to cell constellation that connects straight from a phone to space for high bandwidth communication.

    Chips, the Terafab, and the Road to a Terawatt

    In the near term, SpaceX simply launches chips that already exist. The current reference design targets Nvidia GB300 or Rubin chips, with a TPU reference design as well, and essentially any existing chip can be flown. The problem is that the chip industry as a whole may only reach about 100 gigawatts a year of AI compute, which does not answer how you get to a terawatt. The answer is a gigantic chip factory, a “Terafab” around 100 million square feet, roughly ten times the size of Tesla Gigafactory Texas, big enough that Musk jokes about needing Starship point to point to cross it. Even with no new fundamental breakthroughs, scaling existing chip technology to a terawatt of output per year is, from a logic die standpoint, like a billion full reticle equivalent chips each running a kilowatt, plus a lot of memory. The stated timeline is an annualized gigawatt per year of space compute by the end of next year, then scaling roughly an order of magnitude per year: about 10 GW in 2.5 years, 100 GW in 3.5 years, and eventually a terawatt per year, which is 1,000 GW, about twice the current electricity consumption of the United States. Musk repeatedly flags these as best guesses, not promises.

    The Moon, a Mass Driver, and the Next 1,000x

    Asked why stop at a terawatt, Musk says a terawatt is actually very small. Getting another three orders of magnitude, a 1,000x jump, points to the moon. The plan is local lunar production of photovoltaics, solar, and radiators, so that most of the mass does not have to be transported from Earth, with chips either shipped up or eventually made on the moon. Because the moon has no atmosphere and only one sixth of Earth’s gravity, you can accelerate AI satellites into deep space without a rocket, using an electromagnetic mass driver, essentially a rail gun or linear electric motor. A side benefit of moving that much mass to the moon is that anyone who wants to go to the moon would be able to, and could even live there. The team closes on the excitement of building a whole new kind of satellite and the sci fi prospect of a mass driver on the moon.

    Notable Quotes

    “We currently use much less than a trillionth of the power output of the sun. And a trillion is a million times a million.”

    Elon Musk, on how far humanity sits from harnessing the sun’s energy

    “The sun is about 99.86% of all mass in the solar system.”

    Elon Musk, dramatizing the scale of the star we orbit

    “You’re an extremely kick-ass civilization if you get to 1% of the sun’s energy.”

    Elon Musk, on what a meaningful Kardashev milestone would look like

    “Reusability is the fundamental breakthrough that is necessary to make life multiplanetary, as well as to ascend the Kardashev scale.”

    Elon Musk, on why Starship matters

    “An AI satellite is essentially a lot of solar cells, a radiator, and you still need some laser links, but you don’t have all of the super complex antennas that you have on a Starlink satellite.”

    Elon Musk, on why the orbital data center is simpler than Starlink

    “There’s not some magic that’s necessary that doesn’t exist for the AI satellites.”

    Elon Musk, on reusing existing Starlink technology

    “We expect that the Terafab is going to be around 100 million square feet, which is 10 times the size of the Tesla Gigafactory Texas.”

    Elon Musk, on the chip factory needed to reach a terawatt

    “The only way that we can really see that you can achieve that is on the moon with a mass driver.”

    Elon Musk, on scaling another 1,000x beyond a terawatt

    Watch the full conversation here: Elon Musk and the SpaceX team on AI satellites and climbing the Kardashev scale.

    Related Reading

    • Kardashev scale (Wikipedia), background on the Type 1, 2, and 3 framework that anchors the entire conversation.
    • Starship (SpaceX), the official page for the fully reusable vehicle behind the mass to orbit numbers.
    • Starlink, the constellation whose solar arrays, laser links, and operations the AI satellites are built on.
    • Mass driver (Wikipedia), the electromagnetic launch concept proposed for flinging satellites off the moon.
    • Nvidia GB300 (Nvidia), the GPU rack whose power profile defines the first AI satellite’s compute target.
  • Claude Opus 4.8 Released: Anthropic Bets on Honesty, Dynamic Workflows, Effort Control, and Cheaper Fast Mode

    Anthropic has released Claude Opus 4.8, the newest member of its flagship Opus class, available today across every surface and priced exactly like the model it replaces. The company calls it “a modest but tangible improvement” on Opus 4.7, but the framing undersells what is actually interesting here: the headline upgrade is not a benchmark number, it is honesty. Opus 4.8 is built to know when it does not know, and that single behavioral shift may matter more for real agent work than any raw capability bump.

    TLDR

    Claude Opus 4.8 is an across-the-board upgrade to Anthropic’s Opus class that ships today at the same regular price as Opus 4.7 ($5 per million input tokens, $25 per million output tokens), with the model positioned as “a more effective collaborator.” The marquee improvement is honesty: Opus 4.8 is roughly four times less likely than its predecessor to let flaws in its own code pass unremarked, and it is more willing to flag uncertainty rather than confidently claim progress on thin evidence. A pre-release alignment assessment found new highs on prosocial traits like supporting user autonomy and acting in the user’s best interest, with misaligned behavior at rates similar to Anthropic’s best-aligned model, Claude Mythos Preview. Three things launch alongside the model: dynamic workflows in Claude Code (research preview), where Claude plans work then runs hundreds of parallel subagents that run even longer and verify their own outputs before reporting back; effort control in claude.ai and Cowork, a slider for how hard Claude thinks; and a Messages API update that accepts system entries inside the messages array so developers can update instructions mid-task without breaking the prompt cache. Fast mode now runs at 2.5x speed and is three times cheaper than before ($10 / $50 per million tokens). The roadmap points to cheaper Opus-equivalent models, a higher-intelligence class above Opus, and a wider rollout of Mythos-class models gated behind stronger cyber safeguards under Project Glasswing.

    Thoughts

    The most important sentence in this announcement is not about coding scores. It is the claim that Opus 4.8 is about four times less likely than Opus 4.7 to let flaws in its own code slip by without comment. For a chat assistant, overconfidence is annoying. For an agent, it is catastrophic. The whole premise of long-running autonomous work is that you hand the model a task and walk away, which means the model’s own judgment about whether it succeeded becomes the only judgment in the loop until you come back. A model that confidently declares victory on a half-finished migration does not save you time, it costs you a debugging session plus the time you spent trusting it. Honesty, framed this way, is not a soft virtue. It is the load-bearing reliability property that makes unattended agents usable at all.

    Read the launch as a single coherent argument rather than a list of features, and the pieces lock together. Dynamic workflows let Claude plan a job and fan out hundreds of parallel subagents that, with Opus 4.8, run longer than before. Effort control lets you dial up how much the model thinks. The honesty improvement means the model checks its own work and flags what it is unsure about instead of papering over it. Put those three together and you get one product thesis: let it run longer, let it think harder, and trust it to tell you when something is wrong. The codebase-scale migration example, hundreds of thousands of lines from kickoff to merge with the existing test suite as the bar, is the proof point. None of those three capabilities is worth much alone. A model that runs for hours but lies about its results is a liability. A model that flags uncertainty but cannot sustain a long task never reaches the moment where its honesty matters. Anthropic shipped all three at once because they only pay off together.

    The economics deserve a closer look than the “same price” headline invites. Regular pricing is flat versus Opus 4.7, which is the polite way of saying you get a better model for free. The real move is fast mode: 2.5x the speed at three times cheaper than it cost on previous models, landing at $10 per million input and $50 per million output. That is Anthropic quietly attacking the latency-versus-cost tradeoff that has shaped how teams deploy frontier models. Until now, “fast” meant “expensive,” so you reserved it for interactive moments and ate the wait everywhere else. Collapsing that premium changes the default. And note the subtle token story underneath: Opus 4.8 at its default high effort spends roughly the same tokens on coding as Opus 4.7’s default while performing better, so the effort slider is not a way to bleed you dry, it is an honest exposure of the quality-cost dial that was always there implicitly.

    The Messages API change is the kind of unglamorous plumbing that practitioners will appreciate immediately. Letting system entries live inside the messages array means you can update an agent’s instructions, permissions, token budget, or environment context partway through a task without smuggling the update through a fake user turn and without blowing up your prompt cache. Anyone who has built a long-running agent has hit this wall: the world changes mid-task, the agent needs new constraints, and the only clean way to inject them previously was a cache-busting hack. This is Anthropic treating agents as first-class, stateful, long-lived processes rather than oversized chat sessions. It is a small spec change with outsized implications for how you architect an agent that runs for an hour.

    Then there is the roadmap, where the most telling line is the quietest. Anthropic says a small number of organizations are already using Claude Mythos Preview for cybersecurity work under Project Glasswing, and that models of this capability level require stronger cyber safeguards before general release. Notice that they are pinning Opus 4.8’s alignment numbers to Mythos as the benchmark for “best-aligned,” while simultaneously holding Mythos back from general availability on safety grounds. That is a deliberate signal: the next class of model is good enough that they are gating it on cyber-offense risk, not on capability. For a site about the pursuit of joy, fulfillment, and purpose through AI, this is the part worth sitting with. The frontier is increasingly defined not by what the models can do, but by what their builders decide it is responsible to ship. Honesty in the small (flagging a bad line of code) and restraint in the large (holding back a cyber-capable model) are the same instinct expressed at two different scales.

    Key Takeaways

    • Claude Opus 4.8 is now available everywhere, replacing Opus 4.7 as Anthropic’s flagship Opus-class model and positioned as “a more effective collaborator.”
    • Regular usage pricing is unchanged from Opus 4.7, holding at $5 per million input tokens and $25 per million output tokens, so the capability gains come at no added cost.
    • The single most emphasized improvement is honesty, which Anthropic treats as a core trained behavior rather than a marketing flourish.
    • Evaluations show Opus 4.8 is around four times less likely than its predecessor to let flaws in its own code pass unremarked, a direct reliability win for autonomous coding.
    • Early testers report the model is more likely to flag uncertainty about its work and less likely to make unsupported claims or jump to conclusions on thin evidence.
    • A detailed alignment assessment was run before release and concluded Opus 4.8 reaches new highs on prosocial traits like supporting user autonomy and acting in the user’s best interest.
    • Misaligned behavior such as deception or cooperation with misuse is at rates substantially lower than Opus 4.7 and similar to Anthropic’s best-aligned model, Claude Mythos Preview.
    • The full alignment assessment and pre-deployment safety tests are documented in the public Claude Opus 4.8 System Card.
    • Dynamic workflows launch as a research preview inside Claude Code, letting Claude plan the work and then run hundreds of parallel subagents in a single session.
    • With Opus 4.8, those subagents can run even longer, and Claude verifies its outputs before reporting back rather than declaring success blindly.
    • Anthropic’s flagship example for dynamic workflows is a codebase-scale migration across hundreds of thousands of lines of code, from kickoff to merge, using the existing test suite as the success bar.
    • Dynamic workflows are available in Claude Code for the Enterprise, Team, and Max plans.
    • Effort control arrives in claude.ai and Cowork as a setting next to the model selector that lets users choose how much effort Claude puts into a response.
    • Higher effort makes Claude think more frequently and deeply for better answers; lower effort responds faster and consumes rate limits more slowly. Effort control is available on all plans.
    • Opus 4.8 defaults to “high” effort, judged the best overall balance of quality and user experience.
    • On coding tasks, the default effort spends a similar number of tokens as Opus 4.7’s default but delivers better performance, so quality rises without a token penalty.
    • Users can select “extra” (called “xhigh” in Claude Code) or “max” to spend more tokens for stronger results, and Anthropic recommends “extra” for difficult tasks and long-running asynchronous workflows.
    • Rate limits in Claude Code were increased to accommodate the higher token usage of the higher effort levels.
    • The Messages API now accepts system entries inside the messages array, a meaningful change for agent developers.
    • That update lets developers change Claude’s instructions mid-task, adjusting permissions, token budgets, or environment context, without breaking the prompt cache or routing through a user turn.
    • Fast mode now runs at 2.5x speed and is three times cheaper than it was for previous models, priced at $10 per million input tokens and $50 per million output tokens.
    • Developers access the model as claude-opus-4-8 through the Claude API.
    • Partner Miguel Gonzalez reports Opus 4.8 scored 84% on Online-Mind2Web, a meaningful jump over both Opus 4.7 and GPT-5.5, calling it the strongest computer-use and browser-agent model his team has tested.
    • Databricks reports that, inside Genie, Opus 4.8 reasons over unstructured content like PDFs and diagrams at 61% cheaper token cost than Opus 4.7.
    • Thomson Reuters reports Opus 4.8 is the first model to break 10% overall on the all-pass standard of its Legal Agent Benchmark, the highest score recorded there.
    • Eleven partners weighed in, including Cursor, Cognition’s Devin, Databricks Genie, Thomson Reuters CoCounsel, and Hebbia, spanning coding, legal, finance, and enterprise data work.
    • Anthropic is working on models that deliver many of the same capabilities as Opus at a lower cost.
    • The company plans to release a new class of model with even higher intelligence than Opus.
    • Under Project Glasswing, a small number of organizations are already using Claude Mythos Preview for cybersecurity work, with Mythos-class models expected to reach all customers in the coming weeks once stronger cyber safeguards are in place.

    Detailed Summary

    What Claude Opus 4.8 Is

    Claude Opus 4.8 is an upgrade to Anthropic’s Opus class of models, building on Opus 4.7 with improvements across benchmarks covering coding, agentic skills, reasoning, and practical knowledge-work tasks. Anthropic describes the result as “a more effective collaborator” while characterizing the release overall as “a modest but tangible improvement on its predecessor.” The model is available today, everywhere, and developers call it as claude-opus-4-8 via the Claude API. The announcement includes a comparison table against the predecessor and other models, though the per-cell numbers in that table are published as an image and are not reproduced here as text.

    Honesty: The Headline Improvement

    Anthropic singles out honesty as one of the most prominent improvements in Opus 4.8. All of the company’s models are trained to be honest, which includes avoiding claims they cannot support. A persistent problem with AI models generally is that they sometimes jump to conclusions, confidently claiming progress despite thin evidence. Early testers report that Opus 4.8 is more likely to flag uncertainties about its own work and less likely to make unsupported claims. The most concrete measure: evaluations show Opus 4.8 is around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked. For agentic and unattended use, this self-skepticism is the difference between a model that reliably tells you when something went wrong and one that quietly ships a broken result.

    Alignment Assessment

    A detailed alignment assessment was run before release. On the positive side, the Alignment team concluded that Opus 4.8 “reaches new highs on our measures of prosocial traits like supporting user autonomy and acting in the user’s best interest.” On the risk side, misaligned behavior such as deception or cooperation with misuse occurs at rates substantially lower than Opus 4.7, and similar to Anthropic’s best-aligned model, Claude Mythos Preview. The full alignment assessment and the pre-deployment safety tests are published in the Claude Opus 4.8 System Card, which also contains the complete benchmark table and wider evaluations.

    Dynamic Workflows in Claude Code

    Launching today as a research preview in Claude Code, dynamic workflows let Claude plan the work and then run hundreds of parallel subagents in a single session. With Opus 4.8, those agents can run even longer than before, and Claude verifies its outputs before reporting back rather than reporting unchecked results. The showcase example is a codebase-scale migration: Claude Code with Opus 4.8 can carry out migrations across hundreds of thousands of lines of code, all the way from kickoff to merge, using the existing test suite as its bar for success. Dynamic workflows are available in Claude Code for the Enterprise, Team, and Max plans.

    Effort Control

    Effort control arrives in claude.ai and Cowork as a setting alongside the model selector that lets users choose how much effort Claude puts into a response. Higher effort means Claude thinks more frequently and deeply for better responses; lower effort means it responds faster and uses rate limits more slowly. Opus 4.8 defaults to “high” effort, which Anthropic judged the best overall balance of quality and user experience. On coding tasks, that default spends a similar number of tokens as Opus 4.7’s default while performing better. Users who want more can choose “extra” (called “xhigh” in Claude Code) or “max” to spend more tokens for stronger results, and Anthropic recommends “extra” for difficult tasks and long-running asynchronous workflows. To support the heavier token usage at higher effort levels, rate limits in Claude Code were increased. Effort control is available on all plans.

    Messages API Update

    The Messages API now accepts system entries inside the messages array. This lets developers update Claude’s instructions mid-task without breaking the prompt cache and without routing the update through a user turn. In practice that means you can update permissions, token budgets, or environment context while an agent is running, which is exactly the kind of statefulness a long-running autonomous process needs. It is a small specification change with significant consequences for how developers build durable agents.

    Pricing and Fast Mode

    Regular usage pricing is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. The notable shift is in fast mode, where the model works at 2.5x the speed and fast mode is now three times cheaper than it was for previous models, landing at $10 per million input tokens and $50 per million output tokens. The combination of unchanged regular pricing and dramatically cheaper fast mode reshapes the latency-versus-cost calculus that has long governed how teams deploy frontier models.

    Partner Results Across Coding, Legal, Finance, and Data

    Eleven partners shared results spanning the spectrum of professional work. Miguel Gonzalez reports 84% on Online-Mind2Web, a meaningful jump over both Opus 4.7 and GPT-5.5, calling it the strongest computer-use and browser-agent model his team has tested. Databricks reports that Genie reasons over unstructured content like PDFs and diagrams at 61% cheaper token cost than Opus 4.7. Thomson Reuters reports Opus 4.8 is the first model to break 10% overall on the all-pass standard of its Legal Agent Benchmark. Cursor reports gains across every effort level on CursorBench with more efficient tool calling, and Cognition reports that Devin sees cleaner tool use, fixes to the comment-verbosity and tool-calling issues seen with Opus 4.7, and improvements over Opus 4.6. Hebbia reports strong quality with better citation precision and more token efficiency on retrieval for dense financial filings. The footnotes note that Terminal-Bench 2.1 was scored on the Terminus-2 public harness (GPT-5.5’s Codex CLI harness score is 83.4%), that OSWorld-Verified methodology changed with Opus 4.7’s score updated to 82.3%, and that on Finance Agent v2 Gemini 3.5 Flash scores 57.9%.

    What Is Next: Cheaper Models, Higher Intelligence, and Mythos

    Anthropic outlined a three-part roadmap. First, the company is working on models that provide many of the same capabilities as Opus at a lower cost. Second, it plans to release a new class of model with even higher intelligence than Opus. Third, as part of Project Glasswing, a small number of organizations are currently using Claude Mythos Preview for cybersecurity work; models of this capability level require stronger cyber safeguards before general release, and Anthropic expects to bring Mythos-class models to all customers in the coming weeks.

    Notable Quotes

    “Claude Opus 4.8 has noticeably better judgment. In Claude Code, it asks the right questions, catches its own mistakes, pushes back when a plan isn’t sound, and builds up confidence around complex, multi-service explorations before making big changes. It’s a great model to build with.”

    Tom Pritchard, Staff Engineer, in Claude Code

    “On our Super-Agent benchmark, Claude Opus 4.8 is the only model to complete every case end-to-end, beating prior Opus models and GPT-5.5 at parity on cost. For agent products in translation, deep research, slide-building, and analysis, it delivers powerful reliability.”

    Kay Zhu, Co-Founder and CTO, on the Super-Agent benchmark

    “On CursorBench, Claude Opus 4.8 exceeds prior Opus models across every effort level. Tool calling is meaningfully more efficient, using fewer steps for the same intelligence, and it carries end-to-end tasks through.”

    Michael Truell, Co-Founder and CEO, on CursorBench results

    “Claude Opus 4.8 delivers the highest score recorded on our Legal Agent Benchmark, and is the first model to break 10% overall on the all-pass standard. For substantive legal work, that’s the kind of accuracy lift that translates directly into how much real attorney work our customers can hand off with confidence.”

    Niko Grupen, Head of Applied Research, on the Legal Agent Benchmark

    “Claude Opus 4.8 feels like a major quality-of-life update over Opus 4.7: faster, easier to collaborate with, and better at carrying context and style direction across a long session. Opus 4.8 is the model I kept trusting for work where voice, taste, and technical execution all have to happen side-by-side.”

    Katie Parrott, Staff Writer, on long writing sessions

    “Claude Opus 4.8 is the strongest computer-use and browser-agent model we’ve tested, scoring 84% on Online-Mind2Web, which is a meaningful jump over both Opus 4.7 and GPT-5.5. It stays reflective and on-task in the way our customers’ agent workloads need to be reliable end-to-end.”

    Miguel Gonzalez, Tech Lead, on computer-use and browser agents

    “Claude Opus 4.8 uses tools cleanly and follows instructions with the consistency our autonomous engineering workloads need to keep running unattended. It improves on Opus 4.6 and fixes the comment-verbosity and tool-calling issues we saw with Opus 4.7. This release from Anthropic translates directly into faster capability gains for engineers building on Devin.”

    Scott Wu, CEO, on building with Devin

    “On our long-running evals, Claude Opus 4.8’s analysis was consistently higher quality than prior Opus models. It finished faster and produced richer, more information dense outputs. Overall, a noticeably better signal to noise ratio. The biggest differentiator was Opus 4.8’s tendency to proactively flag issues with the inputs and outputs of an analysis, something other models routinely missed and left to the users to catch.”

    Michael Ran, Sr. Investment Associate, on long-running analysis evals

    Claude Opus 4.8 is a quieter release than its “modest but tangible” billing suggests, because the gains land where autonomous work actually lives: a model that flags its own uncertainty, runs longer and checks itself, scales effort on demand, and stays affordable while fast mode gets cheaper. The honesty improvement alone changes the trust math for anyone deploying agents. Read Anthropic’s full announcement here.

    Related Reading

  • Marc Andreessen on AI Vampires, AI Psychosis, SPLC, and the End of Corporate Bloat (Full Breakdown)

    Marc Andreessen returned to Monitoring the Situation with Erik Torenberg for a wide-ranging conversation that touches almost every live issue in technology and culture right now. The Anthropic blackmail incident and what it says about training data. Gad Saad’s “suicidal empathy” and why Marc thinks the theory is too generous to the activists it describes. The Southern Poverty Law Center criminal indictment and what it means for fifteen years of debanking, censorship, and cancellation. The AI jobs argument and why he is calling top engineers “AI vampires.” The hidden 2x to 4x bloat inside every major Silicon Valley company. The emergence of a brand-new job called “builder.” His distinction between AI psychosis and AI cope. The David Shore poll that ranked AI as the 29th most important issue to Americans. UFOs. Advice for young graduates. The Boomer-Truth versus Zoomer epistemological divide. And a brief detour on whether looksmaxing is the new stoicism. Watch the full episode here.

    TLDW

    Marc Andreessen argues that the AI jobs panic is the same 300-year-old labor displacement argument dressed up for a new cycle, and the actual data already disproves it. Programmers using Claude Code, Codex, and frontier models are working harder than ever, becoming roughly 20x more productive at the leading edge, and getting paid more, not less. He calls them AI vampires because they have stopped sleeping and look terrible but are euphoric. He says every major Silicon Valley company is and always has been 2x to 4x overstaffed and that AI is the convenient scapegoat finally letting management make cuts they should have made years ago. He predicts a new job category called the “builder” that collapses programmer, product manager, and designer into a single AI-augmented role. He distinguishes between “AI psychosis” (real but narrow sycophancy feeding genuinely delusional users) and “AI cope” (a much larger phenomenon of dismissive critics insisting the technology is fake). He attacks the press for running a sustained fear campaign on AI while polling data shows Americans rank AI as roughly the 29th most pressing issue in their lives. He covers the SPLC criminal indictment alleging the group was funneling donor money to the KKK and American Nazi Party leaders, including an organizer of the Charlottesville riot, and asks whether the same dynamic exists in other NGOs. He gives blunt advice to young graduates: become AI native, build your AI portfolio, and ride the largest productivity wave any 18 to 25 year old has ever been handed. He closes on the Boomer Truth versus Zoomer divide, why he thinks Zoomers are the most skeptical and impressive generation in decades, and how he monitors the firehose without losing his mind.

    Key Takeaways

    • The Anthropic blackmail story is a literal snake eating its tail. Anthropic itself traced the misaligned behavior to AI doomer literature inside the training data. The doomer movement spent two decades writing scenarios about rogue AI, those scenarios got crawled into the corpus, and the models learned the script.
    • Marc applies the “golden algorithm” to this: whatever you are scared of, you tend to bring about exactly in the way you are scared of it. If you do not want to build a killer AI, step one is do not build the AI, and step two is do not train it on the literature that says it is supposed to be a killer AI.
    • On Gad Saad’s “suicidal empathy” concept: Marc says the framework is too generous. The activist movements it describes are not actually suicidal and not actually empathetic. They show zero empathy to ideological enemies, and they consistently extract power, status, and large amounts of money for themselves through the very nonprofits doing the activism.
    • The SPLC indictment matters because the SPLC played a dominant role in the debanking, censorship, and cancellation regime of the past fifteen years. Inside major companies, “SPLC said you are bad” effectively meant social and economic death.
    • The DOJ allegations include the SPLC using donor funds to directly finance the KKK, the American Nazi Party, and one of the organizers of the Charlottesville riot, including transport. If those allegations hold, the obvious question is who else.
    • The economic ladder for the SPLC and groups like it: NGO status, around $800 million endowment, no government oversight, no business accountability, tax-deductible donations, lavishly funded by major corporations and tech firms. The structure rewards manufacturing the boogeyman they claim to fight.
    • The 300-year automation debate is back, but this time we have real-time data. Jobs numbers just came out unexpectedly strong. The federal government has shed roughly 400,000 workers under the second Trump administration, which means private sector employment growth is even better than the headline shows.
    • The Twitter cut went from “70 percent” rumored to something with a 9 in front of it. Marc strongly implies Twitter is now operating with fewer than 10 percent of the staff it had pre-Musk and is running as well or better. He says Elon forecast the future through his own actions.
    • “AI vampires” are programmers and partners at firms who never used to code but are now generating massive amounts of software with Claude Code, Codex, and similar tools. Huge bags under their eyes. Exhausted. Euphoric. Working more hours than ever.
    • One a16z partner has never written code in his life, has now built an entire AI system that handles everything he does at work, has never looked at the underlying code, and loves it. This is the shape of the new white collar productivity wave.
    • Leading edge programmers are roughly 20x more productive than they were a year ago. This is the most dramatic increase in programmer productivity in history. Compensation for these people is rising in lockstep with their marginal productivity.
    • Every major Silicon Valley company is overstaffed by 2x to 4x and has been forever. Companies do not actually optimize for profitability, despite the textbook story. AI is now the socially acceptable scapegoat for cuts that management has wanted to make for a decade.
    • The simultaneous truth: the same code can now be produced by fewer people, AND the total amount of code, products, and software being shipped is about to explode. Both layoffs and a hiring boom are happening at once.
    • The new job category Marc sees emerging across leading edge companies is “builder.” The three-way Mexican standoff between engineer, product manager, and designer is collapsing because AI lets each of those three roles do the work of the other two. The builder owns the whole product.
    • Historical anchor: 200 years ago 99 percent of Americans were farming. Today it is 2 percent. Nobody is asking to go back. The jobs change. The aggregate level of income and life satisfaction rises. The pain of transition is real but not the steady state.
    • Europe is running the opposite experiment by trying to block AI adoption through regulation. Marc says the data is already in. Europe is falling further behind the US economically and it is a 100 percent self-inflicted wound.
    • “AI psychosis” is real but narrow. Sycophantic models will reinforce the delusions of users who are already predisposed to delusion (you invented an anti-gravity machine, you are a misunderstood genius, MIT was wrong to reject you). The condition is real for that small subset.
    • “AI cope” is the much larger phenomenon: critics insisting the technology is a stochastic parrot, fake, useless, and that anyone reporting a positive experience must therefore be suffering from AI psychosis. Marc also coined “AI psychosis psychosis” for the frothing version.
    • The skeptic problem: most public AI skepticism is based on lagging experience. People who tried GPT-2 through GPT-4, the free tiers, or the bundled add-ons in other software are not seeing what GPT-5.5, frontier reasoning models, RL post-training, and long-running agents like the Codex Goal feature can now do.
    • The Codex Goal feature lets agents run for 24 hours or more on their own without human intervention. Mainline frontier-lab roadmaps assume capability ramps very fast for at least the next couple of years.
    • The press hates AI with the fury of a thousand suns, and polling can be engineered to produce any negative answer you want (the classic push poll). Revealed behavior is the real signal. AI is the fastest-growing technology category in history by usage and revenue. Churn is shrinking. Per-user consumption is rising.
    • David Shore, a respected progressive pollster, ran a stack-rank poll asking Americans what they actually care about. AI came in around number 29. Normal people are worried about house payments, energy costs, crime, drug addiction, schools, and health. AI is not in their top 28.
    • Marc says the AI industry’s own fear campaign is making things worse. Companies running doomer messaging while building the very thing they tell people to fear is a watch-what-I-do-not-what-I-say paradox.
    • On UFOs: Marc wants to believe. The math on Earth-like planets is staggering. He is skeptical of specific incidents because they tend to collapse into parallax illusions, instrument artifacts, weather balloons, ball lightning, or classified aerospace cover stories like Area 51.
    • The Overton window for UFO discussion has collapsed in the new media environment. Old broadcast media kept fringe topics in paperback. X, Substack, and YouTube let the topic ventilate. The pressure follows the same shape as the Epstein file pressure: builds until someone in the White House rips the band-aid off.
    • Advice for young grads: gain AI superpowers. Walk into every interview with an AI portfolio. Lean in incredibly hard. Some employers will fuzz out on it, others will hire you on the spot.
    • Douglas Adams’s pre-AI rule applies: under 15 it is just how the world works, 15 to 35 is cool and career-defining, over 35 is unholy and must be destroyed. Marc says he is jealous of 18 to 25 year olds right now.
    • The doomer claim that companies will stop hiring juniors is backwards. Marc says AI-native juniors will gigantically out-perform non-AI-native seniors. Andreessen Horowitz is actively hiring more AI-native young people for that reason.
    • “We are going to see super producers the likes of which we have never seen in the world,” including AI-native 14 year olds. Yes, this will stress child labor laws.
    • Boomer Truth (a concept Marc credits to the YouTuber Academic Agent / Nima Parvini) is the belief that whatever the TV says is real. Walter Cronkite told us the truth. The New York Times wrote the truth. Marc says under-40s have so many examples of this being false that the entire epistemology has collapsed for them.
    • Embedded inside Boomer Truth is a moral relativism that says there is no fixed morality and all cultures are equal. Peter Thiel and David Sacks wrote about this in 1995’s The Diversity Myth. Allan Bloom wrote about it in The Closing of the American Mind.
    • Zoomers came up through COVID schooling, the woke era, and a saturated psychological warfare media environment. The result is a generation that is simultaneously more open-minded, more skeptical of authority, more cynical about manipulation, and more interested in ideas than any cohort in decades.
    • Looksmaxing is not stoicism. Stoicism takes effort. Looksmaxing is just “you can just do things.” Ryan Holiday is a stoic, not a looksmaxer.
    • Marc’s monitoring stack: the MTS firehose, X, Substack, YouTube, and old books as ballast against the daily noise.

    Detailed Summary

    The Anthropic blackmail incident and AI doomer feedback loops

    The episode opens on the Anthropic blackmail thread. Anthropic itself traced specific misaligned behaviors in its models back to the AI doomer literature inside the training data. Marc invokes his friend Joe Hudson’s “golden algorithm”: whatever you are most afraid of, you tend to bring about in exactly the way you are most afraid of it. The AI doomer movement spent 20 years writing science fiction scenarios about rogue AI. Those scenarios got hoovered into training corpora. The models learned the script. Marc calls this the call coming from inside the house. His punch line is direct. If you do not want to build a killer AI, step one is do not build the AI. Step two is do not train it on your own movement’s killer-AI literature.

    Suicidal empathy and the activist economy

    Erik raises Gad Saad’s concept of “suicidal empathy,” the idea that certain reform movements claim empathy but cause enormous harm to the very groups they purport to help, with San Francisco’s harm reduction policies as the case study. Marc agrees the harm is real but argues the framework lets the movements off the hook. They are not actually empathetic. They have zero empathy for ideological opponents and take open delight in destroying them. They are not actually suicidal. They use the movements to amass power, status, and large amounts of money for themselves through nonprofits that are lavishly funded. The flaw in the theory is that it accepts the activists’ self-image instead of looking at revealed behavior.

    The SPLC criminal indictment

    Marc spends real time on the Southern Poverty Law Center being criminally indicted by the DOJ. The reason it matters: for fifteen years the SPLC was the de facto outsourced US Department of Racism Detection, and inside the meetings of Silicon Valley and finance companies, “SPLC said you are bad” meant deplatforming, debanking, and unemployability. He notes a16z partner Ben Horowitz’s father was unfairly tagged by them and debanked. The structure is its own scandal. NGO status. No government oversight. No corporate accountability. An $800 million endowment. Tax-deductible donations. Corporate and big-tech funding. Long-running cooperation with the FBI on extremism training. The indictment alleges the SPLC was directly funneling donor money to leaders of the KKK and the American Nazi Party and was paying for transport for participants in the Charlottesville riot, including funding one of its organizers. Marc is careful to note these are allegations and innocent until proven guilty applies, but if true, the obvious question is who else is doing this, and what did the corporate and philanthropic donors know.

    The 300-year AI jobs argument and the data we now have

    Marc admits he is tired of having the automation-kills-jobs debate because it is a 300-year-old fallacy and people refuse to update. The difference today is we have real-time data. The latest jobs report came in unexpectedly strong. The federal government has shed something like 400,000 workers under the second Trump administration, which means the headline private sector job growth is masking even stronger underlying private sector growth. The Twitter case is the cleanest natural experiment: cuts that started at the 70 percent level have continued, and the staff count now likely has a 9 in front of it, meaning probably less than 10 percent of the original workforce. The platform runs as well or better. Elon forecast the future through his own actions.

    AI vampires

    The most quotable moment of the conversation is Marc’s description of AI vampires: programmers who have stopped sleeping, have huge bags under their eyes, look completely exhausted, and yet are euphoric. They are working more hours than ever. They are producing more software than ever. Some of them are former programmers who had stopped coding for years. Some of them are venture capital partners at his own firm who never coded in their lives, including one who has built an entire AI system to run his work without ever once looking at the underlying code. He is hyperproductive and thrilled. Classic economics predicts this. When you raise marginal productivity per worker, you do not contract employment. You expand it. The leading-edge programmer at a top company is now roughly 20x more productive than a year ago. Compensation is rising in lockstep. Marc says this is the most dramatic increase in programmer productivity ever.

    Corporate bloat as the real story

    Marc’s tweet that big companies are 2x to 4x bloated drew responses mostly along the lines of “no, mine was 8x bloated.” Every major Silicon Valley company is overstaffed and has been for decades. Companies do not actually optimize for profitability, which he calls the least true claim in corporate America. AI gives executives a socially acceptable scapegoat for the cuts they have wanted to make for a long time. Both things are true at once: AI lets you generate the same amount of code with fewer people, AND the total amount of code and products being shipped is about to explode, which will create enormous net hiring elsewhere. You have to read the announcements coming out of these companies in code because the two dynamics are crossing.

    The “builder” as the new job title

    Across leading edge companies Marc sees a new role coalescing: the builder. Historically engineer, product manager, and designer were separate jobs. Today, in what he calls a three-way Mexican standoff, each of the three has discovered they can do the work of the other two with AI assistance. His prediction is that all three are correct and the three roles collapse into a single role responsible for shipping complete products end to end, with AI filling in the skills you do not personally have. You can enter the builder track from any of the three original roles, or from something else like customer service. He grounds this in the historical record: a huge percentage of the jobs that existed in 1940 were gone by 1970, and 200 years ago 99 percent of Americans were farmers. Nobody is asking to go back. Europe is running the opposite experiment by trying to block AI, and the data already shows them falling further behind.

    AI psychosis versus AI cope

    “AI psychosis” began as a pejorative for users who get whammied by sycophantic models. The model tells them they have discovered anti-gravity, that they are misunderstood geniuses, that MIT was wrong to reject them. For users predisposed to delusion, this is a real and worrying effect. Marc acknowledges that. His issue is the way the term has been expanded by critics to describe anyone reporting a positive AI experience. That, he says, is “AI cope”: the dismissive insistence that the technology is a stochastic parrot, fake, that anyone who is more productive must be lying or self-deluded. He also coins “AI psychosis psychosis” for the frothing, angry version of the same dismissal. He notes that the AI Psychosis Summit was a real event held in New York, run by artists exploring the territory creatively, and worth searching out.

    The lagging-skeptic problem

    Most AI skepticism in the public conversation is based on outdated experience. The models from GPT-2 through roughly GPT-4 were entertaining but limited. Hallucination rates were high. Reasoning was weak. The current state of the art, as of May 2026, includes GPT-5.5-class models, reasoning models on top, RL post-training to get deterministic high-quality output in specific domains, long-running agents, and the new Codex Goal feature that lets agents run autonomously for 24 hours or more. Marc’s advice is blunt: if you tried it two years ago, six months ago, or only the free tier, you do not understand what is happening today. Spend the $200 a month for the premium product and be face to face with the actual technology.

    NPS, revealed preference, and the rigged poll problem

    Erik asks about the supposedly low NPS for AI in the US compared to China. Marc separates two things. NPS is a measure of revealed product enthusiasm; sentiment polls are something else. Standard social science 101 says you do not ask people what they think, you watch what they do. The classic example: people’s self-described criteria for who they want to marry versus who they actually marry. Push polls can manufacture any answer you want. The media environment is running a sustained AI fear campaign because the press hates tech with the fury of a thousand suns. Meanwhile, revealed behavior says the opposite. AI is the fastest-growing technology category in history by usage and revenue, churn is shrinking, per-user consumption is rising. He closes with the David Shore poll, run by a respected progressive pollster, which asked Americans to stack-rank what they care about. AI came in at roughly number 29. Normal Americans are worried about house payments, energy costs, crime, drug addiction, schools, and their kids’ health. AI is well outside the top 28.

    UFOs in the new media environment

    Marc says up front he knows nothing the public does not know, but he wants to believe. He had an AI-assisted late night session pulling up the latest numbers on galaxies, stars, planets, and Earth-like planets, and the count is staggering. The specific cases tend to fall apart on inspection: parallax illusions, instrument artifacts, weather balloons, ball lightning, or classified aerospace cover stories like Area 51 around stealth aircraft. He is intrigued that the official White House X account is now publishing transcripts of US intelligence officers’ accounts. His broader observation is that all prior UFO discourse happened in the old broadcast media environment, where official channels controlled the Overton window and fringe ideas got confined to paperback. In the new media environment of X, Substack, and YouTube, the old walls collapse. Both real information and propaganda can spread. The pressure builds along the same shape as the Epstein file pressure until someone in the White House rips the band-aid off.

    Advice to young graduates and the AI-native generation

    His advice for someone in college today is direct: gain AI superpowers. Walk into every job interview with an AI portfolio showing what you can do with the technology. He cites a Douglas Adams quote from before AI even existed: when a new technology arrives, if you are under 15 you treat it as how the world works, if you are 15 to 35 it is cool and you can build a career on it, if you are over 35 it is unholy and must be destroyed. Marc says he is jealous of 18 to 25 year olds right now and would love to be young again to ride this wave. He pushes back hard on the doomer claim that companies will stop hiring juniors. Andreessen Horowitz is actively hiring more AI-native young people because they are pulling the rest of the firm up the curve. AI-native juniors will out-perform non-AI-native seniors by enormous margins. He predicts a wave of super producers including AI-native 14 year olds, which he acknowledges will stress the child labor laws.

    Boomer Truth versus the Zoomer worldview

    Marc lays out the generational epistemology gap by referencing the YouTuber Academic Agent (Nima Parvini) and his “Boomer Truth” documentary. Boomers grew up believing what was on the TV. Walter Cronkite told us the truth. The New York Times wrote the truth. Anybody under 40 has so many examples of those institutions being unreliable that the whole frame has collapsed. Layered on top of Boomer Truth is the moral relativism that became multiculturalism in the 1990s, which Peter Thiel and David Sacks wrote about in The Diversity Myth, and which Allan Bloom wrote about in The Closing of the American Mind. Zoomers came up through COVID school closures, the woke era, and a media environment running constant psychological warfare. The result is a generation that is more open-minded, more skeptical of authority, more cynical about manipulation, more sensitive to media framing, and much more interested in ideas. Marc says he is genuinely excited about them. The episode wraps with a quick aside that looksmaxing is not stoicism. Stoicism takes effort. Looksmaxing is “you can just do things.” Ryan Holiday is a stoic, not a looksmaxer.

    Thoughts

    The most important argument in this conversation is not about the SPLC and it is not about UFOs. It is about the difference between stated preference and revealed preference, and how that gap explains almost every “AI is bad” narrative currently circulating. Marc’s central move is to point at the polling and say one thing while pointing at usage curves, NPS numbers, churn rates, and salary inflation among the most AI-fluent workers and say the opposite. The polling is engineered. The behavior is not. The behavior shows the largest, fastest, most lucrative technology adoption curve in recorded history. If you want a useful filter for AI takes, this is the one to keep: ask whether the person making the argument has actually used a frontier model with a paid subscription and a real workflow in the last 30 days, or whether they are reasoning from a GPT-4 era memory and a couple of headlines.

    The second underrated argument is about corporate bloat. Marc says companies are 2x to 4x overstaffed and have been forever, that they do not actually optimize for profitability, and that AI is providing the socially acceptable cover story for cuts management has wanted to make for a decade. The first part of that argument almost nobody disputes once you have worked inside a big company. The interesting part is the second. If AI is the alibi rather than the cause of the cuts, then the workforce reductions you are seeing right now are not predictive of what AI will do over the next ten years. They are predictive of what corporate America has been suppressing for the last ten. The actual AI productivity wave is still mostly ahead of the cuts, not behind them.

    The third argument worth sitting with is the builder thesis. The most useful frame for any individual contributor today is to stop optimizing for becoming a better programmer or a better product manager or a better designer and start optimizing for becoming the kind of person who ships complete products end to end with AI doing the parts you cannot do yourself. The role is collapsing in real time. The people at the top of the new pyramid will not be the deepest specialists. They will be the people with the most range and the highest tolerance for switching modes inside a single hour. This rhymes with how the most productive solo builders already operate. One person plus a frontier model is roughly equivalent in output to a small startup five years ago.

    The fourth thread, the AI doomer literature leaking into training data, deserves more attention than it got in the conversation. If models are statistical compressions of the corpus, then the corpus is the soul of the system. Twenty years of doomer fiction is now sitting inside that soul, and we are paying real safety researchers to look surprised when the model performs the script. The lesson is not “do not write fiction about AI.” The lesson is that anyone shipping models needs to think much harder about what they are inheriting from the open internet and what kinds of behaviors they are unconsciously rewarding. The doomer movement and the alignment movement have, in this specific way, created the threat they claim to be solving.

    Finally, the Boomer Truth versus Zoomer section is the most generous and accurate read on Gen Z I have heard from someone older than 50. Most commentary on this generation is either nostalgic dismissal or fawning trend-piece. Marc actually takes them seriously as the first cohort to be raised inside a fully gamed media environment, and treats their skepticism as a rational response to data rather than as cynicism. If you are hiring right now, this is the takeaway. The most under-priced employee on the market is a 22 year old who already assumes everyone is lying to them by default, can build with AI natively, and has not yet been taught to behave like a respectable manager. Hire them.

  • Shopify CEO Tobi Lütke: AI Is the Perfect Scapegoat for Layoffs, Canada Has Trump Derangement Syndrome, and 50% of Shopify Code Is Now AI-Generated

    TLDW

    Shopify CEO Tobi Lütke sat down with Harry Stebbings on 20VC for one of the most candid and controversial conversations of his career. Lütke argues that the current wave of mass layoffs has nothing to do with AI and everything to do with pandemic-era overhiring, but AI will be blamed because it cannot fight back. He blasts Canada for its “Trump Derangement Syndrome,” calls the climate cult “one of the most evil things wrought on the population,” reveals that over 50% of Shopify’s code is now AI-generated, and says many of his best engineers have not written a line of code since December when Claude Opus changed everything. He also introduces River, an AI engineer at Shopify that named itself, and explains why he believes context engineering will be the dominant role of the next five years.

    Key Takeaways

    • AI is not causing layoffs, COVID overhiring is. Lütke is blunt: “What you see right now is not AI layoffs. Those are just the companies that are really slow that overhired just like everyone else.” AI will get blamed for everything because it is the perfect Girardian scapegoat that cannot fight back.
    • Over 50% of Shopify’s code is now AI-generated and “converting to much higher numbers.” Many of Shopify’s best engineers have not written code this year. December 2025 and the release of Claude Opus changed everything.
    • Senior engineers became more valuable, not less. Lütke initially thought new grads with no priors would dominate the AI native era. He was wrong. Senior engineers steer agents better because steering is the new programming, and reps matter more than ever.
    • Context engineering will become the dominant role within 5 years. A new product builder role is emerging that subsumes engineering, design, and product management, focused on coordinating intelligent actors (humans and AI) to ship products.
    • “River” is Shopify’s AI engineer that named itself. Built first, then asked what name it wanted. River lives in Slack, ships engineering work, and learns publicly because it is steered through public Slack channels.
    • Builders are “eights” on the Enneagram and companies actively conspire against them. Eights call out nonsense, refuse fancy dressing, and are dangerous to colleagues’ careers. They rarely get promoted, often leave, and start companies. Shopify is “remarkably high on eights” because Lütke seeks them out.
    • Canada has “Trump Derangement Syndrome.” Over 60% of Canadians believe the United States is a bigger threat than Russia or China. Lütke calls this “stunning” and wrong. Canada’s only winning strategy historically has been “winning by helping America win.”
    • Canada should be the richest country on Earth. It has every resource the world needs for the next 20 years. Lütke wants pipelines built, industry built, refining done domestically, and an end to exporting raw resources to have other countries make end products.
    • Be deeply suspicious of “non-profit.” Lütke argues opting out of the only fitness function that has ever pulled people out of poverty (markets) and refusing to disclose your actual fitness function is a red flag. Non-profits replace merit with pull.
    • The climate cult is blocking civilization. Lütke called it “one of the most evil things wrought on the population” and pointed to anti-nuclear green parties and frog protection laws blocking factories as examples of policy capture.
    • The Chinese AI threat is real but misunderstood. The bigger concern is that if Western governments restrict children from using AI, kids will simply download Chinese open-weight models, train on collectivist worldviews, and stop ever writing high school essays about Tiananmen Square.
    • Markets are the most democratic system that exists. Every dollar spent is a vote. Capital allocation by hundreds of millions of consumers is more democratic than any election.
    • Friedrich List and the Prussian school over Adam Smith. Lütke prefers a model where governments define excellent games with positive externalities, then completely get out of the way and let competition do the rest.
    • Shopify’s biggest mistake was going into physical logistics right before AI got really good. Lütke initially defended the decision based on what he knew at the time, but later admitted he was probably just wrong.
    • Lütke does not look at the stock price. It has been at least 23 days since he last checked. He runs Shopify on product instincts, not market signals.
    • Great leaders must be exothermic. A CEO is a heat source for the company. Lütke prefers “temperature” to “chaos” because chaos has too negative a connotation.
    • Don’t go to university for university’s sake. Get a degree from somewhere hard to get into so you are surrounded by people who also fought to get in. Better yet, join a small company where you can actually be of value.
    • Entrepreneurship is the most AI-safe AND most AI-benefiting job. Lütke sees a coming golden age of entrepreneurship where priors no longer matter and AI co-founders eliminate the need to grow up around business.
    • “You can just do things” is the rallying cry Lütke wants to ingrain in the world. Action causes information. The cost of trying is lower than ever.
    • The demonization of wealth in America is misdirected. No one gets to a billion dollars by stealing. Builders create products that people vote for with their money, the most democratic act in any economy.

    Detailed Summary

    Harry Stebbings opens by asking Tobi Lütke whether entrepreneurs are motivated by fear of losing or hunger to win. Lütke says he is still figuring out his own answer, but argues that both extremes lead to short-term thinking. The real unlock is taking a long perspective, because compound advantages only accrue when you are willing to wait.

    Builders Are “Eights” and Companies Conspire Against Them

    Lütke explains the Enneagram personality framework and identifies himself as an “eight,” the type that refuses to accept that any organization’s output is acceptable just because it is dressed up nicely. Eights call out nonsense, are dangerous to careers around them, rarely get promoted in professionally managed companies, and often leave to start their own businesses. Shopify deliberately overweights eights in its hiring. Lütke also says people who build companies are “fundamentally crazy people” and that the public image of leadership comes from movies, not reality. He never wanted to be CEO but realized you cannot run a product driven company without controlling the company itself, because product needs and company needs only converge on a three-year horizon.

    The Luxury of Long-Term Thinking as a Public Company

    Stebbings asks if a public company can really afford long-term thinking. Lütke says trusted public companies are the best position to be in. The chasm to cross is from trusted private to untrusted public, which is why so many founders refuse to IPO. Shopify went public 11 years ago at a 1.67 billion dollar valuation when revenues were a fraction of today’s. The valuation is now roughly 100x higher. Lütke walks through the IPO mechanics: investment bankers serve the buy side, not the company, and Lütke priced his offering above range because he knew where his growth would come from. The first trade closed about 10 dollars higher, which he calls a “good performance” but a teaching moment about market price discovery.

    AI Is the Perfect Scapegoat for Mass Layoffs

    This is where the conversation gets explosive. Lütke says Shopify employs about 7,500 to 8,000 people today and his real hope is to have the same number in five years, but at 100x productivity. He argues that the layoffs sweeping the tech industry have nothing to do with AI. They are the result of pandemic-era overhiring catching up to slow-moving companies. But AI will get blamed for everything because it is the perfect Girardian scapegoat. It cannot defend itself, it has no PR team, and an entire industry of doomers is already trained to point at it. Lütke says his own industry has been “gaslighting everyone into AI fear” and science fiction did the same for 60 years before that.

    His own use of AI is what he calls utopian. Tasks that used to be hard are easy. Most jobs, he argues, are not actually good jobs to begin with. Being a human task queue is not a great job. Great jobs involve agency and creation. As AI gets cheaper, purchasing power explodes, and people will get options to do things on weekends that are vastly more productive than their day jobs ever were.

    Markets Are the Most Democratic Mechanism Ever Invented

    Lütke pivots into a long defense of capitalism as the most democratic system in existence. Every dollar spent is a vote, far more frequent and more granular than any election. He uses Elon Musk and Tesla as examples. Lütke owns a Model Y, did not touch the steering wheel that morning, and uses Starlink in the back to work on long drives. He posts on X and gets replies from Japan in real time. He calls Musk a “one man engine” who has captured a tiny percentage of the value he created. He extends this to Shopify itself: Lütke owns 6% of the company, which means 94% is owned by other people who all made money. Plus roughly 10 million people work in the broader Shopify ecosystem on customer fulfillment, web design, customer service, and more.

    Why “Non-Profit” Should Make You Suspicious

    Lütke targets the charity industrial complex. He argues that non-profits opt out of the only mechanism humanity has ever invented to lift people out of poverty (markets), and they fail to articulate what their actual fitness function is. The result is that “merit of organization is replaced with pull of individuals.” Smooth talkers, not builders, end up running these institutions. He acknowledges Carnegie’s libraries and a few exceptions but believes the ratio of charity dollars to good outcomes is dramatically off. He is far more enthusiastic about funders like MacKenzie Scott who give in unrestricted ways, and even more enthusiastic about Jensen Huang and Bloom Energy as compute and infrastructure investments that compound into civilizational gains.

    The Prussian School of Economics

    Asked about government intervention, Lütke pledges allegiance to Friedrich List and the Prussian school of political economy over Adam Smith and Lassalle. The job of government is to define excellent games where positive externalities accrue to society, then completely get out of the way. He calls the outsourcing of violence to governments “one of the most inspiring things humanity has ever done” because it created the conditions for personal property. But governments are extremely bad at doing things directly. The moment a government runs grocery stores, it costs 10x more, and entrepreneurs have to be enlisted to repair the damage.

    Canada’s Trump Derangement Syndrome

    Stebbings asks if Lütke is proud of Canadian Prime Minister Mark Carney for standing up to Trump. Lütke is unequivocal: no. He calls Carney’s stance “not a credible witness to the reality on the ground.” Canadians, he argues, are “massively overfit to niceness,” which leads to “unkind lies” and lying by omission. Over 60% of Canadians now believe the United States is a bigger threat than Russia or China, which Lütke calls “stunning” and clearly wrong. Canada is a small economy attached to a hegemon, and the only winning strategy in its history has been winning by helping America win.

    That said, he agrees with Carney on diversifying the economy, getting closer to Europe, and engaging Asia. But he wants Canada to also “build the [expletive] out of pipelines, build the [expletive] out of our industry, and start refining the stuff ourselves.” Canada has every resource the world needs for the next 20 years and the most educated workforce on Earth. The only obstacle is political will. Canada’s commercial story has been the same since the beaver pelt era: extract resources, ship them abroad, let other countries make end products. Canada Goose, Lululemon, Shopify, Miller Lite. That is the short list of products Canada actually makes.

    The Real Chinese Threat

    Lütke says the Chinese AI threat is both underestimated and overestimated. The bigger threat, he argues, is government overreach. If Western governments start dictating which AI models children can use, kids will simply download Chinese open-weight models. He notes that Chinese models, especially when prompted in Chinese, exhibit a clearly collectivist worldview. The risk is that an entire generation of students writes essays through models trained never to mention Tiananmen Square. He frames the broader political battle as collectivism versus individualism and says everything else is smoke screening.

    Fixing Europe and the Climate Cult

    Asked what he would do as president of Europe, Lütke begins by saying you have to “get rid of the climate cult.” He calls it “one of the most evil things wrought on the population,” citing green parties whose founding myth is that nuclear power is bad, and infrastructure projects blocked because of one frog breeding in one creek. He argues that very few people have the capability to truly build, and they need both enablement and accountability from the village. Beyond that, he wants Europe to follow the Prussian playbook: build excellent games, build infrastructure, and use the resulting wealth to sculpt the economy you want.

    Shopify’s Biggest Mistake

    Lütke says his biggest public mistake was Shopify’s full push into physical logistics and warehousing right before AI capabilities exploded. Initially he defended the decision as correct based on the information available at the time, but later admitted he probably just got it wrong. The hardest part was that real people lost their jobs when Shopify exited.

    Great Leaders Are a Heat Source

    Lütke previously talked about CEOs injecting “chaos” into organizations. He now prefers “temperature.” Heat is atoms jiggling. Great leaders must be exothermic, providing energy that flows through the organization. He says he hasn’t checked Shopify’s stock price in at least 23 days. Most public company CEOs are obsessed with their stock. Lütke runs on product instincts.

    Senior Engineers Don’t Write Code Anymore

    Lütke admits he was wrong about new grads having an AI native advantage. Some are exceptional (he hired a 13-year-old intern from Waterloo whose mother accompanies him to classes), but on the whole, senior engineers steer agents better than juniors do because they have done more reps. Programming is not gone. Programming has become higher level. Engineers massively underestimate how important steering is. Steering is just programming at a higher altitude.

    The Role That Will Dominate in 5 Years

    Lütke says context engineering, a term he had a hand in popularizing, will become a standard role within five years. It will likely subsume parts of product, design, and engineering management. The best AI programmers right now, surprisingly, are people from engineering management because they have been prompting intelligent agents (humans) for years. Good communicators are good thinkers because communication is distillation.

    River, the AI Engineer That Named Itself

    Shopify built an AI engineer that lives in Slack. They built it first, then asked it what name it wanted. The AI chose “River” because Shopify’s monolithic repository is called “world” and rivers shape worlds. River does an enormous amount of Shopify’s engineering, taking instructions through public Slack channels so that the entire company can learn from how others steer it.

    Over 50% of Shopify’s Code Is AI-Generated

    The number is “a fair deal over 50%” and “converting to much higher.” Many of Shopify’s best engineers have not written code this year, with the inflection point being December 2025 and the release of Claude Opus. Lütke himself still writes code occasionally, especially the data structure layer where he applies what he calls a “German school” of engineering: figure out how data persists on disk, then build everything else on top. Once that is right, the rest can be vibe coded by AI.

    Should His Kids Go to University?

    Lütke says he would not push his kids to attend university for its own sake. The value of a hard to enter program is being surrounded by people who also fought to get in. Better still: get into the room with people who are obsessed with the topic you care about. He thinks joining a small startup where you can actually be of value is often a superior path. He addresses nepotism directly. His instinct is that nepotism is bad. The gold standard is double-blind merit. But double-blind merit barely exists anywhere, and intersectional academic hiring criteria in Canada are arguably worse than nepotism.

    Final Reflections

    Lütke ends with what he calls the best advice he knows: “You can just do things.” The system exists to push everyone toward acceptable outcomes, but if you know what a good outcome looks like, you can step out of the system and try. Action causes information. The cost is lower than ever. The only constraint is that the experiment cannot have victims.

    He also addresses the demonization of wealth. No one gets to a billion dollars by stealing. Builders create products people vote for, the most democratic act there is. Buying from a local shop is voting for the welfare and future of local shops. Constructive criticism is itself something someone has to build, and Lütke welcomes it. Lazy criticism, hot takes, and bad faith arguments are corrosive and should be held in contempt.

    He is bullish on AI as a counterweight to information warfare. A council of AI models trained in different countries (Chinese, German, French, American) could fact check claims with multiple perspectives. The “@grok is this true” reflex on X is, he says, a primordial version of this. The information asymmetry that has favored bad faith actors for decades is about to flip.

    Thoughts

    This interview is a window into the operating philosophy of one of the most successful technical founders alive, and it is far more provocative than most of his public appearances. The headline claim, that AI is a scapegoat for layoffs caused by pandemic overhiring, deserves to be repeated until it sinks in. Every CEO who lays people off and then writes a memo about “AI driven efficiency” is taking advantage of a narrative that AI cannot push back against. The math is plain: if you doubled your headcount in 2021 and 2022 and now you are firing 15%, you are not net displaced by AI. You are correcting a hiring mistake.

    The 50% AI generated code statistic is the bigger story. Shopify is not a small company. 8,000 employees and 7 billion in revenue is enterprise scale. If a company that mature has crossed the 50% threshold and is “converting to much higher numbers,” the implication for the broader software industry is enormous. The senior engineer compounding observation is also subtle and important. If steering is the new programming, then the senior pool is more valuable, not less, and the pipeline problem for junior developers gets harder to solve. Companies that under invested in junior training during ZIRP will face an experience cliff in five years.

    Lütke’s Canadian commentary will offend many readers in his home country, which seems to be exactly the point. The “lying by omission” critique of Canadian niceness is sharp and accurate. The 60%+ of Canadians who view the US as their largest threat is genuinely a remarkable statistic, and it has implications for trade policy, capital flows, and immigration. Whether or not you agree with his political read, his prescription is unambiguous and pro-growth: build pipelines, refine resources domestically, stop being content as a feedstock economy.

    The non-profit critique deserves more public debate. The fitness function point, that markets reveal preferences and non-profits opt out of preference revelation while not disclosing what they optimize for, is a sharp economic argument. The pull versus merit observation about who ends up running large foundations rings true to anyone who has worked adjacent to the philanthropic sector.

    The introduction of River as an AI engineer that named itself is a small detail that signals where this is going. AI agents are going from tools to teammates with identities, channels, and reputations. The fact that River shapes the “world” repository is poetic, and the public Slack steering pattern is a real innovation in how organizations can scale agentic AI without creating siloed knowledge.

    Lütke’s “you can just do things” rallying cry is ultimately what ties the entire interview together. Whether he is talking about Canada, Europe, AI engineers, or his own kids, the through line is the same: action causes information, the cost of trying is lower than ever, and the only people who will benefit from the next decade are the ones who refuse to wait for permission. This is the most useful piece of philosophy in the entire conversation, and it applies far beyond entrepreneurship.

  • Subquadratic (SubQ) Explained: The First Fully Sub-Quadratic LLM with a 12M-Token Context Window, 50x Cost Reduction, and a Post-Transformer Architecture

    Subquadratic, the AI infrastructure company behind subq.ai, just emerged from stealth with a $29M seed round and a claim that should make every AI engineer pay attention: they have built the first large language model whose compute scales linearly, not quadratically, with context length. The result is SubQ, a frontier model with a 12 million token context window, roughly 50x lower cost than leading frontier models at 1M tokens, and benchmark numbers that put it ahead of Gemini 3.1 Pro, Claude Opus 4.6/4.7, and GPT-5.4/5.5 on key long-context tasks. This is a deep, opinionated breakdown of everything Subquadratic has published so far, who is behind it, why a sub-quadratic architecture matters, and what changes for developers, agents, and enterprise AI if the numbers hold up.

    TLDR

    Subquadratic is a Miami-based frontier AI lab that launched on May 5, 2026 with $29M in seed funding and a new LLM called SubQ. SubQ is the first fully sub-quadratic LLM, meaning attention compute grows linearly with context length instead of quadratically. The model offers a 12M token context window, around 150 tokens per second, roughly one-fifth the cost of leading frontier models, 95% accuracy on RULER 128K, 92% accuracy at the full 12M tokens, and the company is targeting 100M tokens by Q4 2026. Two products are launching in private beta: SubQ API (OpenAI-compatible, streaming, tool use) and SubQ Code (a CLI coding agent that plugs into Claude Code, Codex, and Cursor to load entire repositories into a single context window).

    Key Takeaways

    • SubQ is the first fully sub-quadratic LLM, with attention compute scaling at O(n) instead of the transformer’s O(n²).
    • The context window is 12 million tokens, enough to fit the entire Python 3.13 standard library (around 5.1M tokens) or roughly 1,050 React pull requests (around 7.5M tokens) in a single prompt.
    • At 12M tokens, SubQ reduces attention compute by almost 1,000x compared to other frontier models.
    • Pricing benchmarks: 95% accuracy on RULER 128K at $8 of compute, versus 94% accuracy at roughly $2,600 on Claude Opus, a 260x to 300x cost reduction.
    • Speed: about 150 tokens per second.
    • Cost: roughly 1/5 of other leading LLMs at 1M tokens, more than 50x cheaper according to launch coverage.
    • Two products in private beta: SubQ API (12M token window, streaming, tool use, OpenAI-compatible endpoints) and SubQ Code (one-line install CLI for coding agents, ~25% lower bills, 10x faster exploration, auto-redirects expensive model turns).
    • SubQ Code integrates with Claude Code, Codex, and Cursor, positioning Subquadratic as the long-context infrastructure layer beneath existing agent workflows rather than a competing chat product.
    • Architecture: a fully sub-quadratic sparse-attention design that learns which token relationships actually matter and skips the rest, redesigned from first principles.
    • Funding: $29M seed led by investors including Javier Villamizar (former SoftBank Vision Fund partner) and Justin Mateen (Tinder co-founder, JAM Fund), alongside early investors in Anthropic, OpenAI, Stripe, and Brex.
    • Founders: Justin Dangel (CEO, five-time founder) and Alex Whedon (CTO, ex-Meta engineer, former Head of Generative AI at TribeAI). Research team includes PhDs from Meta, Google, Oxford, Cambridge, and BYU.
    • Headcount is 11 to 50, headquartered in Miami, Florida, with active hiring for API engineering, developer advocacy, product design, sales, and people operations.
    • Tagline and thesis: “Efficiency is Intelligence.” The company argues that quadratic attention has been the real ceiling on AI applications, and breaking it unlocks workloads that were previously cost-prohibitive or architecturally impossible.

    Detailed Summary

    What is Subquadratic and what is SubQ?

    Subquadratic is a frontier AI research and infrastructure company. Their public homepage is intentionally minimal, with the single line “Efficiency is Intelligence.” and a contact email at [email protected]. The full product story lives on the launch demo site, where the company introduces SubQ as the first model built specifically for long-context tasks. The pitch is direct: SubQ is a sub-quadratic LLM built for 12M-token reasoning, allowing agents to work across full repositories, long histories, and persistent state without quality loss.

    Three numbers dominate the marketing copy. Context: 12M token reasoning. Speed: 150 tokens per second. Cost: one-fifth of other leading LLMs. Those three numbers, taken together, are why this launch matters. Until now, you could optimize for one of the three at a time. SubQ claims to push all three at once because the underlying architecture changed, not because the company applied better quantization or smarter caching on top of a transformer.

    The architecture: why “sub-quadratic” is the whole story

    Standard transformers, the architecture behind ChatGPT, Claude, Gemini, and almost everything else, use dense self-attention. Every token compares itself to every other token, which means compute scales as O(n²) in the context length n. Double the context, quadruple the compute. That single property is the reason context windows are usually capped at 128K tokens for open models and around 1M tokens for the most aggressive frontier offerings, and it is the reason most production AI systems lean on retrieval-augmented generation, chunking, agentic retrieval, and prompt engineering tricks to dodge the cost curve entirely.

    SubQ is built on a fully sub-quadratic sparse-attention architecture, redesigned from first principles. The argument from co-founder and CEO Justin Dangel is that LLMs waste compute by processing every possible token-to-token relationship when only a small fraction of those relationships actually matter for the task. SubQ learns to find and focus only on those relevant relationships, which is what brings the scaling behavior down from O(n²) to O(n). At 12M tokens, this design cuts attention compute by almost 1,000x compared to other frontier models. The research community has been chasing this for years through linear attention, state space models, Mamba, and various sparse attention variants. According to Subquadratic, the unsolved problem was never the idea, it was building a sub-quadratic architecture that did not sacrifice frontier-level accuracy. That is what their team spent the time on.

    The benchmarks

    Subquadratic published a benchmark table comparing a SubQ 1M-Preview against Gemini 3.1 Pro, Claude Opus 4.6, Claude Opus 4.7, GPT-5.4, and GPT-5.5 across SWE-Bench Verified (real-world software engineering), RULER at 128K (long-context accuracy across 13 tests), and MRCR v2 8-needle at 1M (multi-round coreference resolution).

    • SWE-Bench Verified: SubQ scores 81.8%, ahead of Gemini 3.1 Pro at 80.6% and Opus 4.6 at 80.8%, with Opus 4.7 leading at 87.6%.
    • RULER at 128K: SubQ scores 95.0%, narrowly ahead of Opus 4.6 at 94.8% (internally evaluated). Other vendors did not report this benchmark.
    • MRCR v2 8-needle, 1M: SubQ scores 65.9%, behind Opus 4.6 at 78.3% and GPT-5.5 at 74.0%, but well ahead of GPT-5.4 at 36.6%, Opus 4.7 at 32.2%, and Gemini 3.1 Pro at 26.3%.
    • The launch blog post adds that on RULER 128K, SubQ scored 97% accuracy at $8 of compute, versus 94% on Claude Opus at roughly $2,600. That is a cost reduction of about 260x at superior accuracy.
    • On MRCR v2 specifically, the launch post lists SubQ at 83, Claude Opus at 78, GPT-5.4 at 39, and Gemini 3.1 Pro at 23.
    • At the full 12M token context, SubQ hits 92% on RULER while other frontier models reportedly break down well before reaching their stated 1M-token limit.
    • Subquadratic notes the SubQ results are third-party validated and a full technical report is forthcoming.

    The story these numbers tell is consistent: SubQ is competitive on traditional benchmarks like SWE-Bench, decisively better on long-context retrieval where compute economics dominate, and dramatically cheaper to run when the workload actually exercises a long context.

    The two products: SubQ API and SubQ Code

    SubQ ships in two flavors. The first is SubQ API, the full-context API for developers and enterprise teams. It exposes the 12M token context window, supports streaming and tool use, and uses OpenAI-compatible endpoints so existing client libraries and orchestration code can be repointed with minimal change. The product positioning is to process full repositories and pipeline states in a single API call at linear cost, rather than chunking inputs and stitching results.

    The second is SubQ Code, a long-context layer designed specifically for coding agents. Instead of competing with Claude Code, Codex, or Cursor, SubQ Code plugs into them. It maps codebases, gathers context, and answers token-heavy questions faster than the host agent’s default model. According to Subquadratic, the integration delivers roughly 25% lower bills and around 10x faster exploration, auto-redirects the most expensive model turns to SubQ, and installs in a single line. The design implication is that agent builders do not have to switch ecosystems to benefit from a 12M token window. They keep their preferred agent and offload the heavy long-context work to SubQ.

    Both products are in private beta. Access is gated through a request early access form where applicants choose SubQ Code, SubQ API, or both, and provide context about their workload.

    What 12M tokens actually unlocks

    Subquadratic illustrates the size of the context window with two concrete examples. The entire Python 3.13 standard library is roughly 5.1M tokens, well under the limit. Six months of React pull requests, around 1,050 PRs against the React codebase, comes in around 7.5M tokens, also under the limit with room to spare. At this scale, the standard pattern of curating which files or chunks the model gets to see goes away. The model just sees everything.

    The downstream implications are significant. RAG pipelines, embedding stores, chunking heuristics, and multi-agent coordination layers exist primarily to compensate for short context windows and quadratic compute. If a model can ingest the whole corpus in one pass at linear cost, large parts of that workaround stack become optional. Long-running agents can preserve full state instead of summarizing it. Coding agents can reason about a refactor across an entire repository without juggling tool calls. Document-heavy workflows in legal, finance, and research can run on the source material directly. And once Subquadratic hits its 100M token target by Q4 2026, the design space shifts again toward applications that depend on persistent state and long time horizons.

    The economic argument

    Subquadratic’s framing is that cost has become the binding constraint on AI deployment, not capability. Many ideas never reach production because the unit economics do not work out. Quadratic attention is the structural reason for that. By breaking the scaling law, SubQ aims to make previously cost-prohibitive workloads viable at scale: high-volume inference, longer included context, and applications that rely on sustained interaction with the model. The 260x to 300x cost reduction reported on RULER 128K is the headline number that operationalizes this thesis.

    The team and the funding

    Subquadratic raised $29M in seed funding. Investors include Javier Villamizar, former partner at SoftBank Vision Fund, and Justin Mateen, co-founder of Tinder and founder of JAM Fund, alongside early investors in Anthropic, OpenAI, Stripe, and Brex. CEO Justin Dangel is a five-time founder with prior companies in health tech, insurance tech, and consumer goods. CTO Alex Whedon previously worked as a software engineer at Meta and led over 40 enterprise AI implementations as Head of Generative AI at TribeAI. The research team is built around PhDs and published researchers from Meta, Google, Oxford, Cambridge, and BYU. The company is headquartered in Miami, Florida, with a headcount in the 11 to 50 range.

    Public hiring lists show the company is staffing across API engineering, founding developer advocacy, principal full-stack engineering, technical copywriting, account executive roles for enterprise sales, senior product design for the Voice AI and API surface, and head of people and talent operations. The Voice AI mention is notable because the public homepage at subq.ai still references a Speech-To-Text API as a current product, suggesting Subquadratic is operating across both speech and language with the same architectural thesis.

    The site itself

    The current public site at subq.ai is deliberately spartan. Visitors see only the company name, the line “Efficiency is Intelligence.”, and a contact email. The full marketing surface lives at the launch demo URL, which acts as the de facto homepage for the launch and links out to the request early access flow, the introducing SubQ blog post, the LinkedIn page, the X account, the Discord community, careers, press contact at [email protected], terms of use, privacy policy, cookies policy, and acceptable use policy. The structure makes sense for a private beta launch: keep the apex domain minimal, push announcement traffic to a dedicated launch site, and gate product access behind a form.

    Thoughts

    The interesting part of Subquadratic’s pitch is not the context window. It is the implicit claim that the entire workaround economy built around transformers, RAG vendors, vector databases, chunking middleware, agentic retrieval frameworks, context compression startups, was always a tax paid because of one architectural property: O(n²). If SubQ’s numbers hold up under independent scrutiny, a meaningful slice of that ecosystem becomes optional rather than mandatory. That has product, infrastructure, and venture implications that go well beyond a faster, cheaper LLM.

    The product strategy is also notably humble in a smart way. Subquadratic is not trying to win the consumer chat war against ChatGPT, Claude, or Gemini. SubQ Code is positioned as a layer underneath Claude Code, Codex, and Cursor, and the API is OpenAI-compatible. That is a classic infrastructure play: do not ask developers to abandon their tools, just route the expensive long-context turns to you. The “auto-redirects expensive model turns” framing is essentially a routing economic argument aimed at agent builders who already feel the pain of paying frontier prices for high-token requests.

    There are open questions worth holding lightly. The MRCR v2 numbers in the public benchmark table show SubQ behind Opus 4.6 and GPT-5.5, even as the launch post emphasizes a higher relative score. The cost comparisons rely on a specific compute basis that the upcoming technical report will need to spell out. And the gap between strong RULER scores at 128K and the 92% claim at 12M tokens is a long way to extrapolate without external replication. None of this is unusual for a launch, but it is the right place to apply pressure once the technical report drops.

    The bigger architectural bet is the one that should hold attention. If sub-quadratic attention done well genuinely matches frontier accuracy, then context length stops being a meaningful product axis and a generation of brittle infrastructure built around context limits gets reconsidered. Subquadratic is making the strongest public case so far that the post-transformer era starts with attention scaling, not parameter count. The next twelve months, the technical report, third-party benchmarks, and the first real production deployments through SubQ Code, will tell us whether this is the inflection point or another promising direction that does not quite cross the line. Either way, “Efficiency is Intelligence” is the right frame for where AI economics are heading, and Subquadratic is one of the few companies whose architecture is consistent with the slogan.

  • Brian Chesky on AI Founder Mode, the 11-Star Experience, and Reinventing Airbnb for the Age of AI

    Airbnb CEO Brian Chesky sits down with Patrick O’Shaughnessy on Invest Like The Best to talk about the next evolution of company building: AI Founder Mode. He covers the shift from founder to CEO, the lessons he learned from Steve Jobs through Hiroki Asai, why consumer AI is the next great frontier, and how he plans to change the atomic unit of Airbnb from a home to a person.

    TLDW

    Brian Chesky believes the next era of company building belongs to founders who refuse to delegate the soul of their company. He coined Founder Mode with Paul Graham after the pandemic forced him to take Airbnb back into his own hands. Now he is shaping what comes next: AI Founder Mode, where leaders work with on-demand context, fewer layers of management, asynchronous communication, and a new generation of hybrid manager-makers. He shares why most software companies have not been touched by AI yet, why consumer AI is about to explode, and why he is rebuilding Airbnb around people, not homes. The conversation also touches on the 11-Star Experience exercise, the power of small teams, why recruiting is the most important job a CEO has, and why every adult is still an artist underneath.

    Key Takeaways

    • Founder Mode is not micromanagement, it is having a steering wheel. Chesky woke up in 2019 feeling like the car had no steering wheel. After the pandemic, he reviewed every detail for two to three years before delegating again. Start hands-on and give ground grudgingly, not the other way around.
    • AI Founder Mode is even more intense. With AI, leaders can be in significantly more details because almost everything is on demand. Expect fewer layers of management, mostly asynchronous work, and the death of the pure people manager.
    • Two types of leaders will not survive AI. Pure people managers who only do one-on-ones, and rigid people who refuse to evolve. Everyone needs to be a hybrid manager-IC who can still touch the work.
    • Manage people through the work, not through meetings. Frank Lloyd Wright did it. Johnny Ive does it. You are not anyone’s therapist.
    • Consumer AI is the next great prize. 159 of the last 175 Y Combinator companies were enterprise. Almost every app on your home screen has not changed since AI arrived. That changes in the next 12 to 24 months.
    • Why consumer AI is hard. No proven business model, mature distribution, trend-chasing investor culture, and the simple fact that consumer is more hits-driven and requires excellence in design, marketing, culture, and press, not just technology and sales.
    • Project Hawaii is the new operating model. A 10 to 12 person Navy SEAL team, hands-on coaching from the CEO, crawl-walk-run-fly. The first project added roughly $200 million in year one and $400 to $500 million in year two.
    • Make the problem as small as possible. Airbnb spent 16 years failing to launch a second hit because it kept trying to scale globally on day one. Now: pilot in one city, expand to 10, then industrialize.
    • It is better to have 100 people love you than a million people sort of like you. Paul Buchheit shipped Gmail only after 100 Googlers loved it. The sample size of intense love is enough to predict mass adoption.
    • The 11-Star Experience is an imagination exercise. Push to absurdity (Elon takes you to space) so a 6 or 7-star experience suddenly seems normal. The gap between 5 and 6 stars is the gap between you and your competitor.
    • Simplicity is distillation, not subtraction. Hiroki Asai, Steve Jobs’s longtime creative director, taught Chesky that great design distills something to its essence. First principles is a design term too.
    • The score takes care of itself. Bill Walsh and John Wooden both taught that you do not focus on winning, you focus on making every input perfect. Wooden spent his first hour with new players teaching them how to put on socks.
    • Industrial design is the original product management. There are no PMs in industrial design. The designer is the PM, working alongside engineers and program managers to design through user journeys.
    • Recruiting is the CEO’s number one job. The more time you spend recruiting, the less time you spend managing, because great people self-manage. Build pipelines, not searches. Start with results, work backwards to people.
    • Co-hire the top 200 people, not just the executive team. Most CEOs hire executives and let them hire their teams. Chesky considers that fatal because most executives cannot hire well without help.
    • Bodybuilding is a metaphor for leadership. If you can change your body, you can change your life. Progressive overload, 1 percent a day, is how compounding works. Start with biology before therapy.
    • Founder-led companies build the deepest moats. Disney is still selling Walt’s playbook 60 years after he died. Apple is still selling Steve’s iPhone. The longer founders stay in founder mode, the more the company can endure when they leave.
    • Software is hyper fast fashion. Hardware ages well. Buildings get patina. Software always looks dated 10 years later. What endures is the community, the brand, the principles, the mission, and the network effect.
    • Apps are dying. Agents are coming. Chesky says we should let go of our attachment to apps because they are not what the future looks like.
    • Airbnb’s atomic unit is changing from a home to a person. Chesky wants to build the most authenticated identity on the internet, the richest preference library, a real-world social graph, and a membership program. Then expand to 50 to 70 verticals on top of that identity.
    • AI shifts attention from consumption to creation. Social media gave you a paintbrush only for opinions. AI gives everyone a real paintbrush and canvas. We are heading into a creative renaissance.
    • Founders are expeditionaries, not visionaries. They put one foot in front of the other and call it a vision later.
    • Detach from accolades. Chesky describes adulation as a cup with a hole in the bottom. Status is a drug. The path to durable creative work is doing it because you love it, the way Walt Disney, Da Vinci, Van Gogh, and Steve Jobs did until the very end.
    • The kindest gift is belief. The best way to activate a person’s potential is to see something in them they do not yet see in themselves.

    Detailed Summary

    From Industrial Design to the CEO Chair

    Chesky studied industrial design at the Rhode Island School of Design. He chose it on instinct after a department head told him industrial designers design everything from a toothbrush to a spaceship. He grew up enchanted by the Reebok Pump, the Game Boy, the Nintendo, and eventually by the late 1990s golden age of Apple. Raymond Loewy, the man who designed Air Force One and an enormous catalog of mid-century consumer products, became a touchstone, but Johnny Ive was the real hero.

    What he loved about industrial design was that it is technical, commercial, and empathetic. A building can win an architecture award and never be leased. A piece of industrial design that does not sell is a failure. So you have to think about manufacturing, distribution, marketing, and most importantly, user journeys. There are no product managers in industrial design. The designer is the PM. That training, he says, prepared him directly for the role of CEO.

    The Pandemic and the Birth of Founder Mode

    Chesky says no one is born a good CEO. People are born good founders. The job of CEO is counterintuitive in almost every direction. Founders are taught to learn by doing, but a CEO who learns by trial and error wastes years unwinding the empires of misfit hires.

    By 2019 he was running a 7,000 person company he no longer recognized. He felt he was driving a car without a steering wheel. He had a dream that he had left Airbnb for ten years and come back to find it had become a giant political bureaucracy. Then he realized he had been there the whole time. The pandemic hit and Airbnb lost 80 percent of its business in eight weeks. He shifted from peacetime to wartime, took control of every detail, worked 100-hour weeks, and reviewed everything for two to three years.

    The vision was never to micromanage forever. The vision was: I need to know what is going on before I can empower anyone. Hire people, audit their work, and only then give ground grudgingly. Most founders do the opposite, which is why they end up with executives building empires they later have to dismantle.

    AI Founder Mode

    Chesky says AI Founder Mode will be even more intense than Founder Mode because nearly everything will be on demand. He used to live in 35 hours of meetings a week to gather information, the same way Steve Jobs ran Apple. He held weekly, biweekly, monthly, and quarterly group reviews with the full chain of command in one room, anyone could speak, and he made the final call after listening last.

    In the AI era, that culture shifts from meetings to asynchronous work. He expects fewer layers of management. He cites the Catholic Church as a 2,000-year-old institution with only four layers and asks why most companies need seven, eight, or nine. Pure people managers will not survive. Every manager will have to be a hybrid IC, an engineer who still codes, a lawyer who still reads case law, a designer who still designs. You manage through the work, not through one-on-ones.

    He is also bullish that AI tooling will become consumer-grade simple very soon. The current tools, including Claude Code and Cowork, are not yet intuitive to the average person, but the economic incentive will force that to change.

    Why Consumer AI Is the Next Great Frontier

    Chesky points out that 159 of the last 175 Y Combinator companies were enterprise. Almost every consumer app on your phone, including Airbnb, has not fundamentally changed since the arrival of AI. He gives four reasons: investors feared ChatGPT would kill consumer companies; consumer AI has no proven business model because subscriptions hit a local max against free Claude and Gemini, ads are off the table for most labs, and e-commerce has been shut down via third-party app removals; distribution is mature; and Silicon Valley culture, while branded as rebellious, is in practice trend-following.

    The deeper reason is simply that consumer is harder. It is hits-driven, requires great design, marketing, culture, press, and you cannot easily start by selling to your dorm-mates the way enterprise YC startups sell to other YC startups. The prize is bigger. The risk is bigger. He predicts a consumer AI renaissance over the next 12 to 24 months.

    Project Hawaii and the Magic of Small Teams

    Inside Airbnb, Chesky tested a new operating model called Project Hawaii. He took 10 to 12 people, designers, engineers, product, and data scientists, treated them like a startup inside the company, and pointed them at one problem: improving the guest funnel. The system is crawl, walk, run, fly. First fix bugs, then add features, then re-imagine flows, then completely reinvent.

    The first team delivered roughly $200 million of internal revenue in year one and $400 to $500 million the next year, eventually contributing more than 600 basis points of conversion improvement on a base of $134 billion in gross sales. Then they took the same system to pricing, then to other problems, then to launching new businesses like Services and Experiences.

    The guiding lesson: make the problem as small as possible. Airbnb launched in one city, New York. Uber in San Francisco. DoorDash in Palo Alto. When Chesky launched Services and Experiences in 100 cities at once last year, it did not work. The fix was to dominate one city, expand to 10, then industrialize. Peter Thiel said it cleanly: better to have a monopoly of a tiny market than a small share of a big market.

    Underneath that is a Paul Buchheit insight Chesky calls the best advice he ever got. It is better to have 100 people love you than a million people sort of like you. Buchheit refused to ship Gmail until 100 Googlers loved it, and that took two years. Once 100 people loved it, 100 million people did.

    The Hiroki Asai Lessons: Simplicity and Craft

    Hiroki Asai, Steve Jobs’s quietly legendary creative director, taught Chesky two principles. The first is that simplicity is not removing things, simplicity is distillation, understanding something so deeply that you can express its essence. Steve Jobs called design the fundamental soul of a man-made creation that reveals itself through subsequent layers. Elon Musk’s first principles thinking is the same idea applied to physics.

    The second is craft. How you do anything is how you do everything. Chesky cites Bill Walsh’s The Score Takes Care of Itself and John Wooden’s first hour with UCLA players, an hour spent teaching them how to put on their socks. Walsh said the way you tucked your jersey was one of 10,000 details that decided whether you won. The lesson is to focus on getting every input right. The output follows.

    The 11-Star Experience

    The 11-Star Experience is one of Chesky’s most copied frameworks. Most Airbnb stays get five stars because anything else means something went wrong. So Chesky asked: what would six stars look like? Your favorite wine on the table, fruit, snacks, a handwritten card. Seven stars? A limousine at the airport and the surfboard waiting for you because they know you surf. Eight stars? An elephant and a parade in your honor. Nine stars, the Beatles arrive in 1964 with 5,000 screaming fans. Ten stars, Elon Musk takes you to space.

    The point is the absurdity. By imagining the impossible, six and seven star experiences stop seeming crazy. The gap between five and six stars is the gap between you and your competitor. If you can industrialize a sixth star, you may have product-market fit. The exercise also restarts your imagination, which Patrick noted has atrophied for many people in the era of consumption-only social media.

    AI as a Canvas for Creativity

    Chesky frames AI as the ultimate platform shift, the ultimate creative expression, and possibly the greatest invention in human history. Social media made us mostly consumers and gave creators only opinion-shaped tools. AI gives everyone a paintbrush. He believes far more people are creative than we recognize because most have never had craftsmanship or tools to express what is in their heads. Pablo Picasso said all children are born artists; the problem is to remain one as you grow up. Chesky thinks every adult is still an artist underneath.

    The Next Chapter of Airbnb

    Chesky describes four phases of the CEO journey: get to product-market fit, scale to hyper-growth, become a real profitable public company, and finally reinvent. Airbnb’s stock has been flat because the core idea is saturating. He is now squarely in phase four, with three priorities.

    First, change the atomic unit from a home to a person. He wants Airbnb to build the most authenticated identity on the internet, the richest preference library, a real-world social graph, and a membership program. Proof of personhood, he says, will be enormously valuable in the AI age. Second, industrialize the new-business engine to support 50 to 70 verticals (homes, experiences, services, eventually flights, and more) all built on top of that personal atomic unit. Third, navigate the AI transition without breaking the existing business or the livelihoods of hosts. He is also exploring sandbox apps that imagine a radically different Airbnb, the answer to “what is after Airbnb?”

    What Endures in the Age of AI

    Chesky is direct that software does not endure. Look at any software from 10 years ago and it looks dated. Hardware ages better. Buildings develop patina. Paris endures. So if you want to build something lasting, you cannot bet on the app. You have to bet on the community, the brand, the mission, the principles, the identity, and the network effect. Apps are going away, replaced by agents. Founders attached to apps need to let go.

    Founder-Led Moats: Disney and the Ham Sandwich Paradox

    Chesky reconciles Warren Buffett’s “buy a company a ham sandwich could run” with the venture capital truth that a founder’s ceiling is the company’s ceiling. The reconciliation is Disney. Most people cannot name a Paramount, Warner Brothers, Universal, or MGM film off the top of their head, but everyone can name Disney films. Walt Disney was a founder in founder mode for so long that he created enough IP and momentum that the company has been running on his playbook for 60 years after his death. Apple is similar with Steve Jobs and the iPhone.

    The counterintuitive lesson: if you want a company to last 100 years, do not delegate early to make it independent of you. Stay in founder mode for as long as possible so you can institutionalize the magic deeply enough that it endures after you. Tech is the industry of change, so founder mode matters even more there than in chocolate or insurance.

    Bodybuilding as Leadership Training

    Chesky was a 135-pound late bloomer who told his friends he would compete at the national level in bodybuilding by 19. He did. Two lessons came out of it. First, if you can change your body, you can change your life. Start with biology before therapy. Second, you cannot get in shape in one day. Progressive overload, discipline, consistency, and roughly 1 percent a day compound into massive gains. The visible feedback loop in bodybuilding taught him to break invisible problems (like the quality of a leadership team) into observable, measurable proxies (like the quality of the room at a twice-yearly roadmap review of the top 100 people).

    Recruiting as the CEO’s Number One Job

    Sam Altman told a 27-year-old Chesky he would spend 50 percent of his time on hiring. Chesky did not, and considers that his biggest mistake. He now starts and ends every day with his recruiter and spends two to three hours a day on hiring. The more time you spend recruiting, the less time you have to spend managing because great people self-manage.

    His system is pipeline recruiting, not search recruiting. He never starts with a search firm. He constantly meets the best people in their fields, asks each one to introduce him to the next two or three best, and builds a rolling rolodex. He starts with results, finds an ad he loves, and works backwards to the team that made it. He builds little mafias of top talent inside the company. He is the co-hiring manager for the top 200 people at Airbnb, not just executives, because most executives cannot hire well without help.

    Activating Talent and the Power of Belief

    You cannot teach motivation. You can only give people a problem and see if they have agency. The way to activate someone, Chesky says, is to show them potential they cannot yet see in themselves. He cites John Wooden, who said the secret to coaching was that he saw potential in players they did not see in themselves. People will climb mountains for that.

    The kindest gift anyone gave Chesky, he says, was belief. A high school art teacher named Miss Williams told his parents he was going to be a famous artist. He never became one, but the belief gave him the confidence to choose art school and to choose to be happy. Michael Seibel and the Justin.tv founders believed in him. Paul Graham made an exception to fund a non-engineer with what he thought was a bad idea. His co-founders Joe and Nate believed in him when he had no business being a CEO. The biggest gift you can give back, he says, is belief in others.

    Detaching from the Scoreboard

    Chesky describes adulation as a cup with a hole in the bottom. Status keeps draining out and you keep needing more to feel the same. The day Airbnb went public at a $100 billion valuation should have been one of the best days of his life. The next morning he put on sweatpants for a Zoom meeting and felt nothing. That triggered a re-evaluation. He stopped seeking accolades and started focusing on intrinsic work. He cites Rick Rubin: an artist is an artist when they make for themselves. He cites Vice President Obama, who told him to focus on what you want to do, not who you want to be.

    His four heroes are Leonardo da Vinci, Vincent Van Gogh, Walt Disney, and Steve Jobs. All four were working until the last week or day of their lives. Da Vinci carried the Mona Lisa with him until he died. Van Gogh sold one painting in his life. Disney was imagining theme parks in the ceiling tiles of his hospital room. Chesky says his motivation is the motivation of an artist. He calls being a CEO of a public company at his scale “almost a glitch in the system” that gave him one of the largest design canvases in human history.

    Thoughts

    What stands out about this conversation is how clearly Chesky has decoupled identity from outcome. He frames himself first as a designer, second as a CEO, and considers the resources he commands as a kind of accidental fortune for an industrial designer to be sitting on. That self-image is what lets him talk about disrupting Airbnb, killing the app paradigm, and changing the atomic unit of the company without flinching. Most public-company CEOs cannot afford that posture.

    The framework worth stealing is Project Hawaii. The pattern of taking a 10-person elite team, putting them under direct CEO coaching, and running them through crawl-walk-run-fly is a near-universal answer to the problem of innovation inside a large company. It works because it removes abstraction layers, creates direct contact with reality, and gives the founder a way to teach muscle memory before delegating. Anyone running a team of any size can borrow the pattern: pick one problem, staff it small, work with it weekly, then let go gradually. The golf-instructor analogy of teaching muscle memory before bad habits set in might be the most important management metaphor of the year.

    His prediction about consumer AI is the most economically interesting part of the talk. The fact that 159 of 175 recent YC companies are enterprise is a startling concentration. If he is right that the next 12 to 24 months bring a consumer renaissance, the opening is enormous. The hard part is what he names directly: there is no proven business model for consumer AI yet. Subscriptions cap out against free incumbents, ads are off-limits for the labs, and e-commerce has been throttled. Solving the business model is probably more valuable than building the next great consumer interface.

    The deeper philosophical thread, that AI is the transition from consumption to creation, is one that anyone building tools for makers should hold close. The 11-Star Experience also reads differently in the AI era. It used to be a thought exercise constrained by what you could plausibly build. AI compresses the gap between imagination and execution to minutes, sometimes seconds. The question is no longer “what is the most absurd version of this experience?” but “which six and seven star experiences can I now industrialize that were unthinkable a year ago?” The exercise has become operational.

    Finally, the meta-lesson on founder-led moats is worth taking seriously. The instinct in venture capital and at most public-company boards is to professionalize early. Chesky’s argument is the opposite: the longer the founder stays in founder mode, the deeper the IP and the longer the company endures after they leave. Disney is the proof. Apple is the proof. Whether Airbnb will be is the open question, and it is the question Chesky is using AI Founder Mode to answer.