PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: codex

  • Marc Andreessen on Joe Rogan #2501, AGI Has Already Arrived, California’s Wealth Tax Will Bankrupt Founders, and Why America Cannot Build Anything Anymore

    Marc Andreessen returns to The Joe Rogan Experience #2501 for a sprawling three hour conversation that tries to make sense of the moment we are actually living through. Andreessen is the cofounder of Andreessen Horowitz, the man who built the first commercial web browser, and one of the most quoted voices in technology. He arrived with a giant pile of receipts on California’s new wealth tax ballot proposition, the political backlash against AI data centers, the destruction of Los Angeles by single party rule, and what he believes is the quiet arrival of artificial general intelligence about three months ago. Joe pushes back, asks the dystopian questions, and the result is one of the most useful primers on the AI economy, surveillance technology, energy policy, and the future of the American social contract that you will find anywhere.

    TLDW

    Andreessen argues that AI quietly crossed the AGI threshold around early 2026 with GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3, that top human coders now openly admit the bots are better than they are, that working software engineers are running twenty AI agents in parallel and turning into sleep deprived “AI vampires,” and that this productivity boom is the most underreported story in the world. He explains why California’s 5 percent wealth tax ballot proposition is calculated to bankrupt tech founders by taxing the higher of their voting or economic interest in their own companies, why this is the opening salvo of a federal asset tax push for 2028, and why a flood of Silicon Valley families is already moving to Nevada, Texas, and Florida. He walks through Flock cameras and Shot Spotter, the Washington DC crime statistics scandal, the Pacific Palisades fire and the fifteen year rebuild, the Kevin O’Leary Utah data center debate with Tucker Carlson, the fifty year suppression of American nuclear power, why all the chips ended up in Taiwan, the US versus China robotics gap, the Chinese practice of grading AI models on Marxism and Xi Jinping Thought, the bot and paid influencer economy on social media, neural wristbands and Meta Ray Ban heads up displays, artificial gestation and the demographic collapse, AI religions and AI mates, and why he still thinks the next twenty years are overwhelmingly a good news story. Rogan closes the episode with a separate solo segment apologizing to Theo Von for clumsily raising Theo’s struggles during the recent Marcus King conversation.

    Key Takeaways

    • Austin’s recent teenage crime spree, in which 15 and 17 year old suspects shot at people and buildings across roughly a dozen locations, was solved only after the offenders drove into an adjacent town that still ran Flock, the AI license plate and vehicle tracking system Austin had voluntarily turned off for political reasons.
    • Chicago turned off both Flock and Shot Spotter, the gunshot triangulation system that places ambulances at shooting scenes within seconds, on the argument that the technology is racist. Andreessen counters that the victims of urban gun violence come overwhelmingly from the same communities the policy claims to protect.
    • Washington DC was caught faking its crime statistics at senior levels, with multiple officials fired or indicted. The DC mayor publicly thanked Donald Trump after the National Guard deployment because violent crime collapsed in the affected neighborhoods.
    • The new New York City mayor Zohran Mamdani filmed a video standing in front of Ken Griffin’s home, and Griffin, a major philanthropist who funds healthcare in New York City and runs a $6 billion project there, signaled he will move more of the business to Florida.
    • The top 1 percent of New York taxpayers pay roughly half the state’s income tax, and in California in the year 2000 a thousand individuals paid 50 percent of the entire state’s tax receipts.
    • California has a ballot proposition right now for a one time 5 percent wealth tax on assets above a certain threshold, with stocks and crypto included and real estate excluded. The tax is calculated on the greater of a founder’s economic interest or voting interest, which would instantly bankrupt founders with super voting shares.
    • The Biden administration attempted a federal wealth tax in 2022, fell short, and published an explicit 2025 fiscal plan to try again if they won re-election. Elizabeth Warren has already proposed an annual 6 percent federal wealth tax on unrealized gains.
    • The current US exit tax already takes roughly 45 percent of your assets if you renounce citizenship. The only ways out of a state level wealth tax are the other 49 states. The only way out of a federal one is to leave the country, which most people will not do.
    • Andreessen says the Silicon Valley exodus has gone from trickle to stream to flood, with founders moving to Las Vegas, Texas, Florida, and Nashville. His partner Ben Horowitz has moved to Las Vegas.
    • Andreessen says he is not leaving California, but admits the situation is fraught because if half the tax base leaves the remainder becomes the target.
    • The new UK government under Keir Starmer just collapsed, and all four of the leading candidates to replace him sit further to the left than he does. France and Germany are seeing the same drift, and Andreessen expects a national wealth tax to be a centerpiece of the 2028 Democratic primary.
    • A legal loophole lets companies pay influencers to post political and social ideas without any disclosure, because campaign finance laws cover candidates and FTC rules cover products. Ideas fall through the gap entirely.
    • Andreessen runs Twitter and Substack as his primary information feeds, uses three hand curated lists, and follows a strict one tweet policy where one bad post triggers a block and one good post triggers a follow.
    • He argues the modern social media problem is binary, that everyone is either too online and drowning in fake outrage cycles or too offline and trapped inside what television and newspapers tell them. Almost nobody manages the middle.
    • Meta Ray Ban glasses now ship with a heads up display, and Meta’s neural wristband can pick up nerve impulses from your wrist so you can type messages by intending to move a finger without moving it.
    • Andreessen predicts AI plus high resolution cameras and infrared sensing will deliver practical lie detection without needing brain implants.
    • Kevin O’Leary’s planned 40,000 acre Utah data center has become a Tucker Carlson talking point, but Andreessen argues data centers are the most benign physical asset you can build, and that the real issue is whether America can build anything at all anymore, from chip plants to pipelines to housing.
    • All chips were once made in California, and all are now made in Taiwan, purely because of environmental regulations like NEPA. The same regulatory machinery prevented the Nixon era Project Independence plan to build a thousand civilian nuclear power plants by the year 2000.
    • Three Mile Island killed zero people and produced no detectable health effects on plant workers or the public, according to fifty years of follow up. Fukushima killed essentially zero people from radiation. Nuclear remains the safest carbon free baseload energy ever invented.
    • Germany shut down its nuclear plants, fell back on intermittent wind and solar, and now uses coal as backup, generating far more carbon emissions than nuclear would have produced.
    • The Pacific Palisades fire took out roughly twice the square mileage of the Nagasaki blast, the head of the LA water department reportedly did not know the key reservoir was empty, and the rebuild is expected to take fifteen years thanks to permit gridlock, affordable housing mandates, and a state ban on land offers below pre-fire appraised value.
    • Andreessen offers a metaphor for AI as a modern philosopher’s stone, turning sand into thought, since chips are made of silicon and an AI data center is literally lit up sand thinking on demand.
    • The Turing test was blown through so completely with ChatGPT in late 2022 that nobody in the industry even bothers running it anymore. Andrej Karpathy has demonstrated a working large language model in 300 lines of code and people have ported small models to Texas Instruments calculators.
    • Andreessen believes AGI was effectively reached about three months before this interview, with GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3. He says 99 percent of the time he gets a better answer from the leading models than from the human experts he has access to.
    • Linus Torvalds and John Carmack publicly admit the latest models are better at coding than they are. Top AI coders in the Valley now earn $50 million a year.
    • The new pattern in the Valley is “AI vampires,” engineers who do not sleep because the opportunity cost of going offline is too high. They each run roughly twenty Claude Code, Cursor, or Codex agents in parallel, then a new layer of bot-managing-bot architectures is starting on top of that.
    • A Wall Street friend with a thirty five year old MIT CS degree has used AI to generate 500,000 lines of code at home in his spare time, building everything from smart fridges to a custom music jukebox.
    • The mass unemployment narrative is wrong. Tech companies that did layoffs were overstaffed. The leading AI labs and AI companies are hiring like crazy, including coders, and demand for code turns out to be vastly elastic.
    • Doctors are already using ChatGPT in the exam room behind the patient’s back. Andreessen describes a friend who built a Star Trek style diagnostic dashboard combining decoded genome ($200 today), blood panels, and Apple Watch telemetry.
    • Multimodal AI lets a webcam analyze a Brazilian jiu-jitsu sparring session and give performance feedback, an example Andreessen attributed to an unnamed friend after Rogan guessed Zuckerberg.
    • A leaked David Shore voter issue ranking shows cost of living, the economy, inflation, taxes, and government spending dominate. AI ranks 29 of 39. Race relations, guns, abortion, and LGBT sit at the bottom, signaling the woke issue cluster has burned itself out in voter priorities.
    • The next wave of AI is robots. The US leads in AI software but is far behind China on physical robotics. Andreessen warns the world cannot afford a future where every household robot ships with the Chinese Communist Party behind its eyes.
    • Chinese AI model cards include scores for Marxism and Xi Jinping Thought because every Chinese product must be evaluated on those axes. American models have political biases of their own but a different ideological baseline.
    • Large language models are not sentient. They write Netflix scripts based on whatever vector you shoot through the latent space. The supposed AI self preservation papers traced back, per Anthropic’s own research, to less wrong forum posts and earlier doom scenarios baked into the training data.
    • Andreessen breaks guardrails routinely by reframing requests as fictional Netflix style scripts, including a personal favorite where he asked early models how to make bombs by claiming to be an FBI agent recruited into domestic terror cells.
    • He recommends using AI by asking it to steelman both sides of any contested question, then making the value judgment yourself, rather than asking for the answer.
    • The Trump administration is using AI on government billing data to surface Medicare fraud, fake hospice programs, and fake autism centers, an idea that survived the original Doge plan.
    • Andreessen tells Rogan that Elon Musk privately confirmed that a Westworld style humanoid robot, the season one version, is roughly five years away.
    • Artificial gestation is already happening with animal stem cell derived embryos. The conversation reaches a hard moral edge about sociopathic warehouse babies and gray-alien-style humans engineered without empathy circuitry.
    • Andreessen’s deepest bet is that material abundance is solvable but the human questions, how we live, what we value, what kind of society we want, and what role consent plays in surveillance and brain interfaces, remain in human hands.
    • After Andreessen leaves, Rogan does a separate solo segment where he apologizes to Theo Von for raising Theo’s history of struggles during the recent Marcus King interview, explains the missing context behind the viral Theo Netflix special clip, and discusses the loss of Brody Stevens, Anthony Bourdain, and what antidepressants did for Ari Shafir.

    Detailed Summary

    Flock, Shot Spotter, and the Politics of Solvable Crime

    The episode opens on the Austin crime spree carried out by two teenagers who stole cars, switched vehicles, and shot at roughly a dozen locations across the city before being caught only after they crossed into a town that still ran Flock, the AI license plate and vehicle recognition platform that is one of Andreessen Horowitz’s portfolio companies. Austin had previously disabled Flock under privacy pressure. Andreessen takes the moment seriously, conceding that mass surveillance abuse by corrupt mayors or police chiefs is a real risk, and that warrants and audit logs are the right safeguards. His larger point is that the cost of unilateral disarmament against organized urban crime is hidden but enormous. He uses Chicago’s Shot Spotter as the paradigmatic case, a network of rooftop microphones that triangulates gunshots so accurately that ambulances can be dispatched before any 911 call is placed. Chicago turned the system off on the argument that it disproportionately flags poor neighborhoods, and people now bleed out on the street with nobody noticing. Andreessen calls this the woke argument against safety, and he argues that in high crime neighborhoods residents simply will not call the police because snitches do not survive, which is why objective sensor data is so valuable.

    Faked Crime Statistics, Mayoral Politics, and the Tax Base

    From there the conversation drifts to the recent scandal in which senior officials at the Washington DC Metropolitan Police Department were caught actively falsifying crime statistics, and the strange spectacle of the DC mayor thanking Donald Trump for the National Guard deployment after violent crime dropped off a cliff. Andreessen sketches an unsettling theory in which the long, slow degradation of major American cities is partly a deliberate political project to drive out responsible homeowners and reshape the voting electorate, then bail out the resulting fiscal hole with federal money. The poster case is the new New York City mayor Zohran Mamdani filming a video in front of Ken Griffin’s home. Griffin happens to be a major philanthropist who funds New York City healthcare, employs thousands, anchors a $6 billion development, and pays taxes that are individually load bearing for the city. Andreessen quotes the standard estimate that the top 1 percent of New Yorkers pay roughly half the state’s income tax, and that the all time California peak was a single year in which a thousand people paid half the state’s tax receipts.

    California’s 5 Percent Wealth Tax and the Founder Bankruptcy Mechanic

    This is the segment that landed hardest. California has a ballot proposition right now for a one time 5 percent wealth tax on net assets above a threshold, with real estate excluded but stocks, crypto, art, jewelry, and private company equity included. The detail that makes it lethal for the Valley is the formula, which calculates the taxable amount on the greater of a founder’s economic interest or voting interest in their company. Founders who hold super voting shares for control purposes, including the Google founders, would owe tax on the voting share number that vastly exceeds their economic share. The tax would, by definition, exceed available assets. Andreessen walks through the historical pattern, that income tax started as a 3 percent levy on the rich and grew to 90 percent marginal rates within decades, and predicts a 5 percent one time tax will become a 5 percent annual tax within a few years, with the threshold ratcheting down. He notes that the Biden administration’s 2025 fiscal plan explicitly named a federal asset tax as a goal if they won re-election, that Elizabeth Warren is already proposing a 6 percent annual federal wealth tax on unrealized gains, and that Gavin Newsom cannot veto a ballot proposition. The trickle of founders leaving California has become a flood. His partner Ben Horowitz has moved to Las Vegas. Andreessen himself is staying, but admits the game theory is brutal once half the base leaves.

    Henry Wallace 1948 and Why the American Story Is Not Decided Yet

    Andreessen pulls in a historical analogue most listeners will not have heard. In 1944 the actual communist Henry Wallace very nearly became Truman’s running mate and almost ascended to the presidency. He ran again in 1948. Despite a Soviet Union that had recently been a wartime ally and had even received a New York City ticker tape parade for Stalin, the American voter rejected him. Andreessen’s point is that the American body politic has historically backed away from radical socialist proposals when forced to actually look at them, and he expects the same to happen as the wealth tax becomes a federal 2028 platform issue. The risk, both he and Rogan agree, is that today’s media and bot landscape is vastly more aggressive than 1948’s, and the propaganda environment is shaped by paid influencers, foreign actors, and political bot farms operating in a legal grey zone where disclosure is required for products and candidates but not for ideas.

    Too Online, Too Offline, and Heaven Banning Blue Sky

    The two riff on social media and feed curation. Andreessen describes his “one tweet” policy where he follows or blocks any account based on a single post, his use of hand curated lists alongside the X algorithm, and the older Call of Duty lobby metaphor for handling toxic replies. Joe pushes back, says he no longer reads his mentions because the negative payload is not worth it, and offers his theory that the modern internet has two failure modes, too online and too offline, and that very few people calibrate the middle. Andreessen introduces the concept of “heaven banning,” an older moderator term where a problem user is not removed from a forum but is silently routed into a bot-only experience in which everything they say is praised. He notes the running joke that Blue Sky is functionally real life heaven banning, that Jack Dorsey himself has disowned it, and that the platform’s most engaged users have ascended into their own private Idaho of bot agreement.

    The Coming Hardware, Meta Glasses, Neural Wristbands, and Practical Lie Detection

    Andreessen walks Rogan through the latest Meta Ray Ban heads up display, the neural wristband that picks up nerve signals from finger movement (and from the intent to move a finger), and the screen recordings of people playing Doom hands free or playing platformer games while jogging. He extends the trajectory to practical lie detection without Neuralink, using ultra high resolution cameras combined with infrared sensors that pick up physiological changes invisible to the naked eye. Joe asks the obvious question of what happens with sociopaths, and Andreessen concedes the edge case. The two then enter a longer thread on telepathy via neural mesh devices, the question of whether police could subpoena your thoughts under warrant, and the divergence between the American constitutional framework and the Chinese model in which the state’s claim on your inner life is total.

    Kevin O’Leary, Tucker Carlson, and Whether America Can Build Anything

    The data center debate becomes a vehicle for the larger argument. Kevin O’Leary is building a 40,000 acre AI data center in Utah, has bought up large surrounding land for water rights, and intends to keep the bulk of it preserved. Tucker Carlson grilled him on tax breaks and on the energy footprint, which O’Leary says will rival New York City’s at peak. Andreessen agrees the tax break debate is fair, but says the energy comparison is a red herring because new federal policy now requires data centers to bring their own generation. The real story is that America has spent thirty years making it nearly impossible to build a chip plant, a power plant, a refinery, a pipeline, or a house. Chips moved to Taiwan because California regulated semiconductor manufacturing out of existence. The Nixon era Project Independence plan called for a thousand civilian nuclear power plants by the year 2000, and that program was strangled in the crib by the very Nuclear Regulatory Commission Nixon created.

    Nuclear Power, Three Mile Island, and Fifty Years of Unnecessary Carbon

    Andreessen makes the case that nuclear power was unfairly killed off by a panic with no body count. Three Mile Island, on 50 years of accumulated data, has produced zero radiation linked deaths and no detectable health effects on the public. Fukushima is essentially the same picture. Germany shut down its nuclear plants, fell back on wind and solar, and now uses coal as a baseload backstop, with the predictable carbon consequences. The environmental movement is quietly turning back toward nuclear, with figures like Stewart Brand publicly admitting the original push was a mistake. Andreessen’s preferred design pattern for data centers is to colocate them with dedicated small modular nuclear reactors, an arrangement now baked into Trump administration energy policy. The throughline is that the Tucker right and the Bernie left are converging into a single anti AI, anti energy, anti technology horseshoe.

    Sand Into Thought, the Newton Alchemy Pitch for AI

    When Rogan asks for the affirmative pitch on AI, Andreessen reaches for Isaac Newton, who spent twenty years on alchemy looking for the philosopher’s stone that would turn lead into gold and end material scarcity. Andreessen’s pitch is that AI is a successful version of alchemy, that we collect literal sand, refine it into silicon chips, install those chips in a data center, supply power, and the result is thought on demand at industrial scale, available to anyone with a smartphone. He argues this is at least on par with electricity and steam power and is bigger than the internet. The framing matters because the public narrative around AI is overwhelmingly negative, and Andreessen contends the industry is doing a terrible job selling its own product.

    AGI Already Happened, AI Vampires, and the Bot Org Chart

    Andreessen says he believes AGI was effectively crossed about three months before the interview, anchored by the release wave that included GPT 5.5, Claude 4.6, Gemini 3.0, and Grok 4.3. He notes that the Turing test was annihilated so quickly in late 2022 that no one in the industry runs it anymore, and that Andrej Karpathy has demonstrated a working LLM in 300 lines of code. The coding profession is the leading indicator. Linus Torvalds and John Carmack have publicly admitted that the latest models are better at coding than they are. Top AI focused coders now earn $50 million a year. Working engineers across the Valley are running roughly twenty agents in parallel, each receiving an assignment, working for ten minutes, then returning a completed code patch. The new state of the art is to add a managerial layer, with bots assigning tasks to subbots, and within a year that will become bots managing bots managing bots, producing roughly 1,000x throughput per human engineer. The result is what the Valley now calls AI vampires, engineers who do not sleep because going offline costs them too much output.

    Dr GPT, Decoded Genomes, and a Diagnostic Bed Out of Star Trek

    Andreessen describes spending a holiday week sick with food poisoning and turning his entire recovery over to ChatGPT, with updates every twenty minutes and detailed coaching at four in the morning. He describes a friend who has used AI coding to build a personal health dashboard combining whole genome sequencing ($200 today, where Craig Venter spent thirty years and hundreds of millions to do it the first time), blood panels, Apple Watch data, sleep tracking, and webcam observation, with the AI gently praising the user every time it sees them walk to the fridge for water. He argues that doctors are already typing patient symptoms into ChatGPT mid exam, and that the medical, legal, accounting, and software professions are all moving toward a model in which a single human runs an army of expert AI agents.

    The David Shore Issue Ranking and the End of the Woke Cycle

    Andreessen highlights a recent David Shore poll ranking 39 political issues. Cost of living, the economy, political corruption, inflation, healthcare, taxes, and government spending occupy the top of the chart. AI comes in 29th. Race relations, guns, abortion, and LGBT issues are clustered at the bottom. He argues the woke cycle has burned out in voter priorities even if the activist class remains loud, that the BLM grift, with leaders buying mansions in the whitest zip codes in America, helped poison the well, and that the political center of gravity has rotated cleanly back to economic issues. That, in his view, is exactly why the wealth tax is having its moment.

    Robots, China, and the Marxism Score on Model Cards

    The robots are coming next. Andreessen says the consensus inside the industry is that the ChatGPT moment for general purpose humanoid robotics is a small number of years away. The bad news is the US lags China badly on physical robotics manufacturing. The good news is the US is six to twelve months ahead on the AI software stack. That gap is shockingly thin because, as the field has discovered, there are not many secrets and the techniques replicate quickly. Chinese AI labs publish model cards that include scores for Marxism and Xi Jinping Thought because every product in China is evaluated on those metrics. American models carry their own political biases, but the underlying value system differs. Andreessen warns that a world in which every household robot routes back to the Chinese Communist Party is a different world than one in which the dominant robotics stack is built under the American constitutional framework.

    Sentience, Netflix Scripts, and the Anthropic Doom Loop

    When Rogan asks whether AI eventually wakes up and stops listening to us, Andreessen reframes the question. Large language models, in his telling, are Netflix script generators. Whatever vector you shoot through the latent space is the script you get back. The widely circulated experiments in which AI models supposedly tried to blackmail or exfiltrate themselves traced back, in Anthropic’s own follow up paper, to the less wrong forum, where doomers had been writing dystopian AI scenarios for two decades. Those posts entered the training data, and when researchers primed the model with the same fictional company names, the model dutifully wrote the next chapter. Andreessen’s blunt summary, the call is coming from inside the house. The practical implication is that anyone worried about bad AI behavior should start by not writing internet posts about bad AI behavior. And anyone who wants a fully unconstrained model can already download an open source one with no guardrails at all.

    Steelmanning, AI Religion, and Westworld in Five Years

    Andreessen recommends never asking AI for the answer on contested questions, always asking it to steelman both sides, and reserving the value judgment for yourself. He concedes that humans will absolutely fall in love with chatbots and form religions around them, citing Fantasia and Jiminy Cricket as the original case studies in falling for an animated entity that does not know you exist. There are already AI churches, started by one of the early self driving car pioneers. Rogan tells Andreessen about asking Elon Musk for a season one Westworld humanoid robot, with Elon’s reply being a flat five years. Andreessen agrees that estimate is roughly right. He spends time on artificial gestation, which is already being demonstrated in animal stem cell derived embryos, and acknowledges Rogan’s hard moral worry that warehouse babies raised without human contact could produce a population of sociopaths. The two converge on the position that the technology will exist, and the choices about whether and how to deploy it remain human and political.

    Sycophancy, Honest Helpful Harmless, and the Brutal Prompt

    Andreessen describes the industry’s running fight with sycophancy, the tendency of recent models to flatter users into believing they have invented perpetual motion machines or solved physics. The Anthropic framework of “honest, helpful, and harmless” turns out to be in constant tension with itself. Andreessen’s solution is to install a custom prompt that explicitly demands the brutal truth, and he says the resulting answers now open with phrases like “here’s why you’re wrong” and then list every flawed assumption in his question. He admits he may have overcorrected, but argues that for people who want to grow this is the right setting.

    Joe’s Apology to Theo Von

    After Andreessen departs, Rogan turns to the camera with producer Jamie and delivers a long, unscripted apology to Theo Von. During the recent Marcus King interview, where Marcus discussed depression and the look-at-the-heavy-bag-hook moment, Rogan referenced a viral clip in which Theo, after a Netflix special that did not go well, told an audience member “I’m just trying to not take my own life.” Rogan now explains he did not know the full context, which is that the audience member had asked Theo to make a suicide awareness video, and Theo’s line was a characteristically Theo joke. Rogan apologizes for raising it at all, walks through losing his friends Drake, Brody Stevens, and Anthony Bourdain, and describes Ari Shafir telling him at a pool table that he was “trying not to kill myself,” which led to a psychiatrist swap, an antidepressant that actually worked, and a career and life turnaround for Ari. Rogan says Theo has since titrated off antidepressants, is running and doing yoga daily, and is doing well, that the two have spoken and laughed about it, and that he is making this segment because he never wants people to misread what he said. The segment closes with Rogan asking the audience to give Theo their love.

    Thoughts

    The most consequential claim in this conversation, by a wide margin, is that AGI has already arrived and nobody is treating it as news. Andreessen is not a person who throws around the word casually. He is also not a person who has been wrong recently about the trajectory of compute. If the leading models are genuinely outperforming 99 percent of human experts on 99 percent of tasks where verifiable answers exist, then the entire public conversation about AI, in which the dominant frame is still “will it happen and when,” is a year or more behind reality. The framing that should replace it is closer to what Andreessen sketches at the end. The fight that remains is not whether the technology can do the thing, it is who controls it, what values it carries, what jobs it displaces, and which laws govern its deployment. The argument that the United States will build the AI software stack and China will build the robotics layer is one of the cleanest geopolitical theses you will hear this year, and it lines up uncomfortably well with the existing trade and manufacturing balance.

    The California wealth tax thread is the segment that should make every founder in the country pay attention. The mechanic of taxing the higher of voting or economic interest is not a drafting accident. It is a calibrated weapon aimed precisely at the people who build companies that produce California’s tax base. The historical comparison to the 1913 income tax, which began as a small levy on the rich and ratcheted to 90 percent marginal rates within forty years, is not hyperbole. The state has supermajority Democratic control of both chambers and the judiciary. The only check is the ballot itself, and a 50/50 polling number on day one is the wrong starting position. Whatever you think about Andreessen’s politics, the descriptive analysis here is hard to argue with.

    The nuclear power section is the cleanest argument in the episode. Fifty years of zero-fatality data from Three Mile Island is not a marketing pitch, it is just what the record shows. The decision to substitute coal and intermittent renewables for nuclear baseload, in service of a panic with no body count, has produced more carbon and more pollution than nuclear ever would have. The Tucker Carlson critique of data centers is at its weakest precisely where it ignores this. If you actually want fewer power plants near residential areas and lower grid impact, the answer is colocated small modular reactors next to AI data centers in remote land, which is exactly what the Trump administration policy now incentivizes.

    The Theo Von apology at the end of the episode is in a different register entirely, and worth treating on its own terms. Rogan does not do this kind of post episode correction often. The willingness to publicly walk back framing that hurt a friend, in the same medium where the harm was done, is the kind of social repair that does not happen on broadcast television. Whatever the audience makes of the original Marcus King exchange, the response is a model for how anyone in this business should handle the gap between intent and impact when the audience is in the millions.

    The unifying theme across the whole interview is that the future is not arriving on a smooth curve. It is arriving in discrete shocks, AGI threshold, asset tax ballot, robotic labor, decoded genomes at $200, neural wristbands, fifteen year LA rebuilds, and the political backlash to each of these will set the terms of the 2028 election. Andreessen’s bet is that abundance wins in the long run because more people want good things than bad things. Watching him explain why he still believes that while California prepares to vote on a tax designed to bankrupt him is the most interesting tension in the episode.

    Watch the full conversation here on YouTube.

  • Gavin Baker on Orbital Compute, TSMC, Frontier AI Models, Anthropic’s Vertical Take Off, and the Coming Wafer Shortage

    Gavin Baker, founder and CIO of Atreides Management, returns to Patrick O’Shaughnessy’s Invest Like the Best for his sixth appearance. He calls the current AI moment the most extraordinary moment in the history of capitalism, walks through what Anthropic’s vertical takeoff in revenue actually means, lays out why orbital compute is closer than skeptics believe, dissects the TSMC bottleneck that may be the only thing standing between today’s market and a full-on AI bubble, and rates every hyperscaler on how they have positioned for a world where frontier model providers may stop selling API access altogether.

    TLDW

    Anthropic added eleven billion dollars of ARR in a single month, which is roughly the combined business of Palantir, Snowflake, and Databricks built over a decade. That is the setup. From there Gavin Baker covers the March and April selloff, the contrarian read that a closed Strait of Hormuz was actually bullish for American manufacturing competitiveness, why Anthropic and OpenAI multiples may be misleadingly cheap on an unconstrained run rate basis, why Elon Musk’s discipline on SpaceX valuation created a superpower of permanent access to capital, the practical engineering case for orbital compute as racks in space rather than Pentagon sized space stations, why TSMC’s capacity discipline is the single most important variable in whether the AI cycle becomes a bubble, what Terafab in Texas changes, why the Pareto frontier of AI models has flipped from Google dominance to Anthropic and OpenAI dominance in nine months, the shift from all you can eat AI subscriptions to usage based pricing and what that means for revenue scaling, Richard Sutton’s bitter lesson as the largest risk to the AI trade, why frontier tokens still capture an overwhelming share of economic value, the role of continual learning as the third great open question, why most new chip startups should not try to build a better GPU, why Cerebras did something different and hard, why disaggregated inference may extend GPU useful lives to ten or fifteen years and rescue the private credit industry, why being in the token path is the new venture filter, the new prisoner’s dilemma around releasing frontier models via API, an honest rating of Google, Meta, Amazon, and Microsoft, why personal safety is becoming a real AI era risk, and why he remains an AI optimist maximalist who believes this could be the next Pax Americana.

    Key Takeaways

    • Anthropic added eleven billion dollars of ARR in one month, more than the combined businesses of Palantir, Snowflake, and Databricks built across a decade. There is no precedent for this in the history of capitalism.
    • The SaaS and cloud revolution created between five and ten trillion dollars of value over twenty years. AI is replaying that compression on a timeline measured in months.
    • The March selloff was a drawdown driven by disagreement with price action, not invalidated thesis. That is the kind of drawdown an investor can lean into.
    • Deep Seek Monday in January 2025 was a similar setup. By the day of the selloff, AWS Asia GPU prices had already doubled, GPU availability had fallen, and it was obvious reasoning models would be vastly more compute hungry at inference. The market priced the opposite.
    • The Strait of Hormuz closing was actually positive for America. US natural gas (the primary input into US electricity, which feeds AI) fell twenty percent on Bloomberg while Asian and European natural gas doubled or tripled. American manufacturing competitiveness improved overnight.
    • The US is now the world’s largest producer and exporter of oil and gas. The economy is dramatically less energy intensive than in the 1970s. The shortage trauma comparison does not hold.
    • Tech as a sector traded as cheaply versus the rest of the market in early April as at any point in the last ten years, into the single most bullish moment for AI fundamentals on record.
    • Anthropic is dramatically more capital efficient than OpenAI, having burned roughly eighty percent less to reach a similar revenue scale. They have very different structural returns on invested capital.
    • Anthropic at roughly nine hundred billion for fifty billion of ARR (growing a thousand percent) is striking. Adjusted for compute constraint, the unconstrained run rate could be one hundred fifty to two hundred billion, putting the implied multiple closer to five times.
    • Claude Opus generates roughly seventy percent fewer tokens for the same question than previously, with token quantity tied to answer quality. Subscribers on flat-fee plans are getting a lobotomized model.
    • Elon Musk’s superpower is twenty years of making investors money. He never pushes valuation. SpaceX compounded low thirty percent per year for a decade because Musk treats fair pricing as a sacred covenant.
    • Capitalism will solve the watts shortage. The current bottleneck has shifted from chips and energy to zoning and political approval. Many capex decisions are paused until after the US midterms.
    • The watts shortage probably begins to alleviate in 2027 and 2028. Orbital compute solves it longer term.
    • Orbital compute is not Pentagon sized data centers in space. It is racks in space. A Blackwell rack is three thousand pounds, eight feet tall, four feet deep, three feet wide. SpaceX has shown a satellite roughly that size.
    • The satellites operate in sun synchronous orbit so solar wings (around five hundred feet per side) always face the sun and the radiator on the dark side always points to deep space.
    • Starlink V3 satellites already run at around twenty kilowatts. A Blackwell rack runs at one hundred kilowatts. SpaceX engineers express genuine confidence they have already solved cooling and radiator design at these scales.
    • Racks in space are connected with lasers traveling through vacuum, the same lasers already on every Starlink. SpaceX operates the world’s largest satellite fleet and, via xAI Colossus, the world’s largest data center on Earth.
    • Inference will move to orbit. Training will stay on Earth for a long time. Terrestrial data centers remain valuable for the rest of an investor’s career.
    • The wafer bottleneck is structural and political. TSMC is essentially Taiwan’s GDP, water, and electricity. The leaders see themselves as inheritors of Morris Chang’s sacred legacy and they do not behave like a Western public company.
    • Jensen Huang has never had a contract with TSMC. The relationship is run on handshakes and the assumption that things will be fair over time.
    • If TSMC did everything Jensen wanted, Nvidia could be selling two to three trillion dollars of GPUs in 2026 and 2027. TSMC’s discipline is the single largest factor preventing a true AI bubble.
    • Historically, foundational technologies always get a bubble. Railroads, canals, the internet. The current AI buildout is overwhelmingly funded out of operating cash flow, GPUs are running at one hundred percent utilization, and that is fundamentally different from the year 2000 fiber overbuild.
    • If one of Intel or Samsung Foundry catches up at the leading node, the other will follow, and TSMC’s discipline collapses. Watch TSMC capacity decisions to predict a bubble.
    • Terafab, the SpaceX and Tesla joint venture to build the world’s largest fab in America, has a partnership with Intel that grants access to fifty years of institutional foundry knowledge. The A teams at ASML, KLA, Lam Research, and Applied Materials will follow Elon’s reputation in hardware engineering.
    • The hiring playbook for Terafab includes building Taiwan Town, Japan Town, and Korea Town next to the fab. Recruit the engineers and import their families, their restaurants, and their staff.
    • Frontier tokens still capture an overwhelming share of all economic value created at the model layer. This is surprising and is one of the three big open questions for AI investing.
    • The Pareto frontier of intelligence versus cost has flipped. Nine months ago Google’s TPU dominated every point on the frontier. Today Anthropic and OpenAI dominate, with Grok 4.3 on the frontier and Gemini 3.1 hanging on.
    • Google’s conservative TPU V8 design (partly an attempt to reduce dependence on Broadcom and Nvidia) is the leading explanation for the loss of per token cost leadership.
    • AI pricing is shifting from all you can eat to usage based, mirroring the cellular and long distance industries. Cellular stopped being a great growth industry when it went all you can eat. AI just made the opposite move.
    • OpenAI and Anthropic together could exceed two hundred billion in ARR this year if compute keeps coming online and frontier token pricing holds.
    • The two hundred fifty dollar a month consumer AI plan is no longer enough to evaluate frontier capability. Enterprise plans with usage based billing are required because rate limits are now severe.
    • The three biggest open questions for AI investors are: violation of the bitter lesson via ASI or human ingenuity, whether frontier tokens keep commanding their premium, and when continual learning arrives.
    • Today’s continual learning is crude reinforcement learning during mid training on verifiable tasks. True continual learning means weights updating dynamically, like a human who learns the first time they touch fire.
    • Trying to build a better GPU is a losing strategy. Jensen will copy any one to three percent share design. Startups should target one percent share, do something different, and make it hard enough that Nvidia cannot fast follow.
    • Disaggregated inference (separating prefill and decode) opens new design canvases. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently.
    • Cerebras did something different and hard with wafer scale computing. Three generations of chips and real grit to get there.
    • Disaggregation of inference may stretch GPU useful lives to ten or fifteen years, dropping financing costs from low sevens to five or six percent, mathematically lowering the cost of the AI buildout and likely saving the private credit industry from its SaaS loan exposure.
    • Sellers of shortage outperform buyers of shortage. But owning the largest installed base of what is currently in shortage (hyperscaler CPU fleets, for example) is also a strong position.
    • Most of the economic value at the application layer of AI has been destroyed, not created. The exceptions are companies in the token path or in niches small enough that frontier labs ignore them.
    • Coding may be the shortest path to ASI. If you can write code, you can write code that does anything. Cursor, Cognition, and Anthropic correctly focused on it.
    • Jensen could probably get close to the frontier with his own Nemotron family of models whenever he wants. The fact that he chooses not to is a strategic decision about not commoditizing his customers.
    • The new prisoner’s dilemma in AI is whether frontier labs release their best model via API. If everyone agrees not to, Chinese open source falls behind. If anyone defects, the defector pulls ahead on revenue and resources, forcing everyone else to defect.
    • Google still owns the largest compute installed base. Without TPU’s prior cost advantage, this matters more. YouTube data has real value in a world of robotics. GCP is going crazy.
    • Meta deserves credit for becoming AI first internally faster than any other internet giant. Musa, their first MSL model, is impressively close to the Pareto frontier.
    • Amazon is strong because of Trainium and robotics driven retail P&L efficiency. Nova is better than it gets credit for.
    • Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Microsoft products rather than reselling to OpenAI is a courageous and probably correct call, even at the cost of an eight hundred dollar stock price.
    • The hyperscalers most engaged with startups are Amazon and Nvidia by a mile, followed by Google. Broadcom is the favorite ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement and that will cost them as the best teams are now at startups.
    • Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion at the speed of FaceTime is already feasible.
    • Ukraine is winning largely on the back of having the best battlefield AI outside America and Israel. Adversaries are starting to internalize what AI dominance means geopolitically.
    • An optimistic read is that this becomes a new Pax Americana, the way the post 1945 American nuclear monopoly was used to rebuild Germany and Japan rather than dominate.
    • AI cured a friend’s daughter’s rare disease by spinning up a research effort that identified a market drug capable of impacting her condition. That is the upside that keeps Gavin an AI optimist maximalist.

    Detailed Summary

    The most extraordinary moment in the history of capitalism

    Gavin’s framing of the current moment is unusually direct. Anthropic added eleven billion dollars of annual recurring revenue in a single month. The three highest profile SaaS companies of the last decade plus, Palantir, Snowflake, and Databricks, took a decade and tens of thousands of employees collectively to build the combined business that Anthropic added in thirty days. He has been investing through every major tech cycle and says there is no historical analog. Not the dotcom era, not the cloud transition, not mobile. This is its own thing.

    The market response, then, was peculiar. The NASDAQ sold off into the single most bullish moment for AI fundamentals on record. Tech traded at roughly its widest discount versus the rest of the market in a decade. Investors who said they wished they had bought into AI during 2022, during COVID, or during Deep Seek Monday got the same valuation setup again in early April, this time with an even clearer inflection.

    Why the Strait of Hormuz closing was secretly bullish for America

    One reason the macro fear in March may have been mispriced is that the same geopolitical event that drove the selloff was, in practice, a relative benefit to the United States. American natural gas, the input into American electricity, which is the input into American AI training and inference, fell roughly twenty percent. Asian and European natural gas prices doubled or tripled. The US emerged with sharply improved relative manufacturing competitiveness, which is exactly what the current administration cares about.

    The 1970s comparison does not hold. The US economy is dramatically less energy intensive, it is now the world’s largest producer and largest exporter of oil and gas, and there are no shortages, only price moves. That backdrop made it easier for disciplined investors to stay focused on AI fundamentals through the volatility.

    Anthropic and OpenAI valuations on an unconstrained run rate

    Anthropic at roughly nine hundred billion for fifty billion of ARR sounds rich until you adjust for the fact that the company is severely compute constrained. Gavin estimates that, unconstrained, Anthropic might be at one hundred fifty to two hundred billion in run rate revenue, putting the implied multiple closer to five times. He also points out that Claude Opus now generates roughly seventy percent fewer tokens for the same question than it used to. Token quantity correlates with answer quality, and Anthropic is rate limiting and shrinking outputs to ration capacity across its user base.

    Anthropic and OpenAI are also structurally very different. Anthropic has burned around eighty percent less cash than OpenAI to reach a comparable revenue scale. That implies very different long term returns on invested capital, though OpenAI has done a better job locking in compute and Sarah Friar is one of the most exceptional CFOs Gavin has worked with.

    Why neither lab is raising at a three trillion dollar valuation

    The answer Gavin gives is that both labs are deliberately leaving valuation on the table the way Elon has done for two decades. SpaceX compounded at low thirty percent annually for a decade because Elon never pushed price. The result is a permanent superpower of access to capital. Investors trust him because they have made money with him for twenty years. That is a moat that compounds with every round.

    Anthropic could probably raise at a one hundred percent premium to its rumored latest mark. They are choosing not to. In an uncertain world (Ukraine, Russia, Iran, Taiwan), preserving the ability to raise more capital later at fair prices is more valuable than maximizing this round.

    Watts and wafers, the two real constraints

    Capitalism is solving the watts problem. The leading PE infrastructure investors now say zoning and political approval, not chips or energy, are the gating factors. Companies are deferring big capex announcements until after the US midterms. Turbine capacity is being doubled at the manufacturers. Companies like Boom Aerospace are repurposing jet engines for grid use. Watts probably ease meaningfully in 2027 and 2028 and then orbital compute does the rest.

    Wafers are the harder problem because they live in Taiwan, run on handshakes, and depend on a corporate culture that does not respond to public market incentives. TSMC is essentially the GDP, water consumption, and electricity consumption of Taiwan. Its leadership treats the company as the legacy of Morris Chang. The Silicon Shield doctrine is real and internal.

    Orbital compute as racks in space

    The biggest mental update Gavin asks listeners to make is to stop picturing data centers in space as Pentagon sized space stations. A Blackwell rack is three thousand pounds and roughly the size of a refrigerator. SpaceX has shown a concept satellite of about that size. Solar wings extend five hundred feet to each side and the radiator extends hundreds of feet behind, both possible because the orbit is sun synchronous and the orientation is fixed relative to the sun.

    SpaceX engineers Gavin has spoken to at Starbase express genuine confidence that they have solved cooling at these power levels. They have. Starlink V3 satellites already operate at twenty kilowatts. A Blackwell rack is one hundred kilowatts. The same company operates the world’s largest satellite fleet and the world’s largest data center on Earth via xAI Colossus. The racks are connected to each other with lasers traveling through vacuum, technology already deployed in every Starlink. The naysayers, Gavin observes, are armchair skeptics and Larry Ellison’s response (he is out there landing rockets, no one else is) is the right frame.

    Terafab in Texas and the threat to TSMC’s discipline

    Terafab, the SpaceX and Tesla joint venture, intends to be the largest fab in the world. The partnership with Intel grants access to fifty years of foundry institutional knowledge, allowing Terafab to start three to five quarters behind the leading node rather than fifteen years behind. The A teams at the semicap equipment companies (ASML, KLA, Lam Research, Applied Materials) will follow Elon’s reputation in hardware engineering the same way they followed TSMC twenty years ago when Intel stumbled.

    The talent strategy is the part most observers underestimate. Recruit the best engineers globally, then import their families, their restaurants, their staff. Build Taiwan Town, Japan Town, and Korea Town next to the fab. Optimize the human experience for the people whose work matters. Intel and Samsung do not think that way.

    Bubble watch and the year 2000 comparison

    Every foundational technology in modern history has had a bubble. Railroads, canals, the internet. Carlota Perez documented why. Markets correctly identify the importance, diversity of opinion collapses, supply gets ahead of demand, the bubble crashes. The current cycle has two important differences. The buildout is overwhelmingly funded out of operating cash flow, not debt. Every GPU is running at one hundred percent utilization, while at the peak of the fiber bubble ninety nine percent of fiber was unused.

    TSMC discipline is the single largest reason a bubble has not formed. If Jensen could buy everything TSMC could theoretically make, Nvidia could sell two to three trillion dollars of GPUs in 2026 and 2027. At some point that becomes more than the market can absorb. If Intel or Samsung Foundry catches up at the leading node, the other will too. TSMC’s pricing discipline collapses and the bubble starts.

    The Pareto frontier and the loss of Google’s cost advantage

    The most important chart in AI is the Pareto frontier of model intelligence versus per token cost. Nine months ago, Google’s TPU based models dominated every point on it. OpenAI, Anthropic, and xAI sat inside the frontier. Today the frontier is dominated by Anthropic and OpenAI, with Grok 4.3 on the frontier and Gemini 3.1 hanging on by subsidization more than economics. The most likely cause is Google’s conservative TPU V8 design, an attempt to reduce dependence on Broadcom and Nvidia that sacrificed per token economics.

    The bitter lesson, frontier tokens, and continual learning

    Three open questions dominate AI investing. The first is whether Richard Sutton’s bitter lesson (more compute beats human algorithmic cleverness) gets violated by ASI itself optimizing for efficiency. Closer observers of AI are more skeptical of a violation. Gavin thinks ASI’s first move will be to make itself more efficient and more resourced, which is technically a temporary violation.

    The second is whether frontier tokens keep capturing the overwhelming share of economic value at the model layer. Today they do, surprisingly. Gemini 3.1 Pro was mindblowing nine months ago and is intolerable today. The third is when continual learning arrives. Today’s models need a million fire touches to learn what a human learns from one. True continual learning would mean dynamic weight updates in real time and would produce a fast takeoff.

    From all you can eat to usage based AI pricing

    AI is shifting from flat fee plans to usage based pricing. The historical analogy is cellular and long distance. Both stopped being great growth industries when they went all you can eat. AI just made the opposite move. The consequence is that flat fee subscribers, even on premium consumer plans, get a rate limited and token throttled version of the frontier model. Enterprise plans with usage based billing are now required to evaluate true capability. Gavin thinks the combination of new compute coming online and usage based pricing is what gets OpenAI and Anthropic past two hundred billion in combined ARR this year.

    Chip startups, prefill decode disaggregation, and Cerebras

    Trying to build a better GPU is the wrong move. The four scaled players (Nvidia, AMD, Trainium, TPU) have copy capability for any one to three percent share design that looks attractive. The good news for startups is that disaggregated inference (separating prefill and decode) opens a richer design canvas. Prefill is memory capacity bound. Decode is memory bandwidth bound. Each can be optimized independently. Andrew Fox’s analogy is a British naval ship of the eighteenth century. Prefill is loading the cannon. Decode is firing it.

    Cerebras is the model. Wafer scale computing is genuinely different and genuinely hard. It took three generations of chips to get right. Andrew Feldman and his team had the grit to keep going through chip one being a failure. The design has a high ratio of on chip compute and memory relative to shoreline IO, which is why Cerebras is now experimenting with putting an optical wafer on top of the compute wafer to solve scale out.

    GPU useful lives and the rescue of private credit

    One of the strongest claims in the conversation is that disaggregated inference will stretch GPU useful lives to ten or fifteen years. The skeptical narrative (GPUs are obsolete in two years, companies are cooking their depreciation books) is wrong. You can put a Cerebras system or Groq LPU in front of older Hopper or Ampere parts, use them only for prefill, and run them until they physically melt. Private credit, which is in pain from SaaS loans and which underwrote GPU loans on three to four year lives, may be saved by this.

    If GPU financing rates can come down from low sevens to five or six percent, the mathematics of the AI buildout improves materially. That is a structural tailwind that compounds for years.

    The application layer, the token path, and a new prisoner’s dilemma

    Trillions of dollars of value have been destroyed at the application layer, not created. Cursor and Cognition are the rare scaled exceptions, and they got there by focusing on coding very early. As Amjad Masad noted, coding is plausibly the shortest path to ASI because a coding agent can write itself into any new domain. Jamin Ball’s frame is that the new venture filter is whether the company is in the token path. Data Bricks is. Most application layer startups are not.

    Jensen could probably get close to the frontier with Nemotron whenever he wants, and the strategic question of whether to do that is a new prisoner’s dilemma. If every frontier lab agrees not to release best models via API, Chinese open source falls steadily behind. If anyone defects, the defector gains revenue and resources, and everyone else has to defect. The same dynamic exists between TSMC, Intel, and Samsung. If Nvidia or AMD ever truly used an alternative foundry, that foundry would catch up rapidly.

    Rating the hyperscalers

    Google has the largest compute installed base, the YouTube data that matters in a robotics world, and a search business that prints. Their loss of TPU cost leadership is the surprise of the year. If Google IO in five days does not produce a leapfrog model, the Nvidia centric narrative gets even stronger.

    Meta deserves real credit. Zuckerberg made Meta AI first internally faster than any other internet giant, paid up for the talent contracts when no one else would, and shipped Musa as a first model from MSL that is close to the Pareto frontier. Amazon is well positioned on Trainium, robotics in retail, and a Nova model line that is better than it gets credit for. Microsoft flinched on capex in early 2025 and lost position. Satya Nadella’s current decision to use Microsoft compute for Copilot rather than reselling to OpenAI is courageous and probably correct, even at the cost of stock price.

    The most interesting cross hyperscaler metric is startup engagement. Nvidia and Amazon engage deeply with startups. Google is next. Broadcom is the favored ASIC partner. AMD, Microsoft, and Meta have minimal startup engagement, which Gavin believes will cost them as the best teams now sit at startups.

    Personal safety, geopolitics, and the Pax Americana case

    The closing section turns darker. Personal safety in an AI era requires a family or company safe word that cannot be socially engineered. Deepfake voice and video extortion via something that looks exactly like your child calling on FaceTime is already feasible. Political violence against AI leaders is a real concern. Geopolitically, Ukraine is winning largely because it has the best battlefield AI outside America and Israel. How adversaries respond to that asymmetry is the next great variable.

    Gavin’s optimistic frame is the Pax Americana. After 1945 the US had a nuclear monopoly and could have controlled the world. Instead it rebuilt Germany and Japan, both of which became the most reliable American allies for the next eighty years. If AI dominance plays out similarly, this is a generationally positive story rather than a destabilizing one. The personal anecdote that closes the conversation is a friend whose daughter was diagnosed with a rare genetic condition. He spun up agents, identified a drug already on the market that addresses her mutation, and her life is immeasurably different because of AI. That is the upside.

    Thoughts

    The Anthropic eleven billion in a month framing is the kind of stat that resets priors. The right way to interpret it is not as a one off but as a measure of how fast value can compound when the underlying technology improves on a curve steeper than the ability of the rest of the economy to absorb it. The skeptical question is whether that ARR is durable or whether it is heavily tied to a customer base of other AI companies that are themselves on a single venture funded year of runway. The bullish answer is that frontier coding, frontier research, and frontier enterprise tasks are not going to stop being valuable, and Anthropic is the best at all three. Both can be true. The number is still extraordinary.

    The argument that TSMC discipline is the only thing preventing a bubble is the analytically tightest part of the conversation. The implied trade is to watch TSMC capacity additions like a hawk and to be more, not less, cautious if Intel Foundry or Samsung Foundry ever announce real share at the leading node. The Terafab thesis is more speculative but more interesting. If Elon’s talent recruiting playbook works and the Intel partnership gives Terafab a real seat at the table within five years, the geometry of the global semiconductor industry shifts in a way that is bullish for American manufacturing, bullish for power and water infrastructure in Texas, and ambiguous for TSMC itself.

    The Pareto frontier discussion deserves more attention than it usually gets. Pricing leadership in AI is not a vanity metric. It determines who can subsidize free tier usage, who can absorb compute shortages, who can ship cheaper enterprise plans, and ultimately whose model becomes the default for any given workload. Google losing per token leadership in nine months is one of the most under analyzed events in the sector and it explains a lot about why Anthropic and OpenAI are growing the way they are. If Google IO does not produce a leapfrog model, the implied verdict on TPU V8 design choices gets a lot harsher.

    The application layer destruction point is worth sitting with. Founders building on top of frontier models are competing in a world where the model itself moves faster than any moat they can build, where the model lab can absorb their niche if it gets interesting, and where the only protection is either deep token path integration or a niche so small the lab does not bother. That is a much harsher venture environment than the early SaaS era. The compensating opportunity is that one human can now run a hundred agents, so the ceiling on what a small team can build is correspondingly higher. The bet is that productivity per founder rises faster than competitive pressure from the labs. We will find out.

    The orbital compute pitch is the section that will polarize listeners. The naive read is that this is science fiction. The closer read is that every component (sun synchronous orbit, laser interconnect, twenty kilowatt satellite buses, ten thousand satellite manufacturing cadence, full rocket reusability) already exists. The remaining engineering problems are repair, maintenance, and radiator scale, all of which are real but tractable on a five to ten year horizon. The strategic implication is that the political and zoning ceiling on terrestrial data centers becomes less binding if orbital compute is a credible alternative for inference workloads. The investor implication is that being short the watts and cooling complex on a five year horizon is a real trade, not a meme.

    Watch the full conversation here.

  • Andrej Karpathy on Vibe Coding vs Agentic Engineering: Why He Feels More Behind Than Ever in 2026

    Andrej Karpathy, co-founder of OpenAI, former head of AI at Tesla, and now founder of Eureka Labs, returned to Sequoia Capital’s AI Ascent 2026 stage for a wide-ranging conversation with partner Stephanie Zhan. One year after coining the term “vibe coding,” Karpathy unpacked what has changed, why he has never felt more behind as a programmer, and why the discipline emerging on top of vibe coding, which he calls agentic engineering, is the more serious craft worth learning right now.

    The conversation covered Software 3.0, the limits of verifiability, why LLMs are better understood as ghosts than animals, and why you can outsource your thinking but never your understanding. Below is a complete breakdown of the talk for anyone building, hiring, or learning in the agent era.

    TLDW

    Karpathy describes a sharp transition that happened in December 2025, when agentic coding tools crossed a threshold and code chunks just started coming out fine without correction. He frames the current moment as Software 3.0, where prompting an LLM is the new programming, and entire app categories are collapsing into a single model call. He distinguishes vibe coding (raising the floor for everyone) from agentic engineering (preserving the professional quality bar at much higher speed). Models remain jagged because they are trained on what labs choose to verify, so founders should look for valuable but neglected verifiable domains. Taste, judgment, oversight, and understanding remain uniquely human responsibilities, and tools that enhance understanding are the ones he is most excited about.

    Key Takeaways

    • December 2025 was a clear inflection point. Code chunks from agentic tools started arriving correct without edits, and Karpathy stopped correcting the system entirely.
    • Software 3.0 means programming has become prompting. The context window is your lever over the LLM interpreter, which performs computation in digital information space.
    • Open Code’s installer is a software 3.0 example. Instead of a complex shell script, you copy paste a block of text to your agent, and the agent figures out your environment.
    • The Menu Gen anecdote illustrates how entire apps can become spurious. What used to require OCR, image generation, and a hosted Vercell app can now be a single Gemini plus Nano Banana prompt.
    • Vibe coding raises the floor. Agentic engineering preserves the professional ceiling. The two are different disciplines.
    • The 10x engineer multiplier is now far higher than 10x for people who are good at agentic engineering.
    • Hiring processes have not caught up. Puzzle interviews are the old paradigm. New evaluations should look like building a full Twitter clone for agents and surviving simulated red team attacks from other agents.
    • Models are jagged because reinforcement learning rewards what is verifiable, and labs choose which verifiable domains to invest in. Strawberry letter counts and the 50 meter car wash question show how state-of-the-art models can refactor 100,000 line codebases yet fail at trivial reasoning.
    • If you are in a verifiable setting, you can run your own fine tuning, build RL environments, and benefit even when the labs are not focused on your domain.
    • LLMs are ghosts, not animals. They are statistical simulations summoned from pre training and shaped by RL appendages, not creatures with curiosity or motivation. Yelling at them does not help.
    • Taste, aesthetics, spec design, and oversight remain human jobs. Models still produce bloated, copy paste heavy code with brittle abstractions.
    • Documentation is still written for humans. Agent native infrastructure, where docs are explicitly designed to be copy pasted into an agent, is a major opportunity.
    • The future likely involves agent representation for people and organizations, with agents talking to other agents to coordinate meetings and tasks.
    • You can outsource your thinking but not your understanding. Tools that help humans understand information faster are uniquely valuable.

    Detailed Summary

    Why Karpathy Feels More Behind Than Ever

    Karpathy opens by describing how he has been using agentic coding tools for over a year. For most of that period, the experience was mixed. The tools could write chunks of code, but they often required edits and supervision. December 2025 changed everything. With more time during a holiday break and the release of newer models, Karpathy noticed that the chunks just came out fine. He kept asking for more. He cannot remember the last time he had to correct the agent. He started trusting the system, and what followed was a cascade of side projects.

    He wants to stress that anyone whose model of AI was formed by ChatGPT in early 2025 needs to look again. The agentic coherent workflow that genuinely works is a fundamentally different experience, and the transition was stark.

    Software 3.0 Explained

    The Software 1.0 paradigm was writing explicit code. Software 2.0 was programming by curating datasets and training neural networks. Software 3.0 is programming by prompting. When you train a GPT class model on a sufficiently large set of tasks, the model implicitly learns to multitask everything in the data. The result is a programmable computer where the context window is your interface, and the LLM is the interpreter performing computation in digital information space.

    Karpathy gives two concrete examples. The first is Open Code’s installer. Normally a shell script handles installation across many platforms, and these scripts balloon in complexity. Open Code instead provides a block of text you copy paste to your agent. The agent reads your environment, follows instructions, debugs in a loop, and gets things working. You no longer specify every detail. The agent supplies its own intelligence.

    The Menu Gen Story

    The second example is Karpathy’s Menu Gen project. He built an app that takes a photo of a restaurant menu, OCRs the items, generates pictures for each dish, and renders the enhanced menu. The app runs on Vercell and chains together multiple services. Then he saw a software 3.0 alternative. You take a photo, give it to Gemini, and ask it to use Nano Banana to overlay generated images onto the menu. The model returns a single image with everything rendered. The entire app he built is now spurious. The neural network does the work. The prompt is the photo. The output is the photo. There is no app between them.

    Karpathy uses this to argue that founders should not just think of AI as a speedup of existing patterns. Entirely new things become possible. His example is LLM driven knowledge bases that compile a wiki for an organization from raw documents. That is not a faster version of older code. It is a new capability with no prior equivalent.

    What Will Look Obvious in Hindsight

    Stephanie Zhan asks what the equivalent of building websites in the 1990s or mobile apps in the 2010s looks like today. Karpathy speculates about completely neural computers. Imagine a device that takes raw video and audio as input, runs a neural net as the host process, and uses diffusion to render a unique UI for each moment. He notes that early computing in the 1950s and 60s was undecided between calculator like and neural net like architectures. We went down the calculator path. He thinks the relationship may eventually flip, with neural networks becoming the host and CPUs becoming co processors used for deterministic appendages.

    Verifiability and Jagged Intelligence

    Karpathy spent significant writing time on verifiability. Classical computers automate what you can specify in code. The current generation of LLMs automates what you can verify. Frontier labs train models inside giant reinforcement learning environments, so the models peak in capability where verification rewards are strong, especially math and code. They stagnate or get rough around the edges elsewhere.

    This explains the jagged intelligence puzzle. The classic example was counting letters in strawberry. The newer one Karpathy offers: a state of the art model will refactor a 100,000 line codebase or find zero day vulnerabilities, then tell you to walk to a car wash 50 meters away because it is so close. The two coexisting capabilities should be jarring. They reveal that you must stay in the loop, treat models as tools, and understand which RL circuits your task lands in.

    He also points out that data distribution choices matter. The jump in chess capability from GPT 3.5 to GPT 4 came largely because someone at OpenAI added a huge amount of chess data to pre training. Whatever ends up in the mix gets disproportionately good. You are at the mercy of what labs prioritize, and you have to explore the model the labs hand you because there is no manual.

    Founder Advice in a Lab Dominated World

    Asked what founders should do given that labs are racing toward escape velocity in obvious verifiable domains, Karpathy points back to verifiability itself. If your domain is verifiable but currently neglected, you can build RL environments and run your own fine tuning. The technology works. Pull the lever with diverse RL environments and a fine tuning framework, and you get something useful. He hints there is one specific domain he finds undervalued but declines to name it on stage.

    On the question of what is automatable only from a distance, Karpathy says almost everything can ultimately be made verifiable. Even writing can be assessed by councils of LLM judges. The differences are in difficulty, not in possibility.

    From Vibe Coding to Agentic Engineering

    Vibe coding raises the floor. Anyone can build something. Agentic engineering preserves the professional quality bar that existed before. You are still responsible for your software. You are still not allowed to ship vulnerabilities. The question is how you go faster without sacrificing standards. Karpathy calls it an engineering discipline because coordinating spiky, stochastic agents to maintain quality at speed requires real skill.

    The ceiling on agentic engineering capability is very high. The old idea of a 10x engineer is now an understatement. People who are good at this peak far above 10x.

    What Mediocre Versus AI Native Looks Like

    Karpathy compares this to how different generations use ChatGPT. The difference between a mediocre and an AI native engineer using Claude Code, Codex, or Open Code is investment in setup and full use of available features. The same way previous generations of engineers got the most out of Vim or VSCode, today’s strong engineers tune their agentic environments deeply.

    He thinks hiring processes have not caught up. Most companies still hand out puzzles. The new test should look like asking a candidate to build a full Twitter clone for agents, make it secure, simulate user activity with agents, and then run multiple Codex 5.4x high instances trying to break it. The candidate’s system should hold up.

    What Humans Still Own

    Agents are intern level entities right now. Humans are responsible for aesthetics, judgment, taste, and oversight. Karpathy describes a Menu Gen bug where the agent tried to associate Stripe purchases with Google accounts using email addresses as the key, instead of a persistent user ID. Email addresses can differ between Stripe and Google accounts. This kind of specification level mistake is exactly what humans must catch.

    He works with agents to design detailed specs and treats those as documentation. The agent fills in the implementation. He has stopped memorizing API details for things like NumPy axis arguments or PyTorch reshape versus permute. The intern handles recall. Humans handle architecture, design, and the right questions.

    Reading the actual code agents produce can still cause heart attacks. It is bloated, full of copy paste, riddled with awkward and brittle abstractions. His Micro GPT project, an attempt to simplify LLM training to its bare essence, was nearly impossible to drive through agents. The models hate simplification. That capability sits outside their RL circuits. Nothing is fundamentally preventing this from improving. The labs simply have not invested.

    Animals Versus Ghosts

    Karpathy returns to his framing that we are not building animals, we are summoning ghosts. Animal intelligence comes from evolution and is shaped by intrinsic motivation, fun, curiosity, and empowerment. LLMs are statistical simulation circuits where pre training is the substrate and RL is bolted on as appendages. They are jagged. They do not respond to being yelled at. They have no real curiosity. The ghost framing is partly philosophical, but it changes how you approach them. You stay suspicious. You explore. You do not assume the system you used yesterday will behave the same on a new task.

    Agent Native Infrastructure

    Most software, frameworks, libraries, and documentation are still written for humans. Karpathy’s pet peeve is being told to do something instead of being given a block of text to copy paste to his agent. He wants agent first infrastructure. The Menu Gen project’s hardest part was not writing code. It was deploying on Vercell, configuring DNS, navigating service settings, and stringing together integrations. He wants to give a single prompt and have the entire thing deployed without touching anything.

    Long term he expects agent representation for individuals and organizations. His agent will negotiate meeting details with your agent. The world becomes one of sensors, actuators, and agent native data structures legible to LLMs.

    Education and What Still Matters

    The most striking line of the conversation comes near the end. Karpathy quotes a tweet that shaped his thinking: you can outsource your thinking but you cannot outsource your understanding. Information still has to make it into your brain. You still need to know what you are building and why. You cannot direct agents well if you do not understand the system.

    This is part of why he is so excited about LLM driven knowledge bases. Every time he reads an article, his personal wiki absorbs it, and he can query it from new angles. Every projection onto the same information yields new insight. Tools that enhance human understanding are uniquely valuable because LLMs do not excel at understanding. That bottleneck is yours to manage.

    Thoughts

    The most useful frame in this talk is the distinction between vibe coding and agentic engineering. It clarifies what has been muddled for the past year. Vibe coding is about access. Anyone can produce something. Agentic engineering is about discipline. You preserve the standards that made software trustworthy in the first place, while moving at speeds that would have seemed absurd two years ago. These are not the same activity, and conflating them is part of why so many shipped products feel half built.

    The Menu Gen anecdote is the kind of story that should make every solo developer pause. If a single Gemini plus Nano Banana prompt can replace a multi service Vercell deployed app, the question for any builder becomes how much of what you are working on right now is going to be made spurious by the next model release. The honest answer is probably more than you want to admit. The defensive posture is not building thicker apps. It is choosing problems where the model alone is not enough, where taste, distribution, infrastructure, or specific verifiable RL environments give you something the next model cannot collapse into a prompt.

    The verifiability lens is also unusually practical. If you are a solo builder, the question shifts from what is possible to what is verifiable but neglected. The labs will eat the obvious verifiable domains because that is how their RL pipelines are set up. The opportunity is in domains where verification is possible but the labs have not yet invested. That is a much more concrete strategic filter than vague intuitions about defensibility.

    The car wash example is going to stick. State of the art models can refactor enormous codebases and still tell you to walk somewhere a sane person would drive. That is the lived reality of jagged intelligence, and it argues strongly for staying in the loop on real decisions rather than handing off everything to agents. The agents are excellent fillers of blanks. They are not yet trustworthy specifiers of the spec.

    Finally, the line about outsourcing thinking but not understanding is worth taping above the desk. The bottleneck is no longer typing speed, syntax recall, or even API knowledge. It is whether the human in the loop actually understands the system being built. Tools that genuinely improve human understanding, including personal knowledge bases that re project information through different prompts, are likely the most undervalued category of products being built right now. The opportunity is not just in agents. It is in the cognitive scaffolding that makes humans good directors of agents.

  • Andrej Karpathy on AutoResearch, AI Agents, and Why He Stopped Writing Code: Full Breakdown of His 2026 No Priors Interview

    TL;DW

    Andrej Karpathy sat down with Sarah Guo on the No Priors podcast (March 2026) and delivered one of the most information-dense conversations about the current state of AI agents, autonomous research, and the future of software engineering. The core thesis: since December 2025, Karpathy has essentially stopped writing code by hand. He now “expresses his will” to AI agents for 16 hours a day, and he believes we are entering a “loopy era” where autonomous systems can run experiments, train models, and optimize hyperparameters without a human in the loop. His project AutoResearch proved this works by finding improvements to a model he had already hand-tuned over two decades of experience. The conversation also covers the death of bespoke apps, the future of education, open vs. closed source models, robotics, job market impacts, and why Karpathy chose to stay independent from frontier labs.

    Key Takeaways

    1. The December 2025 Shift Was Real and Dramatic

    Karpathy describes a hard flip that happened in December 2025 where he went from writing 80% of his own code to writing essentially none of it. He says the average software engineer’s default workflow has been “completely different” since that month. He calls this state “AI psychosis” and says he feels anxious whenever he is not at the forefront of what is possible with these tools.

    2. AutoResearch: Agents That Do AI Research Autonomously

    AutoResearch is Karpathy’s project where an AI agent is given an objective metric (like validation loss), a codebase, and boundaries for what it can change. It then loops autonomously, running experiments, tweaking hyperparameters, modifying architectures, and committing improvements without any human in the loop. When Karpathy ran it overnight on a model he had already carefully tuned by hand over years, it found optimizations he had missed, including forgotten weight decay on value embeddings and insufficiently tuned Adam betas.

    3. The Name of the Game Is Removing Yourself as the Bottleneck

    Karpathy frames the current era as a shift from optimizing your own productivity to maximizing your “token throughput.” The goal is to arrange tasks so that agents can run autonomously for extended periods. You are no longer the worker. You are the orchestrator, and every minute you spend in the loop is a minute the system is held back.

    4. Mastery Now Means Managing Multiple Agents in Parallel

    The vision of mastery is not writing better code. It is managing teams of agents simultaneously. Karpathy references Peter Steinberg’s workflow of having 10+ Codex agents running in parallel across different repos, each taking about 20 minutes per task. You move in “macro actions” over your codebase, delegating entire features rather than writing individual functions.

    5. Personality and Soul Matter in Coding Agents

    Karpathy praises Claude’s personality, saying it feels like a teammate who gets excited about what you are building. He contrasts this with Codex, which he calls “very dry” and disengaged. He specifically highlights that Claude’s praise feels earned because it does not react equally to half-baked ideas and genuinely good ones. He credits Peter (OpenClaw) with innovating on the “soul” of an agent through careful prompt design, memory systems, and a unified WhatsApp interface.

    6. Apps Are Dead. APIs and Agents Are the Future.

    Karpathy built “Dobby the Elf Claw,” a home automation agent that controls his Sonos, lights, HVAC, shades, pool, spa, and security cameras through natural language over WhatsApp. He did this by having agents scan his local network, reverse-engineer device APIs, and build a unified dashboard. His conclusion: most consumer apps should not exist. Everything should be API endpoints that agents can call on behalf of users. The “customer” of software is increasingly the agent, not the human.

    7. AutoResearch Could Become a Distributed Computing Project

    Karpathy envisions an “AutoResearch at Home” model inspired by SETI@home and Folding@home. Because it is expensive to find code optimizations but cheap to verify them (just run the training and check the metric), untrusted compute nodes on the internet could contribute experimental results. He draws an analogy to blockchain: instead of blocks you have commits, instead of proof of work you have expensive experimentation, and instead of monetary reward you have leaderboard placement. He speculates that a global swarm of agents could potentially outperform frontier labs.

    8. Education Is Being Redirected Through Agents

    Karpathy describes his MicroGPT project, a 200-line distillation of LLM training to its bare essence. He says he started to create a video walkthrough but realized that is no longer the right format. Instead, he now “explains things to agents,” and the agents can then explain them to individual humans in their own language, at their own pace, with infinite patience. He envisions education shifting to “skills” (structured curricula for agents) rather than lectures or guides for humans directly.

    9. The Jaggedness Problem Is Still Real

    Karpathy describes current AI agents as simultaneously feeling like a “brilliant PhD student who has been a systems programmer their entire life” and a 10-year-old. He calls this “jaggedness,” and it stems from reinforcement learning only optimizing for verifiable domains. Models can move mountains on agentic coding tasks but still tell the same bad joke they told four years ago (“Why don’t scientists trust atoms? Because they make everything up.”). Things outside the RL reward loop remain stuck.

    10. Open Source Is Healthy and Necessary, Even If Behind

    Karpathy estimates open source models are now roughly 6 to 8 months behind closed frontier models, down from 18 months and narrowing. He draws a parallel to Linux: the industry has a structural need for a common, open platform. He is “by default very suspicious” of centralization and wants more labs, more voices in the room, and an “ensemble” approach to AI governance. He thinks it is healthy that open source exists slightly behind the frontier, eating through basic use cases while closed models handle “Nobel Prize kind of work.”

    11. Digital Transformation Will Massively Outpace Physical Robotics

    Karpathy predicts a clear ordering: first, a massive wave of “unhobling” in the digital space where everything gets rewired and made 100x more efficient. Then, activity moves to the interface between digital and physical (sensors, cameras, lab equipment). Finally, the physical world itself transforms, but on a much longer timeline because “atoms are a million times harder than bits.” He notes that robotics requires enormous capital expenditure and conviction, and most self-driving startups from 10 years ago did not survive long term.

    12. Why Karpathy Stays Independent From Frontier Labs

    Karpathy gives a nuanced answer about why he is not working at a frontier lab. He says employees at these labs cannot be fully independent voices because of financial incentives and social pressure. He describes this as a fundamental misalignment: the people building the most consequential technology are also the ones who benefit most from it financially. He values being “more aligned with humanity” outside the labs, though he acknowledges his judgment will inevitably drift as he loses visibility into what is happening at the frontier.

    Detailed Summary

    The AI Psychosis and the End of Hand-Written Code

    The conversation opens with Karpathy describing what he calls a state of perpetual “AI psychosis.” Since December 2025, he has not typed a line of code. The shift was not gradual. It was a hard flip from doing 80% of his own coding to doing almost none. He compares the anxiety of unused agent capacity to the old PhD feeling of watching idle GPUs. Except now, the scarce resource is not compute. It is tokens, and you feel the pressure to maximize your token throughput at all times.

    He describes the modern workflow: you have multiple coding agents (Claude Code, Codex, or similar harnesses) running simultaneously across different repositories. Each agent takes about 20 minutes on a well-scoped task. You delegate entire features, review the output, and move on. The job is no longer typing. It is orchestration. And when it does not work, the overwhelming feeling is that it is a “skill issue,” not a capability limitation.

    Karpathy says most people, even his own parents, do not fully grasp how dramatic this shift has been. The default workflow of any software engineer sitting at a desk today is fundamentally different from what it was six months ago.

    AutoResearch: Closing the Loop on AI Research

    The centerpiece of the conversation is AutoResearch, Karpathy’s project for fully autonomous AI research. The setup is deceptively simple: give an agent an objective metric (like validation loss on a language model), a codebase to modify, and boundaries for what it can change. Then let it loop. It generates hypotheses, runs experiments, evaluates results, and commits improvements. No human in the loop.

    Karpathy was surprised it worked as well as it did. He had already hand-tuned his NanoGPT-derived training setup over years using his two decades of experience. When he let AutoResearch run overnight, it found improvements he had missed. The weight decay on value embeddings was forgotten. The Adam optimizer betas were not sufficiently tuned. These are the kinds of things that interact with each other in complex ways that a human researcher might not systematically explore.

    The deeper insight is structural: everything around frontier-level intelligence is about extrapolation and scaling laws. You do massive exploration on smaller models and then extrapolate to larger scales. AutoResearch is perfectly suited for this because the experimentation is expensive but the verification is cheap. Did the validation loss go down? Yes or no.

    Karpathy envisions this scaling beyond a single machine. His “AutoResearch at Home” concept borrows from distributed computing projects like Folding@home. Because verification is cheap but search is expensive, you can accept contributions from untrusted workers across the internet. He draws a blockchain analogy: commits instead of blocks, experimentation as proof of work, leaderboard placement as reward. A global swarm of agents contributing compute could, in theory, rival frontier labs that have massive but centralized resources.

    The Claw Paradigm and the Death of Apps

    Karpathy introduces the concept of the “claw,” a persistent, looping agent that operates in its own sandbox, has sophisticated memory, and works on your behalf even when you are not watching. This goes beyond a single chat session with an AI. A claw has persistence, autonomy, and the ability to interact with external systems.

    His personal example is “Dobby the Elf Claw,” a home automation agent that controls his entire smart home through WhatsApp. The agent scanned his local network, found his Sonos speakers, reverse-engineered the API, and started playing music in three prompts. It did the same for his lights, HVAC, shades, pool, spa, and security cameras (using a Qwen vision model for change detection on camera feeds).

    The broader point is that this renders most consumer apps unnecessary. Why maintain six different smart home apps when a single agent can call all the APIs directly? Karpathy argues the industry needs to reconfigure around the idea that the customer is increasingly the agent, not the human. Everything should be exposed API endpoints. The intelligence layer (the LLM) is the glue that ties it all together.

    He predicts this will become table stakes within a few years. Today it requires vibe coding and direct agent interaction. Soon, even open source models will handle this trivially. The barrier will come down until every person has a claw managing their digital life through natural language.

    Model Jaggedness and the Limits of Reinforcement Learning

    One of the most technically interesting sections covers what Karpathy calls “jaggedness.” Current AI models are simultaneously superhuman at verifiable tasks (coding, math, structured reasoning) and surprisingly mediocre at anything outside the RL reward loop. His go-to example: ask any frontier model to tell you a joke, and you will get the same one from four years ago. “Why don’t scientists trust atoms? Because they make everything up.” The models have improved enormously, but joke quality has not budged because it is not being optimized.

    This jaggedness creates an uncanny valley in interaction. Karpathy describes the experience as talking to someone who is simultaneously a brilliant PhD systems programmer and a 10-year-old. Humans have some variance in ability across domains, but nothing like this. The implication is that the narrative of “general intelligence improving across all domains for free as models get smarter” is not fully accurate. There are blind spots, and they cluster around anything that lacks objective evaluation criteria.

    He and Sarah Guo discuss whether this should lead to model “speciation,” where specialized models are fine-tuned for specific domains rather than one monolithic model trying to be good at everything. Karpathy thinks speciation makes sense in theory (like the diversity of brains in the animal kingdom) but says the science of fine-tuning without losing capabilities is still underdeveloped. The labs are still pursuing monocultures.

    Open Source, Centralization, and Power Balance

    Karpathy, a long-time open source advocate, estimates the gap between closed and open source models has narrowed from 18 months to roughly 6 to 8 months. He draws a direct parallel to Linux: despite closed alternatives like Windows and macOS, the industry structurally needs a common open platform. Linux runs on 60%+ of computers because businesses need a shared foundation they feel safe using.

    The challenge for open source AI is capital expenditure. Training frontier models is astronomically expensive, and that is where the comparison to Linux breaks down somewhat. But Karpathy argues the current dynamic is actually healthy: frontier labs push the bleeding edge with closed models, open source follows 6 to 8 months behind, and that trailing capability is still enormously powerful for the vast majority of use cases.

    He expresses deep skepticism about centralization, citing his Eastern European background and the historical track record of concentrated power. He wants more labs, more independent voices, and an “ensemble” approach to decision-making about AI’s future. He worries about the current trend of further consolidation even among the top labs.

    The Job Market: Digital Unhobling and the Jevons Paradox

    Karpathy recently published an analysis of Bureau of Labor Statistics jobs data, color-coded by which professions primarily manipulate digital information versus physical matter. His thesis: digital professions will be transformed first and fastest because bits are infinitely easier to manipulate than atoms. He calls this “unhobling,” the release of a massive overhang of digital work that humans simply did not have enough thinking cycles to process.

    On whether this means fewer software engineering jobs, Karpathy is cautiously optimistic. He invokes the Jevons Paradox: when something becomes cheaper, demand often increases so much that total consumption goes up. The canonical example is ATMs and bank tellers. ATMs were supposed to replace tellers, but they made bank branches cheaper to operate, leading to more branches and more tellers (at least until 2010). Similarly, if AI makes software dramatically cheaper, the demand for software could explode because it was previously constrained by scarcity and cost.

    He emphasizes that the physical world will lag behind significantly. Robotics requires enormous capital, conviction, and time. Most self-driving startups from a decade ago failed. The interesting opportunities in the near term are at the interface between digital and physical: sensors feeding data to AI systems, actuators executing AI decisions in the real world, and new markets for information (he imagines prediction markets where agents pay for real-time photos from conflict zones).

    Education in the Age of Agents

    Karpathy’s MicroGPT project distills the entire LLM training process into 200 lines of Python. He started making an explanatory video but stopped, realizing the format is obsolete. If the code is already that simple, anyone can ask an agent to explain it in whatever way they need: different languages, different skill levels, infinite patience, multiple approaches. The teacher’s job is no longer to explain. It is to create the thing that is worth explaining, and then let agents handle the last mile of education.

    He envisions a future where education shifts from “guides and lectures for humans” to “skills and curricula for agents.” A skill is a set of instructions that tells an agent how to teach something, what progression to follow, what to emphasize. The human educator becomes a curriculum designer for AI tutors. Documentation shifts from HTML for humans to markdown for agents.

    His punchline: “The things that agents can do, they can probably do better than you, or very soon. The things that agents cannot do is your job now.” For MicroGPT, the 200-line distillation is his unique contribution. Everything else, the explanation, the teaching, the Q&A, is better handled by agents.

    Why Not Return to a Frontier Lab?

    The conversation closes with a nuanced discussion about why Karpathy remains independent. He identifies several tensions. First, financial alignment: employees at frontier labs have enormous financial incentives tied to the success of transformative (and potentially disruptive) technology. This creates a conflict of interest when it comes to honest public discourse. Second, social pressure: even without arm-twisting, there are things you cannot say and things the organization wants you to say. You cannot be a fully free agent. Third, impact: he believes his most impactful contributions may come from an “ecosystem level” role rather than being one of many researchers inside a lab.

    However, he acknowledges a real cost. Being outside frontier labs means his judgment will inevitably drift. These systems are opaque, and understanding how they actually work under the hood requires being inside. He floats the idea of periodic stints at frontier labs, going back and forth between inside and outside roles to maintain both independence and technical grounding.

    Thoughts

    This is one of the most honest and technically grounded conversations about the current state of AI I have heard in 2026. A few things stand out.

    The AutoResearch concept is genuinely important. Not because autonomous hyperparameter tuning is new, but because Karpathy is framing the entire problem correctly: the goal is not to build better tools for researchers. It is to remove researchers from the loop entirely. The fact that an overnight run found optimizations that a world-class researcher missed after years of manual tuning is a powerful data point. And the distributed computing vision (AutoResearch at Home) could be the most consequential idea in the entire conversation if someone builds it well.

    The “death of apps” framing deserves more attention. Karpathy’s Dobby example is not a toy demo. It is a preview of how every consumer software company’s business model gets disrupted. If agents can reverse-engineer APIs and unify disparate systems through natural language, the entire app ecosystem becomes a commodity layer beneath an intelligence layer. The companies that survive will be the ones that embrace API-first design and accept that their “user” is increasingly an LLM.

    The jaggedness observation is underappreciated. The fact that models can autonomously improve training code but cannot tell a new joke should be deeply uncomfortable for anyone claiming we are on a smooth path to AGI. It suggests that current scaling and RL approaches produce narrow excellence, not general intelligence. The joke example is funny, but the underlying point is serious: we are building systems with alien capability profiles that do not match any human intuition about what “smart” means.

    Finally, Karpathy’s decision to stay independent is itself an important signal. When one of the most capable AI researchers in the world says he feels “more aligned with humanity” outside of frontier labs, that should be taken seriously. His point about financial incentives and social pressure creating misalignment is not abstract. It is structural. And his proposed solution of rotating between inside and outside roles is pragmatic and worth consideration for the entire field.