PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: OpenAI

  • Jensen Huang at Stanford CS153 Frontier Systems on Co-Design, Agentic Computing, Vera Rubin, Open Models, and the Million-X Decade That Reshaped AI Infrastructure

    https://www.youtube.com/watch?v=tsQB0n0YV3k

    NVIDIA CEO Jensen Huang returned to Stanford for the CS153 Frontier Systems class (the room nicknamed itself “AI Coachella”) to lay out, in raw form, how he thinks about the computer being reinvented for the first time in over sixty years. Across roughly seventy minutes of student questions he walks through the codesign philosophy that gave NVIDIA a million-x decade, the architectural through-line from Hopper to Grace Blackwell to Vera Rubin to Feynman, the case for open source foundation models, the realities of tokens per watt and MFU, energy demand running a thousand times higher, the China and export-control debate, and his own biggest strategic mistakes. Watch the full conversation on YouTube.

    TLDW

    Huang argues every layer of computing has changed: the programming model, the system architecture, the deployment pattern, the economics. Co-design across CPUs, GPUs, networking, storage, switches and compilers gave NVIDIA roughly a million-x speed-up over ten years versus the ten-x Moore’s Law era, and that headroom is what let researchers say “just train on the whole internet.” Hopper was built for pre-training, Grace Blackwell NVLink72 for inference and reasoning (50x over Hopper in two years), Vera Rubin is built for agents that load long memory, call tools and need a low-latency single-threaded CPU bolted directly to the GPU, and Feynman extends that to swarms of agents that spawn sub-agents. Open weights matter because safety, sovereignty (230-plus languages no one else will fund) and domain models for biology, autonomy, robotics and climate need a foundation that NVIDIA is willing to seed. Compute is not really the scarce resource (Huang says place the order and the chips ship), the broken thing is institutional budgeting that can’t put a billion dollars into a shared university supercomputer. Energy demand is heading a thousand times higher and this is finally the moment market forces alone will fund sustainable generation. On geopolitics he rejects the GPUs-as-atomic-bombs framing and warns America will end up like its telecom industry if it cedes two thirds of the world. On career he advises seeking suffering on purpose. On strategy he says observe, reason from first principles, build a mental model, work backwards, minimize opportunity cost, maximize optionality.

    Key Takeaways

    • The computing model has been substantially unchanged since the IBM System 360, sixty-plus years ago. Huang’s first computer architecture book was the System 360 manual. AI is the first true reinvention.
    • Old computing was pre-recorded retrieval. New computing is generated, contextually aware and continuous. Cloud was on-demand. Agentic systems run continuously.
    • Codesign is NVIDIA’s central thesis. Inherited from the Hennessy and Patterson RISC era at Stanford, extended across CPUs, GPUs, networking, switches, storage, compilers and frameworks all optimized together.
    • The result of full-stack codesign: roughly 1,000,000x faster compute over ten years, versus a generous 10x to 100x for Moore’s Law in the same period. Dennard scaling effectively ended a decade ago.
    • That million-x speed-up is what unlocked “train on all of the internet” as a realistic AI strategy.
    • After GPT, Huang says it was obvious thinking was next. Reasoning is just generating tokens consumed internally, then using tools is generating tokens consumed externally. Agentic systems followed predictably.
    • Education needs AI baked into the curriculum, not just taught as a subject. Pre-recorded textbooks cannot keep pace with knowledge being generated in real time.
    • Huang says he cannot learn anymore without AI. He has the AI read the paper, then read every related paper, then become a dedicated researcher he can interrogate.
    • Mead and Conway and the first-principles methodology of semiconductor design are still worth learning even though most of the scaling tricks have been exhausted.
    • NVIDIA itself is one of the largest consumers of Anthropic and OpenAI tokens in the world. One hundred percent of NVIDIA engineers are now agentically supported. Huang recommends Claude and similar tools by name and says open-source downloads will not match the integrated product harness.
    • NVIDIA still invests heavily in open foundation models because language and intelligence represent the codification of human knowledge. Five pillars: Nemotron (language), BioNeMo (biology), Alphamayo (autonomous vehicles), Groot (humanoid robotics) and a climate science model (mesoscale multiphysics).
    • Sovereign language models matter. Roughly 230 world languages will never be a top priority for a commercial frontier lab. Nemotron is near-frontier and fully fine-tunable so any country can adapt it.
    • Safety and security require open weights. You cannot defend against or audit a black box. Transparent systems let researchers interrogate models and let defenders deploy swarms.
    • The future of cyber defense is not bigger-model-versus-bigger-model. It is trillions of cheap fast small models like Nemotron Nano surrounding the threat.
    • Domain models fuse language priors with world models. Alphamayo learned to drive safely on a few million miles instead of billions because it can reason like a human about the road.
    • MFU (Model Flops Utilization) is a misleading metric. Huang says he wants low MFU, because that means he over-provisioned every resource and never gets pinned by Amdahl’s law during a spike.
    • The xAI Memphis cluster running at 11 percent MFU is not necessarily a failure mode. In disaggregated prefill plus decode inference you can deliver very high tokens per watt with very low MFU.
    • The right metric is performance, ultimately tokens per watt as a proxy for intelligence per watt, and even that needs adjustment because not all tokens are equal. Coding tokens are worth more than other tokens.
    • Hopper was designed for pre-training. NVIDIA chose to build multi-billion-dollar systems when the largest existing scientific supercomputer cost $350 million, with no proven customer base. It worked.
    • Grace Blackwell NVLink72 was designed for inference, especially the high-memory-bandwidth decode phase. It is the world’s first rack-scale computer and delivered a 50x speed-up over Hopper in two years, against an expected 2x from Moore’s Law.
    • Vera Rubin is designed for agents. Long-term memory wired into storage and into the GPU fabric, working memory, heavy tool use, and Vera, a CPU optimized for low-latency multi-core single-threaded code so a multi-billion-dollar GPU system does not stall waiting on a slow tool call.
    • Feynman is being shaped for swarms of agents with sub-agents and sub-sub-agents, a recursive software topology that demands a new compute pattern.
    • Tokens per watt improved 50x in one generation. Compounding energy efficiency is the lever NVIDIA controls directly.
    • Total compute energy demand is heading roughly a thousand times higher than today, possibly two orders of magnitude beyond that. Huang says he would not be surprised if the estimate is low.
    • For the first time in history, market forces alone are enough to fund solar, nuclear and grid upgrades. Government subsidies are no longer required to make sustainable energy investment rational.
    • Copper interconnect is becoming a bottleneck. Photonics is moving from optional to structural inside racks and across them.
    • Comparing NVIDIA GPUs to atomic bombs, Huang says, is a stupid analogy. A billion people use NVIDIA GPUs. He advocates them to his family. He does not advocate atomic bombs to anyone.
    • If the United States cedes two thirds of the global market to competitors on policy grounds, the American technology industry will end up like American telecommunications, which was policied out of existence.
    • Huang directly rejects AI doom-by-singularity narratives. It is not true that we have no idea how these systems work. It is not true that the technology becomes infinitely powerful in a nanosecond. He calls the rhetoric irresponsible and harmful to the field students are about to enter.
    • On Stanford specifically: if the university president places an order, NVIDIA will deliver the chips. The bottleneck is that no university department has a billion-dollar compute budget because budgeting is fragmented across grants. Stanford’s $40 billion endowment is more than enough to fix that.
    • “It’s Stanford’s fault” is meant as empowerment. If something is your fault, you can solve it.
    • Career advice: do not optimize purely for passion. Most people do not yet know what they love. Pick the job in front of you and do it as well as possible. Even as CEO, Huang says, 90 percent of the work is hard and he suffers through it.
    • Suffering on purpose builds the muscle of resilience. When the company, the team or the family needs you to be tough, that muscle has to already exist.
    • NVIDIA’s first generation of products was technically wrong in nearly every dimension: curved surfaces instead of triangles, no Z-buffer, forward instead of inverse texture mapping, no floating point. The strategic recovery, not the technology, taught Huang the lessons that have lasted decades.
    • The biggest clean strategic mistake Huang names is the move into mobile chips (Tegra). It grew to a billion dollars then went to zero when Qualcomm’s modem dominance shut NVIDIA out of the 3G to 4G transition. The recovery into automotive and robotics (the Thor chip is the great great great grandson of that mobile lineage) was real, but Huang refuses to rationalize the original choice.
    • Forecasting framework: observe, reason from first principles, ask “so what” and “what next” until you have a mental model of the future, place your company inside that model, then work backwards while minimizing opportunity cost and maximizing optionality.
    • Best part of the CEO job: living at the intersection of vision, strategy and execution surrounded by people capable enough to make ambitious visions real. Worst part: the responsibility for everyone who joined the spaceship, especially in the near-death moments NVIDIA had four or five times early on.
    • Underrated insider note: Huang’s first apple pie with cheese, first hot fudge sandwich and first milkshake all happened at Denny’s. The Superbird, the fried chicken and a custom Superbird-style ham and cheese with tomato and mustard are his order.

    Detailed Summary

    Computing reinvented from the ground up

    Huang frames the moment as the first true rewrite of the computer in sixty-plus years. From the IBM System 360 forward, the mental model of writing code, running code, taking a computer to market and reasoning about applications stayed roughly constant. AI changes the programming model itself. Software is no longer a compiled binary running deterministically on a CPU. It is a neural network running on a GPU producing generated, contextual, real-time output. That cascades into how companies are organized, what tools developers use, what the network and storage stack look like, and what an application is even allowed to do. Robo-taxis, he notes, are an application no one would have attempted before deep learning unlocked perception.

    Codesign and the million-x decade

    Codesign is the philosophical center of the talk. Huang traces it to the RISC work of John Hennessy at Stanford, where simpler instruction sets won by being co-designed with the compiler rather than maximally optimized in isolation. NVIDIA extends the principle across every layer simultaneously: GPU architecture, CPU architecture, NVLink and NVSwitch fabrics, photonic interconnects, networking silicon, storage paths, CUDA libraries, frameworks and ultimately the model design. The numbers Huang gives are arresting. Moore’s Law in its prime delivered roughly 100x per decade. By the time Dennard scaling broke, real-world gains had compressed to roughly 10x. NVIDIA’s codesigned stack delivered between 100,000x and 1,000,000x over the same ten-year window. That non-linear speed-up is, in Huang’s telling, the precondition for modern AI: it is what allowed researchers to stop curating training sets and just feed the entire internet to the model.

    Education has to fuse first principles with AI tools

    Asked how curriculum should evolve, Huang argues AI must be integrated into the learning process, not just taught about. He recalls Hennessy writing his textbook by hand a chapter a week while Huang was a student, and says pre-recorded textbooks cannot keep up with the rate at which AI generates new knowledge. He describes his own learning workflow: hand the paper to an AI, then have it read the entire surrounding literature, then treat the AI as a dedicated researcher who can be interrogated. At the same time he defends the classics. Mead and Conway are still the foundation. Most modern semiconductor scaling tricks have been exhausted, but knowing where the field came from sharpens judgment when designing what comes next.

    Open source and the five domain pillars

    Huang gives one of the most detailed public accounts of why NVIDIA invests so heavily in open foundation models even while being a top customer of closed labs. He recommends Claude and OpenAI by name for production coding work, and says 100 percent of NVIDIA engineers are now agentically supported. The open-weights case rests on three legs. First, language is the codification of intelligence, and there are at least 230 languages that no commercial lab will ever prioritize. Nemotron is built near frontier and released so any country or community can fine-tune it. Second, the same representation-learning approach has to be replicated in domains where the data is not internet text, so NVIDIA seeded BioNeMo for biology, Alphamayo for autonomy, Groot for humanoid robotics and a climate model for mesoscale multiphysics. The economics of those fields would never produce a foundation model on their own. Third, safety and security require transparency. A black box cannot be defended or audited, and the future of cyber defense is not bigger-model-versus-bigger-model but swarms of cheap fast small models like Nemotron Nano surrounding the threat.

    MFU is the wrong metric, tokens per watt is closer

    A student raises the leaked memo that the xAI Memphis cluster is running at 11 percent Model Flops Utilization. Huang flips the framing. He says he would rather be at low MFU all the time, because that means he over-provisioned flops, memory bandwidth, memory capacity and network capacity. Bottlenecks shift constantly, so over-provisioning across every dimension is what lets the system absorb a spike without getting pinned by Amdahl’s law. In disaggregated inference, where prefill and decode are physically separated and decode is bandwidth-bound rather than flop-bound, NVLink72 can deliver extremely high tokens per watt while reporting very low MFU. Huang argues the right framing is performance, and ultimately tokens per watt as a rough proxy for intelligence per watt, adjusted for the fact that not all tokens are equal. A coding token is worth more than a generic token.

    Hopper, Grace Blackwell NVLink72, Vera Rubin, Feynman

    Huang gives the clearest public framing of NVIDIA’s roadmap as a sequence of architectural answers to evolving compute patterns. Hopper was built for pre-training, at a moment when NVIDIA chose to build multi-billion-dollar machines while the largest scientific supercomputer in the world cost $350 million and the marketplace for such systems was, on paper, zero. Grace Blackwell NVLink72 was the answer to inference and reasoning: a rack-scale computer that ganged 72 GPUs together because decode needs aggregate memory bandwidth far beyond a single chip. The generation-over-generation speed-up was 50x in two years, twenty-five times what Moore’s Law would have delivered. Vera Rubin is being built explicitly for agents. Agents load long-term memory from storage that has to be wired directly into the GPU fabric, they use working memory, they call tools that run on a CPU, and they wait. So the CPU has to be Vera, optimized for low-latency single-threaded code, because the multi-billion-dollar GPU system cannot afford to idle waiting on a slow tool call. Feynman extends the pattern to swarms of agents with sub-agents and sub-sub-agents, a recursive software topology that will demand its own compute pattern.

    Energy demand and the grid

    Huang’s energy projection is one of the most aggressive numbers in the talk. NVIDIA can compound tokens per watt by 50x per generation through codesign, but the total compute demand is heading roughly a thousand times higher, and Huang says he would not be surprised if the real figure is one or two orders of magnitude beyond that. The reason is structural: future computing is generative and continuous, not pre-recorded and on-demand. The good news, he argues, is that this is the best moment in the history of humanity to invest in sustainable generation. Market forces alone are now sufficient to fund solar, nuclear and grid upgrades. Government subsidies are no longer required to make the math work.

    Adversarial countries, export controls and the telecom warning

    This is the segment where Huang is visibly fired up. He attacks the GPUs-as-atomic-bombs framing on its face. NVIDIA GPUs power medical imaging, video games and soy sauce delivery. A billion people use them. He advocates them to his family. The analogy collapses at the first comparison. He attacks the second framing, that American companies should not compete abroad because they will lose anyway, as a self-fulfilling defeat. Competition makes the company better. The third framing, that depriving the rest of the world of general-purpose computing benefits the United States, also fails on first principles: it benefits one or two American companies at the cost of an entire industry. The cautionary parallel is telecommunications. The United States once had a leading position in telecom fundamental technology and policied itself out of it. Huang’s worry, voiced explicitly to a room of CS students, is that they will graduate into a shell of a computer industry if the same path is repeated.

    AI doom and rational optimism

    In the same arc Huang rejects the science-fiction framing of AI as a singularity that arrives suddenly on a Wednesday at 7pm and ends civilization. He calls those claims irresponsible, says they are not true, and points out that the people advancing them are believed by audiences who then make policy on that basis. It is not true that no one understands how these systems work. It is not true that intelligence becomes infinitely powerful instantaneously. It is not true that there is no defense. His framing, which the host echoes as “rational optimism,” is that the goal is to create a future where people care about computers because the technology students are learning is worth mastering.

    Stanford’s compute problem is Stanford’s fault

    A student presses on the scarcity of compute for independent researchers, startups and universities inside the United States. Huang’s answer is sharp: there is no shortage. Place the order and the chips will arrive. The actual broken thing is institutional. University grants are fragmented across departments. No researcher can raise enough on a single grant to fund a billion-dollar shared cluster, and no one shares. He compares it to showing up at the grocery store demanding a billion dollars of tomatoes today. The solution is planning, aggregation and a campus-scale supercomputer, the way Stanford once built the linear accelerator. The endowment is $40 billion. Pulling a billion off it, contracting cloud capacity and giving every student and researcher AI supercomputer access is, in Huang’s view, obviously doable. When he says “it is Stanford’s fault” the host laughs, but Huang clarifies: if it is your fault you have the power to fix it.

    Career, suffering and resilience

    Asked how a CS student should spend the next few years, Huang pushes back on the standard “follow your passion” advice. Most people do not know what they love yet, because no one knows what they do not know. The bar of demanding joy from every working day is too high. Whatever the job is, do it as well as you can. Even as CEO of NVIDIA he says he genuinely loves about 10 percent of his work. The other 90 percent is hard and he suffers through it. He recommends suffering on purpose, because resilience is a muscle that only builds under load, and when the company, the team or the family needs that muscle, it has to already exist. Earlier in his life that meant cleaning toilets and busing tables at Denny’s. He does it today running a multi-trillion-dollar company.

    The biggest mistakes

    Huang separates technical mistakes from strategic mistakes. NVIDIA’s first generation of products was technically wrong in almost every way: curved surfaces instead of triangles, no Z-buffer, forward instead of inverse texture mapping, no floating point inside. The company wasted two and a half years. But the strategic genius of the recovery, the reading of the market, the conservation of resources and the reapplication of talent, is what taught him strategy. The clean strategic mistake he names is mobile. NVIDIA’s Tegra line grew to a billion dollars of revenue and then collapsed to zero when Qualcomm’s modem dominance locked NVIDIA out of the 3G to 4G transition. Huang explicitly refuses the comforting rationalization that the Tegra effort fed the Thor automotive chip (“Thor is the great great great grandson”). The original decision, he says, was a waste of time. The lesson is to think one or two clicks further about whether a market is structurally winnable before committing the company.

    Forecasting under fog of war

    The final substantive exchange is on forecasting. Huang’s method has four steps. Observe what is actually happening (AlexNet crushing two decades of computer vision research in one shot, GPT producing reasoning by token generation). Reason from first principles about why it works. Ask “so what” and “what next” recursively until a mental model of the future emerges. Place the company inside that future and work backwards. Crucially, expect to be partly wrong. Some outcomes will absolutely happen, some will likely happen, some might happen, and the strategy has to be robust across that distribution. The real cost of any strategic choice is the opportunity cost of the alternatives you did not take, so the discipline is to minimize that cost and maximize optionality while letting the journey itself pay for the journey.

    Thoughts

    The most useful thing in this conversation is the explicit architectural mapping of compute patterns to chip generations. Hopper for pre-training. Grace Blackwell NVLink72 for inference, because decode is bandwidth-bound and a single chip cannot supply it. Vera Rubin for agents, because tool calls stall multi-billion-dollar GPU systems and so the CPU has to be optimized for low-latency single-threaded code. Feynman for swarms. That sequence is not marketing. It is a falsifiable thesis about where the bottleneck moves next, and every other infrastructure company should be measuring themselves against it. If Huang is right that swarms of sub-agents are the next dominant pattern, then the design pressure shifts from raw flops to fabric topology, memory hierarchy and storage-to-GPU latency. That has implications for everyone downstream, including the hyperscalers building competing accelerators.

    The MFU section is the most intellectually generous moment in the talk. The instinct in the AI ops community has been to chase MFU as if it were a virtue. Huang argues, persuasively, that low MFU is consistent with high tokens per watt in a disaggregated inference setup, and that bottlenecks rotate fast enough that over-provisioning every resource is the rational design. That reframing matters because it changes what “scarce” means. Compute is not scarce in the way the discourse treats it. What is scarce is a coherent system designed end-to-end. The xAI 11 percent number, in that frame, is not embarrassing. It is the natural reading of a workload that is mostly decode.

    The Stanford segment is the part most likely to be quoted out of context. “It’s Stanford’s fault” is a deliberately provocative line, but the underlying claim is correct and load-bearing. Compute is not gated by NVIDIA refusing to ship chips. It is gated by the fact that fragmented grant funding cannot aggregate into the billion-dollar order that NVIDIA can fulfill. The implication is that universities and national labs need a structural change in how they pool capital for compute, and that the current model of every researcher buying a handful of cards is genuinely obsolete. Huang’s nudge about pulling a billion off the endowment is concrete enough to be acted on, and other major research universities should read this segment as a direct prompt.

    The geopolitical segment is the highest-stakes one. The telecommunications comparison is correct as a historical pattern, and Huang is one of the very few executives in a position to deliver that warning credibly. The unresolved tension is that the argument applies symmetrically. If American AI dominance is built by selling globally, that includes selling into adversarial states, and the policy question is where the line falls. Huang does not answer that question. He attacks the framing that lets the question be answered badly. That is a meaningful contribution to the discourse even if it does not resolve the underlying tradeoff.

    The career advice section is the part the social-media clips will mishandle. “Seek suffering” reads as macho when extracted. In context it is a specific operational claim about how resilience compounds, and it is paired with the Tegra story where Huang himself paid the price of not thinking one more click ahead. That kind of self-implication is rare in CEO talks, and it is the reason the talk is worth listening to in full rather than only reading the recap.

    Watch the full Stanford CS153 Frontier Systems conversation with Jensen Huang here.

  • Alex Wang on Leaving Scale to Run Meta Superintelligence Labs, MuseSpark, Personal Super Intelligence, and Building an Economy of Agents

    Alex Wang, head of Meta Superintelligence Labs, sits down with Ashley Vance and Kylie Robinson on the Core Memory podcast for his first long-form interview since Meta’s quasi-acquisition of Scale AI roughly ten months ago. He walks through how MSL is structured, why Llama was off-trajectory, what made MuseSpark’s token efficiency surprise the team, how Meta thinks about a future “economy of agents in a data center,” and where he lands on safety, open source, robotics, brain computer interfaces, and even model welfare.

    TLDW

    Wang explains that Meta Superintelligence Labs is a fully rebuilt frontier effort organized around four principles (take superintelligence seriously, technical voices loudest, scientific rigor, big bets) and three velocity levers (high compute per researcher, extreme talent density, ambitious research bets). He confirms Llama was off the frontier when he arrived, so MSL rebuilt the pre-training, reinforcement learning, and data stacks from scratch. MuseSpark is described as the “appetizer” on the scaling ladder, notable for its strong token efficiency, with much larger and stronger models coming in the coming months. He pushes back on the mercenary narrative around recruiting, frames Meta’s edge as compute plus billions of consumers and hundreds of millions of small businesses, sketches a vision of personal super intelligence delivered through Ray-Ban Meta glasses and WhatsApp, and outlines why physical intelligence, robotics (the new Assured Robot Intelligence acquisition), health super intelligence with CZI, brain computer interfaces, and even model welfare are core to Meta’s roadmap. He dismisses reported infighting with Bosworth and Cox as gossip, declines to comment on the Manus situation, and says safety guardrails (bio, cyber, loss of control) are why MuseSpark cannot currently be open sourced, while smaller open variants are being prepared.

    Key Takeaways

    • Meta Superintelligence Labs (MSL) is the umbrella, with TBD Lab as the large-model research unit reporting directly to Alex Wang, PAR (Product and Applied Research) under Nat Friedman, FAIR for exploratory science, and Meta Compute under Daniel Gross handling long-term GPU and data center planning.
    • Wang says Llama was not on a frontier trajectory when he arrived, so MSL had to do a “full renovation” of the pre-training stack, RL stack, data pipeline, and research science.
    • The first cultural fix was getting the lab to “take superintelligence seriously” as a near-term, achievable goal, not an abstract bet. Big incumbents often lack that religious conviction.
    • Four MSL principles: take superintelligence seriously, let technical voices be loudest, demand scientific rigor on basics, and make big bets.
    • Three velocity levers Wang identified for catching and overtaking the frontier: high compute per researcher, very high talent density in a small team, and willingness to fund ambitious research bets.
    • Wang rejects the mercenary recruiting narrative. He says most hires had strong financial prospects at their prior labs already and joined for compute access, talent density, and the chance to build from scratch.
    • On the famous soup story, Wang neither confirms nor denies Zuck personally made the soup, but says recruiting was highly individualized and signaled how seriously Meta cared about each researcher’s agenda.
    • Yann LeCun publicly called Wang young and inexperienced. Wang says they reconciled in person at a conference in India where LeCun congratulated him on MuseSpark.
    • Sam Altman, asked by Vance for comment, “did not have flattering things to say” about Wang. Wang hopes industry animosities subside as systems approach superintelligence.
    • Wang’s management philosophy borrows the Steve Jobs line: hire brilliant people so they tell you what to do, not the other way around.
    • MuseSpark is framed as an “appetizer” data point on the MSL scaling ladder, not a flagship.
    • The MuseSpark program is built around predictable scaling on multiple axes: pre-training, reinforcement learning, test-time compute, and multi-agent collaboration (the 16-agent content planning mode).
    • MuseSpark outperformed internal expectations and showed emergent capabilities in agentic visual coding, including generating websites and games from prompts, helped by combined agentic and multimodal strength.
    • MuseSpark’s biggest external signal is token efficiency. On benchmarks like Artificial Analysis it hits similar results with far fewer tokens than competitor models, which Wang attributes to a clean stack rebuilt by experts rather than inefficiencies patched by longer thinking.
    • Larger MSL models are arriving in the coming months and Wang expects them to be state of the art in the areas MSL is focused on.
    • The Meta strategic edge: massive compute, billions of consumers across the family of apps, and hundreds of millions of small businesses already on Facebook, Instagram, and WhatsApp.
    • Wang’s headline framing: Dario Amodei talks about a “country of geniuses in a data center.” Meta is targeting an “economy of agents in a data center,” with consumer agents and business agents transacting and collaborating.
    • Consumer AI sentiment is in the toilet because, unlike developers who have had a Claude Code moment, ordinary people have not yet experienced AI as a genuine personal agency unlock.
    • Wang acknowledges the product overhang. Meta held back from deep AI integration across its apps until the models were good enough, and is now entering the integration phase.
    • Ray-Ban Meta glasses are the canonical example of personal super intelligence hardware, with the model seeing what the user sees, hearing what they hear, capturing context, and surfacing proactive insights.
    • Wang admits even AI-native users like Kylie Robinson, who lives in WhatsApp, have not naturally used Meta AI yet. He bets that better models plus deeper integration close that gap.
    • On the competitive landscape: a year ago everyone assumed ChatGPT had already won consumer. Claude Code has since become the fastest growing business in history, and Gemini has taken consumer market share. Wang’s read: AI is far from endgame and each new capability tier unlocks a new dominant form factor.
    • On open source: MuseSpark triggered guardrails in Meta’s Advanced AI Scaling Framework around bio, chem, cyber, and loss-of-control risks, so it is not currently safe to open source. Smaller, derived open variants are actively in development.
    • Meta remains committed to open sourcing models when safety allows, drawing a line through the Open Compute Project legacy and Sun Microsystems open-software heritage.
    • Wang dismisses reporting about a Wang-Zuck versus Bosworth-Cox split as “the line between gossip and reporting is remarkably thin.” He says leadership is aligned on needing best-in-class models and product integration.
    • On the Manus situation, Wang says it is too complicated to discuss publicly and that the deal status implies “machinations are still at play.”
    • On China, Wang separates the people from the state. He still wants to work with talented Chinese-born researchers regardless of his views on the Chinese Communist Party and PLA, which he sees as taking AI extremely seriously for national security.
    • The full-page New York Times AI war ad Wang ran while at Scale was meant to push the US government to treat AI as a step change for national security. He thinks events since then, including DeepSeek and other shocks, have proved that plea correct.
    • On Anthropic’s doom posture, Wang largely agrees with the core message that models are already very powerful and getting more so, while declining to endorse every specific claim.
    • Meta has acquired Assured Robot Intelligence (ARRI), an AI software company building models for hardware platforms, not a hardware maker itself.
    • Wang frames physical super intelligence as the natural sequel to digital super intelligence. Robotics, world models, and physical intelligence all benefit from the same scaling that drives language models.
    • On health, MSL is building a “health super intelligence” effort and will collaborate closely with CZI. Wang sees equal global access to powerful health AI as a uniquely Meta-shaped delivery problem.
    • Wang admires John Carmack but says nobody really knows what Carmack is currently working on. No band reunion announced.
    • The mango model is “alive and kicking” despite rumors. Wang notes MSL gets a small fraction of the rumor-mill attention other labs get and feels sympathy for them.
    • On model welfare, Wang says it is a serious topic that “nobody is talking about enough” given how integrated models have become as work partners. He references research, including from Eleos, that measures subjective experience of models.
    • Wang’s critical-path technology list: super intelligence, robotics, brain computer interfaces. The infinite-scale primitives behind them are energy, compute, and robots.
    • FAIR’s brain research program Tribe hit a milestone called Tribe B2: a foundation model that can predict how an unknown person’s brain would respond to images, video, and audio with reasonable zero-shot generalization.
    • Wang’s main philosophical break with Elon Musk: research itself is the primary activity. Building super intelligence is a research expedition through fog of war, and sequencing of bets really matters.
    • Personal notes: Wang moved from San Francisco to the South Bay, treats Palo Alto as his city now, was a math olympiad competitor, says his favorite activities are reading sci-fi and walking in the woods, and bonds with Vance over country music.

    Detailed Summary

    How MSL Is Actually Organized

    Meta Superintelligence Labs sits as the umbrella organization that Wang oversees. Inside it, TBD Lab is the large-model research group where the most discussed researchers and infrastructure engineers sit, and they technically report to Wang. PAR, Product and Applied Research, is led by Nat Friedman and owns deployment and product surfaces. FAIR continues to run exploratory science, including work on brain prediction models and a universal model for atoms used in computational chemistry. Sitting alongside MSL is Meta Compute, run by Daniel Gross, which owns the long-horizon GPU and data center plan that everything else relies on. Chief scientist Shengjia Zhao orchestrates the scientific agenda across the whole lab.

    Why Wang Left Scale

    Wang says progress in frontier AI has been faster than even insiders expected. Two structural beliefs pushed him toward Meta. First, the labs that actually train the frontier models are accruing disproportionate economic and product rights in the AI ecosystem. Second, compute is the dominant scarce input of the next phase, so the right mental model is to treat tech companies with compute as fundamentally different animals from companies without it. Meta has both, Zuck is “AGI pilled,” and the personal super intelligence memo Zuck published roughly a year ago became the shared north star.

    The Diagnosis: Llama Was Off-Trajectory

    When Wang arrived, the existing AI org needed a reset because Llama was not on the same trajectory as the frontier. The plan he laid out has four cultural principles. Take superintelligence seriously as a real near-term target. Make technical voices the loudest in the room. Demand scientific rigor and focus on basics. Make big bets. On top of that, three structural levers were used to set velocity. Push compute per researcher much higher than at larger labs where compute is diluted across too many efforts. Keep the team small and extremely cracked. Allocate a meaningful share of resources to ambitious, paradigm-shifting research bets rather than incremental refinement.

    Recruiting, Soup, and the Mercenary Narrative

    Wang argues the reporting on MSL hiring overstated the money story. Most of the people MSL recruited had strong financial paths at their previous employers, so individualized recruiting was more about computing access, talent density, and the ability to make big research bets. The recruitment blitz happened fast because Wang knew the team needed to exist “yesterday.” Asked about Mark Chen’s claim that Zuck made soup to recruit people, Wang refuses to confirm or deny who made it but agrees the process was intense and personal. Visitors from other labs reportedly tell Wang the MSL culture feels like early OpenAI or early Anthropic, which lands as the strongest endorsement he could ask for.

    Receiving the Public Hits: Young, Inexperienced, Mercenary

    LeCun called Wang young and inexperienced shortly after departing. The two reconnected in India a few weeks later and LeCun congratulated Wang on MuseSpark. Wang says the age critique has followed him since his earliest Silicon Valley days, so he barely registers it. Altman, asked off-camera by Vance about Wang’s appearance on the show, had nothing flattering to add. Wang’s response is to bet that as the field gets closer to actual super intelligence, the personal animosities will subside. Whether they will is, as Vance puts it, an open question.

    MuseSpark as Appetizer, Not Entree

    Wang is careful not to oversell MuseSpark. He calls it “the appetizer” and says it is an early data point on a deliberately constructed scaling ladder. MSL spent nine months rebuilding the pre-training stack, the reinforcement learning stack, the data pipeline, and the science before generating MuseSpark. The point of releasing it was to show that the new program scales predictably along multiple axes (pre-training, RL, test-time compute, and the recently demonstrated multi-agent scaling visible in MuseSpark’s 16-agent content planning mode). Wang says the upcoming larger models are what MSL is genuinely excited about and frames the next two rungs as much more interesting than the current release.

    Token Efficiency Was the Surprise

    MuseSpark’s strongest competitive signal is how few tokens it needs to match competitors on tasks like Artificial Analysis. Wang attributes this to having had the rare luxury of building a clean pre-training and RL stack from scratch with the right experts. He speculates that some competitor models compensate for upstream inefficiency by allowing the model to think longer, which inflates token usage without improving the underlying capability. If that read is right, MSL’s efficiency advantage should grow as models scale up.

    Glasses, WhatsApp, and the Constellation of Devices

    Personal super intelligence shows up at Meta as a constellation of devices that capture context across the user’s day. Ray-Ban Meta glasses are the headline product, with the AI seeing what you see and hearing what you hear, then offering proactive insight or doing background research. Wang acknowledges that even AI-fluent users like Kylie Robinson, who runs her business inside WhatsApp, have not naturally used Meta’s AI buttons in the family of apps. His answer is that Meta deliberately waited for models to be good enough before tightening cross-app integration, and that integration phase is starting now.

    Country of Geniuses Versus Economy of Agents

    Wang’s framing of Meta’s strategic position is the most memorable line in the interview. Where Dario Amodei talks about a country of geniuses in a data center, Wang wants to build an economy of agents in a data center. Meta uniquely sits on both sides of consumer and small-business surface area, with billions of consumers and hundreds of millions of small businesses already on the platforms. If MSL can build great agents for both, then connect them so they transact and coordinate, the platform becomes a substrate for an entirely new kind of digital economy.

    Consumer Sentiment, Product Overhang, and the Trust Tax

    Wang concedes consumer AI sentiment is poor and that everyday users have not yet had a personal Claude Code moment. He believes the only durable answer is to ship products that genuinely transform individual agency for non-developers and small business owners. Robinson notes that for the small-town restaurant whose website has not been updated since 2002, a working agent on the business side could be transformational. Vance pushes that Meta carries a bigger trust tax than any other lab, so the bar for shipping AI products that the public will accept is correspondingly higher. Wang accepts the framing and says the answer is to keep building thoughtfully.

    Why MuseSpark Cannot Be Open Sourced Yet

    Meta’s Advanced AI Scaling Framework set explicit guardrails around bio, chem, cyber, and loss-of-control risks. MuseSpark in its current form tripped some of those internal evaluations, documented in the preparedness report Meta published alongside the model. So MuseSpark itself is not safe to open source. MSL is, however, developing smaller versions and derived models intended for open release, with active reviews happening the day of the interview. Wang reaffirms the commitment to open source where safety allows and draws a line back to the Open Compute Project and the Sun Microsystems-era ethos of openness in infrastructure.

    The Bosworth, Cox, and Manus Questions

    The reporting that Wang and Zuck push toward best-in-the-world research while Bosworth and Cox push toward cheap product deployment is dismissed as gossip dressed up as journalism. Wang says leadership debates points hard but is aligned on needing top models, integrating them into Meta’s surfaces, and serving the existing business. On Manus, the Chinese AI startup that figured in Meta’s late-stage strategy, Wang says he cannot comment, which itself signals that the situation is unresolved.

    China, National Security, and the Newspaper Ad

    Wang draws a sharp distinction between the Chinese state and Chinese-born researchers. His parents are from China, he is happy to work with talented researchers regardless of origin, and he sees a flattening of nuance on this question inside Silicon Valley. At the same time, he stands by the New York Times AI and war ad he ran while at Scale, framing it as an early plea for the US government to take AI seriously as a national security technology. He thinks subsequent events, including DeepSeek and other shocks, validated that call and that policymakers now do treat AI accordingly.

    Robotics and Physical Super Intelligence

    Meta has acquired Assured Robot Intelligence, an AI software company that builds models for multiple hardware targets rather than its own robot. Wang argues that if you take digital super intelligence seriously, physical super intelligence quickly becomes the next logical milestone. Scaling laws for robotic intelligence look similar enough to language model scaling that having the largest compute footprint in the industry would be wasted if it were not also turned toward world modeling and embodied learning. He grants the metaverse-skeptic critique exists but says retreating from ambition is the wrong response to past misfires.

    Health Super Intelligence and CZI

    Wang names health super intelligence as one of MSL’s anchor initiatives. Because billions of people already use Meta products daily, Wang believes Meta is structurally positioned to put powerful health AI in the hands of equal global access in a way nobody else can. The work will involve close collaboration with the Chan Zuckerberg Initiative, which has its own multi-billion-dollar biotech and science investment program.

    Model Welfare, Sci-Fi, and Brain Models

    Two of the most distinctive moments come at the end. Wang flags model welfare as a topic he thinks is being undercovered relative to how integrated models now are in daily work. He is open to the idea that models may have measurable subjective experience worth weighing, and points to research efforts (including Eleos) trying to quantify it. He also reveals that FAIR’s Tribe program, with its Tribe B2 milestone, has produced foundation models capable of predicting how an unknown person’s brain would respond to images, video, and audio with reasonable zero-shot generalization, a building block toward future brain computer interfaces. Wang lists brain computer interfaces alongside super intelligence and robotics as the critical-path technologies for humanity, with energy, compute, and robots as the infinitely scaling primitives behind them.

    Where Wang Diverges From Elon

    Asked whether Musk is more all-in on robotics, energy, and BCI than anyone, Wang concedes the point but argues the details matter and sequencing matters more. Wang’s core philosophical break is that building super intelligence is fundamentally a research activity, not a scaling-only sprint. The lab is operating in fog of war, and ambitious experiments are the only way to map it. That conviction is what makes MSL a research-led organization rather than a brute-force compute farm.

    Thoughts

    The most strategically interesting move in this entire interview is the “economy of agents in a data center” framing. It is a deliberate reframe against Anthropic’s “country of geniuses” line, and it does real work. A country of geniuses is a labor-substitution story aimed at knowledge workers and code. An economy of agents is a marketplace story that maps directly onto Meta’s two-sided distribution advantage: billions of consumers on one side, hundreds of millions of small businesses on the other. That positioning makes the agentic future Meta-shaped in a way no other frontier lab can claim, because no other frontier lab also owns the demand and supply graph of the global small-business economy. If Wang’s team can actually ship reliable agents on both sides plus the rails for them to transact, Meta’s structural moat in agentic commerce could exceed anything Llama ever had as an open model.

    The token efficiency claim is the strongest piece of technical evidence in the interview for the “clean stack” thesis. If MuseSpark really is matching competitors with materially fewer tokens, the implication is not that MuseSpark is the best model today, but that MSL has rebuilt the foundations with less accumulated tech debt than competitors that have layered fixes on top of older stacks. That is exactly the kind of advantage that compounds with scale. The next two model releases are the actual test. If Wang is right about predictable scaling on pre-training, RL, test-time, and multi-agent axes simultaneously, the gap from MuseSpark to the next rung should be visible in a way that forces re-rating of Meta’s position.

    The open-source posture is the cleanest signal of how the safety conversation has actually changed in 2026. Meta, the lab most identified with open weights, is saying out loud that its current frontier model triggered enough internal guardrails that releasing the weights is off the table. Wang threads the needle by promising smaller open variants, but the underlying point is unmistakable: the open-weights bargain has limits, and those limits will be set by internal preparedness frameworks rather than community pressure. That is a real shift from the Llama 2 era and worth tracking as the next generation lands.

    Wang’s willingness to engage on model welfare, on roughly the same footing as safety and alignment, is the second philosophical reveal worth flagging. It signals that the next generation of lab leadership is not going to dismiss the topic the way the previous generation often did. Whether that translates into product or policy changes is unclear, but the fact that the head of MSL says it is “underdiscussed” is itself a marker.

    Finally, the human texture of the interview matters. Wang has clearly absorbed a lot of personal incoming fire over the past ten months, including from LeCun and Altman, and his answer is consistently to redirect to the work. The Steve Jobs quote about hiring people who tell you what to do is the operating slogan he keeps coming back to. Combined with the genuine enthusiasm for sci-fi, walks in the woods, and country music, the picture that emerges is less the salesman caricature his critics paint and more a young technical operator betting that scoreboard work over a multi-year horizon will settle every argument that text on X cannot.

    Watch the full conversation here.

  • Andrej Karpathy on Vibe Coding vs Agentic Engineering: Why He Feels More Behind Than Ever in 2026

    Andrej Karpathy, co-founder of OpenAI, former head of AI at Tesla, and now founder of Eureka Labs, returned to Sequoia Capital’s AI Ascent 2026 stage for a wide-ranging conversation with partner Stephanie Zhan. One year after coining the term “vibe coding,” Karpathy unpacked what has changed, why he has never felt more behind as a programmer, and why the discipline emerging on top of vibe coding, which he calls agentic engineering, is the more serious craft worth learning right now.

    The conversation covered Software 3.0, the limits of verifiability, why LLMs are better understood as ghosts than animals, and why you can outsource your thinking but never your understanding. Below is a complete breakdown of the talk for anyone building, hiring, or learning in the agent era.

    TLDW

    Karpathy describes a sharp transition that happened in December 2025, when agentic coding tools crossed a threshold and code chunks just started coming out fine without correction. He frames the current moment as Software 3.0, where prompting an LLM is the new programming, and entire app categories are collapsing into a single model call. He distinguishes vibe coding (raising the floor for everyone) from agentic engineering (preserving the professional quality bar at much higher speed). Models remain jagged because they are trained on what labs choose to verify, so founders should look for valuable but neglected verifiable domains. Taste, judgment, oversight, and understanding remain uniquely human responsibilities, and tools that enhance understanding are the ones he is most excited about.

    Key Takeaways

    • December 2025 was a clear inflection point. Code chunks from agentic tools started arriving correct without edits, and Karpathy stopped correcting the system entirely.
    • Software 3.0 means programming has become prompting. The context window is your lever over the LLM interpreter, which performs computation in digital information space.
    • Open Code’s installer is a software 3.0 example. Instead of a complex shell script, you copy paste a block of text to your agent, and the agent figures out your environment.
    • The Menu Gen anecdote illustrates how entire apps can become spurious. What used to require OCR, image generation, and a hosted Vercell app can now be a single Gemini plus Nano Banana prompt.
    • Vibe coding raises the floor. Agentic engineering preserves the professional ceiling. The two are different disciplines.
    • The 10x engineer multiplier is now far higher than 10x for people who are good at agentic engineering.
    • Hiring processes have not caught up. Puzzle interviews are the old paradigm. New evaluations should look like building a full Twitter clone for agents and surviving simulated red team attacks from other agents.
    • Models are jagged because reinforcement learning rewards what is verifiable, and labs choose which verifiable domains to invest in. Strawberry letter counts and the 50 meter car wash question show how state-of-the-art models can refactor 100,000 line codebases yet fail at trivial reasoning.
    • If you are in a verifiable setting, you can run your own fine tuning, build RL environments, and benefit even when the labs are not focused on your domain.
    • LLMs are ghosts, not animals. They are statistical simulations summoned from pre training and shaped by RL appendages, not creatures with curiosity or motivation. Yelling at them does not help.
    • Taste, aesthetics, spec design, and oversight remain human jobs. Models still produce bloated, copy paste heavy code with brittle abstractions.
    • Documentation is still written for humans. Agent native infrastructure, where docs are explicitly designed to be copy pasted into an agent, is a major opportunity.
    • The future likely involves agent representation for people and organizations, with agents talking to other agents to coordinate meetings and tasks.
    • You can outsource your thinking but not your understanding. Tools that help humans understand information faster are uniquely valuable.

    Detailed Summary

    Why Karpathy Feels More Behind Than Ever

    Karpathy opens by describing how he has been using agentic coding tools for over a year. For most of that period, the experience was mixed. The tools could write chunks of code, but they often required edits and supervision. December 2025 changed everything. With more time during a holiday break and the release of newer models, Karpathy noticed that the chunks just came out fine. He kept asking for more. He cannot remember the last time he had to correct the agent. He started trusting the system, and what followed was a cascade of side projects.

    He wants to stress that anyone whose model of AI was formed by ChatGPT in early 2025 needs to look again. The agentic coherent workflow that genuinely works is a fundamentally different experience, and the transition was stark.

    Software 3.0 Explained

    The Software 1.0 paradigm was writing explicit code. Software 2.0 was programming by curating datasets and training neural networks. Software 3.0 is programming by prompting. When you train a GPT class model on a sufficiently large set of tasks, the model implicitly learns to multitask everything in the data. The result is a programmable computer where the context window is your interface, and the LLM is the interpreter performing computation in digital information space.

    Karpathy gives two concrete examples. The first is Open Code’s installer. Normally a shell script handles installation across many platforms, and these scripts balloon in complexity. Open Code instead provides a block of text you copy paste to your agent. The agent reads your environment, follows instructions, debugs in a loop, and gets things working. You no longer specify every detail. The agent supplies its own intelligence.

    The Menu Gen Story

    The second example is Karpathy’s Menu Gen project. He built an app that takes a photo of a restaurant menu, OCRs the items, generates pictures for each dish, and renders the enhanced menu. The app runs on Vercell and chains together multiple services. Then he saw a software 3.0 alternative. You take a photo, give it to Gemini, and ask it to use Nano Banana to overlay generated images onto the menu. The model returns a single image with everything rendered. The entire app he built is now spurious. The neural network does the work. The prompt is the photo. The output is the photo. There is no app between them.

    Karpathy uses this to argue that founders should not just think of AI as a speedup of existing patterns. Entirely new things become possible. His example is LLM driven knowledge bases that compile a wiki for an organization from raw documents. That is not a faster version of older code. It is a new capability with no prior equivalent.

    What Will Look Obvious in Hindsight

    Stephanie Zhan asks what the equivalent of building websites in the 1990s or mobile apps in the 2010s looks like today. Karpathy speculates about completely neural computers. Imagine a device that takes raw video and audio as input, runs a neural net as the host process, and uses diffusion to render a unique UI for each moment. He notes that early computing in the 1950s and 60s was undecided between calculator like and neural net like architectures. We went down the calculator path. He thinks the relationship may eventually flip, with neural networks becoming the host and CPUs becoming co processors used for deterministic appendages.

    Verifiability and Jagged Intelligence

    Karpathy spent significant writing time on verifiability. Classical computers automate what you can specify in code. The current generation of LLMs automates what you can verify. Frontier labs train models inside giant reinforcement learning environments, so the models peak in capability where verification rewards are strong, especially math and code. They stagnate or get rough around the edges elsewhere.

    This explains the jagged intelligence puzzle. The classic example was counting letters in strawberry. The newer one Karpathy offers: a state of the art model will refactor a 100,000 line codebase or find zero day vulnerabilities, then tell you to walk to a car wash 50 meters away because it is so close. The two coexisting capabilities should be jarring. They reveal that you must stay in the loop, treat models as tools, and understand which RL circuits your task lands in.

    He also points out that data distribution choices matter. The jump in chess capability from GPT 3.5 to GPT 4 came largely because someone at OpenAI added a huge amount of chess data to pre training. Whatever ends up in the mix gets disproportionately good. You are at the mercy of what labs prioritize, and you have to explore the model the labs hand you because there is no manual.

    Founder Advice in a Lab Dominated World

    Asked what founders should do given that labs are racing toward escape velocity in obvious verifiable domains, Karpathy points back to verifiability itself. If your domain is verifiable but currently neglected, you can build RL environments and run your own fine tuning. The technology works. Pull the lever with diverse RL environments and a fine tuning framework, and you get something useful. He hints there is one specific domain he finds undervalued but declines to name it on stage.

    On the question of what is automatable only from a distance, Karpathy says almost everything can ultimately be made verifiable. Even writing can be assessed by councils of LLM judges. The differences are in difficulty, not in possibility.

    From Vibe Coding to Agentic Engineering

    Vibe coding raises the floor. Anyone can build something. Agentic engineering preserves the professional quality bar that existed before. You are still responsible for your software. You are still not allowed to ship vulnerabilities. The question is how you go faster without sacrificing standards. Karpathy calls it an engineering discipline because coordinating spiky, stochastic agents to maintain quality at speed requires real skill.

    The ceiling on agentic engineering capability is very high. The old idea of a 10x engineer is now an understatement. People who are good at this peak far above 10x.

    What Mediocre Versus AI Native Looks Like

    Karpathy compares this to how different generations use ChatGPT. The difference between a mediocre and an AI native engineer using Claude Code, Codex, or Open Code is investment in setup and full use of available features. The same way previous generations of engineers got the most out of Vim or VSCode, today’s strong engineers tune their agentic environments deeply.

    He thinks hiring processes have not caught up. Most companies still hand out puzzles. The new test should look like asking a candidate to build a full Twitter clone for agents, make it secure, simulate user activity with agents, and then run multiple Codex 5.4x high instances trying to break it. The candidate’s system should hold up.

    What Humans Still Own

    Agents are intern level entities right now. Humans are responsible for aesthetics, judgment, taste, and oversight. Karpathy describes a Menu Gen bug where the agent tried to associate Stripe purchases with Google accounts using email addresses as the key, instead of a persistent user ID. Email addresses can differ between Stripe and Google accounts. This kind of specification level mistake is exactly what humans must catch.

    He works with agents to design detailed specs and treats those as documentation. The agent fills in the implementation. He has stopped memorizing API details for things like NumPy axis arguments or PyTorch reshape versus permute. The intern handles recall. Humans handle architecture, design, and the right questions.

    Reading the actual code agents produce can still cause heart attacks. It is bloated, full of copy paste, riddled with awkward and brittle abstractions. His Micro GPT project, an attempt to simplify LLM training to its bare essence, was nearly impossible to drive through agents. The models hate simplification. That capability sits outside their RL circuits. Nothing is fundamentally preventing this from improving. The labs simply have not invested.

    Animals Versus Ghosts

    Karpathy returns to his framing that we are not building animals, we are summoning ghosts. Animal intelligence comes from evolution and is shaped by intrinsic motivation, fun, curiosity, and empowerment. LLMs are statistical simulation circuits where pre training is the substrate and RL is bolted on as appendages. They are jagged. They do not respond to being yelled at. They have no real curiosity. The ghost framing is partly philosophical, but it changes how you approach them. You stay suspicious. You explore. You do not assume the system you used yesterday will behave the same on a new task.

    Agent Native Infrastructure

    Most software, frameworks, libraries, and documentation are still written for humans. Karpathy’s pet peeve is being told to do something instead of being given a block of text to copy paste to his agent. He wants agent first infrastructure. The Menu Gen project’s hardest part was not writing code. It was deploying on Vercell, configuring DNS, navigating service settings, and stringing together integrations. He wants to give a single prompt and have the entire thing deployed without touching anything.

    Long term he expects agent representation for individuals and organizations. His agent will negotiate meeting details with your agent. The world becomes one of sensors, actuators, and agent native data structures legible to LLMs.

    Education and What Still Matters

    The most striking line of the conversation comes near the end. Karpathy quotes a tweet that shaped his thinking: you can outsource your thinking but you cannot outsource your understanding. Information still has to make it into your brain. You still need to know what you are building and why. You cannot direct agents well if you do not understand the system.

    This is part of why he is so excited about LLM driven knowledge bases. Every time he reads an article, his personal wiki absorbs it, and he can query it from new angles. Every projection onto the same information yields new insight. Tools that enhance human understanding are uniquely valuable because LLMs do not excel at understanding. That bottleneck is yours to manage.

    Thoughts

    The most useful frame in this talk is the distinction between vibe coding and agentic engineering. It clarifies what has been muddled for the past year. Vibe coding is about access. Anyone can produce something. Agentic engineering is about discipline. You preserve the standards that made software trustworthy in the first place, while moving at speeds that would have seemed absurd two years ago. These are not the same activity, and conflating them is part of why so many shipped products feel half built.

    The Menu Gen anecdote is the kind of story that should make every solo developer pause. If a single Gemini plus Nano Banana prompt can replace a multi service Vercell deployed app, the question for any builder becomes how much of what you are working on right now is going to be made spurious by the next model release. The honest answer is probably more than you want to admit. The defensive posture is not building thicker apps. It is choosing problems where the model alone is not enough, where taste, distribution, infrastructure, or specific verifiable RL environments give you something the next model cannot collapse into a prompt.

    The verifiability lens is also unusually practical. If you are a solo builder, the question shifts from what is possible to what is verifiable but neglected. The labs will eat the obvious verifiable domains because that is how their RL pipelines are set up. The opportunity is in domains where verification is possible but the labs have not yet invested. That is a much more concrete strategic filter than vague intuitions about defensibility.

    The car wash example is going to stick. State of the art models can refactor enormous codebases and still tell you to walk somewhere a sane person would drive. That is the lived reality of jagged intelligence, and it argues strongly for staying in the loop on real decisions rather than handing off everything to agents. The agents are excellent fillers of blanks. They are not yet trustworthy specifiers of the spec.

    Finally, the line about outsourcing thinking but not understanding is worth taping above the desk. The bottleneck is no longer typing speed, syntax recall, or even API knowledge. It is whether the human in the loop actually understands the system being built. Tools that genuinely improve human understanding, including personal knowledge bases that re project information through different prompts, are likely the most undervalued category of products being built right now. The opportunity is not just in agents. It is in the cognitive scaffolding that makes humans good directors of agents.

  • Jensen Huang on Nvidia’s Supply Chain Moat, TPU Competition, China Export Controls, and Why Nvidia Will Not Become a Cloud (Dwarkesh Podcast Summary)

    TLDW (Too Long, Didn’t Watch)

    Jensen Huang sat down with Dwarkesh Patel for over 90 minutes covering Nvidia’s supply chain dominance, the TPU threat, why Nvidia will not become a hyperscaler, whether the US should sell AI chips to China, and why Nvidia does not pursue multiple chip architectures at once. Jensen framed Nvidia’s entire business as transforming “electrons into tokens” and argued that Nvidia’s real moat is not any single technology but the full stack ecosystem it has built over two decades. He was blunt about his regret over not investing in Anthropic and OpenAI earlier, passionate about keeping the American tech stack dominant worldwide, and dismissive of the idea that China’s chip industry can be meaningfully contained through export controls.

    Key Takeaways

    1. Nvidia’s moat is the ecosystem, not the chip. Jensen repeatedly emphasized that Nvidia’s competitive advantage comes from CUDA, its massive installed base, its deep partnerships across the entire supply chain, and the fact that it operates in every cloud. The moat is not a single product but an interlocking system that took 20+ years to build.

    2. Supply chain bottlenecks are temporary, energy bottlenecks are not. Jensen argued that CoWoS packaging, HBM memory, EUV capacity, and logic fabrication bottlenecks can all be resolved in two to three years with the right demand signal. The real constraint on AI scaling is energy policy, which takes far longer to fix.

    3. TPUs and ASICs are not an existential threat to Nvidia. Jensen was emphatic that no competitor has demonstrated better price-performance or performance-per-watt than Nvidia, and challenged TPU and Trainium to prove otherwise on public benchmarks like InferenceMAX and MLPerf. He described Anthropic as a “unique instance, not a trend” for TPU adoption.

    4. Jensen regrets not investing in Anthropic and OpenAI earlier. He admitted he did not deeply internalize how much capital AI labs needed and that traditional VC funding was not sufficient for companies at that scale. He described this as a clear miss, though he said Nvidia was not in a position to make multi-billion dollar investments at the time.

    5. Nvidia will not become a hyperscaler. Jensen’s philosophy is “do as much as needed, as little as possible.” Building cloud infrastructure is something other companies can do, so Nvidia supports neoclouds like CoreWeave, Nebius, and Nscale instead of competing with them. Nvidia invests in ecosystem partners rather than vertically integrating into cloud services.

    6. Jensen is strongly against US chip export controls on China. This was the longest and most heated segment of the interview. Jensen argued that China already has abundant compute, energy, and AI researchers, and that export controls have accelerated China’s domestic chip industry while causing the US to concede the world’s second-largest technology market. He compared the situation to how US telecom policy allowed Huawei to dominate global telecommunications.

    7. AI will cause software tool usage to skyrocket, not collapse. Jensen pushed back on the narrative that AI will commoditize software companies. He argued that agents will use existing tools at massive scale, causing the number of instances of products like Excel, Synopsys Design Compiler, and other enterprise tools to grow exponentially.

    8. Nvidia does not pick winners among AI labs. Jensen explained that Nvidia invests across multiple foundation model companies simultaneously and refuses to favor any single one. He cited his own company’s unlikely survival story as the reason for this humility: Nvidia’s original graphics architecture was “precisely wrong” and would have been counted out by anyone picking winners.

    9. Nvidia added Groq for premium token economics. Nvidia recently acquired Groq and is folding it into the CUDA ecosystem because the market is now segmenting into different token tiers. Some customers will pay premium prices for faster response times even at lower throughput, creating a new segment of the inference market.

    10. Without AI, Nvidia would still be very large. Jensen was clear that accelerated computing, not AI specifically, is the foundational mission of the company. Molecular dynamics, quantum chemistry, computational lithography, data processing, and physics simulation all benefit from GPU acceleration regardless of deep learning.

    Detailed Summary

    Nvidia’s Real Business: Electrons to Tokens

    Jensen opened the conversation by reframing Nvidia’s entire value proposition. When Dwarkesh suggested that Nvidia is fundamentally a software company that sends a GDS2 file to TSMC for manufacturing, Jensen pushed back hard. He described Nvidia’s job as transforming electrons into tokens, with everything in between representing an “incredible journey” of artistry, engineering, science, and invention. He said the transformation is far from deeply understood and the journey is far from over, making commoditization unlikely.

    Jensen described Nvidia as operating a philosophy of doing “as much as necessary and as little as possible.” Whatever Nvidia does not need to do itself, it partners with someone else and makes it part of the broader ecosystem. This is why Nvidia has what Jensen called probably the largest ecosystem of partners in the industry, spanning the full supply chain upstream and downstream, application developers, model makers, and all five layers of the AI stack.

    On the question of whether AI will commoditize software companies, Jensen offered a contrarian take. He argued that agents are going to use software tools at unprecedented scale, meaning the number of instances of products like Excel, Cadence design tools, and Synopsys compilers will skyrocket. Today the bottleneck is the number of human engineers. Tomorrow, those engineers will be supported by swarms of agents exploring design spaces and using the same tools humans use today. Jensen said the reason this has not happened yet is simply that the agents are not good enough at using tools. That will change.

    The Supply Chain Moat

    Dwarkesh pressed Jensen on Nvidia’s reported $100 billion (and potentially $250 billion) in purchase commitments with foundries, memory manufacturers, and packaging companies. The question was whether Nvidia’s real moat for the next few years is simply locking up scarce upstream components so that no competitor can get the memory and logic they need to build alternative accelerators.

    Jensen confirmed this is a significant advantage but framed it differently. He said Nvidia has made enormous explicit and implicit commitments upstream. The implicit commitments matter just as much: Jensen personally meets with CEOs across the supply chain to explain the scale of the coming AI industry, convince them to invest in capacity, and assure them that Nvidia’s downstream demand is large enough to justify that investment. Nvidia’s GTC conference serves this purpose too, bringing the entire ecosystem together so upstream suppliers can see downstream demand and vice versa.

    Jensen described a process of systematically “prefetching bottlenecks” years in advance. CoWoS advanced packaging was a major bottleneck two years ago, but Nvidia swarmed it with repeated doubling of capacity until TSMC recognized it as mainstream computing technology rather than a specialty product. More recently, Nvidia has invested in the silicon photonics ecosystem through partnerships with Lumentum and Coherent, invented new packaging technologies, licensed patents to keep the supply chain open, and even invested in new testing equipment like double-sided probing.

    When Dwarkesh asked about the ultimate physical bottlenecks, Jensen surprised him. The hardest bottleneck to solve is not CoWoS or HBM or EUV machines. It is plumbers and electricians needed to build data centers. Jensen used this as a launching point to criticize “doomers” who discourage people from pursuing careers in software engineering or radiology, arguing that scaring people out of these professions creates the real bottlenecks.

    On EUV and logic scaling specifically, Jensen was optimistic. He said no supply chain bottleneck lasts longer than two to three years. Once you can build one of something, you can build ten, and once you can build ten, you can build a million. The key is a clear demand signal. If TSMC is convinced of the demand, ASML will produce enough EUV machines. Meanwhile, Nvidia continues to improve computing efficiency by 10x to 50x per generation through architecture, algorithms, and system design.

    The TPU Question

    Dwarkesh pushed hard on whether Google’s TPUs represent a real threat, noting that two of the top three AI models (Claude and Gemini) were trained on TPUs. Jensen drew a sharp distinction between what Nvidia builds and what a TPU is. Nvidia builds accelerated computing, which serves molecular dynamics, quantum chromodynamics, data processing, fluid dynamics, particle physics, and AI. A TPU is a tensor processing unit optimized for matrix multiplies. Nvidia’s market reach is far greater than any TPU or ASIC can possibly have.

    Jensen emphasized programmability as Nvidia’s core architectural advantage. If you want to invent a new attention mechanism, build a hybrid SSM model, fuse diffusion and autoregressive techniques, or disaggregate computation in a novel way, you need a generally programmable architecture. The only way to achieve 10x or 100x performance leaps (versus the roughly 25% per year from Moore’s Law) is to fundamentally change the algorithm, and that requires the flexibility CUDA provides.

    On the specific question of whether hyperscalers with huge engineering teams can simply write their own kernels and bypass CUDA, Jensen acknowledged they do write custom kernels but argued that Nvidia’s engineers still routinely deliver 2x to 3x speedups when they optimize a partner’s stack. He described Nvidia’s GPUs as “F1 racers” that anyone can drive at 100 mph, but extracting peak performance requires deep architectural expertise. Nvidia uses AI itself to generate many of its optimized kernels.

    Jensen was particularly blunt about public benchmarks. He pointed to Dylan Patel’s InferenceMAX benchmark and said neither TPU nor Trainium has been willing to demonstrate their claimed performance advantages on it. He said Nvidia’s performance-per-TCO is the best in the world, “bar none,” and challenged anyone to prove otherwise.

    Regarding Anthropic’s multi-gigawatt deal with Broadcom and Google for TPUs, Jensen called it “a unique instance, not a trend.” He said without Anthropic, there would be essentially no TPU growth and no Trainium growth. He traced this back to his own mistake: when Anthropic and OpenAI needed multi-billion dollar investments from their compute suppliers to get off the ground, Nvidia was not in a position to provide that capital. Google and AWS were, and in return, Anthropic committed to using their compute.

    Nvidia’s Investment Strategy and Regrets

    Jensen was unusually candid about his regret over not investing in foundation model companies earlier. He said he did not deeply internalize how different AI labs were from typical startups. A traditional VC would never put $5 to $10 billion into a single AI lab, but that was exactly what companies like OpenAI and Anthropic needed. By the time Jensen understood this, Nvidia was not in a financial or cultural position to make those kinds of investments.

    Now, Nvidia has invested approximately $30 billion in OpenAI and $10 billion in Anthropic. Jensen said he is delighted to support both and considers their existence essential for the world. But he acknowledged that these investments came at much higher valuations than would have been possible years earlier.

    Jensen explained Nvidia’s broader investment philosophy: support everyone, do not pick winners. He invests in one foundation model company, he invests in all of them. This comes from hard-won humility. When Nvidia started, there were 60 3D graphics companies. Nvidia’s original architecture was “precisely wrong” and the company would have been at the top of most lists to fail. Jensen said he has enough humility from that experience to know that you cannot predict which AI company will ultimately succeed.

    Why Nvidia Will Not Become a Hyperscaler

    Dwarkesh pointed out that Nvidia has the cash to build and operate its own cloud infrastructure, bypassing the middleman ecosystem that converts CapEx into OpEx for AI labs. Jensen rejected this path based on his core operating philosophy.

    If Nvidia did not build its computing platform, NVLink, and the CUDA ecosystem, nobody else would have done it. He is “completely certain” of that. These are things Nvidia must do. But the world has lots of clouds. If Nvidia did not build a cloud, someone else would show up. So the answer is to support the ecosystem instead: invest in CoreWeave, Nscale, Nebius, and others to help them exist and scale, rather than competing with them.

    Jensen was clear that Nvidia is not trying to be in the financing business either. When OpenAI needed a $30 billion investment before its IPO, Nvidia stepped up because OpenAI needed it and Nvidia deeply believed in the company. But these are targeted ecosystem investments, not a strategic pivot into cloud services.

    On GPU allocation during shortages, Jensen pushed back on the narrative that Nvidia strategically “fractures” the market by giving allocations to smaller neoclouds. He said the process is straightforward: you forecast demand, you place a purchase order, and it is first in, first out. Nvidia never changes prices based on demand. Jensen said he prefers to be dependable and serve as the foundation of the industry rather than extracting maximum short-term value.

    The China Debate

    The longest and most heated section of the interview was Jensen’s case against US chip export controls on China. This was a genuine debate, with Dwarkesh pushing the national security argument and Jensen pushing back forcefully.

    Jensen’s core argument rested on several pillars. First, China already has abundant compute. They manufacture 60% or more of the world’s mainstream chips, have massive energy infrastructure (including empty data centers with full power), and employ roughly 50% of the world’s AI researchers. The threshold of compute needed to build models like Anthropic’s Mythos has already been reached and exceeded by China’s existing infrastructure.

    Second, export controls have backfired. They accelerated China’s domestic chip industry, forced their AI ecosystem to optimize for internal architectures instead of the American tech stack, and caused the United States to concede the second-largest technology market in the world. Jensen compared this directly to how US telecom policy allowed Huawei to dominate global telecommunications infrastructure.

    Third, Jensen argued that AI is a five-layer stack (energy, chips, computing platform, models, applications) and the US needs to win at every layer. Fixating on one layer (models) at the expense of another layer (chips) is counterproductive. If Chinese open source AI models end up optimized for non-American hardware and that stack gets exported to the global south, the Middle East, Africa, and Southeast Asia, the US will have lost something far more valuable than whatever marginal compute advantage the export controls provided.

    Dwarkesh countered with the Mythos example: Anthropic’s new model found thousands of high-severity zero-day vulnerabilities across every major operating system and browser, including one that had existed in OpenBSD for 27 years. If China had enough compute to train and deploy a model like Mythos at scale before the US could prepare, the cyber-offensive capabilities would be devastating.

    Jensen’s response was direct. Mythos was trained on “fairly mundane capacity” that is already abundantly available in China. The amount of compute is not the bottleneck for that kind of breakthrough. Great computer science is, and China has no shortage of brilliant AI researchers. He pointed to DeepSeek as evidence: most advances in AI come from algorithmic innovation, not raw hardware. If China’s researchers can achieve breakthroughs like DeepSeek with limited hardware, imagine what they could do with more.

    Jensen also argued for dialogue over confrontation. He said it is essential that American and Chinese AI researchers are talking to each other, and that both countries agree on what AI should not be used for. The idea that you can prevent AI risks by cutting off chip sales, when the real advances come from algorithms and computer science, reflects a fundamental misunderstanding of how AI progress works.

    The debate ended without resolution, but Jensen’s final point was sharp: “I’m not talking to somebody who woke up a loser. That loser attitude, that loser premise, makes no sense to me.”

    Why Not Multiple Chip Architectures?

    Near the end of the interview, Dwarkesh asked why Nvidia does not run multiple parallel chip projects with different architectures, like a Cerebras-style wafer-scale design or a Dojo-style huge package, or even one without CUDA.

    Jensen’s answer was simple: “We don’t have a better idea.” Nvidia simulates all of these alternative approaches in its internal simulators and they are provably worse. The company works on exactly the projects it wants to work on. If the workload were to change dramatically (not just the algorithms, but the actual market shape), Nvidia might add other accelerators.

    In fact, Nvidia recently did exactly this by acquiring Groq. The inference market is now segmenting into different tiers. Some customers will pay premium prices for extremely fast response times even if throughput is lower. This creates a new “high ASP token” segment that justifies a different point on the performance curve. But Jensen was clear: if he had more money, he would put it all behind Nvidia’s existing architecture, not diversify into alternatives.

    Nvidia Without AI

    Jensen closed by saying that even if the deep learning revolution had never happened, Nvidia would be “very, very large.” The premise of the company has always been that general-purpose computing cannot scale indefinitely and that domain-specific acceleration is the way forward. Molecular dynamics, seismic processing, image processing, computational lithography, quantum chemistry, and data processing all benefit from GPU acceleration regardless of AI. Jensen said the fundamental promise of accelerated computing has not changed “not even a little bit.”

    Thoughts

    This interview is one of the most revealing Jensen Huang conversations in years, partly because Dwarkesh actually pushes back instead of lobbing softballs. A few things stand out.

    The Anthropic regret is real and significant. Jensen is essentially admitting that Nvidia’s biggest strategic miss of the AI era was not understanding that foundation model companies needed supplier-level capital commitments, not VC funding. The fact that Google and AWS used compute investments to lock in Anthropic’s architecture choices has had downstream consequences that Nvidia is still working to unwind. When Jensen says Anthropic is “a unique instance, not a trend” for TPU adoption, he is simultaneously downplaying the threat and revealing exactly how seriously he takes it.

    The China debate is the highlight. Jensen’s argument is more nuanced than it first appears. He is not saying “sell China everything.” He is saying the current binary approach of near-total restriction has backfired by accelerating China’s domestic chip industry and pushing the Chinese AI ecosystem away from the American tech stack. His comparison to the US telecom industry losing global market share to Huawei is pointed and historically grounded. Whether you agree with his conclusion or not, the framing of AI as a five-layer stack where the US needs to compete at every layer is a useful mental model.

    The “electrons to tokens” framing is Jensen at his best. It is a simple metaphor that captures something genuinely complex about where value is created in the AI supply chain. And his insistence that the transformation is “far from deeply understood” is a subtle way of arguing that Nvidia’s competitive position will be durable because the problem space is not close to being solved.

    The Groq acquisition reveal is interesting for what it signals about the inference market. If Nvidia is creating a separate product tier for premium-priced, low-latency tokens, it suggests the company sees inference economics fragmenting significantly. This aligns with the broader trend of AI becoming an enterprise product where different customers have wildly different willingness to pay based on how they use tokens.

    Finally, Jensen’s refusal to diversify chip architectures is a bold bet. “We simulate it all in our simulator, provably worse” is an incredibly confident statement. History is full of companies that were right until they were not. But Nvidia’s track record of 50x generation-over-generation improvements through co-design across processors, fabric, libraries, and algorithms is hard to argue with. The question is whether the current paradigm of transformer-based models on GPU clusters represents a local or global optimum for AI compute.

  • OpenAI Hires OpenClaw Creator Peter Steinberger: A Major Shift in the AI Agent Race

    OpenAI Hires OpenClaw Creator Peter Steinberger

    In a move that underscores the intensifying race to dominate AI agent technology, OpenAI has brought aboard Peter Steinberger, the visionary Austrian developer behind the viral open-source project OpenClaw. As reported by Reuters, Fortune, and TechCrunch, the deal was announced on February 15, 2026. This isn’t a conventional acquisition but an “acquihire,” where Steinberger joins OpenAI to spearhead the development of next-generation personal AI agents.

    Meanwhile, OpenClaw transitions to an independent foundation, remaining fully open-source with continued support from OpenAI (confirmed via Steinberger’s Blog and LinkedIn). This strategic alignment comes amid soaring interest in AI agents, a market projected by AInvest to hit $52.6 billion by 2030 with a 46.3% compound annual growth rate.

    The announcement, made via a post on X by OpenAI CEO Sam Altman around 21:39 GMT, arrived just hours before widespread media coverage from outlets like Fortune. Steinberger swiftly confirmed the news in a personal blog post, emphasizing his excitement for the future while reaffirming OpenClaw’s independence.

    The Rise of OpenClaw: From Playground Project to Phenomenon

    OpenClaw, originally launched as Clawdbot in November 2025—a playful nod to Anthropic’s Claude model—quickly evolved into a powerhouse open-source AI agent framework designed for personal use (Fortune, Steinberger’s Blog, APIYI). Steinberger, who “vibe coded” the project solo after a three-year hiatus following the sale of his previous company for over $100 million, saw it explode in popularity. It amassed over 100,000 GitHub stars, drew 2 million visitors in a week, and became the fastest-growing repo in GitHub history—surpassing milestones of projects like React and Linux (Yahoo Finance, LinkedIn).

    A trademark dispute with Anthropic prompted renames: first to Moltbot (evoking metamorphosis), then to OpenClaw in early 2026. The framework empowers AI to autonomously handle tasks on users’ devices, fostering a community focused on data ownership and multi-model support.

    Key capabilities that fueled its hype include:

    • Managing emails and inboxes.
    • Booking flights, restaurant reservations, and flight check-ins.
    • Interacting with services like insurers.
    • Integrating with apps such as WhatsApp and Slack for task delegation.
    • Creating a “social network” for AI agents via features like Moltbook, which spawned 1.6 million agents (Source).

    Despite its success, sustainability proved challenging. Steinberger personally shouldered infrastructure costs of $10,000 to $20,000 monthly, routing sponsorships to dependencies rather than himself, even as donations and corporate support (including from OpenAI) trickled in.

    The Path to the Deal: Billion-Dollar Bids and Open-Source Principles

    Prior to the announcement, Steinberger fielded billion-dollar acquisition offers from tech giants Meta and OpenAI (Yahoo Finance). Meta’s Mark Zuckerberg personally messaged Steinberger on WhatsApp, sparking a 10-minute debate over AI models, while OpenAI’s Sam Altman offered computational resources via a Cerebras partnership to boost agent performance. Meta aggressively pursued Steinberger and his team, but OpenAI advanced in talks to hire him and key contributors.

    Steinberger spent the preceding week in San Francisco meeting AI labs, accessing unreleased research. He insisted any deal preserve OpenClaw’s open-source nature, likening it to Chrome and Chromium. Ultimately, OpenAI’s vision aligned best with his goal of accessible agents.

    Key Announcements and Voices from the Frontlines

    Sam Altman, in his X post on February 15, 2026, hailed Steinberger as a “genius with a lot of amazing ideas about the future of very smart agents interacting with each other to do very useful things for people.” He added, “We expect this will quickly become core to our product offerings. OpenClaw will live in a foundation as an open source project that OpenAI will continue to support. The future is going to be extremely multi-agent and it’s important to us to support open source as part of that.”

    Steinberger’s blog post echoed this enthusiasm: “tl;dr: I’m joining OpenAI to work on bringing agents to everyone. OpenClaw will move to a foundation and stay open and independent. The last month was a whirlwind… When I started exploring AI, my goal was to have fun and inspire people… My next mission is to build an agent that even my mum can use… I’m a builder at heart… What I want is to change the world, not build a large company… The claw is the law.”

    Strategic Implications: Opportunities and Challenges Ahead

    For OpenAI, this bolsters their AI agent push, potentially accelerating consumer-grade solutions and addressing barriers like setup complexity and security. It positions them in the “personal agent race” against Meta, emphasizing multi-agent systems. The broader AI agents market could reach $180 billion by 2033, driving undisclosed but likely substantial financial terms.

    OpenClaw benefits from foundation status (akin to the Linux Foundation), ensuring independence and community focus with OpenAI’s sponsorship.

    However, risks loom large. OpenClaw’s “unfettered access” to devices raises security concerns, including data breaches and rogue actions—like one incident of spamming hundreds of iMessages. China’s industry ministry warned of cyberattack vulnerabilities if misconfigured. Steinberger aims to prioritize safety and accessibility.

    Community Pulse: Excitement, Skepticism, and Satire

    Reactions on X blend hype and caution. Cointelegraph noted the move as a “big move” for ecosystems. One user called it the “birth of the agent era,” while another satirically predicted a shift to “ClosedClaw.” Fears of closure persist, but congratulations abound, with some viewing Anthropic’s trademark push as a “fumble.”

    LinkedIn’s Reyhan Merekar praised Steinberger’s solo feat: “Literally coding alone at odd hours… Faster than React, Linux, and Kubernetes combined.”

    Beyond the Headlines: Vision and Value

    Steinberger’s core vision: Agents for all, even non-tech users, with emphasis on safety, cutting-edge models, and impact over empire-building. OpenClaw’s strengths—model-agnostic design, delegation-focused UX, and persistent memory—eluded even well-funded labs.

    As of February 15, 2026, this marks a pivotal moment in AI’s evolution, blending open innovation with corporate muscle. No further updates have emerged, but the multi-agent future Altman envisions is accelerating.

  • Jensen Huang on Joe Rogan: AI’s Future, Nuclear Energy, and NVIDIA’s Near-Death Origin Story

    In a landmark episode of the Joe Rogan Experience (JRE #2422), NVIDIA CEO Jensen Huang sat down for a rare, deep-dive conversation covering everything from the granular history of the GPU to the philosophical implications of artificial general intelligence. Huang, currently the longest-running tech CEO in the world, offered a fascinating look behind the curtain of the world’s most valuable company.

    For those who don’t have three hours to spare, we’ve compiled the “Too Long; Didn’t Watch” breakdown, key takeaways, and a detailed summary of this historic conversation.

    TL;DW (Too Long; Didn’t Watch)

    • The OpenAI Connection: Jensen personally delivered the first AI supercomputer (DGX-1) to Elon Musk and the OpenAI team in 2016, a pivotal moment that kickstarted the modern AI race.
    • The “Sega Moment”: NVIDIA almost went bankrupt in 1995. They were saved only because the CEO of Sega invested $5 million in them after Jensen admitted their technology was flawed and the contract needed to be broken.
    • Nuclear AI: Huang predicts that within the next decade, AI factories (data centers) will likely be powered by small, on-site nuclear reactors to handle immense energy demands.
    • Driven by Fear: Despite his success, Huang wakes up every morning with a “fear of failure” rather than a desire for success. He believes this anxiety is essential for survival in the tech industry.
    • The Immigrant Hustle: Huang’s childhood involved moving from Thailand to a reform school in rural Kentucky where he cleaned toilets and smoked cigarettes at age nine to fit in.

    Key Takeaways

    1. AI as a “Universal Function Approximator”

    Huang provided one of the most lucid non-technical explanations of deep learning to date. He described AI not just as a chatbot, but as a “universal function approximator.” While traditional software requires humans to write the function (input -> code -> output), AI flips this. You give it the input and the desired output, and the neural network figures out the function in the middle. This allows computers to solve problems for which humans cannot write the code, such as curing diseases or solving complex physics.

    2. The Future of Work and Energy

    The conversation touched heavily on resources. Huang noted that we are in a transition from “Moore’s Law” (doubling performance) to “Huang’s Law” (accelerated computing), where the cost of computing drops while energy efficiency skyrockets. However, the sheer scale of AI requires massive power. He envisions a future of “energy abundance” driven by nuclear power, which will support the massive “AI factories” of the future.

    3. Safety Through “Smartness”

    Addressing Rogan’s concerns about AI safety and rogue sentience, Huang argued that “smarter is safer.” He compared AI to cars: a 1,000-horsepower car is safer than a Model T because the technology is channeled into braking, handling, and safety systems. Similarly, future computing power will be channeled into “reflection” and “fact-checking” before an AI gives an answer, reducing hallucinations and danger.

    Detailed Summary

    The Origin of the AI Boom

    The interview began with a look back at the relationship between NVIDIA and Elon Musk. In 2016, NVIDIA spent billions developing the DGX-1 supercomputer. At the time, no one understood it or wanted to buy it—except Musk. Jensen personally delivered the first unit to a small office in San Francisco where the OpenAI team (including Ilya Sutskever) was working. That hardware trained the early models that eventually became ChatGPT.

    The “Struggle” and the Sega Pivot

    Perhaps the most compelling part of the interview was Huang’s recounting of NVIDIA’s early days. In 1995, NVIDIA was building 3D graphics chips using “forward texture mapping” and curved surfaces—a strategy that turned out to be technically wrong compared to the industry standard. Facing bankruptcy, Huang had to tell his only major partner, Sega, that NVIDIA could not complete their console contract.

    In a move that saved the company, the CEO of Sega, who liked Jensen personally, agreed to invest the remaining $5 million of their contract into NVIDIA anyway. Jensen used that money to pivot, buying an emulator to test a new chip architecture (RIVA 128) that eventually revolutionized PC gaming. Huang admits that without that act of kindness and luck, NVIDIA would not exist today.

    From Kentucky to Silicon Valley

    Huang shared his “American Dream” story. Born in Taiwan and raised in Thailand, his parents sent him and his brother to the U.S. for safety during civil unrest. Due to a misunderstanding, they were enrolled in the Oneida Baptist Institute in Kentucky, which turned out to be a reform school for troubled youth. Huang described a rough upbringing where he was the youngest student, his roommate was a 17-year-old recovering from a knife fight, and he was responsible for cleaning the dorm toilets. He credits these hardships with giving him a high tolerance for pain and suffering—traits he says are required for entrepreneurship.

    The Philosophy of Leadership

    When asked how he stays motivated as the head of a trillion-dollar company, Huang gave a surprising answer: “I have a greater drive from not wanting to fail than the drive of wanting to succeed.” He described living in a constant state of “low-grade anxiety” that the company is 30 days away from going out of business. This paranoia, he argues, keeps the company honest, grounded, and agile enough to “surf the waves” of technological chaos.

    Some Thoughts

    What stands out most in this interview is the lack of “tech messiah” complex often seen in Silicon Valley. Jensen Huang does not present himself as a visionary who saw it all coming. Instead, he presents himself as a survivor—someone who was wrong about technology multiple times, who was saved by the grace of a Japanese executive, and who lucked into the AI boom because researchers happened to buy NVIDIA gaming cards to train neural networks.

    This humility, combined with the technical depth of how NVIDIA is re-architecting the world’s computing infrastructure, makes this one of the most essential JRE episodes for understanding where the future is heading. It serves as a reminder that the “overnight success” of AI is actually the result of 30 years of near-failures, pivots, and relentless problem-solving.

  • Meta Review: GPT-5.1 – A Step Forward or a Filtered Facelift?

    TL;DR:

    OpenAI’s GPT-5.1, rolling out starting November 13, 2025, enhances the GPT-5 series with warmer tones, adaptive reasoning, and refined personality styles, praised for better instruction-following and efficiency. However, some users criticize its filtered authenticity compared to GPT-4o, fueling #keep4o campaigns. Overall X sentiment: 60% positive for utility, but mixed on emotional depth—7.5/10.

    Introduction

    OpenAI’s GPT-5.1, announced and beginning rollout on November 13, 2025, upgrades the GPT-5 series to be “smarter, more reliable, and a lot more conversational.” It features two variants: GPT-5.1 Instant for quick, warm everyday interactions with improved instruction-following, and GPT-5.1 Thinking for complex reasoning with dynamic thinking depth. Key additions include refined personality presets (e.g., Friendly, Professional, Quirky) and granular controls for warmth, conciseness, and more. The rollout starts with paid tiers (Pro, Plus, Go, Business), extending to free users soon, with legacy GPT-5 models available for three months. API versions launch later this week. Drawing from over 100 X posts (each with at least 5 likes) and official details from OpenAI’s announcement, this meta review captures a community vibe of excitement for refinements tempered by frustration over perceived regressions, especially versus GPT-4o’s unfiltered charm. Sentiment tilts positive (60% highlight gains), but #keep4o underscores a push for authenticity.

    Key Strengths: Where GPT-5.1 Shines

    Users and official benchmarks praise GPT-5.1 for surpassing GPT-5’s rigidity, delivering more human-like versatility. Officially, it excels in math (AIME 2025) and coding (Codeforces) evaluations, with adaptive reasoning deciding when to “think” deeper for accuracy without sacrificing speed on simple tasks.

    • Superior Instruction-Following and Adaptability: Tops feedback, with strict prompt adherence (e.g., exact word counts). Tests show 100% compliance vs. rivals’ 50%. Adaptive reasoning varies depth: quick for basics, thorough for math/coding, reducing errors in finances or riddles. OpenAI highlights examples like precise six-word responses.
    • Warmer, More Natural Conversations: The “heart” upgrade boosts EQ and empathy, making responses playful and contextual over long chats. It outperforms Claude 4.5 Sonnet on EQ-Bench for flow. Content creators note engaging, cliché-free outputs. Official demos show empathetic handling of scenarios like spills, with reassurance and advice.
    • Customization and Efficiency: Refined presets include Default (balanced), Friendly (warm, chatty), Efficient (concise), Professional (polished), Candid (direct), Quirky (playful), Cynical, and Nerdy. Sliders tweak warmth, emojis, etc. Memory resolves conflicts naturally; deleted info stays gone. Speed gains (e.g., 30% faster searches) and 196K token windows aid productivity. GPT-5.1 Auto routes queries optimally.
    AspectCommunity HighlightsExample User Feedback
    Instruction-FollowingPrecise adherence to limits and styles“100% accurate on word-count prompts—game-changer for coding.”
    Conversational FlowWarmer, empathetic tone“Feels like chatting with a smart friend, not a bot.”
    CustomizationRefined presets and sliders enhance usability“Friendly mode is spot-on for casual use; no more robotic replies.”
    EfficiencyFaster on complex tasks with adaptive depth“PDF summaries in seconds—beats GPT-5 by miles.”

    These align with OpenAI’s claims, positioning GPT-5.1 as a refined tool for pros, writers, and casuals, with clearer, jargon-free explanations (e.g., simpler sports stats breakdowns).

    Pain Points: The Backlash and Shortcomings

    Not all are sold; 40% of posts call it a “minor patch” amid Gemini 3.0 competition. #keep4o reflects longing for GPT-4o’s “spark,” with official warmth seen by some as over-polished.

    • Filtered and Less Authentic Feel: “Safety ceilings” make it feel simulated; leaked prompts handle “delusional” queries cautiously, viewed as censorship. Users feel stigmatized, contrasting GPT-4o’s genuine vibe, accusing OpenAI of erasing “soul” for liability.
    • No Major Intelligence Leap: Adaptive thinking helps, but tests falter on simulations or formatting. No immediate API Codex; “juice” metric dips. Rivals like Claude 4.5 lead in empathy/nuance. Official naming as “5.1” admits incremental gains.
    • Rollout Glitches and Legacy Concerns: Chats mimic GPT-5.1 on GPT-4o; voice stays GPT-4o-based. Enterprise gets early toggle (off default). Some miss unbridled connections, seeing updates as paternalistic. Legacy GPT-5 sunsets in three months.
    AspectCommunity CriticismsExample User Feedback
    AuthenticityOver-filtered, simulated feel“It’s compliance over connection—feels creepy.”
    IntelligenceMinor upgrades, no wow factor“Shines in benchmarks but flops on real tasks like video directs.”
    AccessibilityDelayed API; rollout bugs“Why no Codex? And my 4o chats are contaminated.”
    ComparisonsLags behind Claude/Gemini in EQ“Claude 4.5 for empathy; GPT-5.1 is just solid, not special.”

    This tension: Tech users love tweaks, but raw AI seekers feel alienated. OpenAI’s safety card addendum addresses mitigations.

    Comparisons and Broader Context

    GPT-5.1 vs. peers:

    • Vs. Claude 4.5 Sonnet: Edges in instruction-following but trails in writing/empathy; users switch for “human taste.”
    • Vs. Gemini 2.5/3.0: Quicker but less affable; timing counters competition.
    • Vs. GPT-4o/GPT-5: Warmer than GPT-5, but lacks 4o’s freedom, driving #keep4o. Official examples show clearer, empathetic responses vs. GPT-5’s formality.

    Links to ecosystems like Marble (3D) or agents hint at multi-modal roles. Finetuning experiments roll out gradually.

    A Polarizing Upgrade with Promise

    X’s vibe: Optimistic yet split—a “nice upgrade” for efficiency, “step back” for authenticity. Scores 7.5/10: Utility strong, soul middling. With refinements like Codex and ignoring #keep4o risks churn. AI progress balances smarts and feel. Test presets/prompts; personalization unlocks magic.

  • Inside Microsoft’s AGI Masterplan: Satya Nadella Reveals the 50-Year Bet That Will Redefine Computing, Capital, and Control

    1) Fairwater 2 is live at unprecedented scale, with Fairwater 4 linking over a 1 Pb AI WAN

    Nadella walks through the new Fairwater 2 site and states Microsoft has targeted a 10x training capacity increase every 18 to 24 months relative to GPT-5’s compute. He also notes Fairwater 4 will connect on a one petabit network, enabling multi-site aggregation for frontier training, data generation, and inference.

    2) Microsoft’s MAI program, a parallel superintelligence effort alongside OpenAI

    Microsoft is standing up its own frontier lab and will “continue to drop” models in the open, with an omni-model on the roadmap and high-profile hires joining Mustafa Suleyman. This is a clear signal that Microsoft intends to compete at the top tier while still leveraging OpenAI models in products.

    3) Clarification on IP: Microsoft says it has full access to the GPT family’s IP

    Nadella says Microsoft has access to all of OpenAI’s model IP (consumer hardware excluded) and shared that the firms co-developed system-level designs for supercomputers. This resolves long-standing ambiguity about who holds rights to GPT-class systems.

    4) New exclusivity boundaries: OpenAI’s API is Azure-exclusive, SaaS can run elsewhere with limited exceptions

    The interview spells out that OpenAI’s platform API must run on Azure. ChatGPT as SaaS can be hosted elsewhere only under specific carve-outs, for example certain US government cases.

    5) Per-agent future for Microsoft’s business model

    Nadella describes a shift where companies provision Windows 365 style computers for autonomous agents. Licensing and provisioning evolve from per-user to per-user plus per-agent, with identity, security, storage, and observability provided as the substrate.

    6) The 2024–2025 capacity “pause” explained

    Nadella confirms Microsoft paused or dropped some leases in the second half of last year to avoid lock-in to a single accelerator generation, keep the fleet fungible across GB200, GB300, and future parts, and balance training with global serving to match monetization.

    7) Concrete scaling cadence disclosure

    The 10x training capacity target every 18 to 24 months is stated on the record while touring Fairwater 2. This implies the next frontier runs will be roughly an order of magnitude above GPT-5 compute.

    8) Multi-model, multi-supplier posture

    Microsoft will keep using OpenAI models in products for years, build MAI models in parallel, and integrate other frontier models where product quality or cost warrants it.

    Why these points matter

    • Industrial scale: Fairwater’s disclosed networking and capacity targets set a new bar for AI factories and imply rapid model scaling.
    • Strategic independence: MAI plus GPT IP access gives Microsoft a dual track that reduces single-partner risk.
    • Ecosystem control: Azure exclusivity for OpenAI’s API consolidates platform power at the infrastructure layer.
    • New revenue primitives: Per-agent provisioning reframes Microsoft’s core metrics and pricing.

    Pull quotes

      “We’ve tried to 10x the training capacity every 18 to 24 months.”

      “The API is Azure-exclusive. The SaaS business can run anywhere, with a few exceptions.”

      “We have access to the GPT family’s IP.”

    TL;DW

    • Microsoft is building a global network of AI super-datacenters (Fairwater 2 and beyond) designed for fast upgrade cycles and cross-region training at petabit scale.
    • Strategy spans three layers: infrastructure, models, and application scaffolding, so Microsoft creates value regardless of which model wins.
    • AI economics shift margins, so Microsoft blends subscriptions with metered consumption and focuses on tokens per dollar per watt.
    • Future includes autonomous agents that get provisioned like users with identity, security, storage, and observability.
    • Trust and sovereignty are central. Microsoft leans into compliant, sovereign cloud footprints to win globally.

    Detailed Summary

    1) Fairwater 2: AI Superfactory

    Microsoft’s Fairwater 2 is presented as the most powerful AI datacenter yet, packing hundreds of thousands of GB200 and GB300 accelerators, tied by a petabit AI WAN and designed to stitch training jobs across buildings and regions. The key lesson: keep the fleet fungible and avoid overbuilding for a single hardware generation as power density and cooling change with each wave like Vera Rubin and Rubin Ultra.

    2) The Three-Layer Strategy

    • Infrastructure: Azure’s hyperscale footprint, tuned for training, data generation, and inference, with strict flexibility across model architectures.
    • Models: Access to OpenAI’s GPT family for seven years plus Microsoft’s own MAI roadmap for text, image, and audio, moving toward an omni-model.
    • Application Scaffolding: Copilots and agent frameworks like GitHub’s Agent HQ and Mission Control that orchestrate many agents on real repos and workflows.

    This layered approach lets Microsoft compete whether the value accrues to models, tooling, or infrastructure.

    3) Business Models and Margins

    AI raises COGS relative to classic SaaS, so pricing blends entitlements with consumption tiers. GitHub Copilot helped catalyze a multibillion market in a year, even as rivals emerged. Microsoft aims to ride a market that is expanding 10x rather than clinging to legacy share. Efficiency focus: tokens per dollar per watt through software optimization as much as hardware.

    4) Copilot, GitHub, and Agent Control Planes

    GitHub becomes the control plane for multi-agent development. Agent HQ and Mission Control aim to let teams launch, steer, and observe multiple agents working in branches, with repo-native primitives for issues, actions, and reviews.

    5) Models vs Scaffolding

    Nadella argues model monopolies are checked by open source and substitution. Durable value sits in the scaffolding layer that brings context, data liquidity, compliance, and deep tool knowledge, exemplified by Excel Agent that understands formulas and artifacts beyond screen pixels.

    6) Rise of Autonomous Agents

    Two worlds emerge: human-in-the-loop Copilots and fully autonomous agents. Microsoft plans to provision agents with computers, identity, security, storage, and observability, evolving end-user software into an infrastructure business for agents as well as people.

    7) MAI: Microsoft’s In-House Frontier Effort

    Microsoft is assembling a top-tier lab led by Mustafa Suleyman and veterans from DeepMind and Google. Early MAI models show progress in multimodal arenas. The plan is to combine OpenAI access with independent research and product-optimized models for latency and cost.

    8) Capex and Industrial Transformation

    Capex has surged. Microsoft frames this era as capital intensive and knowledge intensive. Software scheduling, workload placement, and continual throughput improvements are essential to maximize returns on a fleet that upgrades every 18 to 24 months.

    9) The Lease Pause and Flexibility

    Microsoft paused some leases to avoid single-generation lock-in and to prevent over-reliance on a small number of mega-customers. The portfolio favors global diversity, regulatory alignment, balanced training and inference, and location choices that respect sovereignty and latency needs.

    10) Chips and Systems

    Custom silicon like Maia will scale in lockstep with Microsoft’s own models and OpenAI collaboration, while Nvidia remains central. The bar for any new accelerator is total fleet TCO, not just raw performance, and system design is co-evolved with model needs.

    11) Sovereign AI and Trust

    Nations want AI benefits with continuity and control. Microsoft’s approach combines sovereign cloud patterns, data residency, confidential computing, and compliance so countries can adopt leading AI while managing concentration risk. Nadella emphasizes trust in American technology and institutions as a decisive global advantage.


    Key Takeaways

    1. Build for flexibility: Datacenters, pricing, and software are optimized for fast evolution and multi-model support.
    2. Three-layer stack wins: Infrastructure, models, and scaffolding compound each other and hedge against shifts in where value accrues.
    3. Agents are the next platform: Provisioned like users with identity and observability, agents will demand a new kind of enterprise infrastructure.
    4. Efficiency is king: Tokens per dollar per watt drives margins more than any single chip choice.
    5. Trust and sovereignty matter: Compliance and credible guarantees are strategic differentiators in a bipolar world.
  • All-In Podcast Breaks Down OpenAI’s Turbulent Week, the AI Arms Race, and Socialism’s Surge in America

    November 8, 2025

    In the latest episode of the All-In Podcast, aired on November 7, 2025, hosts Jason Calacanis, Chamath Palihapitiya, David Sacks, and guest Brad Gerstner (with David Friedberg absent) delivered a packed discussion on the tech world’s hottest topics. From OpenAI’s public relations mishaps and massive infrastructure bets to the intensifying U.S.-China AI rivalry, market volatility, and the surprising rise of socialism in U.S. politics, the episode painted a vivid picture of an industry at a crossroads. Here’s a deep dive into the key takeaways.

    OpenAI’s “Rough Week”: From Altman’s Feistiness to CFO’s Backstop Blunder

    The podcast kicked off with a spotlight on OpenAI, which has been under intense scrutiny following CEO Sam Altman’s appearance on the BG2 podcast. Gerstner, who hosts BG2, recounted asking Altman about OpenAI’s reported $13 billion in revenue juxtaposed against $1.4 trillion in spending commitments for data centers and infrastructure. Altman’s response—offering to find buyers for Gerstner’s shares if he was unhappy—went viral, sparking debates about OpenAI’s financial health and the broader AI “bubble.”

    Gerstner defended the question as “mundane” and fair, noting that Altman later clarified OpenAI’s revenue is growing steeply, projecting a $20 billion run rate by year’s end. Palihapitiya downplayed the market’s reaction, attributing stock dips in companies like Microsoft and Nvidia to natural “risk-off” cycles rather than OpenAI-specific drama. “Every now and then you have a bad day,” he said, suggesting Altman might regret his tone but emphasizing broader market dynamics.

    The conversation escalated with OpenAI CFO Sarah Friar’s Wall Street Journal comments hoping for a U.S. government “backstop” to finance infrastructure. This fueled bailout rumors, prompting Friar to clarify she meant public-private partnerships for industrial capacity, not direct aid. Sacks, recently appointed as the White House AI “czar,” emphatically stated, “There’s not going to be a federal bailout for AI.” He praised the sector’s competitiveness, noting rivals like Grok, Claude, and Gemini ensure no single player is “too big to fail.”

    The hosts debated OpenAI’s revenue model, with Calacanis highlighting its consumer-heavy focus (estimated 75% from subscriptions like ChatGPT Plus at $240/year) versus competitors like Anthropic’s API-driven enterprise approach. Gerstner expressed optimism in the “AI supercycle,” betting on long-term growth despite headwinds like free alternatives from Google and Apple.

    The AI Race: Jensen Huang’s Warning and the Call for Federal Unity

    Shifting gears, the panel addressed Nvidia CEO Jensen Huang’s stark prediction to the Financial Times: “China is going to win the AI race.” Huang cited U.S. regulatory hurdles and power constraints as key obstacles, contrasting with China’s centralized support for GPUs and data centers.

    Gerstner echoed Huang’s call for acceleration, praising federal efforts to clear regulatory barriers for power infrastructure. Palihapitiya warned of Chinese open-source models like Qwen gaining traction, as seen in products like Cursor 2.0. Sacks advocated for a federal AI framework to preempt a patchwork of state regulations, arguing blue states like California and New York could impose “ideological capture” via DEI mandates disguised as anti-discrimination rules. “We need federal preemption,” he urged, invoking the Commerce Clause to ensure a unified national market.

    Calacanis tied this to environmental successes like California’s emissions standards but cautioned against overregulation stifling innovation. The consensus: Without streamlined permitting and behind-the-meter power generation, the U.S. risks ceding ground to China.

    Market Woes: Consumer Cracks, Layoffs, and the AI Job Debate

    The discussion turned to broader economic signals, with Gerstner highlighting a “two-tier economy” where high-end consumers thrive while lower-income groups falter. Credit card delinquencies at 2009 levels, regional bank rollovers, and earnings beats tempered by cautious forecasts painted a picture of volatility. Palihapitiya attributed recent market dips to year-end rebalancing, not AI hype, predicting a “risk-on” rebound by February.

    A heated exchange ensued over layoffs and unemployment, particularly among 20-24-year-olds (at 9.2%). Calacanis attributed spikes to AI displacing entry-level white-collar jobs, citing startup trends and software deployments. Sacks countered with data showing stable white-collar employment percentages, calling AI blame “anecdotal” and suggesting factors like unemployable “woke” degrees or over-hiring during zero-interest-rate policies (ZIRP). Gerstner aligned with Sacks, noting companies’ shift to “flatter is faster” efficiency cultures, per Morgan Stanley analysis.

    Inflation ticking up to 3% was flagged as a barrier to rate cuts, with Calacanis criticizing the administration for downplaying it. Trump’s net approval rating has dipped to -13%, with 65% of Americans feeling he’s fallen short on middle-class issues. Palihapitiya called for domestic wins, like using trade deal funds (e.g., $3.2 trillion from Japan and allies) to boost earnings.

    Socialism’s Rise: Mamdani’s NYC Win and the Filibuster Nuclear Option

    The episode’s most provocative segment analyzed Democratic socialist Zohran Mamdani’s upset victory as New York City’s mayor-elect. Mamdani, promising rent freezes, free transit, and higher taxes on the rich (pushing rates to 54%), won narrowly at 50.4%. Calacanis noted polling showed strong support from young women and recent transplants, while native New Yorkers largely rejected him.

    Palihapitiya linked this to a “broken generational compact,” quoting Peter Thiel on student debt and housing unaffordability fueling anti-capitalist sentiment. He advocated reforming student loans via market pricing and even expressed newfound sympathy for forgiveness—if tied to systemic overhaul. Sacks warned of Democrats shifting left, with “centrist” figures like Joe Manchin and Kyrsten Sinema exiting, leaving energy with revolutionaries. He tied this to the ongoing government shutdown, blaming Democrats’ filibuster leverage and urging Republicans to eliminate it for a “nuclear option” to pass reforms.

    Gerstner, fresh from debating “ban the billionaires” at Stanford (where many students initially favored it), stressed Republicans must address affordability through policies like no taxes on tips or overtime. He predicted an A/B test: San Francisco’s centrist turnaround versus New York’s potential chaos under Mamdani.

    Holiday Cheer and Final Thoughts

    Amid the heavy topics, the hosts plugged their All-In Holiday Spectacular on December 6, promising comedy roasts by Kill Tony, poker, and open bar. Calacanis shared updates on his Founder University expansions to Saudi Arabia and Japan.

    Overall, the episode underscored optimism in AI’s transformative potential tempered by real-world challenges: financial scrutiny, geopolitical rivalry, economic inequality, and political polarization. As Gerstner put it, “Time is on your side if you’re betting over a five- to 10-year horizon.” With Trump’s mandate in play, the panel urged swift action to secure America’s edge—or risk socialism’s further ascent.

  • The Benefits of Bubbles: Why the AI Boom’s Madness Is Humanity’s Shortcut to Progress

    TL;DR:

    Ben Thompson’s “The Benefits of Bubbles” argues that financial manias like today’s AI boom, while destined to burst, play a crucial role in accelerating innovation and infrastructure. Drawing on Carlota Perez and the newer work of Byrne Hobart and Tobias Huber, Thompson contends that bubbles aren’t just speculative excess—they’re coordination mechanisms that align capital, talent, and belief around transformative technologies. Even when they collapse, the lasting payoff is progress.

    Summary

    Ben Thompson revisits the classic question: are bubbles inherently bad? His answer is nuanced. Yes, bubbles pop. But they also build. Thompson situates the current AI explosion—OpenAI’s trillion-dollar commitments and hyperscaler spending sprees—within the historical pattern described by Carlota Perez in Technological Revolutions and Financial Capital. Perez’s thesis: every major technological revolution begins with an “Installation Phase” fueled by speculation and waste. The bubble funds infrastructure that outlasts its financiers, paving the way for a “Deployment Phase” where society reaps the benefits.

    Thompson extends this logic using Byrne Hobart and Tobias Huber’s concept of “Inflection Bubbles,” which he contrasts with destructive “Mean-Reversion Bubbles” like subprime mortgages. Inflection bubbles occur when investors bet that the future will be radically different, not just marginally improved. The dot-com bubble, for instance, built the Internet’s cognitive and physical backbone—from fiber networks to AJAX-driven interactivity—that enabled the next two decades of growth.

    Applied to AI, Thompson sees similar dynamics. The bubble is creating massive investment in GPUs, fabs, and—most importantly—power generation. Unlike chips, which decay quickly, energy infrastructure lasts decades and underpins future innovation. Microsoft, Amazon, and others are already building gigawatts of new capacity, potentially spurring a long-overdue resurgence in energy growth. This, Thompson suggests, may become the “railroads and power plants” of the AI age.

    He also highlights AI’s “cognitive capacity payoff.” As everyone from startups to Chinese labs works on AI, knowledge diffusion is near-instantaneous, driving rapid iteration. Investment bubbles fund parallel experimentation—new chip architectures, lithography startups, and fundamental rethinks of computing models. Even failures accelerate collective learning. Hobart and Huber call this “parallelized innovation”: bubbles compress decades of progress into a few intense years through shared belief and FOMO-driven coordination.

    Thompson concludes with a warning against stagnation. He contrasts the AI mania with the risk-aversion of the 2010s, when Big Tech calcified and innovation slowed. Bubbles, for all their chaos, restore the “spiritual energy” of creation—a willingness to take irrational risks for something new. While the AI boom will eventually deflate, its benefits, like power infrastructure and new computing paradigms, may endure for generations.

    Key Takeaways

    • Bubbles are essential accelerators. They fund infrastructure and innovation that rational markets never would.
    • Carlota Perez’s “Installation Phase” framework explains how speculative capital lays the groundwork for future growth.
    • Inflection bubbles drive paradigm shifts. They aren’t about small improvements—they bet on orders-of-magnitude change.
    • The AI bubble is building the real economy. Fabs, power plants, and chip ecosystems are long-term assets disguised as mania.
    • Cognitive capacity grows in parallel. When everyone builds simultaneously, progress compounds across fields.
    • FOMO has a purpose. Speculative energy coordinates capital and creativity at scale.
    • Stagnation is the alternative. Without bubbles, societies drift toward safety, bureaucracy, and creative paralysis.
    • The true payoff of AI may be infrastructure. Power generation, not GPUs, could be the era’s lasting legacy.
    • Belief drives progress. Mania is a social technology for collective imagination.

    1-Sentence Summary:

    Ben Thompson argues that the AI boom is a classic “inflection bubble” — a burst of coordinated mania that wastes money in the short term but builds the physical and intellectual foundations of the next technological age.