PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: AI doomerism

  • Dan Shipper’s Most Contrarian AI Predictions for 2026: Why the Job Apocalypse Is a Myth, SaaS Will Boom, PMs and Designers Win, and CLIs Are Already Over

    Dan Shipper, the CEO and founder of Every, returned to Lenny’s Podcast for round two of AI predictions. His last appearance produced one of the most prescient calls of the year: that non-technical people would build serious work inside Claude Code. He was unbelievably right. This conversation is the follow-up, a tour of his most contrarian forecasts for how AI is actually changing the way we work, who wins, who loses, and what almost every commentator is getting wrong about the next twelve to twenty-four months.

    TLDW

    Shipper argues that the AI job apocalypse is a myth, that SaaS is going to boom rather than die, that product managers and full-stack designers are the biggest winners of the agent era, that personal agents inside Codex and Claude Code will quietly replace the browser as the primary work surface, that every company will run a single shared super-agent in Slack instead of a fleet of per-user bots, that the CLI moment is already over, that pull requests are going to flood organizations from non-technical staff, that forward-deployed engineers who garden company agents become the new senior role, that GPT-5.5 still cannot match a real senior engineer on architectural judgment, that AI-generated internal writing is fine and probably better than what most humans produce, that CEOs and middle managers have not adapted yet but soon will be forced to, that the edge of AI lives wherever a curious human is using it rather than in San Francisco, and that the only durable strategy is to ride the models and keep playing with whatever ships next. The whole conversation balances aggressive AI bullishness with an equally strong bet on humans, on creativity, and on the unavoidable need for someone to care for every agent that gets deployed.

    Thoughts

    The most useful frame Shipper gives is that models commoditize yesterday’s human competence. Every time a frontier model crosses a new bar, the work that used to define seniority becomes cheap. The senior engineer who could carry a refactor in their head, the PM who could write a coherent strategy doc, the designer who could ship a polished landing page in a week. That competence is now frozen, codified, and available on tap. The interesting question is not whether models will keep eating tasks. They will. The interesting question is what humans do with the suddenly cheap raw material underneath them. Shipper’s answer is that humans climb the stack: they go up a level, find a new problem worth framing, and use the commoditized competence as feedstock for something that did not exist before. That treadmill is the actual engine of value creation, and it is why he can be simultaneously AI pilled and bullish on hiring.

    His SaaS take is the spiciest call of the episode and probably the most defensible. The crowd consensus is that agents will gut SaaS because an AI can just write the form filler, the dashboard, the workflow. Shipper points out the obvious counterfactual: agents do not reduce the number of people using SaaS, they increase it. A marketing lead who could never touch the data warehouse can now stand up a PostHog query through Codex. A founder who never opened Vanta can run a SOC 2 prep through an agent. The result is more users, more accounts, and a much fatter top of funnel for every horizontal tool. The second-order effect is even more interesting. When the SaaS tool runs inside the user’s agent, the user supplies the tokens. Vendor margins improve, not collapse. If he is right, the next two years are going to be brutal for the SaaS-is-dead thesis pieces and very good for the public software multiples.

    The PM and designer bet is where this gets personal for anyone in product. For a decade the bottleneck in shipping anything was engineering capacity. A PM with spiky product sense had to negotiate their vision through a roadmap, a sprint, a review, and a release. Designers had to convince an engineer that the third state of the empty screen was actually worth building. Both of those constraints are dissolving fast. A PM who can prompt Codex into a working prototype on Friday afternoon, then iterate it live in front of a customer on Monday, is doing the job of a small team. A designer who can ship a fully functional landing page in their own style, without negotiating with anyone, is suddenly the most leveraged person in the company. The scarce skill is no longer execution. It is taste, judgment, and the willingness to decide what is worth building. That has always been the real PM and design job. AI just stripped away the parts that were not.

    The quietest but most important prediction is that agents need humans, permanently. Every benchmark advance reveals a new layer of judgment the model cannot frame on its own. When the agent finishes the task, there is always a senior human who sees the deeper problem the model patched over. Shipper calls this gardening, and it is the basis for the new forward-deployed engineer role. The companies winning right now are the ones that put a real person next to every agent, watching what it does, course-correcting in Slack, and noticing when the output drifts. The dream of autonomous AI workflows is a stage in a journey, not the destination. The destination looks more like a thoughtful operator with a small cluster of agents they trust and constantly tend. That is a much more humane future than the discourse suggests, and it is the one Every is already living.

    The final advice, ride the models, sounds glib but is the single most actionable line in the episode. Most professional anxiety about AI dissolves the moment you actually use the newest model on real work. Most professional advantage accrues to the people who do that one thing consistently. The edge does not live in San Francisco where the labs build the things. It lives wherever a curious human meets a real workflow and discovers something the labs have not noticed. A PM in Iowa willing to try Codex on a Tuesday night can be further ahead than a research engineer who has only used the model on its evals. Pair that with Shipper’s closing motto, do things worth writing about and write things worth reading, and you have a pretty complete operating system for the next two years.

    Key Takeaways

    • The AI job apocalypse narrative is wrong. Models commoditize yesterday’s competence, then humans climb the stack and find new work to do with the cheap raw material.
    • Every has roughly doubled headcount in the last year despite being one of the most AI-forward companies in the world. The lived data point cuts directly against the doom thesis.
    • Shipper’s dual stance: simultaneously extremely AI pilled and very bullish on humans. He treats this as the only intellectually honest position right now.
    • Work will bifurcate. Companies will run one shared super-agent in Slack for everyone, and individuals will run their own personal agent inside Codex or Claude Code on their machine.
    • The personal agent inside Codex effectively becomes the new operating system. Instead of putting AI in the browser, you put a browser inside the AI.
    • The super-agent pattern is already real: Shopify has River, Ramp has its own, and Every runs Claudie inside Slack for internal consulting.
    • SaaS is not dying. Agents increase the user base of SaaS tools because non-technical people can finally drive them. Shipper would buy SaaS stocks today.
    • When SaaS runs inside an agent, the user brings their own tokens. Vendor margins improve because they no longer eat inference costs on every interaction.
    • The CLI era is already over. The magic was never the terminal. It was the AI plus the ability to see what the agent is doing. A good GUI captures the same benefits and more.
    • Pull requests are about to flood every company. Non-engineers can now ship code, run queries, and open tickets. Reviewing the output becomes the new bottleneck.
    • Open-source maintainers are already living in the future. Some receive thousands of agent-generated PRs per day and spin up thousands of Codex instances just to triage them.
    • Forward-deployed engineers are the new senior role. They live in Slack, garden the company’s agents, fix broken flows, and keep non-technical staff from doing damage.
    • Product managers with spiky product sense plus a little Codex fluency become extremely dangerous. Marcus at Every, formerly a PM at Axios, is the archetype.
    • Full-stack designers are the other big winner. They can build distinctive interfaces end to end without negotiating with engineering. The bottleneck on taste-driven product work disappears.
    • Designer hiring data has not yet caught up to the prediction. Shipper notes this and says check back in a year.
    • Sales is the role least changed so far. Top of funnel research has been turbocharged by agents, but the actual relationship and closing work remains human.
    • AI-generated internal writing is going mainstream and that is a good thing. Most humans are bad at strategy docs, quarterly plans, and PRs. AI drafts a coherent first pass that a human can refine.
    • Shipper says most of his email is now written by GPT-5.5 and Codex. He would honestly prefer the signature to say so.
    • Public writing, newsletters, and published essays still demand a human voice. Internal communication does not.
    • CEOs and middle managers have largely not adapted yet because their staff still does the work. That window is closing fast and will become an obvious career liability.
    • Your company will only go as far as your CEO goes in AI. The leadership ceiling becomes the AI ceiling.
    • Shipper’s senior engineer benchmark scores GPT-5.5 at roughly 62 out of 100. Real senior engineers sit at 85 to 90. Progress is real, but the gap on architectural judgment remains.
    • Models tend to patch problems locally instead of rewriting from first principles. A senior human still sees the deeper rework that the model avoids.
    • Every uses Notion-based agents to draft quarterly plans. The human edits, approves, and stands behind the output.
    • The hard rule on AI-generated communication: you have to read it and stand behind it before sending it. Pasting unread output is the only true no-no.
    • Every agent needs a human. Automation is a lie in the strong sense. The story of automation is the story of new and different humans being needed alongside it.
    • The reach test, organic daily usage, is the real signal that an AI product works. Benchmark scores are noisy. Daily reach is not.
    • Cursor’s SpaceX acquisition is a tell. Harnesses around models, not the models themselves, are where the strategic value is concentrating.
    • The edge of AI is not in San Francisco. It is wherever a real human meets a real workflow and discovers something the labs have not noticed yet.
    • A PM in Iowa willing to ride the models can be further ahead than a researcher in SF who only uses them on internal evals.
    • Ride the models. Use them for whatever you do. Try every new release the day it ships. That single behavior compounds faster than any other AI career strategy.
    • Shipper got bursitis, which he calls vibe coder elbow, from too much rapid agent-assisted coding while debugging his markdown editor Proof.
    • The closing motto for the year: do things worth writing about and write things worth reading.
    • Lenny will re-interview Shipper in roughly May 2027 to score the predictions.

    Detailed Summary

    Why The AI Job Apocalypse Is The Wrong Frame

    Shipper opens with the headline contrarian call. Benchmarks keep climbing. Models can now sustain seventeen-hour autonomous tasks at fifty percent accuracy. The pace is real and accelerating. None of that translates cleanly into mass unemployment. His mechanism: models codify yesterday’s human competence and make it cheap. The act of compressing past expertise into an API call is genuinely deflationary for the work it captures, but it is also raw material for the next layer of human work. He uses Every as his own data point. The company has roughly doubled in the past year despite being one of the most AI-forward outfits in media. Hiring goes up because agents create new categories of work that need humans, not because the agents fail. The discourse, he argues, is stuck modeling AI as substitution. The reality looks much more like leverage.

    The Bifurcation: Super-Agents And Personal Agents

    Work splits into two surfaces. The first is the shared super-agent that lives in Slack and serves the whole company. Shopify has River. Ramp has its own. Every has Claudie. Each is a single, trusted, gardened agent that anyone in the company can talk to. The pattern has converged on one shared agent rather than one agent per person because agents need human attention to stay useful, and a single shared instance pools the gardening cost. The second surface is the personal agent inside Codex or Claude Code that runs on your machine and reaches into your local environment, your editor, your files, and through an embedded browser into the web. Shipper calls this the new operating system. Instead of the old paradigm of putting AI inside the browser, you put the browser inside the AI. The agent sees what you see, follows what you do, and works on your stuff in your context.

    The SaaS Bet: Up, Not Down

    The SaaS-is-dead thesis was the consensus call of late 2025. Shipper takes the other side and would buy software stocks now. Three arguments. First, agents make SaaS accessible to people who never could have used it directly. The total addressable user base inside every company goes up. Second, the business model improves when the user runs the SaaS through their own agent, because the user supplies the tokens. Vendors stop subsidizing inference. Third, SaaS spend in his observable universe is up, not down, and is concentrating on the tools that play well with agents. He frames the prediction as a sound bite for the cycle: buy SaaS stocks, the apocalypse is dumb.

    The CLI Era Is Already Over

    For a moment in early 2026 it looked like everyone was migrating to the terminal because Claude Code was a CLI. Shipper says the moment is finished. The actual leverage was never the terminal. It was the model plus the ability to watch and steer an agent live. A great GUI captures every advantage of the CLI without the friction. His own engineering team at Every has mostly moved off the CLI as their primary surface and onto Codex desktop. He frames it bluntly: we speed ran the CLI era, it was nice, and now we are done. Tooling for the next two years will be visual, multi-pane, multi-agent, and built around the human watching the work unfold.

    The Pull Request Flood And The Rise Of Forward-Deployed Engineers

    Once non-engineers can ship code, run queries, and file changes through agents, the volume of incoming work explodes. Open-source maintainers already report receiving thousands of agent-generated pull requests per day. Inside companies, the same thing happens to data teams, ops teams, and any function that owns a review gate. The bottleneck shifts from creation to evaluation. The job that emerges to absorb the flood is the forward-deployed engineer. This is a senior person who lives in Slack with the company’s agents, fixes their context, sharpens their instructions, and prevents non-technical colleagues from making well-meaning but incoherent changes. Nitesh at Every is the example Shipper returns to. The model is the same one the labs use internally: pair every important agent with a real engineer who gardens it.

    PMs And Full-Stack Designers Win The Decade

    The two roles Shipper is most bullish on are product manager and full-stack designer. For PMs, the entire job of coordinating a team to translate vision into code collapses into a Codex session. A PM with strong product instincts and a little technical literacy can now prototype, iterate, and even ship. The example is Marcus, formerly a PM at Axios, who took a year to fully internalize AI and now ships faster than most engineers. For designers, the model is similar. The Friday-night-side-project designer who used to be stuck explaining a vision can now build the vision themselves, with their own taste fully expressed. The scarce skill in both cases is the same: judgment about what to build and the courage to decide it is good. Execution capacity is no longer the constraint.

    The Senior Engineer Benchmark And What Models Still Miss

    Shipper has built his own benchmark to test whether coding models can actually do senior engineering work. GPT-5.5 scores around 62 out of 100. Real senior engineers sit closer to 85 or 90. The gap is not in syntax or test pass rates. It is in the willingness to step back, see that a piece of code is fundamentally the wrong shape, and rewrite it from first principles. Models almost universally patch locally. They take the instruction at face value, accept the existing code as a constraint, and optimize within it. A real senior engineer ignores the prompt when the prompt is wrong. This is the durable moat for senior technical judgment, and Shipper expects it to remain visible for at least another year of model releases.

    AI-Generated Writing Goes Mainstream

    Internal writing inside companies is quietly becoming AI-first and Shipper thinks it should. Quarterly plans, status updates, PR descriptions, strategy memos, recruiting outreach, most internal email. He runs his own inbox through GPT-5.5 and Codex and says he would honestly prefer if the recipient knew. The point is not that AI is a better writer in some absolute sense. The point is that most humans are not very good at these specific genres, and the model produces a coherent, structurally sound first draft that a human can guide and approve. The constraint is honesty: you read it, you understand it, you stand behind it. Public writing, like the newsletters Every publishes, still demands a human voice. Internal communication does not, and treating it as if it did is a tax on the organization.

    The CEO And Middle Manager Lag

    Shipper points to a population that has largely escaped AI adoption: senior leaders and middle managers. They have staff to do the work, so they have not been forced to pick up the tools personally. He thinks this is the single largest pocket of latent disruption coming in the next year. Your company will only go as far as your CEO goes in AI, because every decision about where to deploy agents, where to hire, and how to restructure work flows downstream from leadership taste. A leader who has not personally lived inside Codex or Claude Code for a few weeks cannot make those calls well. Expect this to flip fast and to become a visible career liability for executives who do not adapt.

    Ride The Models

    The closing advice is the simplest. Ride the models. Use AI for whatever you actually do. Try every new release the day it lands. Most of the professional anxiety around AI dissolves on contact with the work, and most of the durable advantage in the field belongs to the people who do this one thing consistently. Shipper notes that the edge of AI does not live in San Francisco. It lives wherever a curious operator meets a real workflow and notices something nobody at the labs has yet. A PM in Iowa willing to spend a Tuesday night exploring Codex can find capabilities researchers have not surfaced. Pair that with his motto, do things worth writing about and write things worth reading, and you have most of an operating system for the next two years.

    Notable Quotes

    “The AI job apocalypse is not really a thing. I am super super bullish on PMs and full-stack designers.”

    Dan Shipper, opening his contrarian thesis for the conversation

    “I’m simultaneously extremely AI pilled and very bullish on humans. Automation is a lie. Every agent needs a human.”

    Dan Shipper, on holding both sides of the AI debate at once

    “What models do in general is they make yesterday’s human competence cheap. And so, it becomes commoditized. It’s not valuable anymore. What humans do is we go in there and we’re like, yeah, we have all this frozen human competence from yesterday, how do I use this to make something new and interesting.”

    Dan Shipper, articulating the core engine behind his anti-apocalypse thesis

    “I would buy SaaS stocks right now. The SaaS apocalypse is dumb. What agents do is increase the number of users of SaaS, not get rid of it.”

    Dan Shipper, calling the consensus SaaS-is-dead thesis directly wrong

    “We speed ran the CLI era. It was nice while it lasted, but I think CLIs are over.”

    Dan Shipper, on why the terminal-first agent moment is already done

    “Most of my email is written by GPT-5.5 and Codex right now. And I honestly would prefer it to say that it’s coming from GPT-5.5.”

    Dan Shipper, on the new etiquette of AI-assisted communication

    “The edge of AI is not in San Francisco. The edge of AI is wherever AI meets a real human doing something.”

    Dan Shipper, on where the actual frontier of the field lives

    “The only thing you need to do is ride the models. And that means use them for whatever it is that you do.”

    Dan Shipper, distilling his career advice for the next two years

    “Do things worth writing about and write things worth reading.”

    Dan Shipper’s closing motto, lifted from his own operating system at Every

    Watch the full conversation with Dan Shipper on Lenny’s Podcast here. The re-interview to score these predictions is scheduled for roughly May 2027.

    Related Reading

    • Every. Dan Shipper’s company and the live laboratory for almost every prediction in this conversation, including Spiral, Cora, and Claudie.
    • The Allocation Economy by Dan Shipper. The earlier essay that frames humans as managers of AI labor and underpins much of the gardening-the-agent thesis here.
    • Claude Code by Anthropic. The agent surface Shipper called correctly last year and one of the two environments he predicts will become the new operating system for work.
    • Codex by OpenAI. Shipper’s current daily driver and the visual, multi-pane agent environment he uses for almost everything from coding to email.
    • The Writing Life by Annie Dillard. The book Shipper makes every Every employee read, and the source of the company’s stance on writing as a tool for noticing the future.
  • Jensen Huang at Stanford CS153 Frontier Systems on Co-Design, Agentic Computing, Vera Rubin, Open Models, and the Million-X Decade That Reshaped AI Infrastructure

    https://www.youtube.com/watch?v=tsQB0n0YV3k

    NVIDIA CEO Jensen Huang returned to Stanford for the CS153 Frontier Systems class (the room nicknamed itself “AI Coachella”) to lay out, in raw form, how he thinks about the computer being reinvented for the first time in over sixty years. Across roughly seventy minutes of student questions he walks through the codesign philosophy that gave NVIDIA a million-x decade, the architectural through-line from Hopper to Grace Blackwell to Vera Rubin to Feynman, the case for open source foundation models, the realities of tokens per watt and MFU, energy demand running a thousand times higher, the China and export-control debate, and his own biggest strategic mistakes. Watch the full conversation on YouTube.

    TLDW

    Huang argues every layer of computing has changed: the programming model, the system architecture, the deployment pattern, the economics. Co-design across CPUs, GPUs, networking, storage, switches and compilers gave NVIDIA roughly a million-x speed-up over ten years versus the ten-x Moore’s Law era, and that headroom is what let researchers say “just train on the whole internet.” Hopper was built for pre-training, Grace Blackwell NVLink72 for inference and reasoning (50x over Hopper in two years), Vera Rubin is built for agents that load long memory, call tools and need a low-latency single-threaded CPU bolted directly to the GPU, and Feynman extends that to swarms of agents that spawn sub-agents. Open weights matter because safety, sovereignty (230-plus languages no one else will fund) and domain models for biology, autonomy, robotics and climate need a foundation that NVIDIA is willing to seed. Compute is not really the scarce resource (Huang says place the order and the chips ship), the broken thing is institutional budgeting that can’t put a billion dollars into a shared university supercomputer. Energy demand is heading a thousand times higher and this is finally the moment market forces alone will fund sustainable generation. On geopolitics he rejects the GPUs-as-atomic-bombs framing and warns America will end up like its telecom industry if it cedes two thirds of the world. On career he advises seeking suffering on purpose. On strategy he says observe, reason from first principles, build a mental model, work backwards, minimize opportunity cost, maximize optionality.

    Key Takeaways

    • The computing model has been substantially unchanged since the IBM System 360, sixty-plus years ago. Huang’s first computer architecture book was the System 360 manual. AI is the first true reinvention.
    • Old computing was pre-recorded retrieval. New computing is generated, contextually aware and continuous. Cloud was on-demand. Agentic systems run continuously.
    • Codesign is NVIDIA’s central thesis. Inherited from the Hennessy and Patterson RISC era at Stanford, extended across CPUs, GPUs, networking, switches, storage, compilers and frameworks all optimized together.
    • The result of full-stack codesign: roughly 1,000,000x faster compute over ten years, versus a generous 10x to 100x for Moore’s Law in the same period. Dennard scaling effectively ended a decade ago.
    • That million-x speed-up is what unlocked “train on all of the internet” as a realistic AI strategy.
    • After GPT, Huang says it was obvious thinking was next. Reasoning is just generating tokens consumed internally, then using tools is generating tokens consumed externally. Agentic systems followed predictably.
    • Education needs AI baked into the curriculum, not just taught as a subject. Pre-recorded textbooks cannot keep pace with knowledge being generated in real time.
    • Huang says he cannot learn anymore without AI. He has the AI read the paper, then read every related paper, then become a dedicated researcher he can interrogate.
    • Mead and Conway and the first-principles methodology of semiconductor design are still worth learning even though most of the scaling tricks have been exhausted.
    • NVIDIA itself is one of the largest consumers of Anthropic and OpenAI tokens in the world. One hundred percent of NVIDIA engineers are now agentically supported. Huang recommends Claude and similar tools by name and says open-source downloads will not match the integrated product harness.
    • NVIDIA still invests heavily in open foundation models because language and intelligence represent the codification of human knowledge. Five pillars: Nemotron (language), BioNeMo (biology), Alphamayo (autonomous vehicles), Groot (humanoid robotics) and a climate science model (mesoscale multiphysics).
    • Sovereign language models matter. Roughly 230 world languages will never be a top priority for a commercial frontier lab. Nemotron is near-frontier and fully fine-tunable so any country can adapt it.
    • Safety and security require open weights. You cannot defend against or audit a black box. Transparent systems let researchers interrogate models and let defenders deploy swarms.
    • The future of cyber defense is not bigger-model-versus-bigger-model. It is trillions of cheap fast small models like Nemotron Nano surrounding the threat.
    • Domain models fuse language priors with world models. Alphamayo learned to drive safely on a few million miles instead of billions because it can reason like a human about the road.
    • MFU (Model Flops Utilization) is a misleading metric. Huang says he wants low MFU, because that means he over-provisioned every resource and never gets pinned by Amdahl’s law during a spike.
    • The xAI Memphis cluster running at 11 percent MFU is not necessarily a failure mode. In disaggregated prefill plus decode inference you can deliver very high tokens per watt with very low MFU.
    • The right metric is performance, ultimately tokens per watt as a proxy for intelligence per watt, and even that needs adjustment because not all tokens are equal. Coding tokens are worth more than other tokens.
    • Hopper was designed for pre-training. NVIDIA chose to build multi-billion-dollar systems when the largest existing scientific supercomputer cost $350 million, with no proven customer base. It worked.
    • Grace Blackwell NVLink72 was designed for inference, especially the high-memory-bandwidth decode phase. It is the world’s first rack-scale computer and delivered a 50x speed-up over Hopper in two years, against an expected 2x from Moore’s Law.
    • Vera Rubin is designed for agents. Long-term memory wired into storage and into the GPU fabric, working memory, heavy tool use, and Vera, a CPU optimized for low-latency multi-core single-threaded code so a multi-billion-dollar GPU system does not stall waiting on a slow tool call.
    • Feynman is being shaped for swarms of agents with sub-agents and sub-sub-agents, a recursive software topology that demands a new compute pattern.
    • Tokens per watt improved 50x in one generation. Compounding energy efficiency is the lever NVIDIA controls directly.
    • Total compute energy demand is heading roughly a thousand times higher than today, possibly two orders of magnitude beyond that. Huang says he would not be surprised if the estimate is low.
    • For the first time in history, market forces alone are enough to fund solar, nuclear and grid upgrades. Government subsidies are no longer required to make sustainable energy investment rational.
    • Copper interconnect is becoming a bottleneck. Photonics is moving from optional to structural inside racks and across them.
    • Comparing NVIDIA GPUs to atomic bombs, Huang says, is a stupid analogy. A billion people use NVIDIA GPUs. He advocates them to his family. He does not advocate atomic bombs to anyone.
    • If the United States cedes two thirds of the global market to competitors on policy grounds, the American technology industry will end up like American telecommunications, which was policied out of existence.
    • Huang directly rejects AI doom-by-singularity narratives. It is not true that we have no idea how these systems work. It is not true that the technology becomes infinitely powerful in a nanosecond. He calls the rhetoric irresponsible and harmful to the field students are about to enter.
    • On Stanford specifically: if the university president places an order, NVIDIA will deliver the chips. The bottleneck is that no university department has a billion-dollar compute budget because budgeting is fragmented across grants. Stanford’s $40 billion endowment is more than enough to fix that.
    • “It’s Stanford’s fault” is meant as empowerment. If something is your fault, you can solve it.
    • Career advice: do not optimize purely for passion. Most people do not yet know what they love. Pick the job in front of you and do it as well as possible. Even as CEO, Huang says, 90 percent of the work is hard and he suffers through it.
    • Suffering on purpose builds the muscle of resilience. When the company, the team or the family needs you to be tough, that muscle has to already exist.
    • NVIDIA’s first generation of products was technically wrong in nearly every dimension: curved surfaces instead of triangles, no Z-buffer, forward instead of inverse texture mapping, no floating point. The strategic recovery, not the technology, taught Huang the lessons that have lasted decades.
    • The biggest clean strategic mistake Huang names is the move into mobile chips (Tegra). It grew to a billion dollars then went to zero when Qualcomm’s modem dominance shut NVIDIA out of the 3G to 4G transition. The recovery into automotive and robotics (the Thor chip is the great great great grandson of that mobile lineage) was real, but Huang refuses to rationalize the original choice.
    • Forecasting framework: observe, reason from first principles, ask “so what” and “what next” until you have a mental model of the future, place your company inside that model, then work backwards while minimizing opportunity cost and maximizing optionality.
    • Best part of the CEO job: living at the intersection of vision, strategy and execution surrounded by people capable enough to make ambitious visions real. Worst part: the responsibility for everyone who joined the spaceship, especially in the near-death moments NVIDIA had four or five times early on.
    • Underrated insider note: Huang’s first apple pie with cheese, first hot fudge sandwich and first milkshake all happened at Denny’s. The Superbird, the fried chicken and a custom Superbird-style ham and cheese with tomato and mustard are his order.

    Detailed Summary

    Computing reinvented from the ground up

    Huang frames the moment as the first true rewrite of the computer in sixty-plus years. From the IBM System 360 forward, the mental model of writing code, running code, taking a computer to market and reasoning about applications stayed roughly constant. AI changes the programming model itself. Software is no longer a compiled binary running deterministically on a CPU. It is a neural network running on a GPU producing generated, contextual, real-time output. That cascades into how companies are organized, what tools developers use, what the network and storage stack look like, and what an application is even allowed to do. Robo-taxis, he notes, are an application no one would have attempted before deep learning unlocked perception.

    Codesign and the million-x decade

    Codesign is the philosophical center of the talk. Huang traces it to the RISC work of John Hennessy at Stanford, where simpler instruction sets won by being co-designed with the compiler rather than maximally optimized in isolation. NVIDIA extends the principle across every layer simultaneously: GPU architecture, CPU architecture, NVLink and NVSwitch fabrics, photonic interconnects, networking silicon, storage paths, CUDA libraries, frameworks and ultimately the model design. The numbers Huang gives are arresting. Moore’s Law in its prime delivered roughly 100x per decade. By the time Dennard scaling broke, real-world gains had compressed to roughly 10x. NVIDIA’s codesigned stack delivered between 100,000x and 1,000,000x over the same ten-year window. That non-linear speed-up is, in Huang’s telling, the precondition for modern AI: it is what allowed researchers to stop curating training sets and just feed the entire internet to the model.

    Education has to fuse first principles with AI tools

    Asked how curriculum should evolve, Huang argues AI must be integrated into the learning process, not just taught about. He recalls Hennessy writing his textbook by hand a chapter a week while Huang was a student, and says pre-recorded textbooks cannot keep up with the rate at which AI generates new knowledge. He describes his own learning workflow: hand the paper to an AI, then have it read the entire surrounding literature, then treat the AI as a dedicated researcher who can be interrogated. At the same time he defends the classics. Mead and Conway are still the foundation. Most modern semiconductor scaling tricks have been exhausted, but knowing where the field came from sharpens judgment when designing what comes next.

    Open source and the five domain pillars

    Huang gives one of the most detailed public accounts of why NVIDIA invests so heavily in open foundation models even while being a top customer of closed labs. He recommends Claude and OpenAI by name for production coding work, and says 100 percent of NVIDIA engineers are now agentically supported. The open-weights case rests on three legs. First, language is the codification of intelligence, and there are at least 230 languages that no commercial lab will ever prioritize. Nemotron is built near frontier and released so any country or community can fine-tune it. Second, the same representation-learning approach has to be replicated in domains where the data is not internet text, so NVIDIA seeded BioNeMo for biology, Alphamayo for autonomy, Groot for humanoid robotics and a climate model for mesoscale multiphysics. The economics of those fields would never produce a foundation model on their own. Third, safety and security require transparency. A black box cannot be defended or audited, and the future of cyber defense is not bigger-model-versus-bigger-model but swarms of cheap fast small models like Nemotron Nano surrounding the threat.

    MFU is the wrong metric, tokens per watt is closer

    A student raises the leaked memo that the xAI Memphis cluster is running at 11 percent Model Flops Utilization. Huang flips the framing. He says he would rather be at low MFU all the time, because that means he over-provisioned flops, memory bandwidth, memory capacity and network capacity. Bottlenecks shift constantly, so over-provisioning across every dimension is what lets the system absorb a spike without getting pinned by Amdahl’s law. In disaggregated inference, where prefill and decode are physically separated and decode is bandwidth-bound rather than flop-bound, NVLink72 can deliver extremely high tokens per watt while reporting very low MFU. Huang argues the right framing is performance, and ultimately tokens per watt as a rough proxy for intelligence per watt, adjusted for the fact that not all tokens are equal. A coding token is worth more than a generic token.

    Hopper, Grace Blackwell NVLink72, Vera Rubin, Feynman

    Huang gives the clearest public framing of NVIDIA’s roadmap as a sequence of architectural answers to evolving compute patterns. Hopper was built for pre-training, at a moment when NVIDIA chose to build multi-billion-dollar machines while the largest scientific supercomputer in the world cost $350 million and the marketplace for such systems was, on paper, zero. Grace Blackwell NVLink72 was the answer to inference and reasoning: a rack-scale computer that ganged 72 GPUs together because decode needs aggregate memory bandwidth far beyond a single chip. The generation-over-generation speed-up was 50x in two years, twenty-five times what Moore’s Law would have delivered. Vera Rubin is being built explicitly for agents. Agents load long-term memory from storage that has to be wired directly into the GPU fabric, they use working memory, they call tools that run on a CPU, and they wait. So the CPU has to be Vera, optimized for low-latency single-threaded code, because the multi-billion-dollar GPU system cannot afford to idle waiting on a slow tool call. Feynman extends the pattern to swarms of agents with sub-agents and sub-sub-agents, a recursive software topology that will demand its own compute pattern.

    Energy demand and the grid

    Huang’s energy projection is one of the most aggressive numbers in the talk. NVIDIA can compound tokens per watt by 50x per generation through codesign, but the total compute demand is heading roughly a thousand times higher, and Huang says he would not be surprised if the real figure is one or two orders of magnitude beyond that. The reason is structural: future computing is generative and continuous, not pre-recorded and on-demand. The good news, he argues, is that this is the best moment in the history of humanity to invest in sustainable generation. Market forces alone are now sufficient to fund solar, nuclear and grid upgrades. Government subsidies are no longer required to make the math work.

    Adversarial countries, export controls and the telecom warning

    This is the segment where Huang is visibly fired up. He attacks the GPUs-as-atomic-bombs framing on its face. NVIDIA GPUs power medical imaging, video games and soy sauce delivery. A billion people use them. He advocates them to his family. The analogy collapses at the first comparison. He attacks the second framing, that American companies should not compete abroad because they will lose anyway, as a self-fulfilling defeat. Competition makes the company better. The third framing, that depriving the rest of the world of general-purpose computing benefits the United States, also fails on first principles: it benefits one or two American companies at the cost of an entire industry. The cautionary parallel is telecommunications. The United States once had a leading position in telecom fundamental technology and policied itself out of it. Huang’s worry, voiced explicitly to a room of CS students, is that they will graduate into a shell of a computer industry if the same path is repeated.

    AI doom and rational optimism

    In the same arc Huang rejects the science-fiction framing of AI as a singularity that arrives suddenly on a Wednesday at 7pm and ends civilization. He calls those claims irresponsible, says they are not true, and points out that the people advancing them are believed by audiences who then make policy on that basis. It is not true that no one understands how these systems work. It is not true that intelligence becomes infinitely powerful instantaneously. It is not true that there is no defense. His framing, which the host echoes as “rational optimism,” is that the goal is to create a future where people care about computers because the technology students are learning is worth mastering.

    Stanford’s compute problem is Stanford’s fault

    A student presses on the scarcity of compute for independent researchers, startups and universities inside the United States. Huang’s answer is sharp: there is no shortage. Place the order and the chips will arrive. The actual broken thing is institutional. University grants are fragmented across departments. No researcher can raise enough on a single grant to fund a billion-dollar shared cluster, and no one shares. He compares it to showing up at the grocery store demanding a billion dollars of tomatoes today. The solution is planning, aggregation and a campus-scale supercomputer, the way Stanford once built the linear accelerator. The endowment is $40 billion. Pulling a billion off it, contracting cloud capacity and giving every student and researcher AI supercomputer access is, in Huang’s view, obviously doable. When he says “it is Stanford’s fault” the host laughs, but Huang clarifies: if it is your fault you have the power to fix it.

    Career, suffering and resilience

    Asked how a CS student should spend the next few years, Huang pushes back on the standard “follow your passion” advice. Most people do not know what they love yet, because no one knows what they do not know. The bar of demanding joy from every working day is too high. Whatever the job is, do it as well as you can. Even as CEO of NVIDIA he says he genuinely loves about 10 percent of his work. The other 90 percent is hard and he suffers through it. He recommends suffering on purpose, because resilience is a muscle that only builds under load, and when the company, the team or the family needs that muscle, it has to already exist. Earlier in his life that meant cleaning toilets and busing tables at Denny’s. He does it today running a multi-trillion-dollar company.

    The biggest mistakes

    Huang separates technical mistakes from strategic mistakes. NVIDIA’s first generation of products was technically wrong in almost every way: curved surfaces instead of triangles, no Z-buffer, forward instead of inverse texture mapping, no floating point inside. The company wasted two and a half years. But the strategic genius of the recovery, the reading of the market, the conservation of resources and the reapplication of talent, is what taught him strategy. The clean strategic mistake he names is mobile. NVIDIA’s Tegra line grew to a billion dollars of revenue and then collapsed to zero when Qualcomm’s modem dominance locked NVIDIA out of the 3G to 4G transition. Huang explicitly refuses the comforting rationalization that the Tegra effort fed the Thor automotive chip (“Thor is the great great great grandson”). The original decision, he says, was a waste of time. The lesson is to think one or two clicks further about whether a market is structurally winnable before committing the company.

    Forecasting under fog of war

    The final substantive exchange is on forecasting. Huang’s method has four steps. Observe what is actually happening (AlexNet crushing two decades of computer vision research in one shot, GPT producing reasoning by token generation). Reason from first principles about why it works. Ask “so what” and “what next” recursively until a mental model of the future emerges. Place the company inside that future and work backwards. Crucially, expect to be partly wrong. Some outcomes will absolutely happen, some will likely happen, some might happen, and the strategy has to be robust across that distribution. The real cost of any strategic choice is the opportunity cost of the alternatives you did not take, so the discipline is to minimize that cost and maximize optionality while letting the journey itself pay for the journey.

    Thoughts

    The most useful thing in this conversation is the explicit architectural mapping of compute patterns to chip generations. Hopper for pre-training. Grace Blackwell NVLink72 for inference, because decode is bandwidth-bound and a single chip cannot supply it. Vera Rubin for agents, because tool calls stall multi-billion-dollar GPU systems and so the CPU has to be optimized for low-latency single-threaded code. Feynman for swarms. That sequence is not marketing. It is a falsifiable thesis about where the bottleneck moves next, and every other infrastructure company should be measuring themselves against it. If Huang is right that swarms of sub-agents are the next dominant pattern, then the design pressure shifts from raw flops to fabric topology, memory hierarchy and storage-to-GPU latency. That has implications for everyone downstream, including the hyperscalers building competing accelerators.

    The MFU section is the most intellectually generous moment in the talk. The instinct in the AI ops community has been to chase MFU as if it were a virtue. Huang argues, persuasively, that low MFU is consistent with high tokens per watt in a disaggregated inference setup, and that bottlenecks rotate fast enough that over-provisioning every resource is the rational design. That reframing matters because it changes what “scarce” means. Compute is not scarce in the way the discourse treats it. What is scarce is a coherent system designed end-to-end. The xAI 11 percent number, in that frame, is not embarrassing. It is the natural reading of a workload that is mostly decode.

    The Stanford segment is the part most likely to be quoted out of context. “It’s Stanford’s fault” is a deliberately provocative line, but the underlying claim is correct and load-bearing. Compute is not gated by NVIDIA refusing to ship chips. It is gated by the fact that fragmented grant funding cannot aggregate into the billion-dollar order that NVIDIA can fulfill. The implication is that universities and national labs need a structural change in how they pool capital for compute, and that the current model of every researcher buying a handful of cards is genuinely obsolete. Huang’s nudge about pulling a billion off the endowment is concrete enough to be acted on, and other major research universities should read this segment as a direct prompt.

    The geopolitical segment is the highest-stakes one. The telecommunications comparison is correct as a historical pattern, and Huang is one of the very few executives in a position to deliver that warning credibly. The unresolved tension is that the argument applies symmetrically. If American AI dominance is built by selling globally, that includes selling into adversarial states, and the policy question is where the line falls. Huang does not answer that question. He attacks the framing that lets the question be answered badly. That is a meaningful contribution to the discourse even if it does not resolve the underlying tradeoff.

    The career advice section is the part the social-media clips will mishandle. “Seek suffering” reads as macho when extracted. In context it is a specific operational claim about how resilience compounds, and it is paired with the Tegra story where Huang himself paid the price of not thinking one more click ahead. That kind of self-implication is rare in CEO talks, and it is the reason the talk is worth listening to in full rather than only reading the recap.

    Watch the full Stanford CS153 Frontier Systems conversation with Jensen Huang here.