PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: DeepSeek

  • Krishna Rao on Anthropic Going From 9 Billion to 30 Billion ARR in One Quarter and the Compute Strategy Powering Claude

    Krishna Rao, Chief Financial Officer of Anthropic, sat down with Patrick O’Shaughnessy on Invest Like the Best for one of the most detailed public looks yet at the operating engine behind Claude. He covers how Anthropic compounded from $9 billion of run rate revenue at the start of the year to north of $30 billion by the end of Q1, why he spends 30 to 40 percent of his time on compute, the playbook for buying gigawatts of AI infrastructure across Trainium, TPU, and GPU platforms, how Anthropic prices its models, why returns to frontier intelligence keep climbing, and what the Mythos release tells us about the cyber capabilities of the next generation of Claude.

    TLDW

    Anthropic is running the most compute fungible frontier lab in the world, with active deployments across AWS Trainium, Google TPU, and Nvidia GPU, and an internal orchestration layer that lets a chip serve inference in the morning and run reinforcement learning the same evening. Krishna Rao explains the cone of uncertainty that governs gigawatt scale compute procurement, the floor Anthropic refuses to drop below on model development compute, the Jevons paradox unlock from cutting Opus pricing, the 500 percent annualized net dollar retention from enterprise customers, the layer cake of long term deals with Google, Broadcom, Amazon, and the recent xAI Colossus tie up in Memphis, the phased release of the Mythos model in response to spiking cyber capabilities, the internal use of Claude Code to produce statutory financial statements and run a Monthly Financial Review skill, and why the team believes scaling laws are alive and well. The interview also covers fundraising history through Series D and Series E, the $75 billion already raised plus another $50 billion coming, talent density beating talent mass during the Meta poaching wave, and Rao’s belief that biotech and drug discovery represent the most exciting frontier for AI.

    Key Takeaways

    • Anthropic entered the year with about $9 billion of run rate revenue and ended the first quarter with north of $30 billion of run rate revenue, a more than 3x leap driven by model intelligence gains and the products built around them.
    • Compute is described as the lifeblood of the company, the canvas everything else is built on, and the most consequential class of decisions Rao makes. Buy too much and you go bankrupt. Buy too little and you cannot serve customers or stay at the frontier.
    • Rao spends 30 to 40 percent of his time on compute, even today, and the leadership team meets repeatedly on both procurement and ongoing compute allocation.
    • Anthropic is the only frontier language lab actively using all three major chip platforms in production: AWS Trainium, Google TPU, and Nvidia GPU. It is also the only major model available on all three clouds.
    • Flexibility is the central design principle. Anthropic builds flexibility into the deals themselves, into the orchestration layer that maps workloads to chips, and into compilers built from the chip level up.
    • The cone of uncertainty frames procurement. Small differences in weekly or monthly growth compound into wildly different two year outcomes, so the team plans across a range of scenarios rather than a single point estimate, and ranges toward the upper end while protecting downside.
    • Compute allocation across the company sits in three buckets: model development and research, internal employee acceleration, and external customer serving. A non negotiable floor protects model development even when customer demand is tight.
    • Anthropic estimates that if it cut off internal employee use of its own models, the freed compute could serve billions of dollars of additional revenue. It chooses not to, because internal use compounds into better future models.
    • Intelligence is multi dimensional, not a single IQ score. Anthropic measures real world capability through customer feedback, long horizon task performance, tool use, computer use, and speed at agentic tasks, not just leaderboard benchmarks that have largely saturated.
    • Each Opus generation, 4 to 4.5 to 4.6 to 4.7, delivers both capability improvements and an efficiency multiplier on token processing. New models often serve customers at a fraction of the prior cost while doing more.
    • Reinforcement learning is described as inference inside a sandbox with a reward function, so model efficiency gains directly improve internal RL throughput. The flywheel is tightly coupled.
    • Over 90 percent of code at Anthropic is now written by Claude Code, and a large share of Claude Code itself is written by Claude Code.
    • Anthropic shipped roughly 30 distinct product and feature releases in January and the pace has accelerated since.
    • Scaling laws, in Anthropic’s internal data, are alive and well. The team holds itself to a skeptical scientific standard and still does not see them slowing down.
    • Anthropic recently signed a 5 gigawatt deal with Google and Broadcom for TPUs starting in 2027, plus an Amazon Trainium agreement for up to 5 gigawatts, totaling more than $100 billion in commitments. A significant portion lands this year and next year.
    • A new partnership for capacity at the xAI Colossus facility in Memphis was announced just before the interview, aimed at expanding consumer and prosumer capacity.
    • Pricing has been remarkably stable across Haiku, Sonnet, and Opus. The biggest deliberate change was lowering Opus pricing, which produced a textbook Jevons paradox: consumption rose far faster than the price drop, and the new Opus 4.6 and 4.7 slot in at the same price point.
    • Mythos is the first model Anthropic chose to release in a phased way because of a sharp spike in cyber capability. In an open source codebase where a prior model found 22 security vulnerabilities, Mythos found roughly 250.
    • The Mythos release framework focuses on defensive use first, expands access over time, and is presented as a template for future capability spikes.
    • Anthropic now sells to 9 of the Fortune 10 and reports net dollar retention above 500 percent on an annualized basis. These are not pilots. Rao describes signing two double digit million dollar commitments during a 20 minute Uber ride to the studio.
    • The platform strategy is mostly horizontal. Anthropic will go vertical with offerings like Claude for Financial Services, Claude for Life Sciences, and Claude Security where it can demonstrate the model’s capabilities, but expects most application value to accrue to customers building on top.
    • Investors raised over $75 billion in equity since Rao joined, with another $50 billion in commitments tied to the Amazon and Google deals. Capital intensity is real, but the raises fund the upper end of the cone of uncertainty more than they fund current losses.
    • The Series E close coincided with the day the DeepSeek news broke, forcing investors to reassess their AI thesis in real time. Anthropic closed the round anyway.
    • Inside finance, Claude now produces statutory financial statements for every Anthropic legal entity, with a human checker. A library of more than 70 finance specific skills underpins workflows.
    • A custom Monthly Financial Review skill produces a 90 to 95 percent ready monthly close report, so leadership discussion shifts from reconciling numbers to debating implications.
    • An internal real time analytics platform called Anthrop Stats compresses weekly insight cycles from hours to about 30 minutes.
    • The biggest token user inside Anthropic’s finance team is the head of tax, focused on tax policy engines and workflow automation. The most senior people, not the youngest, are leading internal adoption.
    • Talent density beats talent mass. When Meta and others ran aggressive offer waves, Anthropic lost two people while peer labs lost dozens.
    • All seven Anthropic co founders remain at the company, as does most of the first 20 to 30 employees, which Rao credits to a collaborative, transparent, debate friendly culture and a real culture interview that can veto otherwise top tier candidates.
    • Dario Amodei holds an open all hands every two weeks, writes a short prepared document, and takes unscripted questions from anyone at the company.
    • AI safety investments in interpretability and alignment have a commercial side effect. Looking inside the model helps Anthropic build better models, and enterprises selling sensitive workloads want to trust the lab they hand customer data to.
    • Anthropic explicitly identifies as America first in its approach to model development, and engages closely with the US administration on capability releases such as Mythos.
    • The longer term product vision is the virtual collaborator: an agent with organizational context, access to the company’s tools, persistent memory, and the ability to work on ideas, not just tasks, over long horizons.
    • CoWork, Anthropic’s extension of the Claude Code paradigm into general knowledge work, is being adopted faster than Claude Code itself when indexed to the same point in its launch curve.
    • Anthropic’s product teams ship daily, with a fleet of agents working across the company on specific tasks. Everyone effectively becomes a manager of agents.
    • The dominant downside risks to Anthropic’s high end forecast are slower customer diffusion of model capability into real workflows, scaling laws flattening unexpectedly, and Anthropic losing its position at the frontier.
    • Rao is most excited about biotech and healthcare outcomes, especially the prospect that AI could push drug discovery and lab throughput up 10x or 100x, turning currently incurable diagnoses into treatable ones within a patient’s lifetime.

    Detailed Summary

    Compute as Lifeblood and the Cone of Uncertainty

    Rao opens with the claim that compute is the most important resource at Anthropic, and the most consequential decision class in the company. You cannot buy a gigawatt of compute next week. You have to anticipate demand a year or two in advance, and the cost of being wrong in either direction is high. Buy too much and the unit economics collapse. Buy too little and you cannot serve customers or stay at the frontier, which are described as the same failure mode. To navigate this, the team uses a cone of uncertainty rather than point estimates. Small differences in weekly growth compound into vastly different two year outcomes, and Anthropic tries to position itself toward the upper end of that cone while preserving optionality. Rao notes he has had to consciously break a lifetime of linear thinking and force himself into exponential models.

    Three Chip Platforms, One Orchestration Layer

    Anthropic uses Amazon’s Trainium, Google’s TPUs, and Nvidia’s GPUs fungibly. That was not free. Adopting TPUs at scale started around the third TPU generation, when outside observers thought it was a strange choice. Anthropic invested years into compilers and orchestration so workloads can flow across chips by generation and by job type. The team works deeply with Annapurna Labs at AWS to influence Trainium roadmaps because Anthropic stresses these chips harder than almost anyone. The result is what Rao believes is the most efficient utilization of compute across any frontier lab, with a dollar of compute going further inside Anthropic than anywhere else.

    Three Buckets and the Model Development Floor

    Compute gets allocated across model development, internal acceleration of employees, and customer serving. The conversations are collaborative rather than zero sum, but there is a hard floor on model development that the company refuses to cross even if it makes customer demand harder to serve in the short term. The thesis is simple. The returns to frontier intelligence are extremely high, especially in enterprise, so cutting model investment to chase near term revenue is a bad trade. Internal employee use is also explicitly protected. Rao notes that diverting that internal usage to external customers would unlock billions of additional revenue today, but the compounding benefit of accelerating researchers and engineers outweighs that.

    Intelligence Is Multi Dimensional

    Rao pushes back hard on the IQ framing of model progress. Benchmarks saturate quickly, and the real signal comes from how customers actually use the models. Anthropic looks at long horizon task completion, tool use, computer use, and time to result on agentic tasks. Two equally capable agents who differ only in speed produce dramatically different value, because the faster one compounds into more attempts and more outcomes. Frontier model leaps are also fuel efficient. The sedan to sports car analogy breaks down because each Opus generation, 4 to 4.5 to 4.6 to 4.7, delivers a step up in capability and a multiplier on per token efficiency.

    From 9 Billion to 30 Billion ARR in One Quarter

    The headline number for the quarter is a leap from about $9 billion of run rate revenue to over $30 billion, accomplished without onboarding a corresponding step up in compute, because new compute lands on ramps locked in 12 months prior. Rao attributes the leap to model capability gains, products that surface that intelligence in usable form factors, and an enterprise customer base that pulls more workloads onto Claude as each generation unlocks new use cases. Coding started the wave with Sonnet 3.5 and 3.6, and the same pattern is now playing out elsewhere in the economy.

    Recursive Self Improvement and Talent Density

    Over 90 percent of Anthropic’s code is now written by Claude Code, including most of Claude Code itself. Rao describes this as a structural reason to keep allocating internal compute to employees even when external demand is hungry. Recursive self improvement is not happening through models that need no humans. It is happening through researchers who set direction and use frontier models to compress months of work into days. Talent density beats talent mass. When Meta and other labs went after Anthropic researchers with very large packages, Anthropic lost two people while peer labs lost dozens.

    Procurement Strategy and the Layer Cake

    Compute lands as a layer cake. Last month Anthropic signed a 5 gigawatt TPU deal with Google and Broadcom starting in 2027, alongside an Amazon Trainium agreement for up to 5 gigawatts. The total is north of $100 billion in commitments. A new tie up with xAI’s Colossus facility in Memphis was announced just before the interview, intended for nearer term capacity to support consumer and prosumer growth. Anthropic evaluates near term and long term compute deals against the same set of variables: price, duration, location, chip type, and how efficiently the team can run it. The relationships are deeper than procurement. The hyperscalers are also distribution channels for the model.

    Platform First, Selective Vertical Bets

    Rao describes Anthropic as a platform first business, with most expected value accruing to customers building on the platform. The team will only go vertical when it can either demonstrate capabilities that are skating to where the puck is going, like Claude Code did before the models could fully support it, or when it wants to set a template for an industry vertical, as with Claude for Financial Services, Claude for Life Sciences, and Claude Security. He acknowledges that surprise capability jumps make customers anxious about the platform competing with them, and frames Anthropic’s mitigation as deeper partnerships, early access programs, and an emphasis on accelerating customer building rather than disintermediating it.

    Pricing, Jevons Paradox, and Return on Compute

    Pricing across Haiku, Sonnet, and Opus has been stable. The notable exception is Opus, which Anthropic deliberately repriced lower when launching Opus 4.5 because Opus class problems were being squeezed into Sonnet workloads. Efficiency gains made it possible to serve Opus profitably at the new level. The consumption response was a classic Jevons paradox, with usage rising far more than the price reduction would have predicted, and Opus 4.6 then slotted in at the same price with a capability bump. Margins are not framed as a per token markup. Compute is fungible across model development, internal acceleration, and customer serving, so Anthropic measures return on the entire compute envelope rather than software style variable cost per call.

    Fundraising, DeepSeek, and Capital Intensity

    Rao joined while Anthropic was closing its Series D, mid frontier model launch and during the FTX share liquidation. Investors initially questioned whether Anthropic needed a frontier model, whether AI safety and a real business could coexist, and why the sales team was so small. The Series E closed the same day the DeepSeek news broke, with markets violently re pricing AI in real time. Since Rao joined, Anthropic has raised over $75 billion, with another $50 billion tied to the Amazon and Google compute deals. The reason for the size of the raises is the cone of uncertainty, not current losses. Returns on compute today are described as robust.

    Mythos, Cyber Capability, and Phased Releases

    The Mythos release marks the first time Anthropic shipped a model under a deliberately phased rollout because of a specific capability spike. Cyber is the dimension that spiked. Where a prior model found 22 vulnerabilities in an open source codebase, Mythos found roughly 250. The defensive applications, automatically patching massive codebases, are genuinely valuable, but the offensive risk is real enough that Anthropic chose to release to a smaller group first and expand access over time. Rao positions this as a template for future capability spikes, not a permanent restriction. He also describes the relationship with the US administration as cooperative, including the Department of War interaction, with Anthropic supporting a regulatory framework that does not strangle innovation but takes responsibility seriously.

    Claude Inside Finance

    Anthropic’s finance team is one of the strongest internal case studies. Statutory financial statements for every legal entity are produced by Claude, with a human reviewer. A skill library of more than 70 finance specific skills underpins a Monthly Financial Review skill that drafts the monthly close at 90 to 95 percent ready, so leadership meetings shift from explaining the numbers to discussing what to do about them. An internal analytics platform called Anthrop Stats compresses weekly insight cycles from hours to 30 minutes. The biggest internal token user in finance is the head of tax, building policy engines, which Rao highlights as evidence that adoption is driven by the most senior people, not just younger engineers.

    Culture, Co Founders, and the Race to the Top

    Seven co founders should not, on paper, work as a leadership group. Rao argues it works because the culture was set early around collaboration, intellectual honesty, transparency, and humility. The culture interview is a real veto, not a checkbox. Dario Amodei runs an all hands every two weeks with a short written piece followed by unscripted questions, and decisions, once made, get clean alignment rather than residual politics. Anthropic frames its approach as a race to the top, where being a model for how to build the technology responsibly is itself a recruiting and retention advantage.

    The Virtual Collaborator and the Frontier Ahead

    The product vision Rao describes is the virtual collaborator. Not just a smarter chatbot, but an agent with organizational context, access to the company’s tools, memory, and the ability to work on ideas over long horizons. Coding was the first domain to feel this, but CoWork, Anthropic’s extension of the Claude Code pattern into general knowledge work, is being adopted faster than Claude Code was at the same age. Product development inside Anthropic already looks different. Teams ship daily, with fleets of agents working across the company, and individual humans increasingly act as managers of those fleets.

    Downside Risks and What Excites Him Most

    The three risks Rao names if asked to do a premortem on a softer year are slower customer diffusion of model capability into real workflows, scaling laws unexpectedly flattening, and Anthropic losing its frontier position to competitors. None of these are observed today, but he is unwilling to claim them with certainty. On the upside, he is most excited about biotech and healthcare. Lab throughput rising 10x or 100x, paired with AI assisted clinical workflows, could turn currently incurable diagnoses into treatable ones within a patient’s lifetime. That is the outcome he wants the technology to chase.

    Thoughts

    The most consequential structural point in this interview is the framing of compute as a single fungible resource pool measured by return on the entire envelope, not as a variable cost per inference call. That accounting shift, if you accept it, breaks most of the bear cases about AI lab unit economics. The bear argument almost always assumes that a token served to a customer is the only thing the chip did that day. Rao’s version is that the same fleet trains models in the morning, runs reinforcement learning at lunch, serves customers in the afternoon, and accelerates internal engineers in the evening. If even half of that is real, the right comparison is total compute spend versus total enterprise value created by the platform, and on that ratio Anthropic looks structurally strong rather than weak.

    The Jevons paradox on Opus pricing is the most actionable insight for anyone running an AI product. Most teams default to either chasing premium pricing on the newest model or undercutting to chase volume. Anthropic did something more disciplined: it left Sonnet and Haiku alone, dropped Opus when efficiency gains made it serveable, and watched aggregate usage rise faster than the price cut. The lesson is that frontier model pricing is not really a price problem. It is a capability access problem, and elasticity around the right tier is much higher than the standard SaaS playbook implies.

    The Mythos cyber jump deserves more attention than it has gotten. Going from 22 to 250 vulnerabilities found in the same codebase is the kind of capability discontinuity that genuinely changes the regulatory calculus. Anthropic is signaling that it can identify these discontinuities ahead of release and choose a deployment shape that respects them. Whether peer labs adopt similar discipline is the open question. Anthropic’s race to the top framing assumes they will be forced to. The competitive market may say otherwise.

    The hiring data point is the most underrated investor signal. Two departures while peer labs lost dozens, during the most aggressive talent war in tech history, is not a culture poster. It is a structural advantage that compounds every time another lab tries to buy its way to the frontier. Money can be matched. Conviction in the mission, transparent leadership, and a culture interview that can veto otherwise stellar candidates cannot. If you believe scaling laws hold, talent retention at this density is one of the few moats that actually scales with capital.

    Finally, the most interesting personal admission is that Krishna Rao, a finance leader trained at Blackstone and Cedar, is openly telling investors that linear thinking is the failure mode he had to break out of. The companies that pattern match this moment to prior technology waves are mispricing it, in both directions. The cone of uncertainty Anthropic uses internally is the right metaphor for everyone else too. If you are forecasting AI as if it is cloud in 2010, you are almost certainly wrong, and the magnitude of the error is much larger than it would be in any prior era.

    Watch the full conversation with Krishna Rao on Invest Like the Best here.

  • Jensen Huang on Nvidia’s Supply Chain Moat, TPU Competition, China Export Controls, and Why Nvidia Will Not Become a Cloud (Dwarkesh Podcast Summary)

    TLDW (Too Long, Didn’t Watch)

    Jensen Huang sat down with Dwarkesh Patel for over 90 minutes covering Nvidia’s supply chain dominance, the TPU threat, why Nvidia will not become a hyperscaler, whether the US should sell AI chips to China, and why Nvidia does not pursue multiple chip architectures at once. Jensen framed Nvidia’s entire business as transforming “electrons into tokens” and argued that Nvidia’s real moat is not any single technology but the full stack ecosystem it has built over two decades. He was blunt about his regret over not investing in Anthropic and OpenAI earlier, passionate about keeping the American tech stack dominant worldwide, and dismissive of the idea that China’s chip industry can be meaningfully contained through export controls.

    Key Takeaways

    1. Nvidia’s moat is the ecosystem, not the chip. Jensen repeatedly emphasized that Nvidia’s competitive advantage comes from CUDA, its massive installed base, its deep partnerships across the entire supply chain, and the fact that it operates in every cloud. The moat is not a single product but an interlocking system that took 20+ years to build.

    2. Supply chain bottlenecks are temporary, energy bottlenecks are not. Jensen argued that CoWoS packaging, HBM memory, EUV capacity, and logic fabrication bottlenecks can all be resolved in two to three years with the right demand signal. The real constraint on AI scaling is energy policy, which takes far longer to fix.

    3. TPUs and ASICs are not an existential threat to Nvidia. Jensen was emphatic that no competitor has demonstrated better price-performance or performance-per-watt than Nvidia, and challenged TPU and Trainium to prove otherwise on public benchmarks like InferenceMAX and MLPerf. He described Anthropic as a “unique instance, not a trend” for TPU adoption.

    4. Jensen regrets not investing in Anthropic and OpenAI earlier. He admitted he did not deeply internalize how much capital AI labs needed and that traditional VC funding was not sufficient for companies at that scale. He described this as a clear miss, though he said Nvidia was not in a position to make multi-billion dollar investments at the time.

    5. Nvidia will not become a hyperscaler. Jensen’s philosophy is “do as much as needed, as little as possible.” Building cloud infrastructure is something other companies can do, so Nvidia supports neoclouds like CoreWeave, Nebius, and Nscale instead of competing with them. Nvidia invests in ecosystem partners rather than vertically integrating into cloud services.

    6. Jensen is strongly against US chip export controls on China. This was the longest and most heated segment of the interview. Jensen argued that China already has abundant compute, energy, and AI researchers, and that export controls have accelerated China’s domestic chip industry while causing the US to concede the world’s second-largest technology market. He compared the situation to how US telecom policy allowed Huawei to dominate global telecommunications.

    7. AI will cause software tool usage to skyrocket, not collapse. Jensen pushed back on the narrative that AI will commoditize software companies. He argued that agents will use existing tools at massive scale, causing the number of instances of products like Excel, Synopsys Design Compiler, and other enterprise tools to grow exponentially.

    8. Nvidia does not pick winners among AI labs. Jensen explained that Nvidia invests across multiple foundation model companies simultaneously and refuses to favor any single one. He cited his own company’s unlikely survival story as the reason for this humility: Nvidia’s original graphics architecture was “precisely wrong” and would have been counted out by anyone picking winners.

    9. Nvidia added Groq for premium token economics. Nvidia recently acquired Groq and is folding it into the CUDA ecosystem because the market is now segmenting into different token tiers. Some customers will pay premium prices for faster response times even at lower throughput, creating a new segment of the inference market.

    10. Without AI, Nvidia would still be very large. Jensen was clear that accelerated computing, not AI specifically, is the foundational mission of the company. Molecular dynamics, quantum chemistry, computational lithography, data processing, and physics simulation all benefit from GPU acceleration regardless of deep learning.

    Detailed Summary

    Nvidia’s Real Business: Electrons to Tokens

    Jensen opened the conversation by reframing Nvidia’s entire value proposition. When Dwarkesh suggested that Nvidia is fundamentally a software company that sends a GDS2 file to TSMC for manufacturing, Jensen pushed back hard. He described Nvidia’s job as transforming electrons into tokens, with everything in between representing an “incredible journey” of artistry, engineering, science, and invention. He said the transformation is far from deeply understood and the journey is far from over, making commoditization unlikely.

    Jensen described Nvidia as operating a philosophy of doing “as much as necessary and as little as possible.” Whatever Nvidia does not need to do itself, it partners with someone else and makes it part of the broader ecosystem. This is why Nvidia has what Jensen called probably the largest ecosystem of partners in the industry, spanning the full supply chain upstream and downstream, application developers, model makers, and all five layers of the AI stack.

    On the question of whether AI will commoditize software companies, Jensen offered a contrarian take. He argued that agents are going to use software tools at unprecedented scale, meaning the number of instances of products like Excel, Cadence design tools, and Synopsys compilers will skyrocket. Today the bottleneck is the number of human engineers. Tomorrow, those engineers will be supported by swarms of agents exploring design spaces and using the same tools humans use today. Jensen said the reason this has not happened yet is simply that the agents are not good enough at using tools. That will change.

    The Supply Chain Moat

    Dwarkesh pressed Jensen on Nvidia’s reported $100 billion (and potentially $250 billion) in purchase commitments with foundries, memory manufacturers, and packaging companies. The question was whether Nvidia’s real moat for the next few years is simply locking up scarce upstream components so that no competitor can get the memory and logic they need to build alternative accelerators.

    Jensen confirmed this is a significant advantage but framed it differently. He said Nvidia has made enormous explicit and implicit commitments upstream. The implicit commitments matter just as much: Jensen personally meets with CEOs across the supply chain to explain the scale of the coming AI industry, convince them to invest in capacity, and assure them that Nvidia’s downstream demand is large enough to justify that investment. Nvidia’s GTC conference serves this purpose too, bringing the entire ecosystem together so upstream suppliers can see downstream demand and vice versa.

    Jensen described a process of systematically “prefetching bottlenecks” years in advance. CoWoS advanced packaging was a major bottleneck two years ago, but Nvidia swarmed it with repeated doubling of capacity until TSMC recognized it as mainstream computing technology rather than a specialty product. More recently, Nvidia has invested in the silicon photonics ecosystem through partnerships with Lumentum and Coherent, invented new packaging technologies, licensed patents to keep the supply chain open, and even invested in new testing equipment like double-sided probing.

    When Dwarkesh asked about the ultimate physical bottlenecks, Jensen surprised him. The hardest bottleneck to solve is not CoWoS or HBM or EUV machines. It is plumbers and electricians needed to build data centers. Jensen used this as a launching point to criticize “doomers” who discourage people from pursuing careers in software engineering or radiology, arguing that scaring people out of these professions creates the real bottlenecks.

    On EUV and logic scaling specifically, Jensen was optimistic. He said no supply chain bottleneck lasts longer than two to three years. Once you can build one of something, you can build ten, and once you can build ten, you can build a million. The key is a clear demand signal. If TSMC is convinced of the demand, ASML will produce enough EUV machines. Meanwhile, Nvidia continues to improve computing efficiency by 10x to 50x per generation through architecture, algorithms, and system design.

    The TPU Question

    Dwarkesh pushed hard on whether Google’s TPUs represent a real threat, noting that two of the top three AI models (Claude and Gemini) were trained on TPUs. Jensen drew a sharp distinction between what Nvidia builds and what a TPU is. Nvidia builds accelerated computing, which serves molecular dynamics, quantum chromodynamics, data processing, fluid dynamics, particle physics, and AI. A TPU is a tensor processing unit optimized for matrix multiplies. Nvidia’s market reach is far greater than any TPU or ASIC can possibly have.

    Jensen emphasized programmability as Nvidia’s core architectural advantage. If you want to invent a new attention mechanism, build a hybrid SSM model, fuse diffusion and autoregressive techniques, or disaggregate computation in a novel way, you need a generally programmable architecture. The only way to achieve 10x or 100x performance leaps (versus the roughly 25% per year from Moore’s Law) is to fundamentally change the algorithm, and that requires the flexibility CUDA provides.

    On the specific question of whether hyperscalers with huge engineering teams can simply write their own kernels and bypass CUDA, Jensen acknowledged they do write custom kernels but argued that Nvidia’s engineers still routinely deliver 2x to 3x speedups when they optimize a partner’s stack. He described Nvidia’s GPUs as “F1 racers” that anyone can drive at 100 mph, but extracting peak performance requires deep architectural expertise. Nvidia uses AI itself to generate many of its optimized kernels.

    Jensen was particularly blunt about public benchmarks. He pointed to Dylan Patel’s InferenceMAX benchmark and said neither TPU nor Trainium has been willing to demonstrate their claimed performance advantages on it. He said Nvidia’s performance-per-TCO is the best in the world, “bar none,” and challenged anyone to prove otherwise.

    Regarding Anthropic’s multi-gigawatt deal with Broadcom and Google for TPUs, Jensen called it “a unique instance, not a trend.” He said without Anthropic, there would be essentially no TPU growth and no Trainium growth. He traced this back to his own mistake: when Anthropic and OpenAI needed multi-billion dollar investments from their compute suppliers to get off the ground, Nvidia was not in a position to provide that capital. Google and AWS were, and in return, Anthropic committed to using their compute.

    Nvidia’s Investment Strategy and Regrets

    Jensen was unusually candid about his regret over not investing in foundation model companies earlier. He said he did not deeply internalize how different AI labs were from typical startups. A traditional VC would never put $5 to $10 billion into a single AI lab, but that was exactly what companies like OpenAI and Anthropic needed. By the time Jensen understood this, Nvidia was not in a financial or cultural position to make those kinds of investments.

    Now, Nvidia has invested approximately $30 billion in OpenAI and $10 billion in Anthropic. Jensen said he is delighted to support both and considers their existence essential for the world. But he acknowledged that these investments came at much higher valuations than would have been possible years earlier.

    Jensen explained Nvidia’s broader investment philosophy: support everyone, do not pick winners. He invests in one foundation model company, he invests in all of them. This comes from hard-won humility. When Nvidia started, there were 60 3D graphics companies. Nvidia’s original architecture was “precisely wrong” and the company would have been at the top of most lists to fail. Jensen said he has enough humility from that experience to know that you cannot predict which AI company will ultimately succeed.

    Why Nvidia Will Not Become a Hyperscaler

    Dwarkesh pointed out that Nvidia has the cash to build and operate its own cloud infrastructure, bypassing the middleman ecosystem that converts CapEx into OpEx for AI labs. Jensen rejected this path based on his core operating philosophy.

    If Nvidia did not build its computing platform, NVLink, and the CUDA ecosystem, nobody else would have done it. He is “completely certain” of that. These are things Nvidia must do. But the world has lots of clouds. If Nvidia did not build a cloud, someone else would show up. So the answer is to support the ecosystem instead: invest in CoreWeave, Nscale, Nebius, and others to help them exist and scale, rather than competing with them.

    Jensen was clear that Nvidia is not trying to be in the financing business either. When OpenAI needed a $30 billion investment before its IPO, Nvidia stepped up because OpenAI needed it and Nvidia deeply believed in the company. But these are targeted ecosystem investments, not a strategic pivot into cloud services.

    On GPU allocation during shortages, Jensen pushed back on the narrative that Nvidia strategically “fractures” the market by giving allocations to smaller neoclouds. He said the process is straightforward: you forecast demand, you place a purchase order, and it is first in, first out. Nvidia never changes prices based on demand. Jensen said he prefers to be dependable and serve as the foundation of the industry rather than extracting maximum short-term value.

    The China Debate

    The longest and most heated section of the interview was Jensen’s case against US chip export controls on China. This was a genuine debate, with Dwarkesh pushing the national security argument and Jensen pushing back forcefully.

    Jensen’s core argument rested on several pillars. First, China already has abundant compute. They manufacture 60% or more of the world’s mainstream chips, have massive energy infrastructure (including empty data centers with full power), and employ roughly 50% of the world’s AI researchers. The threshold of compute needed to build models like Anthropic’s Mythos has already been reached and exceeded by China’s existing infrastructure.

    Second, export controls have backfired. They accelerated China’s domestic chip industry, forced their AI ecosystem to optimize for internal architectures instead of the American tech stack, and caused the United States to concede the second-largest technology market in the world. Jensen compared this directly to how US telecom policy allowed Huawei to dominate global telecommunications infrastructure.

    Third, Jensen argued that AI is a five-layer stack (energy, chips, computing platform, models, applications) and the US needs to win at every layer. Fixating on one layer (models) at the expense of another layer (chips) is counterproductive. If Chinese open source AI models end up optimized for non-American hardware and that stack gets exported to the global south, the Middle East, Africa, and Southeast Asia, the US will have lost something far more valuable than whatever marginal compute advantage the export controls provided.

    Dwarkesh countered with the Mythos example: Anthropic’s new model found thousands of high-severity zero-day vulnerabilities across every major operating system and browser, including one that had existed in OpenBSD for 27 years. If China had enough compute to train and deploy a model like Mythos at scale before the US could prepare, the cyber-offensive capabilities would be devastating.

    Jensen’s response was direct. Mythos was trained on “fairly mundane capacity” that is already abundantly available in China. The amount of compute is not the bottleneck for that kind of breakthrough. Great computer science is, and China has no shortage of brilliant AI researchers. He pointed to DeepSeek as evidence: most advances in AI come from algorithmic innovation, not raw hardware. If China’s researchers can achieve breakthroughs like DeepSeek with limited hardware, imagine what they could do with more.

    Jensen also argued for dialogue over confrontation. He said it is essential that American and Chinese AI researchers are talking to each other, and that both countries agree on what AI should not be used for. The idea that you can prevent AI risks by cutting off chip sales, when the real advances come from algorithms and computer science, reflects a fundamental misunderstanding of how AI progress works.

    The debate ended without resolution, but Jensen’s final point was sharp: “I’m not talking to somebody who woke up a loser. That loser attitude, that loser premise, makes no sense to me.”

    Why Not Multiple Chip Architectures?

    Near the end of the interview, Dwarkesh asked why Nvidia does not run multiple parallel chip projects with different architectures, like a Cerebras-style wafer-scale design or a Dojo-style huge package, or even one without CUDA.

    Jensen’s answer was simple: “We don’t have a better idea.” Nvidia simulates all of these alternative approaches in its internal simulators and they are provably worse. The company works on exactly the projects it wants to work on. If the workload were to change dramatically (not just the algorithms, but the actual market shape), Nvidia might add other accelerators.

    In fact, Nvidia recently did exactly this by acquiring Groq. The inference market is now segmenting into different tiers. Some customers will pay premium prices for extremely fast response times even if throughput is lower. This creates a new “high ASP token” segment that justifies a different point on the performance curve. But Jensen was clear: if he had more money, he would put it all behind Nvidia’s existing architecture, not diversify into alternatives.

    Nvidia Without AI

    Jensen closed by saying that even if the deep learning revolution had never happened, Nvidia would be “very, very large.” The premise of the company has always been that general-purpose computing cannot scale indefinitely and that domain-specific acceleration is the way forward. Molecular dynamics, seismic processing, image processing, computational lithography, quantum chemistry, and data processing all benefit from GPU acceleration regardless of AI. Jensen said the fundamental promise of accelerated computing has not changed “not even a little bit.”

    Thoughts

    This interview is one of the most revealing Jensen Huang conversations in years, partly because Dwarkesh actually pushes back instead of lobbing softballs. A few things stand out.

    The Anthropic regret is real and significant. Jensen is essentially admitting that Nvidia’s biggest strategic miss of the AI era was not understanding that foundation model companies needed supplier-level capital commitments, not VC funding. The fact that Google and AWS used compute investments to lock in Anthropic’s architecture choices has had downstream consequences that Nvidia is still working to unwind. When Jensen says Anthropic is “a unique instance, not a trend” for TPU adoption, he is simultaneously downplaying the threat and revealing exactly how seriously he takes it.

    The China debate is the highlight. Jensen’s argument is more nuanced than it first appears. He is not saying “sell China everything.” He is saying the current binary approach of near-total restriction has backfired by accelerating China’s domestic chip industry and pushing the Chinese AI ecosystem away from the American tech stack. His comparison to the US telecom industry losing global market share to Huawei is pointed and historically grounded. Whether you agree with his conclusion or not, the framing of AI as a five-layer stack where the US needs to compete at every layer is a useful mental model.

    The “electrons to tokens” framing is Jensen at his best. It is a simple metaphor that captures something genuinely complex about where value is created in the AI supply chain. And his insistence that the transformation is “far from deeply understood” is a subtle way of arguing that Nvidia’s competitive position will be durable because the problem space is not close to being solved.

    The Groq acquisition reveal is interesting for what it signals about the inference market. If Nvidia is creating a separate product tier for premium-priced, low-latency tokens, it suggests the company sees inference economics fragmenting significantly. This aligns with the broader trend of AI becoming an enterprise product where different customers have wildly different willingness to pay based on how they use tokens.

    Finally, Jensen’s refusal to diversify chip architectures is a bold bet. “We simulate it all in our simulator, provably worse” is an incredibly confident statement. History is full of companies that were right until they were not. But Nvidia’s track record of 50x generation-over-generation improvements through co-design across processors, fabric, libraries, and algorithms is hard to argue with. The question is whether the current paradigm of transformer-based models on GPU clusters represents a local or global optimum for AI compute.

  • Jensen Huang on Lex Fridman: NVIDIA’s CEO Reveals His Vision for the AI Revolution, Scaling Laws, and Why Intelligence Is Now a Commodity

    A deep breakdown of Lex Fridman Podcast #494 featuring Jensen Huang, CEO of NVIDIA, covering extreme co-design, the four AI scaling laws, CUDA’s origin story, the future of programming, AGI timelines, and what it takes to lead the world’s most valuable company.

    TLDW (Too Long, Didn’t Watch)

    Jensen Huang sat down with Lex Fridman for a sprawling two-and-a-half-hour conversation covering the full arc of NVIDIA’s evolution from a GPU gaming company to the engine of the AI revolution. Jensen explains how NVIDIA now thinks in terms of rack-scale and pod-scale computing rather than individual chips, breaks down his four AI scaling laws (pre-training, post-training, test time, and agentic), and reveals the near-existential bet the company made putting CUDA on GeForce. He shares his views on China’s tech ecosystem, his deep respect for TSMC, why he turned down the chance to become TSMC’s CEO, how Elon Musk’s systems engineering approach built Colossus in record time, and why he believes AGI already exists. He also discusses why the future of programming is really about “specification,” why intelligence is being commoditized while humanity is the true superpower, and how he manages the enormous pressure of leading a company that nations and economies depend on. His core message: do not let the democratization of intelligence cause you anxiety. Instead, let it inspire you.

    Key Takeaways

    1. NVIDIA No Longer Thinks in Chips. It Thinks in AI Factories.

    Jensen’s mental model of what NVIDIA builds has fundamentally changed. He no longer picks up a chip to represent a new product generation. Instead, his mental model is a gigawatt-scale AI factory with power generation, cooling systems, and thousands of engineers bringing it online. The unit of computing at NVIDIA has evolved from GPU to computer to cluster to AI factory. His next mental “click” is planetary-scale computing.

    2. Extreme Co-Design Is NVIDIA’s Secret Weapon

    The reason NVIDIA dominates is not just better GPUs. It is the extreme co-design of the entire stack: GPU, CPU, memory, networking, switching, power, cooling, storage, software, algorithms, and applications. Jensen explains that when you distribute workloads across tens of thousands of computers and want them to go a million times faster (not just 10,000 times), every single component becomes a bottleneck. This is a restatement of Amdahl’s Law at scale. NVIDIA’s organizational structure directly reflects this co-design philosophy. Jensen has 60+ direct reports, holds no one-on-ones, and runs every meeting as a collective problem-solving session where specialists across all domains are present and contribute.

    3. The Four AI Scaling Laws Are a Flywheel

    Jensen outlined four distinct scaling laws that form a continuous loop:

    Pre-training scaling: Larger models plus more data equals smarter AI. The industry panicked when people said data was running out, but synthetic data generation has removed that ceiling. Data is now limited by compute, not by human generation.

    Post-training scaling: Fine-tuning, reinforcement learning from human feedback, and curated data continue to scale AI capabilities beyond what pre-training alone achieves.

    Test-time scaling: Inference is not “easy” as many predicted. It is thinking, reasoning, planning, and search. It is far more compute-intensive than memorization and pattern matching. This is why inference chips cannot be commoditized the way many predicted.

    Agentic scaling: A single AI agent can spawn sub-agents, creating teams. This is like scaling a company by hiring more employees rather than trying to make one person faster. The experiences generated by agents feed back into pre-training, creating a flywheel.

    4. The CUDA Bet Nearly Killed NVIDIA

    Putting CUDA on GeForce was one of the most consequential technology decisions in modern history. It increased GPU costs by roughly 50%, which crushed the company’s gross margins at a time when NVIDIA was a 35% gross margin business. The company’s market cap dropped from around $7-8 billion to approximately $1.5 billion. But Jensen understood that install base defines a computing architecture, not elegance. He pointed to x86 as proof: a less-than-elegant architecture that defeated beautifully designed RISC alternatives because of its massive install base. CUDA on GeForce put a supercomputer in the hands of every researcher, every scientist, every student. It took a decade to recover, but that install base became the foundation of the deep learning revolution.

    5. NVIDIA’s Moat Is Trust, Velocity, and Install Base

    Jensen was direct about NVIDIA’s competitive advantage. The CUDA install base is the number one asset. Developers target CUDA first because it reaches hundreds of millions of computers, is in every cloud, every OEM, every country, every industry. NVIDIA ships a new architecture roughly every year. No company in history has built systems of this complexity at this cadence. And the trust that NVIDIA will maintain, improve, and optimize CUDA indefinitely is something developers can count on. If someone created “GUDA” or “TUDA” tomorrow, it would not matter. The install base, velocity of execution, ecosystem breadth, and earned trust create a compounding advantage that is nearly impossible to replicate.

    6. Jensen Believes AGI Is Already Here

    When asked about AGI timelines, Jensen said he believes AGI has been achieved. His reasoning is practical: an agentic system today could plausibly create a web service, achieve virality, and generate a billion dollars in revenue, even if temporarily. This is not meaningfully different from many internet-era companies that did the same thing with technology no more sophisticated than what current AI agents can produce. He does not believe 100,000 agents could build another NVIDIA, but he believes a single agent-driven viral product is within reach right now.

    7. The Future of Programming Is Specification, Not Syntax

    Jensen believes the number of programmers in the world will increase dramatically, not decrease. His reasoning: the definition of coding is expanding to include specification and architectural description in natural language. This expands the population of “coders” from roughly 30 million professional developers to potentially a billion people. Every carpenter, plumber, accountant, and farmer who can describe what they want a computer to build is now a coder. The artistry of the future is knowing where on the spectrum of specification to operate, from highly prescriptive to exploratory and open-ended.

    8. China Is the Fastest Innovating Country in the World

    Jensen gave a nuanced and detailed explanation of why China’s tech ecosystem is so formidable. About 50% of the world’s AI researchers are Chinese. China’s tech industry emerged during the mobile cloud era, so it was built on modern software from the start. The country’s provincial competition creates an insane internal competitive environment. And the cultural norm of knowledge-sharing through school and family networks means China effectively operates as an open-source ecosystem at all times. This is why Chinese companies contribute disproportionately to open source. Their engineers’ brothers, friends, and schoolmates work at competing companies, and sharing knowledge is the cultural default.

    9. The Power Grid Has Enormous Waste That AI Can Exploit

    Jensen proposed a pragmatic solution to the energy problem for AI data centers. Power grids are designed for worst-case conditions with margin, but 99% of the time they run at around 60% of peak capacity. That idle capacity is simply wasted. Jensen wants data centers to negotiate flexible contracts where they absorb excess power most of the time and gracefully degrade during rare peak demand periods. This requires three things: customers accepting that “six nines” uptime may not always be necessary, data centers that can dynamically shift workloads, and utilities that offer tiered power delivery contracts instead of all-or-nothing commitments.

    10. Jensen Turned Down the CEO Role at TSMC

    In 2013, TSMC founder Morris Chang offered Jensen the chance to become CEO of TSMC. Jensen confirmed the story is true and said he was deeply honored. But he had already envisioned what NVIDIA could become and felt it was his sole responsibility to make that vision happen. He sees the relationship with TSMC as one built on three decades of trust, hundreds of billions of dollars in business, and zero formal contracts.

    11. Elon Musk’s Systems Engineering Approach Is Instructive

    Jensen praised Elon Musk’s approach to building the Colossus supercomputer in Memphis in just four months. He highlighted several principles: Elon questions everything relentlessly, strips every process down to the minimum necessary, is physically present at the point of action, and his personal urgency creates urgency in every supplier. Jensen drew a parallel to NVIDIA’s own “speed of light” methodology, where every process is benchmarked against the physical limits of what is possible, not against historical baselines.

    12. Intelligence Is a Commodity. Humanity Is Not.

    Perhaps the most philosophical takeaway from the conversation: Jensen argued that intelligence is a functional, measurable thing that is being commoditized. He surrounded himself with 60 direct reports who are all “superhuman” in their respective domains, more educated and deeper in their specialties than he is. Yet he sits in the middle orchestrating all of them. This proves that intelligence alone does not determine success. Character, compassion, grit, determination, tolerance for embarrassment, and the ability to endure suffering are the real differentiators. Jensen wants the audience to understand that the word we should elevate is not intelligence but humanity.

    Detailed Summary

    From GPU Maker to AI Infrastructure Company

    The conversation opened with Jensen explaining NVIDIA’s evolution from chip-scale to rack-scale to pod-scale design. The Vera Rubin pod, announced at GTC, contains seven chip types, five purpose-built rack types, 40 racks, 1.2 quadrillion transistors, nearly 20,000 NVIDIA dies, over 1,100 Rubin GPUs, 60 exaflops of compute, and 10 petabytes per second of scale bandwidth. And that is just one pod. NVIDIA plans to produce roughly 200 of these pods per week.

    Jensen explained that extreme co-design is necessary because the problems AI must solve no longer fit inside a single computer. When you distribute a workload across 10,000 computers but want a million-fold speedup, everything becomes a bottleneck: computation, networking, switching, memory, power, cooling. This is fundamentally an Amdahl’s Law problem at planetary scale. If computation represents only 50% of the workload, speeding it up infinitely only doubles total throughput. Every layer must be co-optimized simultaneously.

    NVIDIA’s organizational structure is a direct reflection of this co-design philosophy. Jensen has more than 60 direct reports, almost all with deep engineering expertise. He does not do one-on-ones. Every meeting is a collective problem-solving session where the memory expert, the networking expert, the cooling expert, and the power delivery expert are all in the room together, attacking the same problem.

    The Strategic History of CUDA

    Jensen walked through the step-by-step journey from graphics accelerator to computing platform. The company invented a programmable pixel shader, then added IEEE-compatible FP32 to its shaders, then put C on top of that (called Cg), and eventually arrived at CUDA. The critical strategic decision was putting CUDA on GeForce, a consumer product.

    This was nearly an existential move. It increased GPU costs by roughly 50% and consumed all of the company’s gross profit at a time when NVIDIA was a 35% gross margin business. The market cap cratered from around $7-8 billion to approximately $1.5 billion. But Jensen understood a principle that many technologists overlook: install base defines a computing architecture. x86 survived not because it was elegant but because it was everywhere. CUDA on GeForce put a supercomputing capability in the hands of every gamer, every student, every researcher who built their own PC. When the deep learning revolution arrived, CUDA was already the foundation.

    How Jensen Leads and Makes Decisions

    Jensen described a leadership philosophy built on continuous reasoning in public. He does not make announcements in the traditional sense. Instead, he shapes the belief systems of his employees, board, partners, and the broader industry over months and years by reasoning through decisions step by step, using every new piece of external information as a brick in the foundation. By the time he formally announces a strategic direction, the reaction is not surprise but rather, “What took you so long?”

    He applies this same approach to his supply chain. He personally visits CEOs of DRAM companies, packaging companies, and infrastructure providers. He explains the dynamics of the industry, shares his vision of future demand, and helps them reason through why they should make multi-billion-dollar capital investments. Three years ago, he convinced DRAM CEOs that HBM memory would become mainstream for data centers, which sounded ridiculous at the time. Those companies had record years as a result.

    Jensen’s “speed of light” methodology is his framework for decision-making. Every process, every design, every cost is benchmarked against the physical limits of what is theoretically possible. He prefers this to continuous improvement, which he views as incrementalism. He would rather strip a 74-day process back to zero and ask, “If we built this from scratch today, how long would it take?” Often the answer is six days, and the remaining 68 days are filled with accumulated compromises that can be challenged individually.

    AI Scaling Laws and the Future of Compute

    Jensen broke down the four scaling laws in detail. The pre-training scaling law, which depends on model size and data volume, was thought to be hitting a wall when the industry worried about running out of high-quality human-generated data. Jensen argued this concern is misplaced. Synthetic data generation has effectively removed the ceiling, and the constraint is now compute, not data.

    Post-training continues to scale through fine-tuning and reinforcement learning. Test-time scaling was the most counterintuitive for the industry. Many predicted that inference would be “easy” and that inference chips would be small, cheap, and commoditized. Jensen saw this as fundamentally wrong. Inference is thinking: reasoning, planning, search, decomposing novel problems into solvable pieces. Thinking is much harder than reading, and test-time compute is intensely resource-hungry.

    Agentic scaling is the newest frontier. A single AI agent can spawn sub-agents, effectively multiplying intelligence the way a company scales by hiring. The experiences and data generated by agentic systems feed back into pre-training, creating a continuous improvement loop. Jensen described this as the reason NVIDIA designed the Vera Rubin rack architecture differently from the Grace Blackwell architecture. Grace Blackwell was optimized for running large language models. Vera Rubin is designed for agents, which need to access files, use tools, do research, and spin off sub-agents. NVIDIA anticipated this architectural shift two and a half years before tools like OpenClaw arrived.

    China, TSMC, and the Global Supply Chain

    Jensen provided a thoughtful analysis of China’s tech ecosystem. He identified several structural advantages: 50% of the world’s AI researchers are Chinese, the tech industry was born during the mobile cloud era (making it natively modern), provincial competition creates internal Darwinian pressure, and the culture of knowledge-sharing through school and family networks makes China effectively open-source by default.

    On TSMC, Jensen emphasized that the deepest misunderstanding about the company is that its technology is its only advantage. Their manufacturing orchestration system, which dynamically manages the shifting demands of hundreds of companies, is “completely miraculous.” Their culture uniquely balances bleeding-edge technology excellence with world-class customer service. And the trust that Jensen places in TSMC is extraordinary: three decades of partnership, hundreds of billions of dollars in business, and no formal contract.

    Jensen also discussed the AI supply chain more broadly. NVIDIA has roughly 200 suppliers contributing technology to each rack. Jensen personally manages these relationships, flying to supplier sites, explaining industry dynamics, and helping CEOs reason through multi-billion-dollar investment decisions. When asked if supply chain bottlenecks keep him up at night, he said no, because he has already communicated what NVIDIA needs, his partners have told him what they will deliver, and he believes them.

    The Energy Challenge and Space Computing

    On the energy front, Jensen proposed a practical approach to the power problem. Rather than waiting for new power generation, he wants to capture the enormous waste already present in the grid. Power infrastructure is designed for worst-case peak demand, but 99% of the time it runs far below capacity. AI data centers could absorb this excess capacity with flexible contracts that allow graceful degradation during rare peak periods.

    On space computing, NVIDIA already has GPUs in orbit for satellite imaging. Jensen acknowledged the cooling challenge (no conduction or convection in space, only radiation) but sees it as a future frontier worth cultivating. In the meantime, he is focused on the lower-hanging fruit of eliminating waste in the terrestrial power grid.

    On AGI, Jobs, and the Human Future

    Jensen stated directly that he believes AGI has been achieved, at least by the practical definition of an AI system capable of creating a billion-dollar company. He sees it as plausible that an agent could build a viral web service that briefly generates enormous revenue, just as many internet-era companies did with technology no more sophisticated than what current AI agents produce.

    On jobs, Jensen was both compassionate and clear-eyed. He told the story of radiology: computer vision became superhuman around 2019-2020, and the prediction was that radiologists would disappear. Instead, the number of radiologists grew because AI allowed them to study more scans, diagnose better, and serve more patients. The purpose of the job (diagnosing disease) did not change, even though the tools changed completely.

    He applied this principle broadly: the number of software engineers at NVIDIA will grow, not decline, because their purpose is solving problems, not writing lines of code. The number of programmers globally will grow because the definition of coding is expanding to include natural language specification, opening it up to potentially a billion people.

    His advice to anyone worried about their job is straightforward: go use AI now. Become expert in it. Every profession, from carpenter to pharmacist to lawyer, will be elevated by AI tools. The people who learn to use AI will be the ones who get hired, promoted, and empowered.

    Mortality, Succession, and Legacy

    The conversation closed with deeply personal reflections. Jensen said he really does not want to die. He sees the current moment as a “once in a humanity experience.” He does not believe in traditional succession planning. Instead, he believes the best succession strategy is to pass on knowledge continuously, every single day, in every meeting, as fast as possible. His hope is to die on the job, instantaneously, with no long period of suffering.

    He described a vision for a kind of digital continuity: sending a humanoid robot into space, continuously improving it in flight, and eventually uploading the consciousness derived from a lifetime of communications, decisions, and reasoning to catch up with it at the speed of light.

    On the emotional experience of leading NVIDIA, Jensen was candid about hitting psychological low points regularly. His coping mechanism is decomposition: break the problem into pieces, reason about what you can control, tell someone who can help, share the burden, and then deliberately forget what is behind you. He compared this to the mental discipline of great athletes who focus only on the next point.

    His final message was about the relationship between intelligence and humanity. Intelligence, he argued, is functional. It is being commoditized. Humanity, character, compassion, grit, tolerance for embarrassment, and the capacity for suffering are the true superpowers. The word society should elevate is not intelligence but humanity.

    Thoughts

    This is one of the most substantive CEO interviews of 2026. What makes it remarkable is not just the breadth of topics but the depth of reasoning Jensen demonstrates in real time. You can actually watch him think through problems on the spot, which is rare for someone at his level.

    A few things stand out. First, the CUDA origin story is one of the great strategic narratives in tech history. The decision to absorb a 50% cost increase on a consumer product, watching your market cap collapse by 80%, and holding the course for a decade because you understood the power of install base is the kind of conviction that separates generational companies from everyone else.

    Second, Jensen’s framing of the four scaling laws as a flywheel is the clearest articulation anyone has given of why AI compute demand will continue to accelerate. Most people understand pre-training. Fewer understand test-time scaling. Almost nobody is thinking about agentic scaling as a compute multiplier. Jensen has been thinking about it for years and already designed hardware for it before the software ecosystem caught up.

    Third, the discussion on jobs deserves attention. The radiology example is powerful because it is a completed experiment, not a prediction. The profession that was supposed to be eliminated first by AI instead grew. The mechanism is straightforward: when you automate the task, you expand the capacity of the purpose, and demand for the purpose increases. This does not mean there will be no pain or dislocation. Jensen acknowledged that explicitly. But the historical pattern is clear.

    Finally, the philosophical distinction between intelligence and humanity is the kind of framing that could genuinely help people navigate the anxiety of this moment. If you define your value by your intelligence alone, AI commoditization is terrifying. If you define your value by your character, your compassion, your tolerance for suffering, and your willingness to keep going when everything goes wrong, then AI is just the most powerful set of tools you have ever been given.

    Jensen Huang is 62 years old, has been running NVIDIA for 34 years, and shows no signs of slowing down. If anything, his conviction about the future is accelerating alongside his company’s growth.

    Watch the full episode: Lex Fridman Podcast #494 with Jensen Huang

  • Beyond the Bubble: Jensen Huang on the Future of AI, Robotics, and Global Tech Strategy in 2026

    In a wide-ranging discussion on the No Priors Podcast, NVIDIA Founder and CEO Jensen Huang reflects on the rapid evolution of artificial intelligence throughout 2025 and provides a strategic roadmap for 2026. From the debunking of the “AI Bubble” to the rise of physical robotics and the “ChatGPT moments” coming for digital biology, Huang offers a masterclass in how accelerated computing is reshaping the global economy.


    TL;DW (Too Long; Didn’t Watch)

    • The Core Shift: General-purpose computing (CPUs) has hit a wall; the world is moving permanently to accelerated computing.
    • The Jobs Narrative: AI automates tasks, not purposes. It is solving labor shortages in manufacturing and nursing rather than causing mass unemployment.
    • The 2026 Breakthrough: Digital biology and physical robotics are slated for their “ChatGPT moment” this year.
    • Geopolitics: A nuanced, constructive relationship with China is essential, and open source is the “innovation flywheel” that keeps the U.S. competitive.

    Key Takeaways

    • Scaling Laws & Reasoning: 2025 proved that scaling compute still translates directly to intelligence, specifically through massive improvements in reasoning, grounding, and the elimination of hallucinations.
    • The End of “God AI”: Huang dismisses the myth of a monolithic “God AI.” Instead, the future is a diverse ecosystem of specialized models for biology, physics, coding, and more.
    • Energy as Infrastructure: AI data centers are “AI Factories.” Without a massive expansion in energy (including natural gas and nuclear), the next industrial revolution cannot happen.
    • Tokenomics: The cost of AI inference dropped 100x in 2024 and could drop a billion times over the next decade, making intelligence a near-free commodity.
    • DeepSeek’s Impact: Open-source contributions from China, like DeepSeek, are significantly benefiting American startups and researchers, proving the value of a global open-source ecosystem.

    Detailed Summary

    The “Five-Layer Cake” of AI

    Huang explains AI not as a single app, but as a technology stack: EnergyChipsInfrastructureModelsApplications. He emphasizes that while the public focuses on chatbots, the real revolution is happening in “non-English” languages, such as the languages of proteins, chemicals, and physical movement.

    Task vs. Purpose: The Future of Labor

    Addressing the fear of job loss, Huang uses the “Radiologist Paradox.” While AI now powers nearly 100% of radiology applications, the number of radiologists has actually increased. Why? Because AI handles the task (scanning images), allowing the human to focus on the purpose (diagnosis and research). This same framework applies to software engineers: their purpose is solving problems, not just writing syntax.

    Robotics and Physical AI

    Huang is incredibly optimistic about robotics. He predicts a future where “everything that moves will be robotic.” By applying reasoning models to physical machines, we are moving from “digital rails” (pre-programmed paths) to autonomous agents that can navigate unknown environments. He foresees a trillion-dollar repair and maintenance industry emerging to support the billions of robots that will eventually inhabit our world.

    The “Bubble” Debate

    Is there an AI bubble? Huang argues “No.” He points to the desperate, unsatisfied demand for compute capacity across every industry. He notes that if chatbots disappeared tomorrow, NVIDIA would still thrive because the fundamental architecture of the world’s $100 trillion GDP is shifting from CPUs to GPUs to stay productive.


    Analysis & Thoughts

    Jensen Huang’s perspective is distinct because he views AI through the lens of industrial production. By calling data centers “factories” and tokens “output,” he strips away the “magic” of AI and reveals it as a standard industrial revolution—one that requires power, raw materials (data/chips), and specialized labor.

    His defense of Open Source is perhaps the most critical takeaway for policymakers. By arguing that open source prevents “suffocation” for startups and 100-year-old industrial companies, he positions transparency as a national security asset rather than a liability. As we head into 2026, the focus is clearly shifting from “Can the model talk?” to “Can the model build a protein or drive a truck?”

  • The BG2 Pod: A Deep Dive into Tech, Tariffs, and TikTok on Liberation Day

    In the latest episode of the BG2 Pod, hosted by tech luminaries Bill Gurley and Brad Gerstner, the duo tackled a whirlwind of topics that dominated headlines on April 3, 2025. Recorded just after President Trump’s “Liberation Day” tariff announcement, this bi-weekly open-source conversation offered a verbose, insightful exploration of market uncertainty, global trade dynamics, AI advancements, and corporate maneuvers. With their signature blend of wit, data-driven analysis, and insider perspectives, Gurley and Gerstner unpacked the implications of a rapidly shifting economic and technological landscape. Here’s a detailed breakdown of the episode’s key discussions.

    Liberation Day and the Tariff Shockwave

    The episode kicked off with a dissection of President Trump’s tariff announcement, dubbed “Liberation Day,” which sent shockwaves through global markets. Gerstner, who had recently spoken at a JP Morgan Tech conference, framed the tariffs as a doctrinal move by the Trump administration to level the trade playing field—a philosophy he’d predicted as early as February 2025. The initial market reaction was volatile: S&P and NASDAQ futures spiked 2.5% on a rumored 10% across-the-board tariff, only to plummet 600 basis points as details emerged, including a staggering 54% tariff on China (on top of an existing 20%) and 25% auto tariffs targeting Mexico, Canada, and Germany.

    Gerstner highlighted the political theater, noting Trump’s invite to UAW members and his claim that these tariffs flipped Michigan red. The administration also introduced a novel “reciprocal tariff” concept, factoring in non-tariff barriers like currency manipulation, which Gurley critiqued for its ambiguity. Exemptions for pharmaceuticals and semiconductors softened the blow, potentially landing the tariff haul closer to $600 billion—still a hefty leap from last year’s $77 billion. Yet, both hosts expressed skepticism about the economic fallout. Gurley, a free-trade advocate, warned of reduced efficiency and higher production costs, while Gerstner relayed CEOs’ fears of stalled hiring and canceled contracts, citing a European-Asian backlash already brewing.

    US vs. China: The Open-Source Arms Race

    Shifting gears, the duo explored the escalating rivalry between the US and China in open-source AI models. Gurley traced China’s decade-long embrace of open source to its strategic advantage—sidestepping IP theft accusations—and highlighted DeepSeek’s success, with over 1,500 forks on Hugging Face. He dismissed claims of forced open-sourcing, arguing it aligns with China’s entrepreneurial ethos. Meanwhile, Gerstner flagged Washington’s unease, hinting at potential restrictions on Chinese models like DeepSeek to prevent a “Huawei Belt and Road” scenario in AI.

    On the US front, OpenAI’s announcement of a forthcoming open-weight model stole the spotlight. Sam Altman’s tease of a “powerful” release, free of Meta-style usage restrictions, sparked excitement. Gurley praised its defensive potential—leveling the playing field akin to Google’s Kubernetes move—while Gerstner tied it to OpenAI’s consumer-product focus, predicting it would bolster ChatGPT’s dominance. The hosts agreed this could counter China’s open-source momentum, though global competition remains fierce.

    OpenAI’s Mega Funding and Coreweave’s IPO

    The conversation turned to OpenAI’s staggering $40 billion funding round, led by SoftBank, valuing the company at $260 billion pre-money. Gerstner, an investor, justified the 20x revenue multiple (versus Anthropic’s 50x and X.AI’s 80x) by emphasizing ChatGPT’s market leadership—20 million paid subscribers, 500 million weekly users—and explosive demand, exemplified by a million sign-ups in an hour. Despite a projected $5-7 billion loss, he drew parallels to Uber’s turnaround, expressing confidence in future unit economics via advertising and tiered pricing.

    Coreweave’s IPO, meanwhile, weathered a “Category 5 hurricane” of market turmoil. Priced at $40, it dipped to $37 before rebounding to $60 on news of a Google-Nvidia deal. Gerstner and Gurley, shareholders, lauded its role in powering AI labs like OpenAI, though they debated GPU depreciation—Gurley favoring a shorter schedule, Gerstner citing seven-year lifecycles for older models like Nvidia’s V100s. The IPO’s success, they argued, could signal a thawing of the public markets.

    TikTok’s Tangled Future

    The episode closed with rumors of a TikTok US deal, set against the April 5 deadline and looming 54% China tariffs. Gerstner, a ByteDance shareholder since 2015, outlined a potential structure: a new entity, TikTok US, with ByteDance at 19.5%, US investors retaining stakes, and new players like Amazon and Oracle injecting fresh capital. Valued potentially low due to Trump’s leverage, the deal hinges on licensing ByteDance’s algorithm while ensuring US data control. Gurley questioned ByteDance’s shift from resistance to cooperation, which Gerstner attributed to preserving global value—90% of ByteDance’s worth lies outside TikTok US. Both saw it as a win for Trump and US investors, though China’s approval remains uncertain amid tariff tensions.

    Broader Implications and Takeaways

    Throughout, Gurley and Gerstner emphasized uncertainty’s chilling effect on markets and innovation. From tariffs disrupting capex to AI’s open-source race reshaping tech supremacy, the episode painted a world in flux. Yet, they struck an optimistic note: fear breeds buying opportunities, and Trump’s dealmaking instincts might temper the tariff storm, especially with China. As Gurley cheered his Gators and Gerstner eyed Stargate’s compute buildout, the BG2 Pod delivered a masterclass in navigating chaos with clarity.

  • How AI is Revolutionizing Writing: Insights from Tyler Cowen and David Perell

    TLDW/TLDR

    Tyler Cowen, an economist and writer, shares practical ways AI transforms writing and research in a conversation with David Perell. He uses AI daily as a “secondary literature” tool to enhance reading and podcast prep, predicts fewer books due to AI’s rapid evolution, and emphasizes the enduring value of authentic, human-centric writing like memoirs and personal narratives.

    Detailed Summary of Video

    In a 68-minute YouTube conversation uploaded on March 5, 2025, economist Tyler Cowen joins writer David Perell to explore AI’s impact on writing and research. Cowen details his daily AI use—replacing stacks of books with large language models (LLMs) like o1 Pro, Claude, and DeepSeek for podcast prep and leisure reading, such as Shakespeare and Wuthering Heights. He highlights AI’s ability to provide context quickly, reducing hallucinations in top models by over tenfold in the past year (as of February 2025).

    The discussion shifts to writing: Cowen avoids AI for drafting to preserve his unique voice, though he uses it for legal background or critiquing drafts (e.g., spotting obnoxious tones). He predicts fewer books as AI outpaces long-form publishing cycles, favoring high-frequency formats like blogs or Substack. However, he believes “truly human” works—memoirs, biographies, and personal experience-based books—will persist, as readers crave authenticity over AI-generated content.

    Cowen also sees AI decentralizing into a “Republic of Science,” with models self-correcting and collaborating, though this remains speculative. For education, he integrates AI into his PhD classes, replacing textbooks with subscriptions to premium models. He warns academia lags in adapting, predicting AI will outstrip researchers in paper production within two years. Perell shares his use of AI for Bible study, praising its cross-referencing but noting experts still excel at pinpointing core insights.

    Practical tips emerge: use top-tier models (o1 Pro, Claude, DeepSeek), craft detailed prompts, and leverage AI for travel or data visualization. Cowen also plans an AI-written biography by “open-sourcing” his life via blog posts, showcasing AI’s potential to compile personal histories.

    Article Itself

    How AI is Revolutionizing Writing: Insights from Tyler Cowen and David Perell

    Artificial Intelligence is no longer a distant sci-fi dream—it’s a tool reshaping how we write, research, and think. In a recent YouTube conversation, economist Tyler Cowen and writer David Perell unpack the practical implications of AI for writers, offering a roadmap for navigating this seismic shift. Recorded on March 5, 2025, their discussion blends hands-on advice with bold predictions, grounded in Cowen’s daily AI use and Perell’s curiosity about its creative potential.

    Cowen, a prolific author and podcaster, doesn’t just theorize about AI—he lives it. He’s swapped towering stacks of secondary literature for LLMs like o1 Pro, Claude, and DeepSeek. Preparing for a podcast on medieval kings Richard II and Henry V, he once ordered 20-30 books; now, he interrogates AI for context, cutting prep time and boosting quality. “It’s more fun,” he says, describing how he queries AI about Shakespearean puzzles or Wuthering Heights chapters, treating it as a conversational guide. Hallucinations? Not a dealbreaker—top models have slashed errors dramatically since 2024, and as an interviewer, he prioritizes context over perfect accuracy.

    For writing, Cowen draws a line: AI informs, but doesn’t draft. His voice—cryptic, layered, parable-like—remains his own. “I don’t want the AI messing with that,” he insists, rejecting its smoothing tendencies. Yet he’s not above using it tactically—checking legal backgrounds for columns or flagging obnoxious tones in drafts (a tip from Agnes Callard). Perell nods, noting AI’s knack for softening managerial critiques, though Cowen prefers his weirdness intact.

    The future of writing, Cowen predicts, is bifurcated. Books, with their slow cycles, face obsolescence—why write a four-year predictive tome when AI evolves monthly? He’s shifted to “ultra high-frequency” outputs like blogs and Substack, tackling AI’s rapid pace. Yet “truly human” writing—memoirs, biographies, personal narratives—will endure. Readers, he bets, want authenticity over AI’s polished slop. His next book, Mentors, leans into this, drawing on lived experience AI can’t replicate.

    Perell, an up-and-coming writer, feels the tension. AI’s prowess deflates his hard-earned skills, yet he’s excited to master it. He uses it to study the Bible, marveling at its cross-referencing, though it lacks the human knack for distilling core truths. Both agree: AI’s edge lies in specifics—detailed prompts yield gold, vague ones yield “mid” mush. Cowen’s tip? Imagine prompting an alien, not a human—literal, clear, context-rich.

    Educationally, Cowen’s ahead of the curve. His PhD students ditch textbooks for AI subscriptions, weaving it into papers to maximize quality. He laments academia’s inertia—AI could outpace researchers in two years, yet few adapt. Perell’s takeaway? Use the best models. “You’re hopeless without o1 Pro,” Cowen warns, highlighting the gap between free and cutting-edge tools.

    Beyond writing, AI’s horizon dazzles. Cowen envisions a decentralized “Republic of Science,” where models self-correct and collaborate, mirroring human progress. Large context windows (Gemini’s 2 million tokens, soon 10-20 million) will decode regulatory codes and historical archives, birthing jobs in data conversion. Inside companies, he suspects AI firms lead secretly, turbocharging their own models.

    Practically, Cowen’s stack—o1 Pro for queries, Claude for thoughtful prose, DeepSeek for wild creativity, Perplexity for citations—offers a playbook. He even plans an AI-crafted biography, “open-sourcing” his life via blog posts about childhood in Fall River or his dog, Spinosa. It’s low-cost immortality, a nod to AI’s archival power.

    For writers, the message is clear: adapt or fade. AI won’t just change writing—it’ll redefine what it means to create. Human quirks, stories, and secrets will shine amid the deluge of AI content. As Cowen puts it, “The truly human books will stand out all the more.” The revolution’s here—time to wield it.

  • Global Madness Unleashed: Tariffs, AI, and the Tech Titans Reshaping Our Future

    As the calendar turns to March 21, 2025, the world economy stands at a crossroads, buffeted by market volatility, looming trade policies, and rapid technological shifts. In the latest episode of the BG2 Pod, aired March 20, venture capitalists Bill Gurley and Brad Gerstner dissect these currents with precision, offering a window into the forces shaping global markets. From the uncertainty surrounding April 2 tariff announcements to Google’s $32 billion acquisition of Wiz, Nvidia’s bold claims at GTC, and the accelerating AI race, their discussion—spanning nearly two hours—lays bare the high stakes. Gurley, sporting a Florida Gators cap in a nod to March Madness, and Gerstner, fresh from Nvidia’s developer conference, frame a narrative of cautious optimism amid palpable risks.

    A Golden Age of Uncertainty

    Gerstner opens with a stark assessment: the global economy is traversing a “golden age of uncertainty,” a period marked by political, economic, and technological flux. Since early February, the NASDAQ has shed 10%, with some Mag 7 constituents—Apple, Amazon, and others—down 20-30%. The Federal Reserve’s latest median dot plot, released just before the podcast, underscores the gloom: GDP forecasts for 2025 have been cut from 2.1% to 1.7%, unemployment is projected to rise from 4.3% to 4.4%, and inflation is expected to edge up from 2.5% to 2.7%. Consumer confidence is fraying, evidenced by a sharp drop in TSA passenger growth and softening demand reported by Delta, United, and Frontier Airlines—a leading indicator of discretionary spending cuts.

    Yet the picture is not uniformly bleak. Gerstner cites Bank of America’s Brian Moynihan, who notes that consumer spending rose 6% year-over-year, reaching $1.5 trillion quarterly, buoyed by a shift from travel to local consumption. Conversations with hedge fund managers reveal a tactical retreat—exposures are at their lowest quartile—but a belief persists that the second half of 2025 could rebound. The Atlanta Fed’s GDP tracker has turned south, but Gerstner sees this as a release of pent-up uncertainty rather than an inevitable slide into recession. “It can become a self-fulfilling prophecy,” he cautions, pointing to CEOs pausing major decisions until the tariff landscape clarifies.

    Tariffs: Reciprocity or Ruin?

    The specter of April 2 looms large, when the Trump administration is set to unveil sectoral tariffs targeting the “terrible 15” countries—a list likely encompassing European and Asian nations with perceived trade imbalances. Gerstner aligns with the administration’s vision, articulated by Vice President JD Vance in a recent speech at an American Dynamism event. Vance argued that globalism’s twin conceits—America monopolizing high-value work while outsourcing low-value tasks, and reliance on cheap foreign labor—have hollowed out the middle class and stifled innovation. China’s ascent, from manufacturing to designing superior cars (BYD) and batteries (CATL), and now running AI inference on Huawei’s Ascend 910 chips, exemplifies this shift. Treasury Secretary Scott Bessent frames it as an “American detox,” a deliberate short-term hit for long-term industrial revival.

    Gurley demurs, championing comparative advantage. “Water runs downhill,” he asserts, questioning whether Americans will assemble $40 microwaves when China commands 35% of the global auto market with superior products. He doubts tariffs will reclaim jobs—automation might onshore production, but employment gains are illusory. A jump in tariff revenues from $65 billion to $1 trillion, he warns, could tip the economy into recession, a risk the U.S. is ill-prepared to absorb. Europe’s reaction adds complexity: *The Economist*’s Zanny Minton Beddoes reports growing frustration among EU leaders, hinting at a pivot toward China if tensions escalate. Gerstner counters that the goal is fairness, not protectionism—tariffs could rise modestly to $150 billion if reciprocal concessions materialize—though he concedes the administration’s bellicose tone risks misfiring.

    The Biden-era “diffusion rule,” restricting chip exports to 50 countries, emerges as a flashpoint. Gurley calls it “unilaterally disarming America in the race to AI,” arguing it hands Huawei a strategic edge—potentially a “Belt and Road” for AI—while hobbling U.S. firms’ access to allies like India and the UAE. Gerstner suggests conditional tariffs, delayed two years, to incentivize onshoring (e.g., TSMC’s $100 billion Arizona R&D fab) without choking the AI race. The stakes are existential: a misstep could cede technological primacy to China.

    Google’s $32 Billion Wiz Bet Signals M&A Revival

    Amid this turbulence, Google’s $32 billion all-cash acquisition of Wiz, a cloud security firm founded in 2020, signals a thaw in mergers and acquisitions. With projected 2025 revenues of $1 billion, Wiz commands a 30x forward revenue multiple—steep against Google’s 5x—adding just 2% to its $45 billion cloud business. Gerstner hails it as a bellwether: “The M&A market is back.” Gurley concurs, noting Google’s strategic pivot. Barred by EU regulators from bolstering search or AI, and trailing AWS’s developer-friendly platform and Microsoft’s enterprise heft, Google sees security as a differentiator in the fragmented cloud race.

    The deal’s scale—$32 billion in five years—underscores Silicon Valley’s capacity for rapid value creation, with Index Ventures and Sequoia Capital notching another win. Gerstner reflects on Altimeter’s misstep with Lacework, a rival that faltered on product-market fit, highlighting the razor-thin margins of venture success. Regulatory hurdles loom: while new FTC chair Matthew Ferguson pledges swift action—“go to court or get out of the way”—differing sharply from Lina Khan’s inertia, Europe’s penchant for thwarting U.S. deals could complicate closure, slated for 2026 with a $3.2 billion breakup fee at risk. Success here could unleash “animal spirits” in M&A and IPOs, with CoreWeave and Cerebras rumored next.

    Nvidia’s GTC: A $1 Trillion AI Gambit

    At Nvidia’s GTC in San Jose, CEO Jensen Huang—clad in a leather jacket evoking Steve Jobs—addressed 18,000 attendees, doubling down on AI’s explosive growth. He projects a $1 trillion annual market for AI data centers by 2028, up from $500 billion, driven by new workloads and the overhaul of x86 infrastructure with accelerated computing. Blackwell, 40x more capable than Hopper, powers robotics (a $5 billion run rate) to synthetic biology. Yet Nvidia’s stock hovers at $115, 20x next year’s earnings—below Costco’s 50x—reflecting investor skittishness over demand sustainability and competition from DeepSeek and custom ASICs.

    Huang dismisses DeepSeek R1’s “cheap intelligence” narrative, insisting compute needs are 100x what was estimated a year ago. Coding agents, set to dominate software development by year-end per Zuckerberg and Musk, fuel this surge. Gurley questions the hype—inference, not pre-training, now drives scaling, and Huang’s “chief revenue destroyer” claim (Blackwell obsoleting Hopper) risks alienating customers on six-year depreciation cycles. Gerstner sees brilliance in Nvidia’s execution—35,000 employees, a top-tier supply chain, and a four-generation roadmap—but both flag government action as the wildcard. Tariffs and export controls could bolster Huawei, though Huang shrugs off near-term impacts.

    AI’s Consumer Frontier: OpenAI’s Lead, Margin Mysteries

    In consumer AI, OpenAI’s ChatGPT reigns with 400 million weekly users, supply-constrained despite new data centers in Texas. Gerstner calls it a “winner-take-most” market—DeepSeek briefly hit #2 in app downloads but faded, Grok lingers at #65, Gemini at #55. “You need to be 10x better to dent this inertia,” he says, predicting a Q2 product blitz. Gurley agrees the lead looks unassailable, though Meta and Apple’s silence hints at brewing counterattacks.

    Gurley’s “negative gross margin AI theory” probes deeper: many AI firms, like Anthropic via AWS, face slim margins due to high acquisition and serving costs, unlike OpenAI’s direct model. With VC billions fueling negative margins—pricing for share, not profit—and compute costs plummeting, unit economics are opaque. Gerstner contrasts this with Google’s near-zero marginal costs, suggesting only direct-to-consumer AI giants can sustain the capex. OpenAI leads, but Meta, Amazon, and Elon Musk’s xAI, with deep pockets, remain wildcards.

    The Next 90 Days: Pivot or Peril?

    The next 90 days will define 2025. April 2 tariffs could spark a trade war or a fairer field; tax cuts and deregulation promise growth, but AI’s fate hinges on export policies. Gerstner’s optimistic—Nvidia at 20x earnings and M&A’s resurgence signal resilience—but Gurley warns of overreach. A trillion-dollar tariff wall or a Huawei-led AI surge could upend it all. As Gurley puts it, “We’ll turn over a lot of cards soon.” The world watches, and the outcome remains perilously uncertain.

  • The AI Revolution Unveiled: Jonathan Ross on Groq, NVIDIA, and the Future of Inference


    TL;DR

    Jonathan Ross, Groq’s CEO, predicts inference will eclipse training in AI’s future, with Groq’s Language Processing Units (LPUs) outpacing NVIDIA’s GPUs in cost and efficiency. He envisions synthetic data breaking scaling limits, a $1.5 billion Saudi revenue deal fueling Groq’s growth, and AI unlocking human potential through prompt engineering, though he warns of an overabundance trap.

    Detailed Summary

    In a captivating 20VC episode with Harry Stebbings, Jonathan Ross, the mastermind behind Groq and Google’s original Tensor Processing Unit (TPU), outlines a transformative vision for AI. Ross asserts that inference—deploying AI models in real-world scenarios—will soon overshadow training, challenging NVIDIA’s GPU stronghold. Groq’s LPUs, engineered for affordable, high-volume inference, deliver over five times the cost efficiency and three times the energy savings of NVIDIA’s training-focused GPUs by avoiding external memory like HBM. He champions synthetic data from advanced models as a breakthrough, dismantling scaling law barriers and redirecting focus to compute, data, and algorithmic bottlenecks.

    Groq’s explosive growth—from 640 chips in early 2024 to over 40,000 by year-end, aiming for 2 million in 2025—is propelled by a $1.5 billion Saudi revenue deal, not a funding round. Partners like Aramco fund the capital expenditure, sharing profits after a set return, liberating Groq from financial limits. Ross targets NVIDIA’s 40% inference revenue as a weak spot, cautions against a data center investment bubble driven by hyperscaler exaggeration, and foresees AI value concentrating among giants via a power law—yet Groq plans to join them by addressing unmet demands. Reflecting on Groq’s near-failure, salvaged by “Grok Bonds,” he dreams of AI enhancing human agency, potentially empowering 1.4 billion Africans through prompt engineering, while urging vigilance against settling for “good enough” in an abundant future.

    The Big Questions Raised—and Answered

    Ross’s insights provoke profound metaphorical questions about AI’s trajectory and humanity’s role. Here’s what the discussion implicitly asks, paired with his responses:

    • What happens when creation becomes so easy it redefines who gets to create?
      • Answer: Ross champions prompt engineering as a revolutionary force, turning speech into a tool that could unleash 1.4 billion African entrepreneurs. By making creation as simple as talking, AI could shift power from tech gatekeepers to the masses, sparking a global wave of innovation.
    • Can an underdog outrun a titan in a scale-driven game?
      • Answer: Groq can outpace NVIDIA, Ross asserts, by targeting inference—a massive, underserved market—rather than battling over training. With no HBM bottlenecks and a scalable Saudi-backed model, Groq’s agility could topple NVIDIA’s inference share, proving size isn’t everything.
    • What’s the human cost when machines replace our effort?
      • Answer: Ross likens LPUs to tireless employees, predicting a shift from labor to compute-driven economics. Yet, he warns of “financial diabetes”—a loss of drive in an AI-abundant world—urging us to preserve agency lest we become passive consumers of convenience.
    • Is the AI gold rush a promise or a pipe dream?
      • Answer: It’s both. Ross foresees billions wasted on overhyped data centers and “AI t-shirts,” but insists the total value created will outstrip losses. The winners, like Groq, will solve real problems, not chase fleeting trends.
    • How do we keep innovation’s spirit alive amid efficiency’s rise?
      • Answer: By prioritizing human agency and delegation—Ross’s “anti-founder mode”—over micromanagement, he says. Groq’s 25 million token-per-second coin aligns teams to innovate, not just optimize, ensuring efficiency amplifies creativity.
    • What’s the price of chasing a future that might not materialize?
      • Answer: Seven years of struggle taught Ross the emotional and financial toll is steep—Groq nearly died—but strategic bets (like inference) pay off when the wave hits. Resilience turns risk into reward.
    • Will AI’s pursuit drown us in wasted ambition?
      • Answer: Partially, yes—Ross cites VC’s “Keynesian Beauty Contest,” where cash floods copycats. But hyperscalers and problem-solvers like Groq will rise above the noise, turning ambition into tangible progress.
    • Can abundance liberate us without trapping us in ease?
      • Answer: Ross fears AI could erode striving, drawing from his boom-bust childhood. Prompt engineering offers liberation—empowering billions—but only if outliers reject “good enough” and push for excellence.

    Jonathan Ross’s vision is a clarion call: AI’s future isn’t just about faster chips or bigger models—it’s about who wields the tools and how they shape us. Groq’s battle with NVIDIA isn’t merely corporate; it’s a referendum on whether innovation can stay human-centric in an age of machine abundance. As Ross puts it, “Your job is to get positioned for the wave”—and he’s riding it, challenging us to paddle alongside or risk being left ashore.

  • The DeepSeek Revolution: Financial Markets in TurmoilA Sputnik Moment for AI and Finance

    The DeepSeek Revolution: Financial Markets in TurmoilA Sputnik Moment for AI and Finance

    On January 27, 2025, the financial markets experienced significant upheaval following the release of DeepSeek’s latest AI model, R1. This event has been likened to a modern “Sputnik moment,” highlighting its profound impact on the global economic and technological landscape.

    Market Turmoil: A Seismic Shift

    The unveiling of DeepSeek R1 led to a sharp decline in major technology stocks, particularly those heavily invested in AI development. Nvidia, a leading AI chip manufacturer, saw its shares tumble by approximately 11.5%, signaling a potential loss exceeding $340 billion in market value if the trend persists. This downturn reflects a broader market reassessment of the AI sector’s financial foundations, especially concerning the substantial investments in high-cost AI infrastructure.

    The ripple effects were felt globally, with tech indices such as the Nasdaq 100 and Europe’s Stoxx 600 technology sub-index facing a combined market capitalization reduction projected at $1.2 trillion. The cryptocurrency market was not immune, as AI-related tokens experienced a 13.3% decline, with notable losses in assets like Near Protocol and Internet Computer (ICP).

    DeepSeek R1: A Paradigm Shift in AI

    DeepSeek’s R1 model has been lauded for its advanced reasoning capabilities, reportedly surpassing established Western models like OpenAI’s o1. Remarkably, R1 was developed at a fraction of the cost, challenging the prevailing notion that only vast financial resources can produce cutting-edge AI. This achievement has prompted a reevaluation of the economic viability of current AI investments and highlighted the rapid technological advancements emerging from China.

    The emergence of R1 has also intensified discussions regarding the effectiveness of U.S. export controls aimed at limiting China’s technological progress. By achieving competitive AI capabilities with less advanced hardware, DeepSeek underscores the potential limitations and unintended consequences of such sanctions, suggesting a need for a strategic reassessment in global tech policy.

    Broader Implications: Economic and Geopolitical Considerations

    The market’s reaction to DeepSeek’s R1 extends beyond immediate financial losses, indicating deeper shifts in economic power, technological leadership, and geopolitical influence. China’s rapid advancement in AI capabilities signifies a pivotal moment in the global race for technological dominance, potentially leading to a reallocation of capital from Western institutions to Chinese entities and reshaping global investment trends.

    Furthermore, this development reaffirms the critical importance of computational resources, such as GPUs, in the AI race. The narrative that more efficient use of computing power can lead to models exhibiting human-like intelligence positions computational capacity not merely as a tool but as a cornerstone of this new technological era.

    DeepSeek’s Strategic Approach: Efficiency and Accessibility

    DeepSeek’s strategy emphasizes efficiency and accessibility. The R1 model was developed using a pure reinforcement learning approach, a departure from traditional methods that often rely on supervised learning. This method allowed the model to develop reasoning capabilities autonomously, without initial reliance on human-annotated datasets.

    In terms of cost, DeepSeek’s R1 model offers a significantly more affordable option compared to its competitors. For instance, where OpenAI’s o1 costs $15 per million input tokens and $60 per million output tokens, DeepSeek’s R1 costs $0.55 per million input tokens and $2.19 per million output tokens. This cost-effectiveness makes advanced AI technology more accessible to a broader audience, including developers, businesses, and educational institutions.

    Global Reception and Future Outlook

    The global reception to DeepSeek’s R1 has been mixed. While some industry leaders have praised the model’s efficiency and performance, others have expressed skepticism regarding its rapid development and the potential implications for data security and ethical considerations.

    Looking ahead, DeepSeek plans to continue refining its models and expanding its offerings. The company aims to democratize AI by making advanced models accessible to a wider audience, challenging the current market leaders, and potentially reshaping the future landscape of artificial intelligence.

    Wrap Up

    DeepSeek’s R1 model has not merely entered the market; it has redefined it, challenging established players, prompting a reevaluation of investment strategies, and potentially ushering in a new era where AI capabilities are more evenly distributed globally. As we navigate this juncture, the pertinent question is not solely who will lead in AI but how this technology will shape our future across all facets of human endeavor. Welcome to 2025, where the landscape has shifted, and the race is on.