Tag: AI 2026

Benedict Evans on Why AI Is Stuck in 1997: The Task vs the Job, Commodity Models, and Why the Jobs Apocalypse Is Overhyped
Benedict Evans, the former Andreessen Horowitz partner and independent analyst behind the annual “AI Eating the World” presentation, sat down with Lenny’s Podcast for what the host calls the most rational take on AI you will hear this year. Instead of either doom or hype, Evans argues that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile, which means we are living through something closer to 1997 than to the singularity. The conversation moves through the jobs question, the difference between a task and a job, whether the model labs have any pricing power, the anti-AI backlash, and what people should actually do. You can watch the full conversation on YouTube here.

TLDW

Evans frames AI as a platform shift on the scale of the internet or mobile, with the crucial twist that almost nothing has been built yet, so we are in the 1997 moment where confident predictions about winners are usually wrong. He introduces his central tool, the distinction between the task and the job, to explain why “X percent of this profession is exposed to AI” studies are misleading, why the AI labs are paradoxically hiring forward deployed engineers and buying consultancies, and why accountants kept multiplying through every wave of automation (the lump of labour fallacy and Jevons paradox at work). On value capture he makes a deterministic bet that foundation models have no network effects, behave like a commodity, and will look more like cloud than like Windows, with the value moving up the stack to applications, much as it did in telecom, where a trillion-dollar industry grew data traffic thousands of times over while its stocks went nowhere. He covers distribution as the real moat, Apple Intelligence as the most compelling unshipped vision, the fuzzy anti-AI backlash (including the largely fake water panic and the very real harms of deepfakes), raising kids under radical uncertainty, and closes with the disarming admission that his own synthesis-heavy job is exactly the kind AI is currently worst at. His advice: presume radical uncertainty, dive in rather than sneer, and assume it will probably be okay.

Thoughts

The most useful thing in this conversation is a single question Evans keeps returning to: what is the task, and what is the job? A spreadsheet automated the arithmetic an accountant does, and the number of accountants went up for the next forty years. Claude Code can write the code, but deciding what to build, for whom, and why is the part nobody has automated. The reason the “this profession is X percent exposed to AI” studies feel hollow is that they assume a job is a neat stack of separable tasks. Evans argues, by analogy to the old expert-systems failure, that you simply cannot decompose a senior lawyer’s work that way. The 75-slide deck is the task. Walking your company, reading its politics, talking to your customers, and telling you the uncomfortable truth is the job, and that is what you actually paid McKinsey for.

The boldest and most falsifiable claim is that the foundation-model companies look more like cloud than like Windows. No network effects means no winner-take-all, which means durable competition, which means commodity pricing and compressed margins, with the real value accruing up the stack in applications that nobody at the labs is going to build. His telecom analogy is the one to sit with. A trillion-dollar industry grew mobile data traffic by 1,500 to 2,000 times in fifteen years, and the stocks went nowhere for a quarter century, because it was a low-margin utility while all the interesting value moved to Apple and the people building apps on top. If he is right, the current token-burn economics, the person reportedly spending 1.5 million dollars a month on tokens, are the 2010 equivalent of a 50,000 dollar roaming bill, not the steady state. Evans flags openly that he could be completely wrong, which is the intellectually honest part and the part most forecasters skip.

“It depends” and “it will probably be okay” sound like evasions, and Evans leans into that. But the 1997 framing is doing real work. The point is not that AI is small, it is that the things that will end up mattering have not been built, and that anyone confidently naming the winners today is repeating the 1997 mistake of betting on Excite over a search company with a weird logo. The discipline he is selling is to presume radical uncertainty and act anyway, because the alternative, declaring the whole thing slop and shouting about it online, buys a great feeling of moral superiority and nothing else. His repeated insistence that you can see the job that goes away but never the new job, because it does not exist yet, is the load-bearing idea under his optimism.

The most disarming moment is the closing AI-corner answer, where the person whose entire brand is explaining AI admits he struggles to use it. His work is synthesis and precise information retrieval, and precise retrieval happens to be exactly what today’s models are worst at. He is, in his own words, the lawyer looking at VisiCalc: it is obviously transformative, and he just does not happen to make spreadsheets all day. That admission is worth more than any benchmark, because it locates the real variable. How much AI changes your life depends less on how good the model gets and more on whether your daily work sits on the part of the jagged frontier where it already works. That is a far more practical lens than arguing about whether AGI arrives in three years or thirty.

Key Takeaways
- Evans’s headline opinion is that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile. Both halves of that sentence matter.
- If you make the internet comparison honestly, we are roughly in 1997: very exciting, most of it does not work yet, most of what people will build has not been built, and it is unclear how any of it will end up working.
- Adoption is spread across a very wide distribution. Even among teenagers, only something like 15 to 20 percent are daily active users and another 20 percent weekly, with the majority saying they do not use it at all.
- That spread maps onto the “jagged frontier” question of where AI works, where it does not, whether you can predict where it will work in advance, and whether you can even tell after the fact.
- Software developers are the accountants seeing VisiCalc: for them everything has already changed. Most other professions are watching, intrigued but unsure what to do with it.
- The AI labs are investing heavily in forward deployed engineers, consultancies, and professional services. Evans jokes that a forward deployed engineer is an Accenture outsourced developer who lives in San Francisco.
- Companies do not have spare people sitting around to reimagine every internal workflow, so reinventing a business around AI is itself a project that needs consultants, which is why the most cutting-edge labs are funding exactly the firms everyone assumed AI would kill.
- The central framework: separate the task from the job. Sometimes the task is the job (the elevator operator pressing a lever), and automating the task ends the job. Far more often, the task is only part of the job.
- Amazon gets you the SKU once you know which SKU you want. Knowing which one to buy is a different job. Claude Code writes the code, but knowing what code and what features to build is the job.
- A McKinsey or Bain engagement is not really about the deck. The deck is the task. The job is walking your enterprise, understanding the politics, talking to your customers, and telling you the truth.
- The Jevons paradox is just price elasticity applied to labour. Make something cheaper to produce and you usually do far more of it, not the same amount with fewer people.
- Excel did not give investment bankers shorter hours. iPhone SDKs did not shrink the number of engineers even though Apple writes 90 percent of the code for you. The number of accountants rose through every wave of automation.
- The lump of labour fallacy: since 1800, each technology automates jobs and unlocks new ones. You can always see the job that disappears and never the new job, because it does not exist yet.
- Evans is wary of argument from authority on jobs. He wants Dario Amodei’s view on where models go in the next 6 to 12 months, not necessarily his theory of labour markets and comparative advantage.
- The doomer scenario of every company buying ChatGPT and firing everyone in two weeks misunderstands how enterprises work. Enterprise sales cycles run 18 months or more. Nobody is ripping out SAP overnight. The full transformation takes 3 to 10 years, sector by sector.
- AGI and superintelligence are being quietly redefined to mean whatever works now. Larry Tesler’s theorem: AI is whatever machines cannot do yet, because once they can, people call it just software.
- We have no theory of human intelligence, no theory of why these models work, and no theory of how much better they will get, so everyone is vibes-forecasting. Even if progress stopped tomorrow, what exists is already transformative and will roll out for a decade.
- On value capture, Evans argues models show no network effects, so no single one runs away with the market. Persistent competition plus little real product differentiation means little pricing power.
- Sam Altman’s pitch of selling intelligence on a meter like electricity ignores the brutal margin structure of utilities. Your TV maker does not pay the power company a cut of your bill.
- The telecom analogy: a roughly trillion-dollar mobile industry spends 15 to 20 percent of revenue on capex, grew data consumption 1,500 to 2,000 times since 2010, and its stocks went nowhere for 25 years because it is a low-margin commodity utility.
- The elemental question: does the model do the whole thing, or does it need thousands of different apps built by different people? If it needs apps, the labs cannot build them all, just as Microsoft did not, so it looks more like AWS than like Windows.
- If the product is a commodity, distribution becomes the moat. Google pushes Gemini through its surfaces, Meta sprayed AI across its apps and quietly ranked between ChatGPT and Gemini in usage, and incumbents with distribution have a structural edge.
- Browsers are the warning: Microsoft used distribution to win the browser war, then it turned out winning browsers did not matter because the value was further up the stack.
- Apple Intelligence, as shown at WWDC 2024, was the most compelling vision of a personal AI assistant Evans has seen. Apple could not ship it, but neither could anyone else, because tool-using on-device agents with no hallucinations across thousands of apps is genuinely hard.
- The model is “the dumb thing underneath” that powers a feature. The same commodity model can sit beneath both Gemini on Android and Apple Intelligence on iOS while the products and distribution differ entirely.
- The anti-AI backlash is a big fuzzy mess. Some is real (local electricity bills, deepfakes, real job anxiety), some is sort of true, and some is simply false.
- The data-center water panic is largely fake. A Livermore lab study put US data-center water consumption at about 0.017 percent of US water use. Local well conflicts are planning problems, not data-center problems.
- We have shockingly little hard data. The model labs do not publish meaningful usage numbers. There is no public daily active user figure for ChatGPT, so economists are reverse-engineering effects from government surveys.
- Real new harms do appear with each wave. A teenager could not use Photoshop to make explicit fakes of every classmate and send them to the whole school in an afternoon. Now they can, and turn them into video.
- The UK Post Office Horizon scandal (buggy Fujitsu software wrongly showing cash shortfalls, leading to prosecutions, bankruptcies, and suicides) is a reminder that every technology brings new ways to ruin lives, by malice or by accident.
- You cannot reliably predict what gets exposed. In 1997 people thought taxis were safe from the internet and newspapers would be fine. The opposite happened. Today, “AI-proof” jobs like personal trainer may not be as safe as they look.
- Uber and Airbnb show that similar-sounding companies can have very different market impact. Uber demolished and then grew the taxi market, while Airbnb’s effect on hotels was fairly marginal because business travel still wants a hotel.
- Every new technology first lets you do the old thing but more, then unlocks things that were not possible before. Recorded music revenue is U-shaped: first “what if I do not pay 15 dollars for a CD,” then “what if 15 dollars a month gives me all the music there is.” Spotify is not an online music store, it is something else.
- Coding was supposed to be one of the last things automated, and instead it is the most transformed role of all, which is itself a lesson in how badly we predict exposure.
- Practical advice: do not stick your head in the sand. Dive in, submerge yourself, and come out understanding what you can do with it. Going into a shrinking job market announcing you will never use AI is not the right posture.
- Evans’s honest coda: he struggles to find AI use cases because his job is synthesis and precise retrieval, the things models are worst at. He uses it for proofreading, images, redecorating his apartment, and dictation. He is the lawyer looking at VisiCalc.
Detailed Summary

AI is as big as the internet, and we are living in 1997

Evans opens with the opinion he calls his most controversial: AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile. To some in tech that sounds dismissive, as if he is underrating a once-in-history event. His reply is that smartphones and the internet were themselves enormous, and we are talking over the internet right now. The deeper point is the comparison’s timing. If this is like the internet, then it is like the internet in 1997: thrilling, but most of it does not work yet, most of what will be built has not been built, and nobody knows how the pieces will fit. His latest 80-slide presentation, he jokes, is essentially 80 ways of saying “we do not know,” which is partly facetious and partly the entire point.

The jagged frontier and the wide spread of adoption

Adoption is not uniform, it is a wide distribution. Some people in tech have bought clusters of Mac minis and stopped using Google, while most people outside tech who use AI at all touch it once every week or two. Even among 13 to 18 year olds, daily active use sits around 15 to 20 percent, weekly use adds another 20 percent, and roughly 60 percent say they do not use it. That spread maps onto what Evans calls the jagged frontier: whether a given task works, whether you can predict in advance that it will work, whether it is intuitive, and whether you can even tell after the fact. Software developers are the accountants who just saw VisiCalc, living in a clear before-and-after. Everyone else is somewhere on the curve, picking it up to varying degrees and a little puzzled about what it is for.

Why the AI labs are buying consultancies

One of the most counterintuitive trends is that the leading labs are pouring money into forward deployed engineers and professional services, the very category many assumed AI would erase. Evans’s explanation is grounded in how companies actually operate. Firms do not keep spare people sitting around to redesign stores, hunt down churn, or rebuild a tech stack, which is exactly why they hire Bain, BCG, McKinsey, Accenture, or Infosys when a big project appears. Reimagining every internal workflow around AI, then actually plugging vertical and horizontal systems together and retraining people, is itself a multi-month project requiring people you do not have. So the work gets outsourced, and the most advanced labs are funding the firms that do it. His joke lands the point: a forward deployed engineer is a statistician, or an Accenture developer, who happens to work in San Francisco.

The task versus the job

This is the spine of the conversation. Ask what the hard part of a job really is. Sometimes the task is the job: the elevator attendant’s whole job was driving the car, the task got automated, the job ended. Much more often the visible task is only a slice. Amazon gets you the SKU once you know which SKU you want, but knowing what to buy is a separate job. Claude Code writes the code, but deciding what to build, for whom, and how to take it to market is the job. A consulting deck is the task, while the reason you pay Bain is for them to walk your company, understand its politics, talk to your customers, and tell you the truth. Evans notes you can already generate a bad McKinsey deck with AI, and the LinkedIn grifters who do are missing that the deck was never the thing you were buying.

Jevons paradox and the lump of labour fallacy

The Jevons paradox is just price elasticity applied to labour: make something cheaper to do and you usually do much more of it. Excel did not hand junior bankers their Friday afternoons off, it expanded the work. iPhone developers write a fraction of the raw code because Apple wrote the drivers and file system, and there are not a tenth as many engineers, there are far more. The count of accountants climbed through adding machines, punch cards, mainframes, databases, ERP, spreadsheets, and cloud. The lump of labour fallacy is the broader version: since 1800 every technology has removed jobs and unlocked new ones, the removed jobs usually look bad in hindsight, the new ones tend to be better, and GDP keeps rising. You can always see the job that disappears and never the one that does not exist yet.

The jobs question, Dario, and the enterprise sales cycle

On the coming jobs apocalypse, Evans is cautious about argument from authority. Running an AI lab makes Dario Amodei worth listening to on where models go in the next 6 to 12 months, not necessarily on labour economics and comparative advantage. The doomer image of companies buying ChatGPT and firing everyone within weeks misreads reality: enterprise sales cycles run 18 months or longer, nobody is tearing out SAP overnight, and the full transformation will take 3 to 10 years, sector by sector, as people slowly work out what to do. He points to the lag in software itself. Many SaaS companies founded the day before ChatGPT launched could have been built a decade earlier, and were not, because the delay was someone realizing a problem existed and that this was the way to solve it.

Redefining AGI and superintelligence

Evans is skeptical of the moving terminology. He cites Larry Tesler’s line that AI is whatever machines cannot do yet, because the moment they can, people call it just software. Machine learning, image recognition, and sentiment analysis all got reclassified as not really AI once they worked, the same way jet airliners were once high technology and are now just planes. AGI is now often quietly redefined as doing some percentage of economically valuable work, which a 1975 mainframe also did, rather than anything about consciousness or a soul. Whether we reach human-level intelligence is, in his view, genuinely unknowable right now. The reassuring point is that you do not need to resolve it. Even if models hit a brick wall tomorrow, what already exists is transformative and will take a decade to deploy.

Where the value accrues: commodity models and the telecom analogy

Here Evans makes his most deterministic argument. Foundation models appear to lack network effects, so no single model runs away from the pack, competition persists, and product differentiation as users experience it is thin. Without differentiation or lock-in, where does pricing power come from? He skewers Sam Altman’s image of selling intelligence on a meter like electricity by pointing out that utilities have terrible margins and nobody pays the power company a cut of their TV. His telecom career supplies the analogy: mobile is a roughly trillion-dollar industry that spends 15 to 20 percent of revenue on capex, grew data traffic 1,500 to 2,000 times since 2010, and whose stocks went nowhere for 25 years because it is a low-margin commodity utility while the value sits up the stack with Apple and the app makers. If models are commodities and the real product is thousands of apps the labs will not build, the outcome looks like cloud, not like Windows.

Distribution as the moat

If the product is a commodity, distribution decides the winners. The web browser is the cautionary tale: the browser product is a thin wrapper around a rendering engine, tab browsing was the last real innovation 20-plus years ago, Microsoft used distribution to win, and then winning browsers turned out not to matter because the value was elsewhere. Now Google drives Gemini through its surfaces and Meta sprayed AI across its apps and, in survey data, sat between ChatGPT and Gemini in usage despite tech writing it off. An adequate product with great distribution and brand becomes a big deal, which is why OpenAI spent last year trying everything to build a flywheel before the giants defaulted everyone onto their own offering. The power of the default and sheer inertia do a lot of work.

Apple Intelligence and the model as the dumb thing underneath

Evans calls the Apple Intelligence segment of WWDC 2024 the most compelling vision of a personal AI assistant he has seen: tool-using, on-device, agentic, with no prompt injection or hallucinations across a standardized API spanning thousands of apps. Apple could not ship it, but neither could anyone else, because that is genuinely hard. The episode illustrates his framing that the model is “the dumb thing underneath” that powers a feature. The same commodity model can sit beneath Gemini intelligence on Android and Apple Intelligence on iOS, with different products, different distribution, and different decisions about what the feature should be. Apple has a billion edge-capable devices, while Google’s “coming soon to our most powerful devices” really means it will not work on most Android phones.

The anti-AI backlash, water, and real harms

The backlash, Evans says, is a big fuzzy mess of very different things. Some is tangible, like a higher local electricity bill in a small number of places. Some is essentially fake, like the water panic. He dug into a Livermore lab study putting US data-center water use at about 0.017 percent of national consumption. Local well conflicts are planning failures, not data-center failures. The jobs piece is genuinely unresolved, with charts pointing both ways and a youth employment slowdown that shows up regardless of degree or AI exposure. He stresses how little hard data exists, since the labs publish no meaningful usage numbers and there is no public daily active user figure for ChatGPT. He compares the moment to the social media backlash, compressed, where some fears were true, some half true, and some simply false. The real new harms are real, though: deepfakes let a teenager generate explicit fakes of an entire school in an afternoon, and the UK Post Office Horizon scandal shows how buggy software plus institutional denial can destroy lives.

You cannot predict what gets exposed, and what to actually do

Evans dismisses the O*NET-style exercise of scoring what percentage of each profession AI can do as deluded, the modern version of the expert-systems problem, where you try to describe a job as 700 logical steps and it never works. You cannot say a senior partner’s work is 17 percent automatable. The history of prediction is humbling: in 1997 people thought taxis were safe from the internet and newspapers would simply save on printing, and both were wrong. Coding, supposedly one of the last things to automate, became the most transformed role of all. Personal trainers might be next once your phone can watch your form. His closing advice is to presume radical uncertainty and act anyway: do not retreat into sneering moral superiority, dive in, internalize what the tools can do, and make yourself a great hire. He ends with a candid admission that his own synthesis-and-retrieval job is exactly what AI is currently worst at, so he is the lawyer looking at VisiCalc, sure it changes everything while not personally making spreadsheets all day.

Notable Quotes

“My most controversial opinion is that I think that AI is as big a deal as the internet or mobile, and only as big a deal as the internet or mobile.”
Benedict Evans, stating the thesis that frames the whole conversation

“If you’re going to make the internet comparison, it’s like we’re in 1997. It’s very exciting. Most stuff kind of doesn’t work yet. Most of the stuff that people are going to do hasn’t been built yet.”
Benedict Evans, on why confident predictions about AI winners are usually wrong

“You can’t look at a senior partner at a law firm and say, well, 17 percent of their work could be automated. This is horseshit.”
Benedict Evans, on why O*NET-style job-exposure scoring fails

“Claude Code can write you the code, but what code do you want? It can make you the features, sure, but what features do you want? Who’s your customer? What’s the right product for that customer?”
Benedict Evans, drawing the line between the task and the job

“There’s this quote from Sam Altman where he said we’re going to be selling AI intelligence on a meter like water or electricity, and you look at this and think, my dear sweet child, you need me to explain the margin structure of the utility industry to you.”
Benedict Evans, on why model labs may lack pricing power

“The model is just the dumb thing underneath that powers the feature. The model is the commodity that powers different decisions about what the feature should be.”
Benedict Evans, on why value moves up the stack to applications

“Every time we have a new technology it automates away a bunch of jobs, and then that automation unlocks a bunch of new jobs, and you don’t know the new job because it doesn’t exist yet.”
Benedict Evans, on the lump of labour fallacy and 200 years of automation

“Don’t stick your head in the sand and say I hate all of this stuff. That gives you a great feeling of moral superiority, but that’s not going to help. What helps is you diving into this and coming out understanding what you can do with it.”
Benedict Evans, on what to actually do about AI right now

“AI is good at stuff that computers are bad at, and bad at stuff that computers are good at.”
Benedict Evans, quoting an observation that explains why he struggles to use AI in his own work

This is a curated set of pulls, not a transcript. To hear the full argument in context, including the telecom and recorded-music charts and the lightning round, watch the full conversation on YouTube here.

Related Reading
- Benedict Evans (ben-evans.com) the primary source for his weekly newsletter and the “AI Eating the World” presentations referenced throughout.
- Jevons paradox (Wikipedia) the price-elasticity idea that anchors his argument about why cheaper output tends to expand work rather than shrink it.
- Why Software Is Eating the World, by Marc Andreessen the original thesis Evans builds on when he talks about ever-larger addressable markets.
- The British Post Office (Horizon) scandal (Wikipedia) the Fujitsu software failure he cites as proof that every technology brings new ways to ruin lives.
- Tesler’s theorem (Wikipedia) “AI is whatever hasn’t been done yet,” the line behind his point about constantly redefining AGI.
June 1, 2026
Vibe Coding Hardware: Naval, Guillermo Rauch, Blake Scholl, and Max Hodak on AI-Designed Jet Engines, Vertical Integration, China’s Open-Source Bet, and Why Humans Become Verifiers
This is part two of Naval Ravikant’s conversation with frontier founders Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science. Where the first part argued that you should waste tokens to save time and that the job of an engineer is now to build the factory rather than the output, this segment drags that thesis out of pure software and into atoms. The question on the table is what happens to hardware when models can vibe code the spreadsheets, the simulations, and eventually the step files and PCB layouts that aerospace, semiconductors, and biotech are built on. This segment is one half of the discussion, and you can watch and read the full episode here. The full conversation is on the Naval Podcast YouTube channel.

TLDW

Blake Scholl describes how Boom Supersonic took hardware engineering workflows that used to live in siloed Excel spreadsheets and VBScript on individual laptops, with handoffs done by email like it was the 1990s, and turned them into versioned, testable software. The new model is that software engineers build the architectures and the tools while hardware engineers vibe code their own domain-specific pieces, which collapsed a turbine-blade analysis that once took one engineer one day per blade into something where two engineers can design an entire jet engine in real time. Naval generalizes this into the cataclysm of enterprise software: there is no longer a startup that can sell you hardware collaboration tools because companies just code the exact thing they need on demand, and even spreadsheets are cooked because they only existed as a proxy for custom software nobody could previously afford to build. Blake predicts that within 2026 AI will move from generating software to generating step files and PCB layouts, which reshapes mechanical and electrical engineering. The group debates China’s open-source push as a way to neutralize Silicon Valley’s software advantage and protect its hardware and supply-chain superiority, lands on the point that if you fall behind on generating software you fall behind on generating everything, and Guillermo notes that frontier coding intelligence still dominates real usage while cheaper models like Gemini win at scale for support and browser automation. Max Hodak explains Science’s vertical integration, including a captive MEMS foundry on the East Coast, because the most innovative hardware cannot be bought off the shelf, and argues that software still needs hands since a model that cannot make physical things hits real boundaries. The conversation closes on the shift from writing to verifying: junior engineering got absorbed by agents while juniors got promoted, the same way paralegals could be seen as fired or promoted, and humans across law, engineering, and operations are becoming the verifiers who sign off on systems they did not write line by line.

Thoughts

The most important shift in this segment is that vibe coding stops being a software-industry story and becomes a deep-tech story. In part one the examples were Postgres, ClickHouse, and deploy targets. Here Blake Scholl is talking about turbine blades that change shape when they heat up, and the brutal fact that converting between cold and hot geometry, and between aerodynamics and structures, used to eat one engineer for one full day per blade in an engine that has a thousand blades. That is the kind of math that quietly kills ambition. When he says two engineers can now design an entire jet engine because the structural and aerodynamic results update in real time as you change the geometry, that is not a productivity improvement, it is a change in what a small team is allowed to attempt. The interesting move is the division of labor: software engineers build the architecture and the framework because they understand systems and separation of concerns, and the hardware engineers vibe code the pieces only they understand. Nobody has to become both.

Naval’s “cataclysm of enterprise software” is the most investable idea in the episode, and it is darker than it sounds for anyone selling B2B tools. His claim is that the entire category of internal collaboration software is being eaten from the inside, because a company that can generate exactly the tool it needs on any given day will not pay a vendor for an approximation of that tool. His follow-on that even spreadsheets are cooked is the sharpest version of the point. The spreadsheet won for forty years precisely because it was the closest thing to custom software that a non-programmer could produce. Remove the constraint that custom software is expensive and the spreadsheet loses its reason to exist. The counterweight, which the group raised in part one with the block-economy thesis, is that the infrastructure primitives agents reach for get more valuable, not less. So the safe place to build is not the collaboration layer on top, it is the primitive underneath.

The China discussion is the geopolitical center of the conversation and it lands on a genuinely uncomfortable insight. The argument is that China leans into open-source models not only because it is a model or two behind, but because open weights neutralize Silicon Valley’s software advantage and let China lean on what it already dominates: hardware, supply chains, and component ecosystems. If software can be generated on demand from open models, then the country with the factories wins the stack. The sharpest line is that if you fall behind on the ability to generate software, you fall behind on the ability to generate everything, because software is now upstream of every hardware pipeline. That reframes the open-versus-closed debate as a question about who controls the means of producing the means of production. It also quietly flatters the American frontier labs, since the same logic says self-improvement requires frontier coding models, and on that narrow axis the consensus at the table is that the Chinese models are not yet in the race.

Max Hodak provides the necessary cold water, and it is the most grounding contribution in the episode. Everyone else is describing software eating the design layer, and Max points out that you still have to make the thing. Science owns a captive MEMS foundry on the East Coast not as a flex but because there was no other way to do the packaging and assembly for products that approach a single block of covalently bonded matter. His framing that the software still needs hands is the real boundary condition on all the AI-eats-everything talk: a model can be smarter than every engineer in the building and still be unable to deposit a layer, bond a wafer, or pass a regulatory inspection. The optimistic version, which he also makes, is that he has instrumented the foundry so that as models improve, the gains show up immediately in cell engineering and material science. The pessimistic reading is that the physical world remains a hard rate limiter, and the companies that own the atoms will capture more of the surplus than the companies that only own the bits.

The closing thread on verification is where the whole conversation resolves into a job description for humans. Guillermo’s point that the biggest problem in software is mountains of slop arriving as a pull request, and that the answer is not pretending to read every line but being able to say “I am signing off on the consequences of this PR, and I wrote the harness, the simulations, the proofs, and the type checkers that let me,” is the most practically useful idea in the episode. It generalizes cleanly. The lawyer you trust is not the one who wrote every clause by hand, it is the one putting their reputation on the line that the document is sound. The production engineer who gets paged at 3am is the one signing off that the system is safe to ship. As models absorb the junior tier of every knowledge profession, the surviving human role is the verifier who carries the accountability. That is a promotion for the people who can hold it and an extinction event for the people whose value was doing the work nobody now needs done by hand.

Key Takeaways
- The factory framing from part one carries straight into hardware: you are judged on whether you build the system that produces multiplicative outputs, not on the single artifact, and the real multiplier was always 100x or 1000x, not 10x.
- AI completely changes the role of software and hardware developers rather than just speeding either one up.
- A huge amount of hardware engineering lives in complex Excel spreadsheets and VBScript on individual engineers’ laptops, with no source control, no automated testing, and handoffs done manually over email. It is software that is not treated as software.
- Boom Supersonic’s move from day one was to turn traditional hardware engineering workflows into real software frameworks that are automatable and repeatable, to drive down the cost of iteration.
- The old bottleneck was never being able to afford enough software engineers to build those frameworks. AI removes that constraint.
- The new model: software engineers create the architectures because they understand systems, algorithms, and separation of concerns, and hardware engineers vibe code the domain pieces only they understand.
- A turbine blade is cold when it starts and hot when it runs, so it changes shape, and you must design both the cold and hot geometry across aerodynamics and structures. Classically that was one engineer, one day, for one blade, in an engine with a thousand blades.
- With software and hardware people combined, you can now change blade geometry and see the structural and aerodynamic results in real time, which lets two engineers design an entire jet engine.
- Naval’s cataclysm of enterprise software: no startup can sell hardware collaboration tools anymore because companies just code the exact thing they need at any given time.
- Even spreadsheets are cooked. Spreadsheets won only because nobody could build custom software, so a spreadsheet full of VBScript was the closest available approximation. Remove the cost barrier and the approximation loses.
- Engineers are moving from Excel to Python models that produce believable simulations of physical systems.
- AI can generate software today, but within 2026 it is expected to generate step files and PCB layouts, which opens up mechanical and electrical engineering as the next frontier.
- The hardware software boon is biggest for small gadget and parts companies that historically shipped bad software because they could not afford good software. Now they can ship good-enough software, or skip the human front end entirely and expose hardware agentically for voice and agent control.
- China goes all in on open-source models partly to neutralize Silicon Valley’s software edge: if software can be generated on demand from open weights, China’s hardware and supply-chain superiority stops being offset by a software disadvantage.
- Other reasons cited for China’s open-source push: it is a model or two behind, it is distilling models, and the government has a history of funding efforts that lift the whole ecosystem, especially in network-effect businesses.
- Open-source heft is coming almost entirely from China. OpenAI is not open, Grok publishes models but is seen as a model or two behind, Google’s local models are not very competitive, and Anthropic is not known for open-source releases.
- Without frontier coding models you do not get self-improvement, and if you fall behind on generating software you fall behind on generating everything, because software now sits upstream of every hardware pipeline.
- Real AI gateway usage shows open models do get used, but the top is heavily dominated by frontier intelligence.
- Frontier intelligence at the right cost and performance slaps at scale. Gemini models are underrated and excel as industrial production models for support tasks and browser automation, even if they are not the top pick for coding.
- For pushing the frontier you need the best possible coding model, which is now only two or three models, and the Chinese models are not among them.
- One contrarian view at the table: use DeepSeek for 97% of tasks because it is cheap, run it repeatedly for harder problems, and reserve frontier models for the most advanced work. The counterargument: intelligence is an unalloyed good, mistakes are invisible and costly, and a smarter model is always cheaper than a person, so you default to the most intelligent option.
- Always wanting the most intelligent model risks creating a monopoly or oligopoly in AI, because when two models disagree you cannot tell which is right, so you trust the smarter one and stop asking the weaker one.
- Vertical integration is forced, not chosen: if you cannot buy it, you have to make it. The preference is always to buy when a vendor offers a service at a great price, like PCBs from Asia.
- The closer a product gets to a single block of covalently bonded matter, the better it performs: lower power, smaller, higher performance, longer lasting. The components for that level of integration simply are not available to buy.
- Science owns a captive MEMS foundry on the East Coast, bought because there was no other way to do the packaging and assembly the company needed.
- One of the biggest near-term AI impacts inside hardware companies is regulatory and documentation work: tracing which of thousands of ISO standards apply used to occupy a regulatory and quality team for months, and now AI just knows.
- Software still needs hands. A model can be smarter than us and still hit real boundaries if it cannot physically make things, which is why Science has instrumented its foundry so model improvements show up immediately in cell engineering and material science.
- Basic legal work is already going away. People have stopped asking lawyers for NDAs and routine agreements, because law is spaghetti code in English with no real APIs, and the basic tasks are handled by AI.
- Junior engineers got promoted to senior engineers while junior engineering itself got taken over by agents. The same framing applies to paralegals: fired, or promoted to senior lawyers who now spend their time thinking about the law.
- What you value in a lawyer is a trusted authority who puts their reputation on the line, not someone who read every clause. The same trust model is coming to engineering.
- The biggest problem in software engineering today is mountains of slop arriving as a pull request. The old norm of reading every line of a PR is gone.
- The new standard is being able to say “I understand and I am signing off on the consequences of this PR,” backed by the test harness, simulations, proofs, and type checkers you built, even without reading every line.
- Embrace a world where code is spaghetti you do not fully understand, but build the evaluators that give confidence, and rely on production engineers to sign off because someone gets paged if the system goes down.
- Creating software is easy from zero to one. The hard part is a thousand days from now: is it secure, tested, production grade, and performant, and are you still motivated to invest the tokens to maintain it in prod?
- Humans are becoming verifiers. The same way models are trained on good verification data, the old functions of lawyers, engineers, and operations people are moving to verifying the stack and standing behind it.
Detailed Summary

Turning Hardware Engineering Into Software

Blake Scholl opens by describing how AI completely changes the role of software and hardware developers at Boom Supersonic. From day one the company tried to take traditional hardware engineering workflows and turn them into software. For anyone who has not been around hardware engineering, he explains that an enormous amount of it happens in complex Excel spreadsheets on individual engineers’ laptops, sometimes with VBScript code, all of which is actually software but is not treated as software. There is no source control, no automated testing, and when an aerodynamicist hands work to a structures engineer it is done manually with a spreadsheet over email, like it is the 1990s. Boom started building software frameworks to automate and make those flows repeatable so the cost of iteration would drop, but progress was slow because the company could never afford enough software engineers.

Two Engineers, One Jet Engine

The mind-blowing change, in Blake’s words, is a new division of labor. Software engineers create the architectures because they understand systems, algorithms, and separation of concerns, and then hardware engineers vibe code the pieces that draw on what they uniquely know about hardware. The result is wildly different productivity for small teams. His example is the turbine blade: it starts cold and gets bigger as it heats up in operation, so you have to design both the cold shape and the hot shape, converting between them and between structures and aerodynamics. Classically that was one engineer, one day, for one blade of analysis, in a jet engine with a thousand blades, which means you simply could not do much. Now, with software and hardware people working together, you can change blade geometry and see the structural and aerodynamic results in real time, which allows two engineers to design an entire jet engine.

The Cataclysm of Enterprise Software

Picking up on the point that software engineers now build the tools and architectures for everyone else, Naval names what he calls the cataclysm of enterprise software. There is no longer a startup that can build and sell hardware collaboration tools, because internally companies just code the right things they need at any given moment. Even spreadsheets are cooked, he argues, because the reason spreadsheets succeeded is that no one could build custom software, so a spreadsheet stuffed with VBScript functions was the closest available approximation. With that constraint gone, the proxy collapses. He notes he has personally moved almost entirely from Excel to Python models where he can get believable simulations of things.

Generating Step Files and PCB Layouts

The next frontier, Blake suggests, is the thing AI has not reached yet but probably will within 2026: today it can generate software, but soon it will generate step files and PCB layouts, and when it comes for mechanical and electrical engineering that will be a whole other thing nobody has seen yet. On the hardware side this is described as a particular boon for the many small gadget and parts companies that historically wrote bad software because they could not make great software. Now they can make good-enough software, or skip a human front end entirely and expose the hardware agentically, so that an agent accesses it and a person controls the hardware by voice.

China’s Open-Source Bet and Hardware Superiority

This leads into one of the reasons China is described as going all in on open-source models. With hardware superiority, complex supply chains, and deep component chains, China’s logic is that if it can generate software on demand it no longer suffers a software disadvantage against Silicon Valley. That is framed as not the only reason: China is also a model or two behind, it is distilling models, and the government has a history of funding efforts that lift the entire ecosystem, especially in network-effect businesses. Ironically, the open-source heft comes from China precisely because OpenAI is not open, Grok publishes models but is a model or two behind, Google’s local models are not very competitive, and Anthropic is not known for open releases. The deeper point is that without great frontier coding models you do not get self-improvement, and if you fall behind on the ability to generate software you fall behind on the ability to generate everything, because generating software is embedded in every piece of the hardware pipeline.

Frontier Intelligence vs. Cheap Models

Naval raises a dinner-table argument from the night before, where someone claimed you will use DeepSeek for 97% of things because it is cheap, run it repeatedly when you need more intelligence, and reserve OpenAI or Anthropic for the most advanced tasks. Naval pushes back: intelligence is an unalloyed good, you always want more of it, model mistakes are invisible, and a smarter model is always cheaper than a real person in real time, so you default to the most intelligent model available. He notes the downside is that this tends toward a monopoly or oligopoly, because when two models give different answers you often cannot tell which is correct, so you trust the smarter one and gradually stop asking the weaker one. Guillermo confirms with AI gateway data that open models do get used, but the top is heavily dominated by frontier intelligence. His caveat is that frontier intelligence at the right cost and performance slaps at scale: Gemini models are underrated but are excellent industrial production models for support tasks and browser automation, while for pushing the frontier you need the best possible coding model, now only two or three models, and the Chinese models are not in that set.

Vertical Integration and the Captive MEMS Foundry

Asked about his push into vertical integration and extreme urgency, Max Hodak explains that for many things you cannot buy what you need, so you have to make it. The preference is always to buy when a vendor offers a service at a great price, and he points to PCBs, which are basically free and available in unlimited quantity from Asia. But the closer a product gets to being a single block of covalently bonded matter, the better it is: lower power, smaller, higher performance, longer lasting. The components for that level of integration are not available, so to innovate beyond piecing together off-the-shelf parts you have to learn to do it yourself, which shows up as vertical integration. Science owns a captive MEMS foundry on the East Coast, bought because there was no other way to do the packaging and assembly work the company wanted.

Software Still Needs Hands

Max expects AI to heavily affect all of this over the next few years, though it is not quite there yet. Ironically, one of the biggest impacts already seen is in regulatory interactions and documentation: figuring out which of thousands of ISO standards apply to a product change, and tracing it through, used to occupy a regulatory and quality team for months, and now the AI just knows. But for things like the surgical program or the MEMS fab, he argues the software still needs hands. It will be smarter than us, but if it cannot make things, those are real boundaries. Science has instrumented its foundry and many other parts of the company so that as models get better, the improvement shows up immediately in cell engineering and material science.

Lawyers, Paralegals, and the Promotion of Junior Work

The discussion turns to law as a parallel to engineering. It has been a while since anyone at the table generated a basic legal document using a lawyer. Routine work like NDAs and standard agreements is gone, because law is essentially spaghetti code that contradicts itself and has no real APIs, expressed in complicated English. Junior engineers got a promotion to senior engineers while junior engineering itself was taken over by agents, and the same framing applies to paralegals: you can say they were fired, or you can say they were promoted to senior lawyers who now spend their time thinking about the law. What you actually value in a lawyer is a trusted authority who went to law school and puts their reputation on the line when they tell you a document is legit.

Slop PRs, the Thousand-Day Problem, and Humans as Verifiers

Guillermo argues the biggest problem in software engineering today is mountains of slop ending up as a pull request. The old meme of reading every line of a PR is gone. In infrastructure he wants engineers to be able to say they understand and are signing off on the consequences of a PR, backed by the test harness, simulations, proofs, and type checkers they wrote, so they have confidence it will be safe in production even without reading every line. There is a world where everyone embraces that the code is spaghetti nobody fully understands, but builds the evaluators that give confidence and relies on production engineers to say it is fine to ship, because someone gets paged if the system goes down. The further warning is that creating software is easy from zero to one, but a thousand days from now you have to ask whether it is secure, tested, production grade, and performant, and whether you are still motivated to invest the tokens to maintain it in prod. The resolution is that humans are becoming verifiers, the same way models are trained on good verification data, and the old functions of lawyers, engineers, and operations people are moving to verifying the stack and standing behind it.

Notable Quotes

“What I found is it completely changes the role of software and hardware developers.”
Blake Scholl, on how AI reshaped engineering at Boom Supersonic.

“If you want to hand something off from like an aerodynamicist to a structures engineer that’s done manually with like a spreadsheet over email. It’s the 1990s. It’s terrible.”
Blake Scholl, describing the state of traditional hardware engineering workflows.

“It allows two engineers to design an entire jet engine, which is just wildly different.”
Blake Scholl, on collapsing turbine-blade analysis with real-time structural and aerodynamic feedback.

“Even spreadsheets are kind of cooked, right? Because the reason spreadsheets were successful is that no one could build custom software.”
Naval Ravikant, on the cataclysm of enterprise software.

“Right now it can generate software, but soon it’ll be able to generate step files and PCB layouts. And when it comes for mechanical and electrical engineering, that will be a whole other thing that we haven’t seen yet.”
Blake Scholl, on the next frontier for AI in hardware.

“If you fall behind on your ability to generate software, you fall behind on the ability to generate everything.”
Naval Ravikant, on why software now sits upstream of every hardware pipeline.

“Anytime I’m working to push the frontier you need the best possible coding model, and that’s basically now like two or three models, and the Chinese are certainly not in it.”
Guillermo Rauch, on where frontier coding intelligence actually lives.

“You can’t buy it, so you got to make it somehow. The closer that our products get to being like a single block of covalently bonded matter, the better they’ll be.”
Max Hodak, on why Science is forced into vertical integration.

“The software still needs hands. It’s going to be smarter than us, but if it can’t make things, then those are real real boundaries.”
Max Hodak, on the physical limits of AI in hardware.

“You need to be able to say I am signing off on understanding the consequences of this PR, or I wrote the test harness, the simulations, the proofs, the type checkers, to be able to say even without reading this, I have confidence it’s going to be safe in production.”
Guillermo Rauch, on what code review becomes in the age of slop PRs.

“Creating software is really easy 0 to one. But think about a thousand days from now. Is it secure? Is it tested? Is it production grade? And are you still motivated to invest all of those tokens in maintaining it in prod?”
On the long-term cost of software that is cheap to create and expensive to keep alive.

Watch the full conversation on the Naval Podcast here.

Related Reading
- Full episode: The AI Industrial Revolution, the complete hour-long conversation this clip is drawn from, covering software factories, hardware, regulation, healthcare economics, autonomous companies, and creativity.
- Part one: Waste Tokens to Save Time, the first half of this same conversation, where Naval, Guillermo Rauch, Blake Scholl, and Max Hodak argue that the job of an engineer is to build the factory and that pure software is not dead.
- Boom Supersonic, Blake Scholl’s company building supersonic civilian aircraft and its own jet engines, the source of the turbine-blade and two-engineers example.
- Science Corporation, Max Hodak’s company, whose captive MEMS foundry and surgical program anchor the vertical-integration argument.
- Vercel, Guillermo Rauch’s company, whose AI gateway data informs the point about frontier intelligence dominating real usage.
- Microelectromechanical systems (Wikipedia), background on the MEMS technology behind the captive foundry Max Hodak describes.
May 29, 2026

Tag: AI 2026

Benedict Evans on Why AI Is Stuck in 1997: The Task vs the Job, Commodity Models, and Why the Jobs Apocalypse Is Overhyped

TLDW

Thoughts

Key Takeaways

Detailed Summary

AI is as big as the internet, and we are living in 1997

The jagged frontier and the wide spread of adoption

Why the AI labs are buying consultancies

The task versus the job

Jevons paradox and the lump of labour fallacy

The jobs question, Dario, and the enterprise sales cycle

Redefining AGI and superintelligence

Where the value accrues: commodity models and the telecom analogy

Distribution as the moat

Apple Intelligence and the model as the dumb thing underneath

The anti-AI backlash, water, and real harms

You cannot predict what gets exposed, and what to actually do

Notable Quotes

Related Reading

Vibe Coding Hardware: Naval, Guillermo Rauch, Blake Scholl, and Max Hodak on AI-Designed Jet Engines, Vertical Integration, China’s Open-Source Bet, and Why Humans Become Verifiers

TLDW

Thoughts

Key Takeaways

Detailed Summary

Turning Hardware Engineering Into Software

Two Engineers, One Jet Engine

The Cataclysm of Enterprise Software

Generating Step Files and PCB Layouts

China’s Open-Source Bet and Hardware Superiority

Frontier Intelligence vs. Cheap Models

Vertical Integration and the Captive MEMS Foundry

Software Still Needs Hands

Lawyers, Paralegals, and the Promotion of Junior Work

Slop PRs, the Thousand-Day Problem, and Humans as Verifiers

Notable Quotes

Related Reading