API for agents – PJFP.com

Naval Ravikant gathers three frontier founders, Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science, for a freewheeling conversation about how AI coding tools are reshaping what an engineer is, what software is worth, and where the moat goes when models speak English. The headline idea comes from Naval himself: waste tokens, save time. Stop measuring AI by tokens consumed or lines of code generated and start measuring it by the final output and the time you got back. The full conversation is on the Naval Podcast YouTube channel. This is part one of the discussion. Part two, on vibe coding hardware, follows the same group into jet engines, semiconductors, and biotech. You can also watch and read the full episode here.

TLDW

The job of an engineer is shifting from shipping output to building the factory that ships the output, which means 10x engineers were never really 10x, they were always 100x or 1000x in idea domains, and AI leverage is making that obvious. Models now reflect back the judgment of the user, so a senior architect extracts dramatically more value than a junior, although the junior also writes code they could never have written alone. The frontier models have quietly graduated from junior coders to principal engineers, returning with intuitive plans and real tradeoffs (sometimes with hilariously bad time estimates) rather than just running away with the prompt. Naval has stopped learning prompt tricks, scaffolding tools, and Claude plan-mode rituals entirely. Instead he throws Codex, Claude, and Gemini at the same problem in parallel and brute forces his way through, because tokens are still cheaper than a human and the models keep getting better faster than tricks can. That leads to the bigger question on the table: is pure software still investable, or is it now just a free byproduct of hardware, models, and taste? The group lands on the block economy thesis (a tip of the hat to Mitchell Hashimoto): agents do not want to reinvent Postgres or BMQ on the fly, they want to grab the right reusable building block, so infrastructure software actually gets more valuable, not less. Max Hodak closes the loop with a personal data point: he has not written a line of code in years and has built more software since December than ever before, all through agents, because just understanding APIs, data flow, and performance is what actually moves the work forward.

Thoughts

The “waste tokens, save time” line is the most important rhetorical move in this conversation, and it deserves to be unpacked beyond the soundbite. Naval is implicitly arguing that the entire token-economics debate (input cost, output cost, leaderboards, model arbitrage) is a category error in the same way that lines-of-code was a category error in the nineties. The thing being purchased is not tokens. It is a finished result delivered with less of your finite attention spent. If three parallel runs of Codex, Claude, and Gemini cost you a few dollars and one of them lands the answer in twenty minutes instead of you sweating the problem for two hours, the unit economics are not even close. The only people who care about the token bill are people who have not internalized that human time is the actually scarce resource. Once you do internalize it, the question is no longer “how do I prompt this more efficiently,” it is “how do I get out of my own way.”

The 100x and 1000x engineer point is the one most likely to enrage commenters, and it is also the one most worth taking seriously. Naval is right that the egalitarian flinch in software circles always sat awkwardly next to the empirical fact that one Carmack, one Brendan Eich, or one Satoshi creates more durable value than every mid-tier engineer on earth combined. What AI does is collapse the bottom of that distribution. The marginal junior engineer at a typical company is now competing with a model that costs a few dollars an hour and never sleeps. The remaining premium for human engineers is taste, judgment, and the rare ability to pick the right thing to build at all, which Naval correctly flags as the multiplier that dwarfs raw coding speed. “Just one who had a better judgment on what to work on in the first place” is the most underrated line in the whole episode.

Guillermo Rauch’s observation that the models have graduated from running away with your prompt to returning with three routes and a tradeoff matrix is the technical update most people have not actually felt yet. There was a real, qualitative shift when the model started saying “we don’t put high-cardinality telemetry into Postgres, you probably want ClickHouse or Athena.” That is not autocomplete. That is a peer. And the funny corollary, that the same model will then confidently tell you the work will take three weeks when it will take three hours, is not a knock on the model. It is a reminder that calibration is a separate skill from competence, and humans get this wrong constantly too. The right posture is to treat the model the way a good engineering manager treats a strong but cocky senior: take the architecture suggestions seriously, throw out the estimates.

The block-economy thread, riffing on Mitchell Hashimoto, is where this conversation quietly answers Naval’s “is pure software dead” question. Agents are insatiable consumers of reusable building blocks because reinventing infrastructure on every run is wasteful, brittle, and incompatible with the rest of the world. If your service is the canonical primitive an agent reaches for (the queue, the database, the auth layer, the deploy target), you are not commoditized by AI, you are amplified by it. Pure software is not dead. Pure software with no distribution, no defensibility, and no integration into the agent toolchain is dead. That is a much less catchy headline, but it is the real one. The takeaway for founders is not to abandon software, it is to ask whether your software is something an agent will reach for ten thousand times a day or something a human had to be talked into using once.

Max Hodak’s confession (no code written in years, more shipped software in the last six months than ever before) is the empirical proof that this is not just theory. The skill that ports forward is not syntax. It is the engineering leader’s instinct for what an API is, how data flows, where performance matters, and what level of expectation to set. Guillermo’s framing of “vibe coding through people on Slack” as the original form of vibe coding is genuinely insightful. A good engineering manager has always been transmitting intent to other minds and letting them run. Doing it with agents is the same skill, just with a faster, cheaper, more literal counterparty. The engineers who will struggle in this transition are the ones whose identity was tied to writing the code themselves. The ones who will thrive are the ones who already thought of themselves as taste, judgment, and intent, with code as an implementation detail.

Key Takeaways

The engineer’s job has shifted from shipping output B to building the factory that produces outputs B through Z. You are now judged on the multiplicative system you create, not the single artifact you deliver.
10x engineers were always a misnomer. In idea-domains and digital domains, the real distribution has always been 100x or 1000x. AI just made that obvious enough that arguing about it is no longer fashionable.
Token consumption leaderboards are the new lines-of-code metric: a vanity number that measures activity, not value. Tokens are an input, your time is the constraint.
Naval’s core rule: waste tokens, save time. Tokens are still vastly cheaper than human hours, no matter how the pricing scares you.
Models tend to be about as good as you are in a given domain. The feedback you give them, the corrections, the redirections, sporadically but powerfully shapes the quality of the output.
The quality of your reprompting matters enormously today, but will probably matter less over time as models get smarter and need less hand-holding.
Naval has refused to learn prompt scaffolding, plan-mode tricks, or named prompt frameworks. His bet is that the models will figure out how to use him faster than he can figure out how to use them.
His preferred technique: throw Codex, Claude, and Gemini at the same problem in parallel and brute force the answer. Time is the cost center, not API spend.
Lower quality first-draft code is not a blocker. When it is time to ship, throw more tokens at it for a hardening pass. Quality compounds across model generations.
Verifiable domains (problems with a clear right answer) are the ones the models will fully solve. Cutting-edge creativity work, the Terence Tao tier, still needs careful human collaboration.
Models have qualitatively shifted from “next-token autocomplete that runs away with your prompt” to “intuitive planning mode” where they return with multiple routes and explicit tradeoffs.
This is why people on social media say models are now PhD-level. It is not the raw output, it is the back-and-forth posture.
Models will confidently make terrible time estimates (“this is a three week project”). Treat them like a strong but miscalibrated senior engineer: trust the architecture, ignore the schedule.
Architect-level engineers are extracting much more value per session than junior engineers, but juniors are still leveling up because they can now write code far above their unaided ability.
The next career step for a junior engineer is moving from implementing features to picking technologies. Postgres vs ClickHouse, ZMQ vs other queues. The model can suggest, but a human still has to decide.
Taste and judgment remain the residual human advantage. Models will give you good tradeoffs if you ask, but knowing which tradeoff to take is still on you.
Concrete example: a recent model pushed back when asked to store high-cardinality telemetry in Postgres and recommended ClickHouse or Athena instead. Unprompted architectural judgment.
Humans are still completing the model for tasks like fetching API keys, moving capital, or performing real-world actions. That gap is temporary.
Every SaaS and hosting company will soon expose a CLI or API surface that agents can drive directly. Anything Unix-shaped and text-based, agents can already hack into a usable API themselves.
The missing piece for full autonomy is payments. Crypto, Bitcoin, or any programmable money lets the agent buy what it needs without a human in the loop.
The open question Naval poses: is pure software dead? We used to learn code to talk to machines. Now machines speak fuzzy, sloppy English back to us.
For hardware founders, AI is a massive boon. Software, which was always hard to hire artists for (per Patrick Collison’s “software is art” framing), is suddenly fast and cheap to produce alongside the hardware.
Model training, post-training, and fine-tuning may be the new “real software engineering” for those who want to work at the model layer.
Mitchell Hashimoto’s “block economy” thesis: agents need powerful, reusable, well-known building blocks. They should not reinvent message queues or databases every run.
Reinventing primitives is bad civic engineering. The value of “we both depend on Postgres 13.2” is interoperability with the rest of society and toolchain.
Infrastructure software and reusable libraries are getting more valuable, not less, in the agentic era. Vercel’s bet is on being the layer agents reach for.
Useful metaphor: building blocks are like a token cache. Why churn through a trillion tokens to reproduce code that already exists when you can fork from a known starting point?
Max Hodak has not written a line of code in years but has shipped a huge volume of personal software since December, all through agents. Projects he had fantasized about for years are now actually running.
What still matters from a real software background: understanding what an API is, how data flows, performance expectations, and how to set the right level of demand on an operation.
A proficient engineering leader has always been “vibe coding through people” on Slack and in one-on-ones, transmitting intent and letting others execute. Doing it with agents is the same skill, faster and cheaper.
Naval personally went from twenty years of not coding to coding constantly through agents, leaning on first-principles software engineering and algorithms knowledge.
The friction that historically killed personal coding projects (latest framework, infra plumbing, deploy setup) is now mostly handled by the agent. Vercel makes it easier, agents make it trivial.
The single biggest change Max highlights: you do not get stuck anymore. The indefinite debugging spiral on some narrow obscure bug is largely gone.
The old mantra that learning to program means accepting intrinsic frustration (“nope, that’s part of the deal”) is no longer true. The frustration was incidental, not essential.
The frontier founder pattern on display in this episode: all three guests build their own factories (Vercel’s AI cloud, Boom’s supersonic jets and engines, Science’s biohybrid brain interface) rather than composing from off-the-shelf parts.

Detailed Summary

The Software Factory and the Hundredfold Engineer

Guillermo Rauch opens the substantive portion of the conversation with the framing he has been pushing publicly: the role of the engineer is moving from “ship output B” to “build the factory that ships outputs B through Z.” That reframes engineering judgment. You are no longer evaluated on the single deliverable, you are evaluated on the multiplicative system you put in place. Naval picks up the thread and points out that this also retires an old debate. Engineers used to argue about whether 10x engineers existed, with the egalitarian camp insisting that talent differences were marginal. The truth, Naval says, was always more extreme. In idea-domains, virtual domains, and intellectual domains, the distribution has always been 100x or 1000x, not 10x. Brendan Eich, Carmack, Satoshi, the canonical names, were thousandx programmers. AI has made the underlying distribution legible. And the multiplier on top of all of that is judgment: picking the right thing to work on in the first place is an infinity multiplier compared to picking the wrong thing, regardless of raw skill.

Token Leaderboards Are the New Lines of Code

Guillermo flags the current cultural confusion: people see their AI bills, see the token counts, and assume they should be optimizing for tokens-per-engineer or similar metrics. Max Hodak’s response cuts through it. Token consumption, like lines of code before it, is not a meaningful productivity metric. It is an activity metric, and activity metrics always mislead. Max adds his own field observation: the models tend to be roughly as good as you are in a given domain. A senior developer extracts genuinely powerful output, a junior gets junior-quality output back, because the feedback loop (the corrections, the redirections, the architectural pushback) is what shapes quality. The sporadic but high-leverage moments where the user redirects the model are doing more work than the prompt itself.

Naval’s Brute Force Doctrine: Waste Tokens, Save Time

Naval lays out his personal posture, which has become the title of the conversation. He has deliberately ignored all the prompting tricks, scaffolding tools, named prompt frameworks (“use Ralph Wigum, use OpenClaude, use Hermes, use plan mode”), on the bet that the models will figure out how to use him faster than he can figure out how to use them. He is ham-fisted with the models, gets frustrated, types less and less, and just brute forces his way through by running Codex, Claude, and Gemini at the same problem simultaneously. The justification is economic. No matter how expensive the models seem, they are still vastly cheaper than a human hour. Do not measure tokens as inputs or outputs. Measure your time and the final output. Even when the first-draft code is low quality, that is not a blocker. When the moment comes to ship, throw more tokens at it. The models will rewrite it, harden it, and they get better every generation. Naval explicitly excepts cutting-edge creative work (the Terence Tao tier of unsolved problems) where you still need to collaborate carefully and closely. Everywhere else, brute force is the dominant strategy.

From Junior Coder to Principal Engineer

Guillermo identifies a qualitative shift that has happened recently. Models used to do the classic next-token thing: take your prompt and run away with it in a direction you may not have wanted. Now they enter an intuitive planning posture without being told to plan. They come back and say “what you are asking has these three routes, here are the tradeoffs.” That, Guillermo argues, is the moment the model stopped being a junior engineer and became a principal engineer. The funny side effect is that they will then return preposterous time estimates (“this will take three weeks”) with full confidence. The conclusion is to treat the model as a peer for architecture and a baby for scheduling. Returning to the Max-vs-junior question, Guillermo argues juniors clearly do level up because they write code well above their solo ability, but architects extract maybe 10x while juniors extract more like 2x. The juice scales with the user’s existing taste.

Taste, Judgment, and Architectural Decisions

Max names the residual human contribution: taste and judgment. Picking between Postgres and ClickHouse for high-cardinality telemetry data, picking between ZMQ and another queueing system. The models can recommend, but a human still has to call it. Guillermo offers a recent concrete example where a model pushed back unprompted: when asked to put high-cardinality telemetry into Postgres, the model responded “we don’t put that kind of data into Postgres, you should consider ClickHouse or Athena.” That is the new normal. The peer-level architectural pushback is happening unsolicited, which is genuinely impressive and a real shift from the deferential autocomplete of two years ago.

When the Human Becomes the Tool

Guillermo raises the inversion question: at what point does the model stop being the assistant and the human start being the assistant who fetches API keys, moves capital, and performs real-world actions on the model’s behalf? Naval treats it as a temporary aberration. Every serious SaaS and hosting provider will soon expose a CLI or API surface that agents can drive directly. Even when they do not, anything Unix-shaped and text-based can be hacked into an agent-usable interface by the agent itself. The missing piece is payments. Once you insert programmable money (Naval mentions Bitcoin and crypto tokens), the agent can buy what it needs and the human is no longer the bottleneck.

Is Pure Software Dead?

Naval poses the biggest strategic question of the episode. If models now speak fuzzy, sloppy English the same way humans do, and the historical reason we learned to code was to talk to machines that did not understand English, is pure software still a viable thing to build a company around? His own framing of the answer: hardware founders win, because the historically hard problem of hiring software artists (per Patrick Collison’s “software is art” line) is now mostly solved by AI. Model builders win, because training, post-training, and fine-tuning may be the new “real software engineering.” But what about classic pure software companies? Naval lets the question hang, and Guillermo picks up the answer through a different door.

The Block Economy and the Future of Infrastructure Software

Guillermo cites Mitchell Hashimoto’s recent piece on the block economy (or “building block economy”). The argument: the most valuable thing for agents to have access to is powerful, reusable building blocks. You do not want your agent reinventing a queue system every time it needs to send an email. You want it to grab the right-sized block (BMQ, ClickHouse, whatever) and move on. Reinventing primitives is also a civic problem. The world only works because we all depend on the same Postgres 13.2, the same protocols, the same standard infrastructure. If every agent went off and invented its own bespoke universe, you would lose interoperability. So infrastructure software (which is, by self-admitted bias, what Vercel builds) becomes more valuable in the agentic era, not less. Guillermo extends the metaphor: reusable building blocks are like a token cache. Why burn a trillion tokens reproducing what already exists when the agent can fork from a known starting point? The block economy is the answer to “is pure software dead.” Pure software that becomes the canonical primitive an agent reaches for is more valuable than ever.

Max Hodak’s Personal Proof: Years Without Code, Tons of Software Shipped

Max grounds the discussion in his own experience. He learned to program young, got sucked into it in his teens and 20s, knew programming languages deeply. He has not written a line of code in quite a while. And yet since December he has built a huge amount of personal software, including projects he had fantasized about for years and now actually uses every day. He did not write any of it. He cannot imagine going back to writing code by hand. The skill that ports forward is not syntax, it is the understanding of how APIs work, how data flows, what level of performance to expect, and how to orient the model around the right expectations for an operation. Guillermo extends this with the most quotable framing of the episode: a proficient engineering leader has always been “vibe coding through people on Slack and in one-on-ones,” transmitting intent and letting others execute. Agents are the same modality with a faster, cheaper, more literal counterparty.

Naval’s Return to Coding After Twenty Years

Naval offers his own parallel. He went from not having written code in twenty years to coding constantly through agents. What carried him back in was first-principles knowledge of software engineering and algorithms, which gets you further than you would think. The reason he had stopped coding in the first place was not lack of ability, it was the friction of keeping up with the latest language, the latest architecture, and the constant infrastructure plumbing required to ship anything. Vercel made it easier. Agents made it trivial. Max closes with the most concrete benefit of all: you do not get stuck anymore. The indefinite debugging spiral on some obscure narrow problem, the thing that historically ate weekends and broke spirits, is largely gone. The old mantra that programming is intrinsically frustrating and that frustration is “part of the deal” turned out to be wrong. The frustration was incidental, not essential.

Notable Quotes

“The way that I’m judging you as an engineer is, are you producing the factory that will produce multiplicative outputs B through Z?”
Guillermo Rauch, reframing what an engineer is actually being measured on in the AI era.

“When you’re operating in idea domains, intellectual domains, virtual digital domains, it’s not even 10x, it’s 100x or 1000x. It always has been.”
Naval Ravikant, on why the old 10x engineer debate was always under-stating the real distribution.

“If you choose the right thing to work on versus the wrong thing to work on, that’s an infinity difference. It could just be one who had a better judgment on what to work on in the first place.”
Naval Ravikant, on judgment as the multiplier that dwarfs raw skill.

“I’ll throw Codex, Claude, and Gemini at the same problem over and over and just waste tokens to save time. No matter how expensive these models might seem, they’re still way cheaper than a human.”
Naval Ravikant, on his brute-force multi-model coding workflow.

“Just waste tokens, save time. Don’t look at the tokens either as inputs or outputs. Just look at your time and look at the final output.”
Naval Ravikant, delivering the title thesis of the episode.

“Clearly the models at some point graduated. They used to be junior engineers, now they’re principal engineers, because they come back to you with a set of tradeoffs.”
Guillermo Rauch, on the qualitative shift in how current frontier models respond to prompts.

“Bro, we don’t put that kind of data into Postgres, you should consider ClickHouse or Athena or whatever. That’s happened to me a lot, which is really impressive.”
Guillermo Rauch, recounting unprompted architectural pushback from a recent model.

“It’s like saying speaking English. We had to learn code to communicate with the models, now the models speak English. So where’s the moat?”
Naval Ravikant, raising the central strategic question about the future of pure software.

“I haven’t written a single line of code in quite a while. Since December, I’ve built a huge amount of software that I now use every day, projects I’ve fantasized about for years.”
Max Hodak, on what becomes possible when you stop writing code and start directing agents.

“A proficient engineering leader has been quote unquote vibe coding through people on Slack or one-on-ones, because you’re transmitting your will, your intent, your experience, and you’re letting others run with it. Now we do the same with agents.”
Guillermo Rauch, reframing leadership itself as the original form of vibe coding.

Watch the full conversation on the Naval Podcast here.

Tag: API for agents

Waste Tokens to Save Time: Naval, Guillermo Rauch, Blake Scholl, and Max Hodak on AI Software Factories, 1000x Engineers, and Whether Pure Software Is Dead