On June 12, 2026, Anthropic published a statement announcing that the US government, citing national security authorities, has issued an export control directive forcing the company to suspend all access to its newest frontier models, Claude Fable 5 and Claude Mythos 5. The order technically targets foreign nationals inside and outside the United States, including Anthropic’s own foreign national employees, but the practical effect is that both models are going dark for every customer worldwide. It is the first publicly known instance of the US government ordering a deployed frontier AI model offline, and Anthropic is complying while openly disputing the basis for the decision.
TLDR
The US government delivered an export control directive to Anthropic at 5:21pm ET on June 12, 2026, suspending all access to Fable 5 and Mythos 5 over an alleged jailbreak of Fable 5’s safeguards. Anthropic says the letter contained no specific details, that the only evidence shared was verbal, and that the technique in question amounts to asking the model to read a codebase and fix software flaws, a capability the company says is freely available from other models including OpenAI’s GPT-5.5 and used daily by cyber defenders. Anthropic defends its defense in depth strategy, notes that thousands of hours of red teaming by the US government, the UK AISI, and third parties found no universal jailbreak, and warns that recalling a commercial model over a narrow, non-universal jailbreak would effectively halt all new frontier model deployments if applied industry-wide. Access to all other Anthropic models, including Claude Opus, Sonnet, and Haiku, is unaffected, and the company says it believes the situation is a misunderstanding and is working to restore access, with more details promised within 24 hours.
Thoughts
This is a watershed moment regardless of how it resolves. Governments have blocked AI exports before, but ordering a deployed commercial model recalled out from under hundreds of millions of users is a new kind of intervention, closer to a product recall than a trade restriction. The mechanism matters too. Export control authority aimed at foreign nationals, including a company’s own employees, that cascades into a global shutdown is a blunt instrument doing the work of a regulatory regime that does not exist yet. The US has no statutory process for recalling an AI model, so the government reached for the closest tool on the shelf, and the result is a precedent built on improvisation.
There is real irony in who got hit first. Anthropic has spent years arguing, publicly and in Washington, that governments should have the power to block unsafe AI deployments. Now the company that asked for a referee is the first one whistled, and its complaint is not about the existence of the power but about the process: a letter at 5:21pm with no specifics, verbal evidence only, and no transparent or technically grounded procedure. That distinction is the whole ballgame for AI governance. A power to halt deployments without due process standards is not regulation, it is discretion, and discretion cuts in every direction depending on who holds it.
The technical dispute underneath is genuinely interesting because it exposes how unsettled the definition of a dangerous jailbreak is. Anthropic’s account of the offending technique, asking the model to read a specific codebase and fix any software flaws, describes something security teams do on purpose every single day. Vulnerability discovery is the canonical dual use capability: the same analysis that lets a defender patch a hole lets an attacker find one. If the bar for recall is that a model can be coaxed into doing competent security analysis, then every capable model on the market fails that bar, which is exactly Anthropic’s point about GPT-5.5. The hard question the directive dodges is not whether Fable 5 can find bugs but whether it provides meaningful uplift beyond what is already freely available, and Anthropic says it does not.
For builders, the immediate lesson is uncomfortable: model availability is now a political variable, not just an engineering one. Teams that built directly on Fable 5 lost a production dependency overnight through no fault of Anthropic’s infrastructure, their own code, or any terms of service violation. Multi-model fallback strategies, abstraction layers over providers, and graceful degradation paths just moved from nice-to-have to table stakes for anyone running serious workloads on frontier models. The companies that absorbed this outage gracefully are the ones that assumed any single model could vanish.
The next 24 hours matter more than the directive itself. Anthropic has promised more details, and the government will face pressure to either substantiate a concern that justifies a global recall or quietly walk it back. Either outcome sets the real precedent. If the directive holds on thin evidence, every frontier lab now operates under the threat of arbitrary shutdown. If it collapses under scrutiny, the case for a formal, transparent statutory process for AI deployment decisions, which Anthropic explicitly endorses in its own statement, gets a lot stronger in Congress than it was a week ago.
Key Takeaways
The US government issued an export control directive on June 12, 2026 suspending all access to Claude Fable 5 and Claude Mythos 5, citing national security authorities.
The directive formally targets access by any foreign national, inside or outside the United States, including Anthropic’s own foreign national employees.
The net effect is that Anthropic must disable Fable 5 and Mythos 5 for all customers worldwide to ensure compliance, not just for foreign users.
Access to all other Anthropic models, including the Claude Opus, Sonnet, and Haiku families, is not affected by the order.
Anthropic received the directive at 5:21pm ET the same day it published its statement, and says the letter did not provide specific details of the national security concern.
Anthropic’s understanding is that the government believes it has become aware of a method of bypassing, or jailbreaking, Fable 5’s safeguards.
Anthropic reviewed a demonstration of the specific technique and says it only identified a small number of previously known, minor vulnerabilities.
The company says other publicly available models can discover the same vulnerabilities without requiring any bypass at all.
Before launch, Fable 5’s safeguards were red-teamed for thousands of hours in total by the US government, the UK AISI, multiple private third-party organizations, and internal teams.
No tester has found a universal jailbreak for Fable 5, meaning a method that broadly bypasses safeguards and unlocks a wide range of cyber capabilities.
Anthropic openly states that perfect jailbreak resistance does not appear possible for any model provider today, and that every safeguard in the industry is vulnerable to non-universal jailbreaks.
Fable 5 was deployed under a defense in depth strategy: make jailbreaks either narrow or very expensive to produce, then combine that with monitoring to quickly detect and shut down successful attacks.
Anthropic’s 30-day customer data retention requirement for Fable exists specifically to support jailbreak research and mitigation, a policy the company says carries real costs with customers.
Anthropic says it has not received any disclosure of a concerning non-universal jailbreak that led to a harmful result; disclosed potential jailbreaks were benign or provided no Mythos-specific uplift.
The only evidence the government has provided is verbal, describing a narrow, non-universal jailbreak that essentially consists of asking the model to read a specific codebase and fix any software flaws.
Anthropic reviewed a report it believes is the basis of the directive and validated that the capability level shown is widely available from other models, including OpenAI’s GPT-5.5, and is used every day by cyber defenders.
Anthropic is complying with the legal directive while explicitly disagreeing that a narrow potential jailbreak justifies recalling a commercial model deployed to hundreds of millions of people.
The company warns that if this recall standard were applied across the industry, it would essentially halt all new model deployments for every frontier model provider.
Anthropic supports government power to block unsafe deployments in principle, but only through a statutory process that is transparent, fair, clear, and grounded in technical facts, and says this action meets none of those principles.
Anthropic apologized to customers, called the situation a misunderstanding, said it is working to restore access as soon as possible, and promised more details within 24 hours.
Detailed Summary
What the directive actually does
The order arrived as a letter from the US government at 5:21pm ET on June 12, 2026, invoking national security authorities under export control law. On paper it suspends access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, a category that includes some of Anthropic’s own employees. In practice, Anthropic says compliance requires abruptly disabling both models for every customer, since there is no clean way to enforce a nationality-based access boundary across a global product. The letter did not spell out the specific national security concern. Everything else in Anthropic’s statement is the company’s own reconstruction of what prompted the action.
The jailbreak at the center of the dispute
Anthropic’s understanding is that the government became aware of a method for bypassing Fable 5’s safeguards. The company reviewed a demonstration of the technique and characterizes the results as a small number of previously known, minor vulnerabilities, all relatively simple, all discoverable by other publicly available models without any jailbreak at all. According to Anthropic, the government’s evidence so far has been entirely verbal, and the technique boils down to asking the model to read a specific codebase and fix any software flaws. The company reviewed a report it believes underlies the directive and validated that the displayed capability is widely available elsewhere, naming OpenAI’s GPT-5.5 directly, and noted that this exact kind of analysis is what defenders use to keep systems safe.
Anthropic’s defense in depth posture
The statement restates the safety posture Anthropic laid out at Fable 5’s launch. The safeguards around cybersecurity tasks are strong enough that users have complained they are overly broad. In the weeks before launch, the US government, the UK AISI, multiple private third-party organizations, and internal teams red-teamed the safeguards for thousands of hours combined, and those tests showed Fable’s protections to be substantially more effective than any previously deployed model. No tester found a universal jailbreak. Anthropic is candid that perfect jailbreak resistance is likely impossible for anyone today, which is why the strategy is defense in depth: keep jailbreaks narrow or expensive, monitor aggressively, and shut down attacks fast. The 30-day customer data retention requirement on Fable exists to support that monitoring and mitigation loop. The company says this posture makes Fable’s risks comparable to models already deployed across the industry.
Complying while disputing the standard
Anthropic is removing access for all users as legally required, but the statement draws a hard line on the principle. The company disagrees that a narrow potential jailbreak, one that produced no disclosed harmful result, justifies recalling a commercial model serving hundreds of millions of people. Its broader warning is that this standard, applied evenly, would halt all new frontier model deployments industry-wide, since every provider’s safeguards are vulnerable to narrow jailbreaks. Anthropic also turns its own policy position into a critique: the company has publicly supported giving government the ability to block unsafe deployments, but through a statutory process that is transparent, fair, clear, and grounded in technical facts, and it says this action does not adhere to those principles.
What happens next
Anthropic closed by apologizing to customers, calling the situation a misunderstanding, and committing to restore access as soon as possible. The company promised to share more details over the next 24 hours, which makes this a developing story. The open questions are whether the government substantiates its concern with written technical evidence, whether the directive survives that scrutiny, and whether this episode accelerates the formal statutory process for AI deployment decisions that Anthropic says should have governed the action in the first place.
Notable Quotes
“The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.”
Anthropic, on why a directive aimed at foreign nationals becomes a global shutdown
“We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern.”
Anthropic, on the abruptness and opacity of the order
“These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass.”
Anthropic, on its review of the demonstrated jailbreak technique
“We suspect that perfect jailbreak resistance is not currently possible for any model provider.”
Anthropic, restating the position it disclosed at Fable 5’s launch
“We stand by this defense in depth strategy. It reduces the risks posed by Fable, making them comparable to the risks of existing models already deployed across the industry.”
Anthropic, defending its layered safeguards approach
“To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws.”
Anthropic, describing the technique behind the directive
“However, we disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people.”
Anthropic, on complying while contesting the decision
“If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”
Anthropic, on the industry-wide implications of the recall standard
“As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.”
Anthropic, on the kind of oversight process it says should have governed the action
“We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.”
Mark Zuckerberg, Priscilla Chan, and AI researcher Alex Rives sat down with the No Priors podcast to explain why CZI Biohub became the primary focus of their philanthropy, why they committed $500 million to a virtual biology initiative, and why they are giving the resulting AI models away as open source instead of building a company. The conversation moves from a goal that Nobel laureates once laughed at, curing, preventing, and managing all disease by the end of the century, to a concrete technical strategy: build world models of biology layer by layer, from proteins to cells to whole systems, and put them in every scientist’s hands.
TLDW
This is the clearest public articulation yet of how the Chan Zuckerberg Initiative thinks about AI and biology. The throughline starts a decade ago when Zuckerberg and Chan asked scientists how to cure all disease and learned the real bottleneck was tooling, siloed labs, and unshared knowledge, not a lack of ambition. That insight produced the Human Cell Atlas, the CELLxGENE annotation tool, and a corpus of single-cell transcriptomics that large language models could finally make sense of. Now Biohub couples a frontier AI lab with frontier wet-lab biology under one roof across San Francisco, New York, and Chicago, organized around the virtual biology initiative and the long-term goal of a virtual cell. Alex Rives, the AI researcher behind the ESM protein language models, walks through their newly released ESM-based world model of protein biology: trained on billions of protein sequences, it predicts atomic-resolution structures blazingly fast, folded over 1.1 billion proteins, designs novel proteins and single-chain antibodies as an emergent property, and found nanomolar binders in a single 96-well plate. The discussion covers mechanistic interpretability as a way to extract genuinely new biological knowledge, personalized medicine driven by understanding the chain from gene variant to protein to disease, predicting off-target toxicity before human trials, rare-disease patient organizing, the baby KJ CRISPR case, biosafety tradeoffs of open source, talent and why frontier biology plus frontier AI is a recruiting moat, and what success looks like five years out.
Thoughts
The most important claim in this conversation is also the easiest to miss because it is delivered casually: protein design is an emergent property of a model that was never asked to design proteins. Rives is explicit that they did not build a model for antibodies and did not build a model to bind a particular target. They built a model that understands proteins, trained on raw sequence with a next-token objective, and protein design, structure prediction, and antibody generation fell out of it. That is the language-model bet transplanted into biology, and the fact that it produced nanomolar binders, the threshold for actual therapeutic activity, in a single 96-well plate rather than a high-throughput screen of millions is the kind of result that quietly resets what a small team can attempt. If that generalizes, the binding curve for “design a molecule” bends the same way the cost curve for “write working code” did.
What makes the strategy coherent, rather than just a well-funded AI lab, is the insistence that the wet lab and the AI lab are a single effort. Most of biology’s useful data does not exist on the internet the way human language does. You cannot pay a factory to produce it. Someone has to invent the cellular engineering in New York, the inflammation-sensing devices in Chicago, the translucent-zebrafish imaging, and that is the actual product of frontier biology: new instruments that generate data nobody has ever seen, which in turn make new classes of models possible. This is the part venture-backed competitors will struggle to replicate, because it requires patience measured in 10 to 15 year horizons and a willingness to spend on data generation that has no business model attached. Zuckerberg is almost dismissive about it, noting they could probably run it as a business but that not having to think about monetization is strategically simplifying. The nonprofit structure is not charity window-dressing here. It is what lets them release the models as an open discovery engine and harness the entire academic and biotech field rather than competing with it.
The mechanistic interpretability thread deserves more attention than it will get. Interpretability has mostly been a safety and alignment story for language models, a way to peer inside the black box and check that the representations match our understanding of the world. Rives flips it: the protein models have been trained on both known and unknown biology, billions of sequences including proteins we understand nothing about, and they are building representations that connect the unknown proteins to the known ones through an underlying structural grammar. The promise is that interpretability becomes a discovery tool, not just an audit tool. You open the box and find biology the field has not characterized yet, the mechanism of action for a treatment, a system in the body nobody mapped. That is a fundamentally more optimistic use of the same toolkit, and it is the part of the launch Sarah Guo and Elad Gil both flag as the most interesting.
Chan’s framing of personalized medicine is worth sitting with because it reframes the entire goal away from “cure disease X.” She wants to treat the individual as an individual: understand this person’s genetics, their risk profile, the mechanistic chain from a specific gene variant through a protein to a disease process, and then design a drug bespoke to them. The current reality she describes, sitting in PubMed reading a paper’s supplement asking “am I represented in this cohort,” guessing whether a drug that kind of impacts a pathway that is probably implicated might do something, is a brutal and accurate picture of how non-standard cases are actually handled today. The vision is generalizable tools delivering personalized answers, which is the same put-the-tool-in-the-individual’s-hands philosophy Zuckerberg applies to open-source AI and, by his own analogy, to social media. Whether you find that analogy reassuring or not, the consistency of the worldview is real: they genuinely do not believe in a central super-intelligence solving science, and the whole architecture follows from that.
The honest gap they name is the clinic. Chan is candid that the science will start moving fast but that translating to patients requires changing how clinical research itself works, and that part is still shaping up. The most interesting near-term lever is not a virtual FDA trial but the recruitment and economics flip for rare disease: patient groups self-organizing registries, biobanks, and natural-history studies, compressing timelines from decades to a handful of years, paired with models that lower the cost of generating a candidate. The baby KJ case, a custom CRISPR therapeutic to edit a single mutation, delivered to liver cells specifically because that target was deliverable, is the proof of concept for why disease selection and delivery creativity matter as much as the molecule. The molecule is becoming the cheap part. The rest of the chain is where the next decade of work actually sits.
Key Takeaways
CZI Biohub is now the primary philanthropic focus of the Chan Zuckerberg Initiative, a shift the team formalized in the past year.
They committed $500 million to the virtual biology initiative, the unifying theme across the Biohubs.
The original goal, set roughly 10 years ago, was to cure, prevent, and manage all disease by the end of the century. Zuckerberg now thinks “end of the century” is too conservative.
Nobel Prize winning scientists initially laughed at the all-disease ambition. When pressed for why it was impossible, the real answers were silos, locked-up unpublished information, and the inability to build shared tools.
The recurring example: a postdoc builds a great tool, it lives on their computer, they graduate, and the tool is gone. Shared, durable tooling was the missing layer.
CZI is explicit that they are not the ones who will cure diseases. Their role is building tools that accelerate the entire scientific field so the field collectively cures them.
The first request for application was single-cell sequencing, funding methods so scientists could share how to do it.
That work led to funding the Human Cell Atlas, now one of the largest databases of single-cell transcriptomics.
They built CELLxGENE, a simple annotation tool, around which a community formed and contributed data CZI had nothing to do with creating. It is now a corpus underpinning many transcriptomic models.
Critics called the data gathering “stamp collecting.” The arrival of large language models, which can make sense of large amounts of data, answered that critique.
The ambition is to move biology from a discovery-based science to an engineering-based science, systematically understanding how living cells work and why things go wrong.
Biohub couples a frontier AI lab with a frontier biology effort. Unlike language models, biology lacks abundant internet-scale data, so new science is required to generate the data the models need.
The Biohubs are specialized: New York focuses on cellular engineering, Chicago builds devices to measure things like inflammation, plus imaging work and translucent-zebrafish development studies.
Alex Rives, who built the ESM protein language models and founded EvolutionaryScale after working at Meta FAIR, now leads the AI effort. The team raised venture capital before joining CZI’s nonprofit structure.
The strategy is hierarchical: model proteins first, then cells, then whole systems, because you cannot understand cells without understanding protein interactions.
They collect data strategically to bridge across the hierarchy, for example spatial transcriptomics showing where RNA localizes within a cell, and sensors that observe cell-to-cell communication.
The newly released ESM-based model is a world model of protein biology, trained on billions of protein sequences, predicting atomic-resolution structure extremely fast at a Pareto-optimal frontier of speed and accuracy.
They folded over 1.1 billion proteins and predicted their structures, identifying connecting features through mechanistic interpretability.
The model hits state of the art on structure prediction benchmarks, especially protein-protein and protein-antibody interactions, which are critical for therapeutic design.
Protein and antibody design are emergent properties. They designed a model to understand proteins, not to bind any specific target, and design capability fell out of it.
In one experiment, they selected from hundreds of thousands of digital trajectories, synthesized 96 proteins in a single well plate, and found nanomolar binders, the threshold for therapeutic activity.
Results were validated with the Biohub’s cryo-EM microscopes and structural biology center, confirming function and atomic-resolution binding interfaces.
Mechanistic interpretability is reframed as a discovery tool: open the black box to find biology nobody has characterized, not just to audit the model.
Chan’s vision of personalized medicine: understand a person’s genetics, the mechanistic chain from gene variant to protein to disease, then design a bespoke drug and intervene.
A comprehensive model of how cells work could predict off-target effects, like a receptor on kidney cells causing renal toxicity, before human trials.
They study systems rather than individual diseases. Inflammation is a major Chicago focus because it connects to many diseases.
A typical drug trial runs about 15 years and $1.5 billion. Only roughly $50 million is the molecule and preclinical work. The other $1.45 billion is drug development, much of it gated on regulation, recruitment, and failures from toxicity or absorption.
The baby KJ case at CHOP delivered a custom CRISPR therapeutic to edit a single mutation, chosen carefully because his liver cells were a deliverable target.
CZI’s “Rare As One” program supports rare-disease patient groups self-organizing registries, biobanks, and even their own clinical trials, compressing gene-therapy timelines from decades to 3 to 5 years.
Letting people opt in to frontier trials, while preserving historical vetting for the general population, is named as a key shift that could accelerate biology.
The open-source philosophy mirrors Zuckerberg’s broader ethos: empower individuals with tools rather than centralizing power in a few institutions or a single super-intelligence.
Biosafety is acknowledged as a real consideration that open-source biology will need to balance and handle carefully.
On talent: AI researchers could join any frontier lab, but no other organization pairs frontier biology with frontier AI, which is the recruiting moat.
You do not need a huge team. Zuckerberg argues real AI progress can come from a strong group of a dozen or a couple dozen people.
Researchers have been connecting the released model to agentic systems to automate the entire protein design process.
The next big challenge is the virtual cell: a system that models the proteomic, genetic, and transcriptomic layers and connects them to phenotype, generalizing to interventions it was never trained on.
Like every lab, Biohub is compute and data constrained, constantly deciding whether to double down on proteins or push further into cellular work.
Five-year success: a hierarchical set of world models of biology and doing the highest-quality, uniquely contributive work in the world, a setup the team believes no other organization has.
The biggest update of the past year: formalizing Biohub as the philanthropy’s core, and flipping leadership from biologists interested in technology to an AI researcher with a biology background.
Zuckerberg’s read on the broader industry: the exponential curve is on track and still accelerating, which validates making a very big long-term investment.
Detailed Summary
From “cure all disease” to a tooling problem
The origin story is a decade old. Zuckerberg and Chan wanted to build an organization that could cure, prevent, and manage all disease by the end of the century, and a series of meetings with famous, Nobel Prize winning scientists produced laughter rather than encouragement. Instead of retreating, they kept asking why it was impossible. The answers, once scientists relented, were not about biology being too hard. They were about how science is organized: researchers work in silos, published information gets locked up for long periods, and there is no good way to build and share durable tools. The image that stuck was a postdoc building an excellent tool that lives on a single computer and vanishes when that person graduates. The bottleneck was infrastructure and shared knowledge, and that is where CZI decided it could contribute.
The path from single-cell sequencing to a world model
The original Biohub model brought engineers and scientists together across universities for long-term tool development, and it worked. CZI’s first request for application targeted single-cell sequencing, funding the methods so scientists could share how to read the RNA transcribed in individual cells. That seeded the Human Cell Atlas, now one of the largest single-cell transcriptomics databases. When annotation became a bottleneck, CZI built CELLxGENE, a simple annotation tool, and a community formed around it and contributed data CZI never funded. Critics dismissed it as stamp collecting, gathering bits of data without extracting wisdom. Then large language models arrived and demonstrated they could make sense of exactly that kind of large-scale data, and Chan describes the delight of realizing the missing engine had appeared.
Frontier AI married to frontier biology
The unifying theme is the virtual biology initiative, and the structural insight is that the AI effort and the wet-lab effort are a single integrated organization, not two collaborating ones. Biology lacks the internet-scale data that language models enjoy. You cannot buy the data from a factory. So Biohub invents the science that generates it: cellular engineering in New York to record what happens inside the body, devices in Chicago to measure inflammation, imaging to visualize the previously invisible, and translucent zebrafish to watch development unfold across cells as the brain forms. Each new instrument creates a new dataset, which enables a new class of model. Rives, who built the ESM models and founded EvolutionaryScale before joining, frames this as the start of a new era of science, where systems that predict the next token can learn world models of biology from the data, provided you build at the right scale with the right people.
Building biology hierarchically
The team is deliberate that each layer of biology is qualitatively different and must be built up in order. You cannot jump to cells without understanding protein interactions, and you cannot model the immune system without first understanding cells. So the approach starts with the building blocks, the proteins, and ladders upward. The advantage of a single integrated effort is the ability to gather data that connects the hierarchy: spatial transcriptomics that show where RNA localizes inside a cell, sensors that capture cell-to-cell communication, developmental imaging in zebrafish. That connective tissue is what lets the modeling generalize across levels. The interviewer, a former wet-lab biologist with a PhD, notes that the reductionist and systems camps of biology historically never worked together deeply, and that bridging them is one of the genuinely novel things about the effort.
The ESM-based protein world model
The launch at the center of the conversation, roughly a week old at recording, is an open system for scientific discovery in protein biology: a language-model-based world model trained on billions of protein sequences. It learns emergent representations of protein biology and predicts atomic-resolution structure at blazing speed, sitting on a Pareto-optimal frontier of speed and accuracy. They folded over 1.1 billion proteins and used mechanistic interpretability to identify features connecting them. It reaches state of the art across structure-prediction benchmarks, with particular strength on protein-protein and protein-antibody interactions that matter for therapeutics. The headline result: they used the model to design proteins and single-chain antibodies digitally, selected from hundreds of thousands of trajectories, synthesized just 96 in a single well plate, and found nanomolar binders, replacing high-throughput screens of millions of antibodies. Validation came from the Biohub’s cryo-EM structural biology center, confirming both function and the atomic-resolution binding interfaces.
Interpretability as discovery, and personalized medicine
Rives reframes mechanistic interpretability, usually aimed at language models, as a way to extract new biological knowledge. The protein models are trained on both known and unknown biology and develop representations that connect uncharacterized proteins to understood ones through an underlying structural grammar. Opening that black box could reveal systems in the body or mechanisms of action for treatments that the field has never mapped. Chan connects this to a personalized-medicine vision: understand an individual’s genetics and the mechanistic chain from gene variant to protein to disease, then design a bespoke intervention. She contrasts it with today’s reality of reading PubMed supplements and guessing whether you are represented in a study cohort. For some diseases, simply knowing which gene variants cause disease is already empowering. For others, the chain is understood and the missing piece is the ability to change a protein’s function, which is where designed proteins could actually cure.
Drug development, off-target effects, and rare disease
The interviewers press on translation, noting a typical trial runs 15 years and $1.5 billion, with only about $50 million in the molecule and preclinical work and the rest in development gated on regulation, recruitment, toxicity, and absorption failures. Chan’s hope is that comprehensive cell models predict off-target effects, like an unanticipated receptor on kidney cells causing renal toxicity, before human trials. They study systems such as inflammation and the immune system rather than chasing individual diseases. The baby KJ case at CHOP, a custom CRISPR therapeutic editing a single mutation delivered to liver cells, illustrates how careful disease and delivery selection unlocks first applications. The “Rare As One” program shows rare-disease patient groups self-organizing registries, biobanks, and trials, compressing timelines from decades to a few years, and the molecule becoming cheap flips the economics of the long tail of niche diseases.
Open source, talent, and the five-year view
Zuckerberg ties the open-source posture to a consistent worldview: empower individuals with tools rather than centralizing intelligence in a few institutions. He does not believe in a single super-intelligence solving all of science, and sees decentralization, the same instinct behind giving people a voice, as how progress is historically made, with biosafety as a real tradeoff to manage. On talent, the pitch is that frontier biology attached to frontier AI is work you cannot do anywhere else, and that meaningful progress needs only a dozen or two dozen strong people, not thousands. Researchers are already wiring the model into agentic systems to automate design. The next frontier is the virtual cell, modeling proteomic, genetic, and transcriptomic layers and connecting them to phenotype with enough generality to answer untrained questions. Five years out, success is a hierarchical set of world models and doing uniquely high-quality work, with Chan adding that the teams are now “arms linked,” directed and interlocked rather than merely moving in the same direction.
Notable Quotes
“We didn’t design a model for antibodies. We didn’t design a model to be able to bind one particular target. We just designed a model that could understand proteins.”
Alex Rives, on protein design emerging from a general model
“The theory isn’t that we’re going to cure the diseases. We’re not. It’s that we want to help accelerate the pace of progress for the whole scientific field.”
Mark Zuckerberg, on why CZI builds tools rather than cures
“My goal is to be able to treat the individual as an individual, understand the mechanisms and be able to intervene.”
Priscilla Chan, on the vision for personalized medicine
“It’s not just like there’s some factory somewhere that you can pay to produce the data. You actually need to invent new novel scientific approaches.”
Mark Zuckerberg, on why frontier biology has to generate its own data
“If we could design a protein to actually change the physiology, then we can actually cure someone.”
Priscilla Chan, on the payoff of protein design
“You open up the black box and you can actually understand the biology that the model is representing.”
Alex Rives, on mechanistic interpretability as a discovery tool
“We don’t believe in this like very centralized future where there should be a small number of institutions that basically are advancing all this stuff.”
Mark Zuckerberg, on the open-source ethos behind Biohub
“Before we had amazing teams moving generally in the same direction. But now we are arms linked moving together.”
Priscilla Chan, on how the Biohub teams now operate under Alex Rives
Watch the full conversation with Mark Zuckerberg, Priscilla Chan, and Alex Rives on the No Priors podcast here.
Related Reading
CZI Biohub Network the official program page for the San Francisco, New York, and Chicago Biohubs discussed throughout.
EvolutionaryScale Alex Rives’s lab and the home of the ESM protein language models behind the world model in this conversation.
Human Cell Atlas the single-cell transcriptomics effort CZI funded that became foundational to modern cell modeling.
AlphaFold (Wikipedia) background on the protein-folding breakthrough referenced as an early proof that structure prediction was tractable at scale.
Rare As One CZI’s program supporting patient-led rare-disease research organizations described near the end of the talk.
This is the full episode of Naval Ravikant’s conversation with three frontier founders: Guillermo Rauch of Vercel, Blake Scholl of Boom Supersonic, and Max Hodak of Science. The premise is that all three are building their own factories rather than assembling off-the-shelf parts, so the interesting question is not what they are building but what they are learning about how to build in the age of AI. Over roughly an hour the discussion moves from software factories and the thousand-x engineer into hardware, regulation, healthcare economics, autonomous companies, and a long closing argument about what humans can still uniquely do. Watch the full conversation on the Naval Podcast YouTube channel. We previously published two segments of this same discussion: part one, Waste Tokens to Save Time, on software factories and whether pure software is dead, and part two, Vibe Coding Hardware, on jet engines, vertical integration, and China’s open-source bet. This post covers the entire episode end to end.
TLDW
Four builders argue that AI has turned the engineer’s job from shipping output into building the factory that produces output, which is why token leaderboards are the new vanity metric and why you should waste tokens to save time. Guillermo Rauch frames the thousand-x engineer and the building-block economy, and asks whether pure software is dead now that models speak English. Blake Scholl shows how Boom turned hardware engineering into software, letting two engineers design an entire jet engine and collapsing months of regulatory compliance documentation into minutes. Max Hodak makes the case for extreme vertical integration, a captive MEMS foundry, and a sober counter to Silicon Valley deregulation triumphalism: the bottleneck is the voters and the regulator’s asymmetric incentives, not just bad rules. The group works through healthcare as a fixed-bucket non-market, China’s cost-reduction strategy and its approved implantable brain interface, autonomous software that runs site reliability and security research with thousands of concurrent agents, a company-wide hackathon where the receptionist shipped a real automation, and a long debate on creativity, out-of-distribution surprise, intent, attribution, and the definition of art. The throughline: humans become verifiers, value moves to creativity, taste, and agency, and the single best move is to get extremely good with the tools, because it is people with AI versus people without AI.
Thoughts
The strongest idea in the episode is the quiet redefinition of what an engineer is for. Rauch’s point is that you no longer judge a person by how well they ship a single output. You judge them by whether they can build the factory that produces outputs B through Z. That reframe instantly explains why token leaderboards are nonsense. Counting tokens consumed is the same category error as counting lines of code written, a measure of motion mistaken for a measure of progress. Naval’s “waste tokens, save time” is the correct response: tokens are cheaper than people, so optimize for your own wall-clock time and the final output, and throw three models at the same problem if that gets you unstuck faster. The uncomfortable corollary, which the group says out loud, is that leverage in idea domains was never linear. The hundred-x and thousand-x engineer is not a new phenomenon. AI just made it impossible to keep pretending otherwise.
The second thread that ties the whole hour together is verification. Everyone converges on the same future: humans stop producing the work directly and move up the stack to signing off on it. Rauch is precise about what that means. Saying “I understand this pull request” no longer requires reading every line. It requires being able to say you wrote the test harness, the proofs, the type checkers, and the simulations that let you stand behind it in production. That is a profound shift, because it accepts that the code may be spaghetti you do not fully understand while insisting that the evaluator around it is trustworthy. Blake extends the same logic to regulation, and this is the most underrated argument in the episode. If you treat a 200-page lightning-strike compliance document as a test suite and a regulation as an exit criterion for an agent loop, then a body of rules you once resented becomes a guard rail that lets you move faster, not slower. The cost of change collapses, change aversion drops, and you can finally afford to iterate on physical things.
Max Hodak is the adult in the room on regulation, and the episode is better for it. The Silicon Valley consensus is that regulation is simply friction to be deleted, and there is plenty of dysfunction to point at: the NRC permitting essentially zero nuclear plants for decades, the FDA’s asymmetric incentives where approving a bad drug ends a career but blocking a good one costs nothing visible. But Hodak keeps pulling the conversation back to the harder truth. This is where the voters are. If you removed the current regulatory package, something very similar would get voted right back in, because the asymmetry reflects how the public actually weighs a visible death against an invisible delay. Real reform is not “deregulate,” it is narrow and surgical: prohibit the FDA from drawing adverse inferences across different users of a compound, build innovation zones where people consent to different rules, or copy Europe’s notified-body model so review capacity can actually scale. That is a far more serious position than the usual abundance-or-bust framing.
The healthcare segment is the part of this conversation you will not find in the two clips, and it is the most heterodox. Hodak’s diagnosis is that healthcare is a fixed bucket of money that grows with tax receipts, not a technological growth industry where falling prices expand the market the way phones and laptops did. Because there is no real private market, you get a small communist society running inside a larger capitalist one, with the waiting lines and frozen product quality that implies. His prescription is not single payer and not insurance reform. It is to drive the cost of bringing devices and drugs to market so low that a patient can buy a restored sense or an extra decade of life on a credit card, the way they finance a car, and his warning is that China’s lower approval costs and its already-approved implantable brain interface put it on track to do exactly that. Whether or not you buy the twenty-percent-of-income deductible he floats, the framing that a private market is the missing feedback loop is the kind of argument that gets too little airtime.
The closing debate on creativity is where the four of them disagree most productively, and they are careful enough to notice that their conclusions follow from their definitions. Hodak defines art as meaningful out-of-distribution behavior, which lets a military maneuver or a math proof count, and leads him to think a sufficiently capable model gets there too. Naval defines art as conveying an emotion with intent, which makes attribution load-bearing: the same photo down to the last pixel means more when a human took it, and a startup doing hardware attestation of human authorship suddenly has a real market. The shared observation that should worry every builder is that AI output collapses to a distribution mean. Every Claude-built website ends up the same serif font, the same brown and cream, the same monospace spacing, recognizable as slop precisely because it is in-distribution. The optimistic read, and the one Naval lands the episode on, is that this leaves an enormous and durable lane for humans who can step outside the system, and that the practical move for everyone is simply to become excellent with the tools, because the real divide is people with AI versus people without.
Key Takeaways
The job of an engineer has shifted from shipping a single output to building the factory that produces multiplicative outputs, so people are now judged on the leverage they create rather than the work they personally do.
There were always 10x engineers, and in idea, intellectual, and digital domains the real spread is 100x or 1000x. AI leverage just made that gap impossible to deny.
Token leaderboards and token consumption are the new lines-of-code: a measure of activity that does not map to value. Measure your own time and the final output instead.
Waste tokens to save time. Models are still far cheaper than a human, so throwing Codex, Claude, and Gemini at the same problem repeatedly is rational even when it looks wasteful.
Low-quality first-pass code is fine because you can spend more tokens later to harden it for production. The constraint is verifiable domains, not code quality.
A model is roughly as good as you are in a domain. The quality of your prompting and reprompting strongly determines the output, though this dependence should fade as models improve.
Models graduated from junior to principal engineers: they now return with multiple routes and tradeoffs rather than running away with the first idea, even if their time and cost estimates are often wrong.
A junior gets knowledge they could never have produced alone, but an experienced architect still extracts far more juice. Taste and judgment, like picking Postgres versus ClickHouse, remain the human’s edge.
Pure software’s moat is in question now that models speak fuzzy, sloppy English. For hardware founders this is a boon, since good software finally becomes cheap to produce.
The building-block economy, from Mitchell Hashimoto, argues agents need powerful reusable infrastructure rather than reinventing queues and databases every time. Shared dependencies are a cooperation value, like everyone depending on the same Postgres version.
Naval and Max both stopped writing code for years, then started building software they use daily through agents, on the strength of understanding how the pieces fit rather than syntax.
With agents you stop getting stuck on narrow debugging problems that used to consume indefinite time. The intrinsic frustration that was once “how you learn” is largely gone.
Boom turned siloed hardware engineering, much of it trapped in Excel and VBScript with no source control, into real software with automated testing and repeatable flows.
Software engineers now build the architectures and hardware engineers vibe code their pieces, letting two engineers design an entire jet engine where a single turbine-blade analysis once took one engineer a full day across a thousand blades.
Enterprise collaboration software and even spreadsheets are getting cooked, because you can now code the exact custom tool you need instead of approximating it.
AI will soon generate step files and PCB layouts, bringing the current software boom to mechanical and electrical engineering, likely within the year.
China is betting on open-source models because its hardware and supply-chain superiority pairs with on-demand software generation to erase Silicon Valley’s software advantage. Fall behind on generating software and you fall behind on generating everything.
In real usage, frontier intelligence dominates the top. Gemini “slaps at scale” as an industrial production model for support and browser automation, while Chinese models are not in the frontier coding tier.
Intelligence is an unalloyed good. Because mistakes are invisible and models are cheaper than people, you reach for the smartest available model rather than running a weaker one many times.
Max’s vertical integration thesis: when you cannot buy a part, you make it. Science owns a captive MEMS foundry because tighter integration toward a single block of bonded matter yields lower power, smaller size, and longer life.
AI’s biggest near-term impact inside hardware companies is regulatory: generating documentation and tracing which of thousands of ISO standards apply, work that used to occupy a quality team for months.
Junior engineers got promoted to senior and junior engineering got handed to agents. The same pattern hits law, where basic NDAs and red lines no longer require a lawyer.
Humans are becoming verifiers. Signing off on a PR means standing behind its consequences via tests, proofs, and type checkers, not reading every line. Creating software is easy; keeping it secure, tested, and maintained 1000 days out is the real question.
A RAG over regulatory documents collapses a 200-page compliance test plan from months to minutes, which cuts change aversion: you can alter the airplane and regenerate compliance instead of crying over rework.
Regulations can act as a test suite and exit criteria for agent loops, as long as they are non-contradictory and reasonable. The alternative is shipping slop directly into the air.
Physical building is guilty until proven innocent, illustrated by the absurdity of pre-filing a driving plan before every trip. The fix is more enforcement-based regulation rather than pre-approval, though agents on both sides could trigger a red queen race and DDoS overwhelmed agencies.
Regulation often fails to make things safer, only slower: the 737 Max shipped a single sensor with full authority over pitch, and the NRC kept us perfectly safe by approving almost no nuclear plants for decades.
The deeper problem is the voters and the regulator’s asymmetric incentives. Approve a bad thing and your career ends; block a good thing and nobody notices. Removing one agency just elects its replacement.
Targeted fixes beat blanket deregulation: bar adverse inferences across users of a compound, use single-patient IND pathways, create opt-in innovation and YIMBY zones, or adopt Europe’s competitive notified-body reviewers.
Healthcare is a fixed bucket of money tied to tax receipts, not a growth industry, so spending 10x more on it would be a catastrophe rather than a triumph. With no private market you run a small communist society inside a capitalist one.
The escape is lower cost-to-market, not single payer, so people can finance care like a car. China’s lower approval costs and its already-approved implantable BCI point that direction. LASIK, dental, and plastic surgery advance because patients pay directly.
End-of-one medicine works at the high end, as with GitLab’s Sid Sijbrandij outliving his cancer prognosis through a self-built escalation ladder, but it demands enormous agency at the patient’s weakest moment. AI should democratize that knowledge.
Vercel automated much of site reliability engineering: anomalies fire alerts, an agent investigates, can open an incident, and begins remediation, stopping just short of changing production itself.
Running an open-sourced security tool against the whole monorepo with 10,000 concurrent agents produced several quarters of security research in a couple of days for about $14,000 in tokens. Code translation and optimization are similarly autonomous now.
Blake stopped all project work for a week and had everyone, receptionist to engineers, build something with AI and demo it. He expected mostly silly projects and got mostly needle movers, including a real automation from shipping and receiving.
The autonomous company of the future may have a workforce that trains the agents doing the work rather than doing it directly, with tooling that extracts reusable skills from your inputs and outputs.
Returns are shifting from intelligence toward agency for humans, since agents supply the intelligence. The people best fit for the future open a coding agent and ask what to build instead of defaulting to passive consumption.
Maybe 10x more people are coding than a year ago, yet around 99% still never will, because to a non-coder the starting step remains unimaginable. Vibe coding is described as more addictive and entertaining than video games, with real output.
AI video lacks taste and judgment for now, but by 2030 expect fan-made films: dozens of Lord of the Rings takes, or generating unmade seasons of The Expanse from the books. The bigger prize is a genuinely new imaginative work, not a remix.
What humans uniquely do is generate meaningful surprise out of the training distribution, with intent that makes it mean something. Gödel stepping outside the formal system is the archetype; Claude’s identical-looking websites are the counterexample of in-distribution slop.
Higher productivity historically means you hire more, not fewer, of the productive people. Expect a larger number of smaller teams, an entrepreneurship explosion, and generalists winning as credentials matter less than creativity, taste, and judgment.
The throughline is people with AI versus people without AI. The single best investment right now is getting genuinely good with the tools and learning the exact edges of what they can and cannot do.
Detailed Summary
Software Factories and the Thousand-X Engineer
Guillermo Rauch opens with the idea that has him “pilled”: the engineer’s job has changed from shipping output directly to building the factory that produces multiplicative outputs. That reframes how you evaluate people and surfaces an old, controversial truth. He used to get flamed on Twitter for asserting 10x engineers, since it offends an equality instinct, but in intellectual and digital domains the real spread is 100x or 1000x, and choosing the right thing to work on is an infinite multiplier on top. AI leverage makes this less controversial, except that people now confuse token spend for productivity. The group agrees token leaderboards are the new lines-of-code. Max Hodak adds that a model is about as good as you are in a domain, so a capable developer gets a powerful collaborator while a junior gets junior-grade help, and the sporadic feedback you give, the reprompting, disproportionately determines the result. Naval’s posture is the opposite of fussy: he ignored every prompt-engineering trick on the bet that the models would improve faster than he could learn to game them, types less and less, and brute-forces problems by throwing multiple models at them. Waste tokens, save time, because tokens are cheaper than people.
Is Pure Software Dead, and the Building-Block Economy
Rauch describes models crossing from junior to principal engineer: they now return with several routes and explicit tradeoffs, push back when you try to jam high-cardinality telemetry into Postgres, and suggest ClickHouse or Athena instead. That elevates taste and judgment as the human contribution. He then poses the hard question: is pure software engineering obsolete now that models speak fuzzy, sloppy English and you no longer need code to communicate with them? For hardware founders it is a boon, echoing Patrick Collison’s line that software is art and artists are hard to hire. To temper the “agents reinvent everything” fantasy, he invokes Mitchell Hashimoto’s building-block economy: you do not want your agent rebuilding a queue from first principles every time it sends an email, and shared dependencies like a common Postgres version carry real cooperation value. Reusable infrastructure becomes more valuable in the agentic era, functioning like libraries and dependencies, or even a token cache, so models fork from existing starting points instead of burning a trillion tokens to recreate what exists. Naval and Max both note they had not written code in years and now build daily through agents, because understanding how APIs, data flow, and performance fit together matters more than syntax, and vibe coding is just transmitting intent the way a good engineering leader already did through people.
Vibe Coding Hardware at Boom Supersonic
Blake Scholl explains how AI changed the role of software and hardware developers at Boom. A great deal of hardware engineering lives in complex Excel spreadsheets and VBScript on individual laptops, with no source control and no automated testing, and handoffs happen manually over email like it is the 1990s. Boom had long tried to turn these flows into real software but could never afford enough software engineers. The new model is that software engineers create the architectures, because they understand systems, algorithms, and separation of concerns, and hardware engineers vibe code their own pieces. The result is mind-blowing productivity for small teams. His example: a turbine blade is cold at rest and expands when hot, so you must design both the cold and hot shapes and convert between structures and aerodynamics, work that took one engineer a full day per blade across a thousand blades in a jet. With a combined software-and-hardware tool you can now change blade geometry and see structural and aerodynamic results in real time, letting two engineers design an entire jet engine. The group extends this to the death of enterprise collaboration software and even spreadsheets, since you can now code the exact custom tool you need, and predicts AI will soon generate step files and PCB layouts, carrying the boom into mechanical and electrical engineering.
China, Open Source, and Which Models Actually Get Used
Naval argues China is going all-in on open-source models because its hardware and supply-chain superiority pairs naturally with on-demand software generation, which erases Silicon Valley’s software edge, and because the Chinese government has a history of funding ecosystem-wide efforts in network-effect businesses. Without frontier coding models there is no self-improvement, so a country that cannot generate frontier software falls behind on generating everything downstream. He notes the irony that almost all the open-source heft now comes from China, since OpenAI is not open, Grok and Google’s local models trail, and Anthropic ships no open models. On real usage, Rauch reports from Vercel’s AI gateway that frontier intelligence dominates the top, with a caveat: frontier intelligence at the right cost and performance, like Gemini, slaps at scale and is the best industrial production model for support and browser automation, while Chinese models are not in the frontier coding tier. Naval frames intelligence as an unalloyed good, since model mistakes are invisible and a smarter model is still cheaper than a person, which pushes everyone toward the most intelligent option and risks an oligopoly in AI.
Vertical Integration, Verifiers, and the Slop Problem
Max Hodak lays out Science’s vertical integration: the preference is always to buy, as with cheap PCBs from Asia, but when components do not exist you must make them, and the closer a product gets to a single block of covalently bonded matter the better it performs. Science owns a captive MEMS foundry on the east coast because there was no other way to do the packaging and assembly it needed. He notes AI’s most surprising internal impact so far is regulatory: generating documentation and tracing which of thousands of ISO standards apply, work that once tied up a quality team for months. Rauch raises the slop problem: mountains of AI-generated code arriving as pull requests nobody can read line by line. His standard is that an engineer must be able to say they understand and will stand behind the consequences of a PR, backed by the test harness, proofs, and type checkers, even without reading it all. Naval generalizes this into humans becoming verifiers, with lawyers, engineers, and operators moving to verifying the stack and standing behind it, and Rauch warns that creating software is the easy zero-to-one part while keeping it secure, tested, performant, and maintained a thousand days later is the real test.
Regulation as Test Suite, and the Voter Problem
Blake describes building a RAG that compresses a 200-page lightning-strike compliance test plan from months of a “monkey at keyboard” engineer’s work into minutes, with a powerful second-order effect: change the airplane and you regenerate compliance in minutes instead of crying over months of rework, which slashes change aversion and lets a small number of creative engineers iterate. Max reframes regulations as potentially good guard rails, a test suite and exit criteria for agent loops, provided they are non-contradictory and reasonable, since the alternative is shipping slop into the air. Naval warns of a red queen race of agent-on-agent compliance and agencies getting DDoSed by clever entrepreneurs flooding them with documents. Blake pushes for enforcement-based rather than pre-approval regulation, using the analogy that we would never tolerate filing a driving plan before every trip, yet that is exactly how physical infrastructure works: guilty until proven innocent. He cites the 737 Max’s single all-authority sensor and the NRC permitting almost no nuclear plants for decades as proof that this makes us slower, not safer. Hodak supplies the counterweight: the deeper issue is the voters and the regulator’s asymmetric incentives, where approving a bad thing ends a career and blocking a good thing goes unnoticed. Remove an agency and the electorate installs its twin. Naval and Max agree the real reforms are narrow, including innovation zones, opt-in YIMBY zones, and the experimental laboratory of fifty states.
Drug Discovery, Healthcare Economics, and End-of-One Medicine
Hodak explains why innovation zones do not solve drug discovery. The right-to-try act and single-patient IND already exist, and the FDA approves over 99% of such requests, sometimes by phone, but dosing requires clinical-grade drug that only the IP owner has, and the FDA will draw an adverse inference against the whole program if a very sick patient does worse. A targeted fix is to prohibit adverse inferences across different users of a compound. He points to Europe’s notified-body system, private certifiers blessed by governments, as a way to scale review capacity, and to China’s CFDA, which already approved an implantable brain-computer interface and brings products to market far cheaper. His core economic argument is that healthcare is a fixed bucket of money that grows only with tax receipts, unlike phones and laptops where falling prices expanded the market, so spending 10x more on healthcare would be a catastrophe rather than the triumph that 10x AI spending would be. With no private market you run a small communist society inside a capitalist one, with the lines and frozen quality that implies. The way out is lower cost-to-market so patients can finance care like a car, which is the direction China is pushing. Naval’s twist is a healthcare plan where the first 20% of income is the deductible to recreate a private market, citing LASIK, dental, and plastic surgery as fields that advance because patients pay directly. The group closes the segment on GitLab’s Sid Sijbrandij, who outlived a rare-cancer prognosis by building his own escalation ladder of drugs, noting that end-of-one medicine works at the high end but demands enormous agency exactly when a patient is weakest, which is where AI should democratize access to knowledge.
Autonomous Software, Hackathons, and the Autonomous Company
Asked how much autonomous software they run, Rauch describes Vercel automating much of site reliability engineering: instead of hand-set alarm thresholds, anomalies in error rate, latency, or throughput fire an alert, an agent investigates, can open an incident that loops in people, and begins remediation, stopping just short of changing production. Vercel also runs autonomous optimization and security research, and an open-sourced security tool run against the entire monorepo with 10,000 concurrent agents produced several quarters of security research in a couple of days for about $14,000 in tokens, the equivalent of months of red teaming. Max shares a vibe-coded bug-reporting queue where TestFlight users submit logs and screenshots, a daemon analyzes and fixes issues in the background, and ships him a build to try, raising the prospect of apps effectively built by their users, with the caveat that you would get a Homer Simpson car of every feature. Blake recounts stopping all project work for a week and requiring everyone, from the receptionist to the engineers, to build something with AI and demo it. He expected mostly silly projects and got mostly needle movers, including a genuinely useful automation from the shipping and receiving associate, concluding that most people have an idea worth building but cannot tell a good first idea from a bad one until they can iterate on a real thing. Rauch extends this to a workforce that trains the agents doing the work rather than doing it directly, and a coming feature to extract reusable skills from your inputs and outputs.
Creativity, Out-of-Distribution Surprise, and What Humans Can Uniquely Do
On the intelligence-versus-agency split, Max suggests returns to humans tilt toward agency since agents supply intelligence, while Naval counters that you stay 99% intelligence and 1% agency because the agents exercise the agency for you. They agree the humans best suited to the future are the agentic ones who open a coding agent and ask what to build. Coding has perhaps 10x more participants than a year ago, yet roughly 99% still never will, because the first step is unimaginable to a non-coder, even as vibe coding proves more addictive and entertaining than video games while producing something real. On AI video, the group notes it still lacks taste and judgment, but expects fan-made films by 2030, dozens of Lord of the Rings takes or generated seasons of The Expanse, while prizing a genuinely new imaginative work over a remix. The long closing debate turns on definitions. Hodak defines art as meaningful out-of-distribution behavior, broad enough to include a military maneuver, and expects models to reach it. Naval defines art as conveying emotion with intent, which makes attribution decisive: the same photo means more taken by a human, and a hardware-attestation startup gains a real use case. They cite Gödel stepping outside the formal system as the human archetype and the identical look of every Claude-built website as in-distribution slop. Naval lands the episode on optimism: productivity gains mean hiring more, not fewer, of the creative and AI-fluent, the future is a larger number of smaller teams and an entrepreneurship explosion where generalists thrive and credentials fade, and the single best move is to get extremely good with the tools, because it is people with AI versus people without AI.
Notable Quotes
“Now clearly there’s 100x or a thousandx engineers and the world hasn’t fully adjusted to this.”
Guillermo Rauch, on why AI made the spread between engineers impossible to ignore
“Just waste tokens, save time. Don’t look at the tokens either as inputs or outputs. Just look at your time and look at the final output.”
Naval Ravikant, on the right way to measure AI’s return
“We had to learn code to communicate with the models. Now the models speak English and they speak fuzzy sloppy English like a human and they understand things.”
Guillermo Rauch, asking whether pure software engineering is now obsolete
“It allows two engineers to design an entire jet engine, which is just wildly different.”
Blake Scholl, on Boom turning hardware engineering into software
“You need to be able to say I am signing off on understanding the consequences of this PR.”
Guillermo Rauch, on what it means to stand behind code you did not read line by line
“That is absolutely the way we build physical infrastructure in this country. It’s guilty until proven innocent. And what we should actually do is make more of these things enforcement based rather than pre-approval based.”
Blake Scholl, comparing the permitting process to filing a driving plan before every trip
“You’re basically running a small communist society inside a larger capitalist society. And that’s what we’re doing in healthcare.”
Max Hodak, on why there is no real private market in healthcare
“I expected we would get a large number of silly projects and a small number of needle movers. And what we got was a large number of needle movers and a very small number of silly projects.”
Blake Scholl, on the week he had the whole company build with AI
“If a person takes the photo versus AI generates the exact same photo down to the last pixel, the person taking the photo will have more meaning for me.”
Naval Ravikant, on why intent and attribution make something art
“It’s about people with AI versus people without AI. And so the single best thing you can be doing right now for yourself is just getting really good with these tools.”
Naval Ravikant, closing the conversation on the only divide that matters
Part one: Waste Tokens to Save Time, our writeup of the first segment, on software factories, the thousand-x engineer, token leaderboards, and whether pure software is dead.
Part two: Vibe Coding Hardware, our writeup of the second segment, on AI-designed jet engines, vertical integration, China’s open-source bet, and humans as verifiers.
Naval Ravikant’s official site, the canonical home for Naval’s essays and podcast on technology, judgment, and leverage.
Boom Supersonic, Blake Scholl’s company building supersonic aircraft and its own jet engines, source of the turbine-blade and two-engineers example.
Science Corporation, Max Hodak’s brain-computer interface company, whose captive MEMS foundry and FDA arguments anchor the hardware and healthcare segments.
Vercel, Guillermo Rauch’s company, whose AI gateway data and autonomous SRE work inform the usage and automation discussion.
Mark Zuckerberg and Priscilla Chan discuss their Chan Zuckerberg Initiative’s mission to cure, prevent, or manage all diseases by 2100 using AI-driven tools like virtual cell models and cell atlases. They emphasize building open-source datasets, fostering cross-disciplinary collaboration, and leveraging AI to accelerate basic science. Worth watching? Absolutely yes – it’s packed with insightful, forward-thinking ideas on AI-biotech fusion, even if you’re skeptical of Big Tech philanthropy.
Detailed Summary
In this a16z podcast episode hosted by Ben Horowitz, Erik Torenberg, and Vineeta Agarwala, Mark Zuckerberg and Priscilla Chan outline the ambitious goals of the Chan Zuckerberg Initiative (CZI). Launched nearly a decade ago, CZI aims to empower scientists to cure, prevent, or manage all diseases by the end of the century. Chan, a pediatrician, shares her motivation from treating patients with unknown conditions, highlighting the need for basic science to create a “pipeline of hope.” Zuckerberg explains their strategy: focusing on tool-building to accelerate scientific discovery, as major breakthroughs often stem from new observational tools like the microscope.
They critique traditional NIH funding for being too fragmented and short-term, advocating for larger, 10-15 year projects costing $100M+. CZI fills this gap by funding collaborative “Biohubs” in San Francisco, Chicago, and New York, each tackling grand challenges like cell engineering, tissue communication, and deep imaging. The integration of AI is central, with Biohubs pairing frontier biology and AI to create datasets for models like virtual cells.
A key highlight is the Human Cell Atlas, described as biology’s “periodic table,” cataloging millions of cells in an open-source format. Initially an annotation tool, it grew via network effects into a community resource. Now, they’re advancing to virtual cell models for in-silico hypothesis testing, reducing wet lab costs and enabling riskier experiments. Models like VariantFormer (predicting CRISPR edits) and diffusion models (generating synthetic cells) are mentioned.
The couple announces big changes: unifying CZI under AI leadership with Alex Rives (from Evolutionary Scale) heading the Biohub, and doubling down on science as their primary philanthropy focus. They stress interdisciplinary collaboration—biologists and engineers working side-by-side—and expanding compute over physical space. Success metrics include tool adoption, enabling precision medicine for “rare” diseases (treating common ones as individualized), and fostering an explosion of biotech innovations.
Challenges include bridging AI optimism with biological complexity, but they see AI as underestimated leverage. Viewer comments range from praise for open AI research to skepticism about non-scientists leading, but the discussion remains optimistic about AI democratizing science via intuitive interfaces.
Key Takeaways
Mission-Driven Philanthropy: CZI focuses on tools to accelerate science, not direct cures, addressing gaps in government funding for long-term, high-risk projects.
AI-Biology Fusion: Biohubs combine frontier AI and biology to build datasets and models, like virtual cells, for simulating biology and derisking experiments.
Human Cell Atlas: An open-source “periodic table” of biology with millions of cells, enabling precision medicine by linking mutations to cellular impacts.
Virtual Cells Promise: Allow in-silico testing to encourage bolder hypotheses, treating diseases as individualized (e.g., no more trial-and-error for hypertension).
Organizational Shift: Unifying under AI expert Alex Rives; expanding compute clusters (10,000+ GPUs) for collaborative research.
Interdisciplinary Collaboration: Success from co-locating biologists and engineers; lowering barriers via user-friendly interfaces to democratize science.
Broader Impact: AI could speed up the 2100 goal; enables startups and pharma to innovate faster using open tools.
Challenges and Feedback: Balancing ambition with realism; community adoption as success metric; envy of for-profit clarity but validation through tool usage.
Hyper-Compressed Summary
Zuckerberg/Chan: CZI uses AI + Biohubs to build virtual cells and atlases, accelerating cures via open tools and cross-discipline collab—targeting all diseases by 2100. Watch for biotech-AI insights.