Tag: red teaming

US Government Orders Anthropic to Suspend Claude Fable 5 and Mythos 5: Inside the Export Control Directive, the Jailbreak Dispute, and What It Means for Frontier AI
On June 12, 2026, Anthropic published a statement announcing that the US government, citing national security authorities, has issued an export control directive forcing the company to suspend all access to its newest frontier models, Claude Fable 5 and Claude Mythos 5. The order technically targets foreign nationals inside and outside the United States, including Anthropic’s own foreign national employees, but the practical effect is that both models are going dark for every customer worldwide. It is the first publicly known instance of the US government ordering a deployed frontier AI model offline, and Anthropic is complying while openly disputing the basis for the decision.

TLDR

The US government delivered an export control directive to Anthropic at 5:21pm ET on June 12, 2026, suspending all access to Fable 5 and Mythos 5 over an alleged jailbreak of Fable 5’s safeguards. Anthropic says the letter contained no specific details, that the only evidence shared was verbal, and that the technique in question amounts to asking the model to read a codebase and fix software flaws, a capability the company says is freely available from other models including OpenAI’s GPT-5.5 and used daily by cyber defenders. Anthropic defends its defense in depth strategy, notes that thousands of hours of red teaming by the US government, the UK AISI, and third parties found no universal jailbreak, and warns that recalling a commercial model over a narrow, non-universal jailbreak would effectively halt all new frontier model deployments if applied industry-wide. Access to all other Anthropic models, including Claude Opus, Sonnet, and Haiku, is unaffected, and the company says it believes the situation is a misunderstanding and is working to restore access, with more details promised within 24 hours.

Thoughts

This is a watershed moment regardless of how it resolves. Governments have blocked AI exports before, but ordering a deployed commercial model recalled out from under hundreds of millions of users is a new kind of intervention, closer to a product recall than a trade restriction. The mechanism matters too. Export control authority aimed at foreign nationals, including a company’s own employees, that cascades into a global shutdown is a blunt instrument doing the work of a regulatory regime that does not exist yet. The US has no statutory process for recalling an AI model, so the government reached for the closest tool on the shelf, and the result is a precedent built on improvisation.

There is real irony in who got hit first. Anthropic has spent years arguing, publicly and in Washington, that governments should have the power to block unsafe AI deployments. Now the company that asked for a referee is the first one whistled, and its complaint is not about the existence of the power but about the process: a letter at 5:21pm with no specifics, verbal evidence only, and no transparent or technically grounded procedure. That distinction is the whole ballgame for AI governance. A power to halt deployments without due process standards is not regulation, it is discretion, and discretion cuts in every direction depending on who holds it.

The technical dispute underneath is genuinely interesting because it exposes how unsettled the definition of a dangerous jailbreak is. Anthropic’s account of the offending technique, asking the model to read a specific codebase and fix any software flaws, describes something security teams do on purpose every single day. Vulnerability discovery is the canonical dual use capability: the same analysis that lets a defender patch a hole lets an attacker find one. If the bar for recall is that a model can be coaxed into doing competent security analysis, then every capable model on the market fails that bar, which is exactly Anthropic’s point about GPT-5.5. The hard question the directive dodges is not whether Fable 5 can find bugs but whether it provides meaningful uplift beyond what is already freely available, and Anthropic says it does not.

For builders, the immediate lesson is uncomfortable: model availability is now a political variable, not just an engineering one. Teams that built directly on Fable 5 lost a production dependency overnight through no fault of Anthropic’s infrastructure, their own code, or any terms of service violation. Multi-model fallback strategies, abstraction layers over providers, and graceful degradation paths just moved from nice-to-have to table stakes for anyone running serious workloads on frontier models. The companies that absorbed this outage gracefully are the ones that assumed any single model could vanish.

The next 24 hours matter more than the directive itself. Anthropic has promised more details, and the government will face pressure to either substantiate a concern that justifies a global recall or quietly walk it back. Either outcome sets the real precedent. If the directive holds on thin evidence, every frontier lab now operates under the threat of arbitrary shutdown. If it collapses under scrutiny, the case for a formal, transparent statutory process for AI deployment decisions, which Anthropic explicitly endorses in its own statement, gets a lot stronger in Congress than it was a week ago.

Key Takeaways
- The US government issued an export control directive on June 12, 2026 suspending all access to Claude Fable 5 and Claude Mythos 5, citing national security authorities.
- The directive formally targets access by any foreign national, inside or outside the United States, including Anthropic’s own foreign national employees.
- The net effect is that Anthropic must disable Fable 5 and Mythos 5 for all customers worldwide to ensure compliance, not just for foreign users.
- Access to all other Anthropic models, including the Claude Opus, Sonnet, and Haiku families, is not affected by the order.
- Anthropic received the directive at 5:21pm ET the same day it published its statement, and says the letter did not provide specific details of the national security concern.
- Anthropic’s understanding is that the government believes it has become aware of a method of bypassing, or jailbreaking, Fable 5’s safeguards.
- Anthropic reviewed a demonstration of the specific technique and says it only identified a small number of previously known, minor vulnerabilities.
- The company says other publicly available models can discover the same vulnerabilities without requiring any bypass at all.
- Before launch, Fable 5’s safeguards were red-teamed for thousands of hours in total by the US government, the UK AISI, multiple private third-party organizations, and internal teams.
- No tester has found a universal jailbreak for Fable 5, meaning a method that broadly bypasses safeguards and unlocks a wide range of cyber capabilities.
- Anthropic openly states that perfect jailbreak resistance does not appear possible for any model provider today, and that every safeguard in the industry is vulnerable to non-universal jailbreaks.
- Fable 5 was deployed under a defense in depth strategy: make jailbreaks either narrow or very expensive to produce, then combine that with monitoring to quickly detect and shut down successful attacks.
- Anthropic’s 30-day customer data retention requirement for Fable exists specifically to support jailbreak research and mitigation, a policy the company says carries real costs with customers.
- Anthropic says it has not received any disclosure of a concerning non-universal jailbreak that led to a harmful result; disclosed potential jailbreaks were benign or provided no Mythos-specific uplift.
- The only evidence the government has provided is verbal, describing a narrow, non-universal jailbreak that essentially consists of asking the model to read a specific codebase and fix any software flaws.
- Anthropic reviewed a report it believes is the basis of the directive and validated that the capability level shown is widely available from other models, including OpenAI’s GPT-5.5, and is used every day by cyber defenders.
- Anthropic is complying with the legal directive while explicitly disagreeing that a narrow potential jailbreak justifies recalling a commercial model deployed to hundreds of millions of people.
- The company warns that if this recall standard were applied across the industry, it would essentially halt all new model deployments for every frontier model provider.
- Anthropic supports government power to block unsafe deployments in principle, but only through a statutory process that is transparent, fair, clear, and grounded in technical facts, and says this action meets none of those principles.
- Anthropic apologized to customers, called the situation a misunderstanding, said it is working to restore access as soon as possible, and promised more details within 24 hours.
Detailed Summary

What the directive actually does

The order arrived as a letter from the US government at 5:21pm ET on June 12, 2026, invoking national security authorities under export control law. On paper it suspends access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, a category that includes some of Anthropic’s own employees. In practice, Anthropic says compliance requires abruptly disabling both models for every customer, since there is no clean way to enforce a nationality-based access boundary across a global product. The letter did not spell out the specific national security concern. Everything else in Anthropic’s statement is the company’s own reconstruction of what prompted the action.

The jailbreak at the center of the dispute

Anthropic’s understanding is that the government became aware of a method for bypassing Fable 5’s safeguards. The company reviewed a demonstration of the technique and characterizes the results as a small number of previously known, minor vulnerabilities, all relatively simple, all discoverable by other publicly available models without any jailbreak at all. According to Anthropic, the government’s evidence so far has been entirely verbal, and the technique boils down to asking the model to read a specific codebase and fix any software flaws. The company reviewed a report it believes underlies the directive and validated that the displayed capability is widely available elsewhere, naming OpenAI’s GPT-5.5 directly, and noted that this exact kind of analysis is what defenders use to keep systems safe.

Anthropic’s defense in depth posture

The statement restates the safety posture Anthropic laid out at Fable 5’s launch. The safeguards around cybersecurity tasks are strong enough that users have complained they are overly broad. In the weeks before launch, the US government, the UK AISI, multiple private third-party organizations, and internal teams red-teamed the safeguards for thousands of hours combined, and those tests showed Fable’s protections to be substantially more effective than any previously deployed model. No tester found a universal jailbreak. Anthropic is candid that perfect jailbreak resistance is likely impossible for anyone today, which is why the strategy is defense in depth: keep jailbreaks narrow or expensive, monitor aggressively, and shut down attacks fast. The 30-day customer data retention requirement on Fable exists to support that monitoring and mitigation loop. The company says this posture makes Fable’s risks comparable to models already deployed across the industry.

Complying while disputing the standard

Anthropic is removing access for all users as legally required, but the statement draws a hard line on the principle. The company disagrees that a narrow potential jailbreak, one that produced no disclosed harmful result, justifies recalling a commercial model serving hundreds of millions of people. Its broader warning is that this standard, applied evenly, would halt all new frontier model deployments industry-wide, since every provider’s safeguards are vulnerable to narrow jailbreaks. Anthropic also turns its own policy position into a critique: the company has publicly supported giving government the ability to block unsafe deployments, but through a statutory process that is transparent, fair, clear, and grounded in technical facts, and it says this action does not adhere to those principles.

What happens next

Anthropic closed by apologizing to customers, calling the situation a misunderstanding, and committing to restore access as soon as possible. The company promised to share more details over the next 24 hours, which makes this a developing story. The open questions are whether the government substantiates its concern with written technical evidence, whether the directive survives that scrutiny, and whether this episode accelerates the formal statutory process for AI deployment decisions that Anthropic says should have governed the action in the first place.

Notable Quotes

“The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.”
Anthropic, on why a directive aimed at foreign nationals becomes a global shutdown

“We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern.”
Anthropic, on the abruptness and opacity of the order

“These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass.”
Anthropic, on its review of the demonstrated jailbreak technique

“We suspect that perfect jailbreak resistance is not currently possible for any model provider.”
Anthropic, restating the position it disclosed at Fable 5’s launch

“We stand by this defense in depth strategy. It reduces the risks posed by Fable, making them comparable to the risks of existing models already deployed across the industry.”
Anthropic, defending its layered safeguards approach

“To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws.”
Anthropic, describing the technique behind the directive

“However, we disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people.”
Anthropic, on complying while contesting the decision

“If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”
Anthropic, on the industry-wide implications of the recall standard

“As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.”
Anthropic, on the kind of oversight process it says should have governed the action

“We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.”
Anthropic, closing its statement to customers

Read the full statement on Anthropic’s site here.

Related Reading
- Anthropic’s Claude Fable 5 and Mythos 5 launch announcement the original deployment post that laid out the safeguards posture now at the center of the dispute.
- US Bureau of Industry and Security the agency that administers US export controls, the kind of authority a directive like this one invokes.
- Export control (Wikipedia) background on how export control law works and why it can reach foreign nationals inside the United States.
- Prompt injection and jailbreaking (Wikipedia) primer on the techniques used to bypass language model safeguards.
- UK AI Security Institute one of the third-party organizations that red-teamed Fable 5’s safeguards before launch.
June 13, 2026
Dario Amodei on Policy for the AI Exponential: Anthropic’s Plan for AI Regulation, Job Displacement, Civil Liberties, and Democratic Leadership
Our Anthropic overlords deciding which prompts the peasants are allowed to use. pic.twitter.com/08YCSJcYSc
— Bojan Tunguz (@tunguz) June 10, 2026

In June 2026, Anthropic CEO Dario Amodei published “Policy on the AI Exponential”, a wide-ranging essay arguing that the gap between how fast AI is advancing and how slowly policy moves has become dangerous, and that the window to close it is open right now. He opens with a memorable image from The Lord of the Rings: the Hobbits trying to rouse Treebeard, the ancient tree who takes a full day just to say hello, to defend his forest before it is cut down. That mismatch in speed, he writes, is exactly the relationship between AI and our political institutions. This post breaks the essay down in full and adds analysis of where the argument lands.

TLDR

Amodei argues that AI’s scaling laws point toward “powerful AI,” a country of geniuses in a datacenter, within a few years, while legislation still moves on a timescale of years. For most of the last few years, safety advocates including Anthropic pushed only for optionality-preserving moves like transparency rules, chip export controls, and labor data collection, because the risks were not yet concrete. He says that has changed: events like Claude Mythos Preview proved frontier models are now tools of national strategic consequence, and the time for binding regulation has arrived. The essay covers five policy areas. First, regulation and public safety, where he proposes an FAA-style regime of mandatory third-party testing of frontier models above a compute threshold across four risks (cybersecurity, biological weapons, loss of control, and automated R&D), with government power to block unsafe deployments. Second, macroeconomics and tax policy, where AI could deliver hypergrowth and severe, enduring job displacement at the same time, demanding measurement, pro-employment incentives, and possibly UBI or universal capital accounts. Third, accelerating AI’s positive impact, where the danger is regulators like the FDA being too slow rather than too lax, and biomedical approval needs reform. Fourth, the state and civil liberties, where AI could become the ultimate tool of autocracy through autonomous weapons and mass surveillance, requiring new accountability rules, a domestic ban on autonomous weapons, closing the data broker loophole, and public rights to AI advice. Fifth, securing leadership by democracies through a values-based global coalition that controls the AI supply chain, coordinates on risk, shares benefits, and rejects AI-powered repression. He closes by rejecting the idea that public concern about AI is a PR problem to be marketed away, calling it democratic accountability working as it should.

Thoughts

The most important move in this essay is structural, not technical. Amodei is explicitly retiring the “preserve optionality” posture that defined Anthropic’s policy work through 2025 and replacing it with a call for binding rules. For years the argument from safety-minded labs was that the risks were too speculative to legislate against without doing more harm than good, an idea he grounds in the Collingridge dilemma and the Hayekian point that regulators lack the information to make good calls. That was a defensible hedge. What is striking here is the claim that the hedge has expired. He is saying the evidence is now concrete enough that continued caution about regulating has flipped from prudent to negligent. Whether you trust the underlying capability claims or not, that is a genuine change in position from one of the field’s most influential voices, and it deserves to be read as such.

The FAA analogy is doing enormous work, and it is worth poking at. Airplanes and drugs are mature technologies with stable physics and decades of incident data; the certification regime works because the failure modes are well understood. Frontier models are the opposite: the whole premise of the essay is that capabilities are changing faster than anyone can characterize them. Amodei half-acknowledges this when he warns that a fixed list of safety requirements tends to consume 95 percent of compliance effort on things that turn out not to matter while missing the real risks, a lesson he says Anthropic learned from its own Responsible Scaling Policy. So the proposal is really for an agency nimble enough to rewrite its own standards continuously, which is a much taller order than the FAA. The honest read is that he is proposing a regulator we do not yet know how to build, and betting that building it is still better than the alternative.

The economics section is where Amodei is most careful, and it is the part most likely to be misread. He goes out of his way to say enduring job displacement is undesirable and that warning about it is not the same as wanting it, a distinction critics of AI leaders often collapse. His real claim is subtle: that AI might jam the economic policy dial on a “hypergrowth, hyper-inequality” setting that is hard to unstick, because AI substitutes for human cognition broadly and faster than past technologies, potentially overwhelming the usual escape hatches like comparative advantage and Jevons paradox. If he is right, the political fight of the next decade is not about growth, which AI supplies, but about distribution, which it does not. His mention of UBI, universal capital accounts, and higher capital gains taxes is notable coming from a frontier CEO, even hedged as it is.

The civil liberties section is the one that should travel furthest beyond the AI-policy bubble, because it does not depend on accepting his most aggressive timelines. The data broker loophole, the idea that the government can simply buy the bulk data Americans hand to private companies and run mass analysis on it, is a problem that exists today; AI just raises the stakes by making that data vastly more revealing. Same with the proposal that anyone facing adverse government action should have access to AI at least as capable as what the government uses against them. These are concrete, near-term, and bipartisan in a way the abstract autonomy debates are not. The most candid line in the whole piece is his admission that AI cannot be safely entrusted to either governments or companies, an unusually direct acknowledgment that his own industry needs external checks, with Anthropic’s Long-Term Benefit Trust offered as one imperfect example rather than a solution.

The geopolitics section is the most contested terrain. Framing AI as a nuclear-scale reset of the game board, with a virtual country of 100 million geniuses divisible across military strategy and weapons R&D, leads naturally to a democratic coalition that hoards chips and denies them to adversaries. That logic is internally consistent, but it sits in tension with the benefit-sharing and “eventually the whole world joins” language elsewhere in the same section. Export controls that lock down the supply chain are, by design, a tool of exclusion, and reconciling that with broad diffusion of AI’s benefits to developing countries is the circle the coalition idea has to square. Amodei is clearly aware of the tension and bets that making membership attractive resolves it. The closing image is the one to remember: Treebeard waking up, with the warning that the goal is to channel real public concern into constructive policy rather than let it curdle into formless anger.

Key Takeaways
- The core tension of the essay is a mismatch in speed: AI advances exponentially while legislation moves on a multi-year timescale, dramatized by the Treebeard and Hobbits image from The Lord of the Rings.
- In only four years, AI models went from barely writing a coherent line of code to writing most of the code at major AI companies, with similar gains across biology, physics, math, finance, law, and translation.
- Scaling laws now have over a decade of empirical support, and if they continue another year or two they likely produce “powerful AI,” a country of geniuses in a datacenter.
- For the last few years, safety advocates including Anthropic focused on optionality-preserving policies: transparency legislation, chip export controls, and data collection on AI’s labor effects.
- Amodei argues that posture is no longer enough. Claude Mythos Preview revealed that frontier models pose real cybersecurity risks to the financial sector, critical infrastructure, and national security, and proved AI is now a tool of strategic consequence.
- He expects biological risks to follow cyber risks, with serious AI autonomy risks potentially not far behind.
- The essay covers five policy areas: regulation and public safety, macroeconomics and tax policy, accelerating AI’s positive impact, the state and civil liberties, and securing leadership by democracies.
- Alongside the essay, Anthropic released a legislative proposal on frontier model testing and a policy framework for job displacement, both with promised financial backing.
- On regulation, Amodei invokes the Collingridge dilemma and Hayek’s information problem to explain why pre-writing AI law in 2023 to 2024 was risky, then argues the situation has now changed.
- Anthropic’s 2025 answer was transparency, helping pass SB 53 in California, RAISE in New York, and SB 315 in Illinois, plus advocating a federal transparency standard.
- He now calls for binding regulation modeled on the FAA, where frontier models must pass technical testing and can have release blocked or reversed if they fail high safety standards.
- Models above a compute threshold should face mandatory third-party testing in four areas: cybersecurity, biological weapons, loss of control of AI systems, and automated R&D that accelerates the other three.
- Government should be able to block or deter deployment of models judged to present unacceptable risk, scoped to those four risks with protections against political favoritism.
- Evaluation could come from a government agency or from authorized and inspected private organizations under a “regulatory markets” approach.
- AI companies should have strong security to protect model weights, conduct regular red teaming and penetration testing, report safety incidents promptly, and work with government against major threat actors.
- He warns a time may come when the most powerful systems resemble weaponizable nuclear materials rather than airplanes, requiring more aggressive measures, but cautions against getting ahead of present dangers.
- On economics, AI could deliver extremely rapid growth via accelerated science and operational efficiency, supercharged by AI building better AI.
- The same properties make AI a broad substitute for human cognition that changes the economy faster than past technologies, risking large and potentially enduring labor market disruption.
- The feared outcome is a “hypergrowth, hyper-inequality” setting that is hard to unstick, where the challenge shifts from incentivizing growth to sharing its benefits.
- Amodei is emphatic that enduring job displacement is undesirable and dangerous, and that he warns about it to help society adapt, not as a prophet of doom.
- Anthropic says it works with customers to find new revenue and use cases rather than only cost cutting, and explores interaction paradigms that keep humans active alongside AI.
- He predicts AI will enable single individuals to build billion-dollar companies, noting teams of a few people already reach hundreds of millions in revenue, while admitting significant enduring job loss may be intrinsic to the technology.
- Any response must address both economic provision and the human need for meaning, purpose, and agency, with the latter ultimately more important and beyond what policy can directly deliver.
- Suggested economic interventions: better measurement and tracking (governments expanding statistics beyond Anthropic’s Economic Index), pro-employment incentives, and long-term macroeconomic support.
- Pro-employment ideas include wage insurance, retention tax incentives, workforce training grants, and employer-employee matching infrastructure.
- If displacement is large and permanent, mechanisms like universal basic income or universal capital accounts, financed through company taxes or higher capital gains taxes, may be necessary.
- He frames datacenter and energy-price backlash as largely a symbol of broader economic anxiety, and says AI companies should pay to absorb rate increases, a pledge Anthropic has already made.
- For technologies accelerated by AI, the bigger risk is regulators like the FDA being too slow, not too lax, because AI may make downstream tech safer in ways that violate skeptical regulatory assumptions.
- Biomedicine is the illustrative case: AI could flood the drug pipeline, raise effect sizes, treat previously untreatable diseases, and create whole new therapy categories, while the current FDA and EMA pipeline takes 7 to 8 years.
- Agencies should pre-approve standards for AI methods like PD/PK modeling, toxicology prediction, dose selection, biomarker validation, synthetic control arms, and surrogate endpoints, plus more flexible accelerated-approval mechanisms.
- On civil liberties, powerful AI in the wrong hands could be the ultimate tool of autocracy, and existing constitutional protections are not fully equipped to counter a surprise seizure of power.
- Threats named include fully automated drone armies that obey unlawful orders and surveillance AI that infers the innermost details of every citizen’s life from widely available data.
- Civil liberties proposals: accountability rules and an “off switch” for autonomous weapons, a domestic ban on fully autonomous weapons including in law enforcement, closing the data broker loophole, and public rights to AI advice during adverse government action.
- Amodei warns companies as well as governments can seize quasi-state power, citing the Gilded Age and the East India Company, and says AI cannot be safely entrusted to either alone.
- He offers Anthropic’s Long-Term Benefit Trust as one separation-of-power structure and urges the industry to explore mechanisms that go further.
- On geopolitics, he argues AI resets the geopolitical game board like nuclear weapons, becoming the dominant source of military and economic power for any nation that holds it.
- A nation with powerful AI versus one without it, or even one three years behind, could resemble WWII Marines facing medieval swordsmen.
- He calls for a democratic coalition that shares chips and semiconductor manufacturing equipment internally while denying them to adversaries, citing MATCH and OVERWATCH as good first steps.
- The coalition should coordinate risk policy, share benefits including harmonized medical approvals, provide mutual AI defense, reject AI-powered repression, and cooperate on macroeconomic stabilization.
- He rejects the idea that AI’s image is a PR problem, arguing public concern reflects real risks and is democratic accountability working as it should, with the task being to channel it into constructive solutions.
Detailed Summary

The speed mismatch between AI and policy

Amodei frames the entire essay around a single problem: AI advances at a lightning pace while policy, especially legislation, moves very slowly, often for good reasons since governments wield grave powers that should not be used hastily. He illustrates this with Treebeard, the sentient tree from The Lord of the Rings who takes a full day to say hello, as a stand-in for political institutions trying to respond to a technology that can go from amusing toy to a country of geniuses in the time it takes Congress to act. He recounts the dilemma responsible actors have faced: they could see where the exponential was headed, but to observers looking only at present capabilities, AI looked as mundane as the latest consumer app or cryptocurrency, making a laissez-faire attitude hard to argue against. The absence of AI’s radical effects, and uncertainty about their shape, made it genuinely difficult to design good policy even where the will existed.

That uncertainty, he says, is why safety advocates limited themselves to optionality-preserving measures like transparency rules, export controls, and labor data collection. But over the last few months the evidence of AI’s power and risk has become undeniable, with Claude Mythos Preview as the emblematic example: it scrambled the global cybersecurity landscape and proved AI models are now tools of global and national strategic consequence. He expects biological and autonomy risks to follow, and argues the world must now activate its slow, rickety policy apparatus to handle risks that will compound quickly. He worries current early actions are at least a year out of step with AI’s progress, and presents the essay as an attempt to close that gap across five policy areas, focused on US policy but relevant worldwide.

Regulation and public safety: an FAA for frontier models

Amodei opens by acknowledging the real costs of regulation: it can reduce a product’s benefits, disincentivize innovation, and suffer from the Hayekian problem that regulators lack the information for good tradeoffs, plus the Collingridge dilemma that a technology’s impacts are hard to anticipate until it is too late to manage them. In 2023 to 2024 these dynamics argued against pre-writing AI law, since the exact form of biological or autonomy risk, how to test for it, and how it would play out were all unclear, creating a high risk of low-value compliance requirements that miss the real dangers. Anthropic’s answer was transparency: requiring developers to disclose safety procedures, tests, and critical incidents, which is why it supported SB 53 in California, RAISE in New York, and SB 315 in Illinois in early 2026.

Now, he argues, the risks are clearly here and it is time for binding regulation. His analogy is to cars, airplanes, and drugs: powerful technologies essential to the economy but capable of killing many people if designed or operated poorly. He models AI regulation on the FAA, with frontier models required to pass testing and auditing and with release blocked or reversed if they fail high safety standards. His concrete proposal: mandatory third-party testing for models above a compute threshold across cybersecurity, biological weapons, loss of control, and accelerating automated R&D; government power to block deployment of unacceptably risky models, scoped narrowly with anti-favoritism protections; evaluation by either a government agency or authorized private organizations in a regulatory-markets model; strong weight security, red teaming, and penetration testing at AI companies; and prompt reporting of safety incidents. He notes a future may arrive when systems resemble weaponizable nuclear materials and demand harsher measures, but warns against designing for dangers that have not yet emerged.

Macroeconomics and tax policy: growth and displacement together

Here Amodei challenges the standard premise that growth is fragile and must be traded off against the drag of taxes or deficits to reduce inequality. Powerful AI, he suggests, may scramble that assumption by producing extremely rapid growth through accelerated science and efficiency, supercharged by AI building better AI, while simultaneously acting as a broad substitute for human cognition that reshapes the economy faster than any prior technology. The result could be a world stuck on a hypergrowth, hyper-inequality setting that is hard to unstick, where the central challenge is no longer incentivizing growth but sharing its benefits. He is careful to make two points clearly: first, enduring job displacement is undesirable and dangerous and should be minimized, and his warnings are meant to help society adapt, not to play prophet of doom; second, any response must address both economic provision and the deeper human need for meaning, purpose, and agency, which matters more and which policy cannot directly supply.

His policy menu starts with measurement and tracking, arguing good policy is impossible without accurate data, and that governments could expand economic statistics well beyond Anthropic’s Economic Index. Next come pro-employment incentives such as wage insurance, retention tax incentives, workforce training grants, and employer-employee matching, costs he says society should readily accept since they are likely offset by AI productivity gains. If displacement proves large and permanent, he says long-term income support like universal basic income or universal capital accounts may be needed, financed through taxes on relevant companies or higher capital gains taxes. He closes the section by reframing datacenter and energy-price backlash as mostly a symbol of broader economic anxiety, while saying AI companies should absorb rate increases, as Anthropic has pledged.

Accelerating AI’s positive impact: the slow-regulator problem

For technologies accelerated by AI, rather than AI itself, Amodei flips his concern: the bigger danger is regulatory systems designed for a slower pace failing to handle the deluge of new products, and AI making downstream technologies safer in ways that violate the skeptical assumptions baked into agencies like the FDA. He focuses on biomedicine as the area likely to produce AI’s biggest humanitarian benefits and where regulation is especially complex. AI could greatly increase the rate of new drug candidates, improve their effect sizes and safety profiles, treat previously untreatable diseases, and create entirely new therapy categories the way antibodies, peptides, and cell therapies did.

The current pipeline at the FDA and EMA takes 7 to 8 years, built on the pessimistic assumption that drug candidates usually fail and often carry safety problems even when they work. Without reform, AI will jam or overload that system. Amodei proposes that agencies develop standards now for accepting AI simulation and analysis, so they can be adopted quickly once proven rather than after years of unnecessary testing. Specific candidates include AI-based PD/PK modeling, toxicology prediction to reduce animal testing, more accurate dose selection, biomarker validation from large datasets, synthetic control arms, and surrogate endpoints (especially for aging and neurodegeneration). He urges more flexible accelerated-approval mechanisms generally, and notes biomedical acceleration may also reduce AI’s risks by aiding biodefense and improving mental health.

The state and civil liberties: guarding against AI-driven tyranny

Amodei frames the perennial balance between state power and individual liberty, enforced through machinery like the First, Fourth, and Fifth Amendments, the Posse Comitatus Act, and FISA, and argues AI threatens to upset that balance while raising its stakes. Powerful AI in the wrong hands could be the ultimate tool of autocracy, because the enormous returns to intelligence combined with AI’s pace create a perfect storm for a surprise seizure of power. The danger could take many forms but shares one feature: AI conferring sudden power while routing around democratic oversight. He cites a fully automated drone army that could obey unlawful orders, where trained humans might object, and a surveillance AI that analyzes widely available information at massive scale to infer the innermost details of every citizen’s life, an ability current civil liberties law never contemplated.

His proposals: create accountability rules for autonomous weapons so they respond to court orders, legislation, and human overseers rather than blindly following orders, possibly with a judicial finger on an off switch; ban domestic use of fully autonomous weapons, including in law enforcement, while allowing them against foreign adversaries; close the bulk-collection and data-broker loophole that lets the government buy and analyze data Americans share with private companies; and guarantee public rights to AI advice at least as capable as what the government uses during adverse action, as an extension of the Administrative Procedure Act, due process, or the Sixth Amendment. He closes by warning that companies, not just governments, can capture the state, citing the Gilded Age and East India Company, and argues AI cannot be safely entrusted to either alone. Anthropic’s Long-Term Benefit Trust is offered as one accountability structure, with a call for the industry to go further.

Securing leadership by democracies: a values-based coalition

Amodei rejects treating AI as a mere instrument of trade policy to diffuse a tech stack worldwide. He believes AI resets the entire geopolitical game board like nuclear weapons, potentially even more so, becoming the dominant source of military and economic power for whoever holds it. In a virtual country of 100 million geniuses, millions could be assigned to military strategy, drone manufacture, weapons R&D, intelligence, and scientific advancement at once, so a nation with powerful AI facing one without it, or even three years behind, could be like WWII Marines against medieval swordsmen. Because powerful AI also enables deeper autocratic repression, it matters enormously that the world’s strongest nations are democracies.

His answer is a global coalition built on shared democratic values that draws in the rest of the world by making membership increasingly attractive and exclusion increasingly costly. Operating principles include managing the AI supply chain by sharing chips and semiconductor manufacturing equipment within the coalition while denying them to adversaries, expanding and tightening export controls (he cites MATCH and OVERWATCH as good first steps); coordinating on biological, cyber, and autonomy risk to make compliance compatible and effective; sharing AI’s benefits including harmonized medical approvals; mutual defense through collective AI cyberdefense, drones, manufacturing, compute, and intelligence; rejection of AI-powered repression; and macroeconomic cooperation against contagious employment crises. The coalition would respect each nation’s sovereignty, start with aligned democracies, and grow iteratively, ideally toward the whole world, but at minimum positioning democracies to contain and outcompete repressive regimes.

A window of opportunity

Amodei closes on cautious optimism. The same exponential that strains policymaking has created a unique opening: clear evidence of AI’s risks, an early taste of its value and disruption, and public backlash against unregulated approaches have left policymakers unusually open to forward-looking action. Treebeard and his forest are waking up. He firmly rejects the industry-circle view that this is a PR problem solved by better marketing, arguing people are worried because the risks are real, and that public concern in response to transparency is democratic accountability working as it should. The key challenge is focusing that concern into constructive solutions rather than letting it descend into formless anger and violence. He is optimistic because issues from job displacement to model testing to export controls have common-sense appeal across the political spectrum, and a broad nonpartisan coalition could adopt sane, forward-looking policy faster than usual.

Notable Quotes

“in only four years, AI models have gone from barely being able to write a coherent line of code to writing most of the code at major AI companies.”
Dario Amodei, on the pace of the AI exponential

“in the several years that it can take Congress to act, AI can go from an amusing toy to the full country of geniuses.”
Dario Amodei, on the mismatch between AI’s speed and the speed of legislation

“However, now the risks are clearly here. It is time to go beyond transparency to more serious and binding regulation of AI.”
Dario Amodei, marking the shift from transparency to binding rules

“enduring job displacement is undesirable and dangerous, and we should do everything we can to minimize or prevent it, not to bring it about.”
Dario Amodei, clarifying his stance on AI and jobs

“The key challenge in such a world won’t be incentivizing growth, but finding a way for everyone to share in the benefits.”
Dario Amodei, on a hypergrowth, hyper-inequality economy

“Powerful AI in the wrong hands could be the ultimate tool of autocracy, and our existing legal and constitutional protections are not fully equipped to counter this threat.”
Dario Amodei, on AI and civil liberties

“A nation that possesses powerful AI facing one without it … could be the equivalent of an army of World War II Marines facing an army of medieval swordsmen.”
Dario Amodei, on AI as the dominant source of geopolitical power

“People are worried about AI because they correctly perceive that its risks are real, not because AI CEOs have been insufficiently Panglossian.”
Dario Amodei, rejecting the idea that AI has a PR problem

“Treebeard and his forest are waking up.”
Dario Amodei, on policymakers’ new openness to acting on AI

“Policy on the AI Exponential” is a dense, structured argument from one of the most consequential figures in the field, and it rewards a full read in the original. The summary and analysis above are a guide, not a substitute. You can read the full essay here.

Related Reading
- Policy on the AI Exponential (full essay) the original source for this post, in Dario Amodei’s own words.
- Anthropic the AI safety company Amodei leads, which released the accompanying model-testing and job-displacement proposals.
- The Collingridge dilemma (Wikipedia) the idea that a technology’s impacts are hard to predict until it is too late to easily control them, central to the regulation section.
- Federal Aviation Administration (Wikipedia) the safety-certification model Amodei proposes adapting for frontier AI.
- Universal basic income (Wikipedia) one of the long-term support mechanisms raised for large-scale labor displacement.
June 10, 2026

Tag: red teaming

US Government Orders Anthropic to Suspend Claude Fable 5 and Mythos 5: Inside the Export Control Directive, the Jailbreak Dispute, and What It Means for Frontier AI

TLDR

Thoughts

Key Takeaways

Detailed Summary

What the directive actually does

The jailbreak at the center of the dispute

Anthropic’s defense in depth posture

Complying while disputing the standard

What happens next

Notable Quotes

Related Reading

Dario Amodei on Policy for the AI Exponential: Anthropic’s Plan for AI Regulation, Job Displacement, Civil Liberties, and Democratic Leadership

TLDR

Thoughts

Key Takeaways

Detailed Summary

The speed mismatch between AI and policy

Regulation and public safety: an FAA for frontier models

Macroeconomics and tax policy: growth and displacement together

Accelerating AI’s positive impact: the slow-regulator problem

The state and civil liberties: guarding against AI-driven tyranny

Securing leadership by democracies: a values-based coalition

A window of opportunity

Notable Quotes

Related Reading