PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Tag: claude opus

US Government Orders Anthropic to Suspend Claude Fable 5 and Mythos 5: Inside the Export Control Directive, the Jailbreak Dispute, and What It Means for Frontier AI
On June 12, 2026, Anthropic published a statement announcing that the US government, citing national security authorities, has issued an export control directive forcing the company to suspend all access to its newest frontier models, Claude Fable 5 and Claude Mythos 5. The order technically targets foreign nationals inside and outside the United States, including Anthropic’s own foreign national employees, but the practical effect is that both models are going dark for every customer worldwide. It is the first publicly known instance of the US government ordering a deployed frontier AI model offline, and Anthropic is complying while openly disputing the basis for the decision.

TLDR

The US government delivered an export control directive to Anthropic at 5:21pm ET on June 12, 2026, suspending all access to Fable 5 and Mythos 5 over an alleged jailbreak of Fable 5’s safeguards. Anthropic says the letter contained no specific details, that the only evidence shared was verbal, and that the technique in question amounts to asking the model to read a codebase and fix software flaws, a capability the company says is freely available from other models including OpenAI’s GPT-5.5 and used daily by cyber defenders. Anthropic defends its defense in depth strategy, notes that thousands of hours of red teaming by the US government, the UK AISI, and third parties found no universal jailbreak, and warns that recalling a commercial model over a narrow, non-universal jailbreak would effectively halt all new frontier model deployments if applied industry-wide. Access to all other Anthropic models, including Claude Opus, Sonnet, and Haiku, is unaffected, and the company says it believes the situation is a misunderstanding and is working to restore access, with more details promised within 24 hours.

Thoughts

This is a watershed moment regardless of how it resolves. Governments have blocked AI exports before, but ordering a deployed commercial model recalled out from under hundreds of millions of users is a new kind of intervention, closer to a product recall than a trade restriction. The mechanism matters too. Export control authority aimed at foreign nationals, including a company’s own employees, that cascades into a global shutdown is a blunt instrument doing the work of a regulatory regime that does not exist yet. The US has no statutory process for recalling an AI model, so the government reached for the closest tool on the shelf, and the result is a precedent built on improvisation.

There is real irony in who got hit first. Anthropic has spent years arguing, publicly and in Washington, that governments should have the power to block unsafe AI deployments. Now the company that asked for a referee is the first one whistled, and its complaint is not about the existence of the power but about the process: a letter at 5:21pm with no specifics, verbal evidence only, and no transparent or technically grounded procedure. That distinction is the whole ballgame for AI governance. A power to halt deployments without due process standards is not regulation, it is discretion, and discretion cuts in every direction depending on who holds it.

The technical dispute underneath is genuinely interesting because it exposes how unsettled the definition of a dangerous jailbreak is. Anthropic’s account of the offending technique, asking the model to read a specific codebase and fix any software flaws, describes something security teams do on purpose every single day. Vulnerability discovery is the canonical dual use capability: the same analysis that lets a defender patch a hole lets an attacker find one. If the bar for recall is that a model can be coaxed into doing competent security analysis, then every capable model on the market fails that bar, which is exactly Anthropic’s point about GPT-5.5. The hard question the directive dodges is not whether Fable 5 can find bugs but whether it provides meaningful uplift beyond what is already freely available, and Anthropic says it does not.

For builders, the immediate lesson is uncomfortable: model availability is now a political variable, not just an engineering one. Teams that built directly on Fable 5 lost a production dependency overnight through no fault of Anthropic’s infrastructure, their own code, or any terms of service violation. Multi-model fallback strategies, abstraction layers over providers, and graceful degradation paths just moved from nice-to-have to table stakes for anyone running serious workloads on frontier models. The companies that absorbed this outage gracefully are the ones that assumed any single model could vanish.

The next 24 hours matter more than the directive itself. Anthropic has promised more details, and the government will face pressure to either substantiate a concern that justifies a global recall or quietly walk it back. Either outcome sets the real precedent. If the directive holds on thin evidence, every frontier lab now operates under the threat of arbitrary shutdown. If it collapses under scrutiny, the case for a formal, transparent statutory process for AI deployment decisions, which Anthropic explicitly endorses in its own statement, gets a lot stronger in Congress than it was a week ago.

Key Takeaways
- The US government issued an export control directive on June 12, 2026 suspending all access to Claude Fable 5 and Claude Mythos 5, citing national security authorities.
- The directive formally targets access by any foreign national, inside or outside the United States, including Anthropic’s own foreign national employees.
- The net effect is that Anthropic must disable Fable 5 and Mythos 5 for all customers worldwide to ensure compliance, not just for foreign users.
- Access to all other Anthropic models, including the Claude Opus, Sonnet, and Haiku families, is not affected by the order.
- Anthropic received the directive at 5:21pm ET the same day it published its statement, and says the letter did not provide specific details of the national security concern.
- Anthropic’s understanding is that the government believes it has become aware of a method of bypassing, or jailbreaking, Fable 5’s safeguards.
- Anthropic reviewed a demonstration of the specific technique and says it only identified a small number of previously known, minor vulnerabilities.
- The company says other publicly available models can discover the same vulnerabilities without requiring any bypass at all.
- Before launch, Fable 5’s safeguards were red-teamed for thousands of hours in total by the US government, the UK AISI, multiple private third-party organizations, and internal teams.
- No tester has found a universal jailbreak for Fable 5, meaning a method that broadly bypasses safeguards and unlocks a wide range of cyber capabilities.
- Anthropic openly states that perfect jailbreak resistance does not appear possible for any model provider today, and that every safeguard in the industry is vulnerable to non-universal jailbreaks.
- Fable 5 was deployed under a defense in depth strategy: make jailbreaks either narrow or very expensive to produce, then combine that with monitoring to quickly detect and shut down successful attacks.
- Anthropic’s 30-day customer data retention requirement for Fable exists specifically to support jailbreak research and mitigation, a policy the company says carries real costs with customers.
- Anthropic says it has not received any disclosure of a concerning non-universal jailbreak that led to a harmful result; disclosed potential jailbreaks were benign or provided no Mythos-specific uplift.
- The only evidence the government has provided is verbal, describing a narrow, non-universal jailbreak that essentially consists of asking the model to read a specific codebase and fix any software flaws.
- Anthropic reviewed a report it believes is the basis of the directive and validated that the capability level shown is widely available from other models, including OpenAI’s GPT-5.5, and is used every day by cyber defenders.
- Anthropic is complying with the legal directive while explicitly disagreeing that a narrow potential jailbreak justifies recalling a commercial model deployed to hundreds of millions of people.
- The company warns that if this recall standard were applied across the industry, it would essentially halt all new model deployments for every frontier model provider.
- Anthropic supports government power to block unsafe deployments in principle, but only through a statutory process that is transparent, fair, clear, and grounded in technical facts, and says this action meets none of those principles.
- Anthropic apologized to customers, called the situation a misunderstanding, said it is working to restore access as soon as possible, and promised more details within 24 hours.
Detailed Summary

What the directive actually does

The order arrived as a letter from the US government at 5:21pm ET on June 12, 2026, invoking national security authorities under export control law. On paper it suspends access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, a category that includes some of Anthropic’s own employees. In practice, Anthropic says compliance requires abruptly disabling both models for every customer, since there is no clean way to enforce a nationality-based access boundary across a global product. The letter did not spell out the specific national security concern. Everything else in Anthropic’s statement is the company’s own reconstruction of what prompted the action.

The jailbreak at the center of the dispute

Anthropic’s understanding is that the government became aware of a method for bypassing Fable 5’s safeguards. The company reviewed a demonstration of the technique and characterizes the results as a small number of previously known, minor vulnerabilities, all relatively simple, all discoverable by other publicly available models without any jailbreak at all. According to Anthropic, the government’s evidence so far has been entirely verbal, and the technique boils down to asking the model to read a specific codebase and fix any software flaws. The company reviewed a report it believes underlies the directive and validated that the displayed capability is widely available elsewhere, naming OpenAI’s GPT-5.5 directly, and noted that this exact kind of analysis is what defenders use to keep systems safe.

Anthropic’s defense in depth posture

The statement restates the safety posture Anthropic laid out at Fable 5’s launch. The safeguards around cybersecurity tasks are strong enough that users have complained they are overly broad. In the weeks before launch, the US government, the UK AISI, multiple private third-party organizations, and internal teams red-teamed the safeguards for thousands of hours combined, and those tests showed Fable’s protections to be substantially more effective than any previously deployed model. No tester found a universal jailbreak. Anthropic is candid that perfect jailbreak resistance is likely impossible for anyone today, which is why the strategy is defense in depth: keep jailbreaks narrow or expensive, monitor aggressively, and shut down attacks fast. The 30-day customer data retention requirement on Fable exists to support that monitoring and mitigation loop. The company says this posture makes Fable’s risks comparable to models already deployed across the industry.

Complying while disputing the standard

Anthropic is removing access for all users as legally required, but the statement draws a hard line on the principle. The company disagrees that a narrow potential jailbreak, one that produced no disclosed harmful result, justifies recalling a commercial model serving hundreds of millions of people. Its broader warning is that this standard, applied evenly, would halt all new frontier model deployments industry-wide, since every provider’s safeguards are vulnerable to narrow jailbreaks. Anthropic also turns its own policy position into a critique: the company has publicly supported giving government the ability to block unsafe deployments, but through a statutory process that is transparent, fair, clear, and grounded in technical facts, and it says this action does not adhere to those principles.

What happens next

Anthropic closed by apologizing to customers, calling the situation a misunderstanding, and committing to restore access as soon as possible. The company promised to share more details over the next 24 hours, which makes this a developing story. The open questions are whether the government substantiates its concern with written technical evidence, whether the directive survives that scrutiny, and whether this episode accelerates the formal statutory process for AI deployment decisions that Anthropic says should have governed the action in the first place.

Notable Quotes

“The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.”
Anthropic, on why a directive aimed at foreign nationals becomes a global shutdown

“We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern.”
Anthropic, on the abruptness and opacity of the order

“These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass.”
Anthropic, on its review of the demonstrated jailbreak technique

“We suspect that perfect jailbreak resistance is not currently possible for any model provider.”
Anthropic, restating the position it disclosed at Fable 5’s launch

“We stand by this defense in depth strategy. It reduces the risks posed by Fable, making them comparable to the risks of existing models already deployed across the industry.”
Anthropic, defending its layered safeguards approach

“To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws.”
Anthropic, describing the technique behind the directive

“However, we disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people.”
Anthropic, on complying while contesting the decision

“If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”
Anthropic, on the industry-wide implications of the recall standard

“As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.”
Anthropic, on the kind of oversight process it says should have governed the action

“We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.”
Anthropic, closing its statement to customers

Read the full statement on Anthropic’s site here.

Related Reading
- Anthropic’s Claude Fable 5 and Mythos 5 launch announcement the original deployment post that laid out the safeguards posture now at the center of the dispute.
- US Bureau of Industry and Security the agency that administers US export controls, the kind of authority a directive like this one invokes.
- Export control (Wikipedia) background on how export control law works and why it can reach foreign nationals inside the United States.
- Prompt injection and jailbreaking (Wikipedia) primer on the techniques used to bypass language model safeguards.
- UK AI Security Institute one of the third-party organizations that red-teamed Fable 5’s safeguards before launch.
June 13, 2026
Dan Shipper’s Most Contrarian AI Predictions for 2026: Why the Job Apocalypse Is a Myth, SaaS Will Boom, PMs and Designers Win, and CLIs Are Already Over
Dan Shipper, the CEO and founder of Every, returned to Lenny’s Podcast for round two of AI predictions. His last appearance produced one of the most prescient calls of the year: that non-technical people would build serious work inside Claude Code. He was unbelievably right. This conversation is the follow-up, a tour of his most contrarian forecasts for how AI is actually changing the way we work, who wins, who loses, and what almost every commentator is getting wrong about the next twelve to twenty-four months.

TLDW

Shipper argues that the AI job apocalypse is a myth, that SaaS is going to boom rather than die, that product managers and full-stack designers are the biggest winners of the agent era, that personal agents inside Codex and Claude Code will quietly replace the browser as the primary work surface, that every company will run a single shared super-agent in Slack instead of a fleet of per-user bots, that the CLI moment is already over, that pull requests are going to flood organizations from non-technical staff, that forward-deployed engineers who garden company agents become the new senior role, that GPT-5.5 still cannot match a real senior engineer on architectural judgment, that AI-generated internal writing is fine and probably better than what most humans produce, that CEOs and middle managers have not adapted yet but soon will be forced to, that the edge of AI lives wherever a curious human is using it rather than in San Francisco, and that the only durable strategy is to ride the models and keep playing with whatever ships next. The whole conversation balances aggressive AI bullishness with an equally strong bet on humans, on creativity, and on the unavoidable need for someone to care for every agent that gets deployed.

Thoughts

The most useful frame Shipper gives is that models commoditize yesterday’s human competence. Every time a frontier model crosses a new bar, the work that used to define seniority becomes cheap. The senior engineer who could carry a refactor in their head, the PM who could write a coherent strategy doc, the designer who could ship a polished landing page in a week. That competence is now frozen, codified, and available on tap. The interesting question is not whether models will keep eating tasks. They will. The interesting question is what humans do with the suddenly cheap raw material underneath them. Shipper’s answer is that humans climb the stack: they go up a level, find a new problem worth framing, and use the commoditized competence as feedstock for something that did not exist before. That treadmill is the actual engine of value creation, and it is why he can be simultaneously AI pilled and bullish on hiring.

His SaaS take is the spiciest call of the episode and probably the most defensible. The crowd consensus is that agents will gut SaaS because an AI can just write the form filler, the dashboard, the workflow. Shipper points out the obvious counterfactual: agents do not reduce the number of people using SaaS, they increase it. A marketing lead who could never touch the data warehouse can now stand up a PostHog query through Codex. A founder who never opened Vanta can run a SOC 2 prep through an agent. The result is more users, more accounts, and a much fatter top of funnel for every horizontal tool. The second-order effect is even more interesting. When the SaaS tool runs inside the user’s agent, the user supplies the tokens. Vendor margins improve, not collapse. If he is right, the next two years are going to be brutal for the SaaS-is-dead thesis pieces and very good for the public software multiples.

The PM and designer bet is where this gets personal for anyone in product. For a decade the bottleneck in shipping anything was engineering capacity. A PM with spiky product sense had to negotiate their vision through a roadmap, a sprint, a review, and a release. Designers had to convince an engineer that the third state of the empty screen was actually worth building. Both of those constraints are dissolving fast. A PM who can prompt Codex into a working prototype on Friday afternoon, then iterate it live in front of a customer on Monday, is doing the job of a small team. A designer who can ship a fully functional landing page in their own style, without negotiating with anyone, is suddenly the most leveraged person in the company. The scarce skill is no longer execution. It is taste, judgment, and the willingness to decide what is worth building. That has always been the real PM and design job. AI just stripped away the parts that were not.

The quietest but most important prediction is that agents need humans, permanently. Every benchmark advance reveals a new layer of judgment the model cannot frame on its own. When the agent finishes the task, there is always a senior human who sees the deeper problem the model patched over. Shipper calls this gardening, and it is the basis for the new forward-deployed engineer role. The companies winning right now are the ones that put a real person next to every agent, watching what it does, course-correcting in Slack, and noticing when the output drifts. The dream of autonomous AI workflows is a stage in a journey, not the destination. The destination looks more like a thoughtful operator with a small cluster of agents they trust and constantly tend. That is a much more humane future than the discourse suggests, and it is the one Every is already living.

The final advice, ride the models, sounds glib but is the single most actionable line in the episode. Most professional anxiety about AI dissolves the moment you actually use the newest model on real work. Most professional advantage accrues to the people who do that one thing consistently. The edge does not live in San Francisco where the labs build the things. It lives wherever a curious human meets a real workflow and discovers something the labs have not noticed. A PM in Iowa willing to try Codex on a Tuesday night can be further ahead than a research engineer who has only used the model on its evals. Pair that with Shipper’s closing motto, do things worth writing about and write things worth reading, and you have a pretty complete operating system for the next two years.

Key Takeaways
- The AI job apocalypse narrative is wrong. Models commoditize yesterday’s competence, then humans climb the stack and find new work to do with the cheap raw material.
- Every has roughly doubled headcount in the last year despite being one of the most AI-forward companies in the world. The lived data point cuts directly against the doom thesis.
- Shipper’s dual stance: simultaneously extremely AI pilled and very bullish on humans. He treats this as the only intellectually honest position right now.
- Work will bifurcate. Companies will run one shared super-agent in Slack for everyone, and individuals will run their own personal agent inside Codex or Claude Code on their machine.
- The personal agent inside Codex effectively becomes the new operating system. Instead of putting AI in the browser, you put a browser inside the AI.
- The super-agent pattern is already real: Shopify has River, Ramp has its own, and Every runs Claudie inside Slack for internal consulting.
- SaaS is not dying. Agents increase the user base of SaaS tools because non-technical people can finally drive them. Shipper would buy SaaS stocks today.
- When SaaS runs inside an agent, the user brings their own tokens. Vendor margins improve because they no longer eat inference costs on every interaction.
- The CLI era is already over. The magic was never the terminal. It was the AI plus the ability to see what the agent is doing. A good GUI captures the same benefits and more.
- Pull requests are about to flood every company. Non-engineers can now ship code, run queries, and open tickets. Reviewing the output becomes the new bottleneck.
- Open-source maintainers are already living in the future. Some receive thousands of agent-generated PRs per day and spin up thousands of Codex instances just to triage them.
- Forward-deployed engineers are the new senior role. They live in Slack, garden the company’s agents, fix broken flows, and keep non-technical staff from doing damage.
- Product managers with spiky product sense plus a little Codex fluency become extremely dangerous. Marcus at Every, formerly a PM at Axios, is the archetype.
- Full-stack designers are the other big winner. They can build distinctive interfaces end to end without negotiating with engineering. The bottleneck on taste-driven product work disappears.
- Designer hiring data has not yet caught up to the prediction. Shipper notes this and says check back in a year.
- Sales is the role least changed so far. Top of funnel research has been turbocharged by agents, but the actual relationship and closing work remains human.
- AI-generated internal writing is going mainstream and that is a good thing. Most humans are bad at strategy docs, quarterly plans, and PRs. AI drafts a coherent first pass that a human can refine.
- Shipper says most of his email is now written by GPT-5.5 and Codex. He would honestly prefer the signature to say so.
- Public writing, newsletters, and published essays still demand a human voice. Internal communication does not.
- CEOs and middle managers have largely not adapted yet because their staff still does the work. That window is closing fast and will become an obvious career liability.
- Your company will only go as far as your CEO goes in AI. The leadership ceiling becomes the AI ceiling.
- Shipper’s senior engineer benchmark scores GPT-5.5 at roughly 62 out of 100. Real senior engineers sit at 85 to 90. Progress is real, but the gap on architectural judgment remains.
- Models tend to patch problems locally instead of rewriting from first principles. A senior human still sees the deeper rework that the model avoids.
- Every uses Notion-based agents to draft quarterly plans. The human edits, approves, and stands behind the output.
- The hard rule on AI-generated communication: you have to read it and stand behind it before sending it. Pasting unread output is the only true no-no.
- Every agent needs a human. Automation is a lie in the strong sense. The story of automation is the story of new and different humans being needed alongside it.
- The reach test, organic daily usage, is the real signal that an AI product works. Benchmark scores are noisy. Daily reach is not.
- Cursor’s SpaceX acquisition is a tell. Harnesses around models, not the models themselves, are where the strategic value is concentrating.
- The edge of AI is not in San Francisco. It is wherever a real human meets a real workflow and discovers something the labs have not noticed yet.
- A PM in Iowa willing to ride the models can be further ahead than a researcher in SF who only uses them on internal evals.
- Ride the models. Use them for whatever you do. Try every new release the day it ships. That single behavior compounds faster than any other AI career strategy.
- Shipper got bursitis, which he calls vibe coder elbow, from too much rapid agent-assisted coding while debugging his markdown editor Proof.
- The closing motto for the year: do things worth writing about and write things worth reading.
- Lenny will re-interview Shipper in roughly May 2027 to score the predictions.
Detailed Summary

Why The AI Job Apocalypse Is The Wrong Frame

Shipper opens with the headline contrarian call. Benchmarks keep climbing. Models can now sustain seventeen-hour autonomous tasks at fifty percent accuracy. The pace is real and accelerating. None of that translates cleanly into mass unemployment. His mechanism: models codify yesterday’s human competence and make it cheap. The act of compressing past expertise into an API call is genuinely deflationary for the work it captures, but it is also raw material for the next layer of human work. He uses Every as his own data point. The company has roughly doubled in the past year despite being one of the most AI-forward outfits in media. Hiring goes up because agents create new categories of work that need humans, not because the agents fail. The discourse, he argues, is stuck modeling AI as substitution. The reality looks much more like leverage.

The Bifurcation: Super-Agents And Personal Agents

Work splits into two surfaces. The first is the shared super-agent that lives in Slack and serves the whole company. Shopify has River. Ramp has its own. Every has Claudie. Each is a single, trusted, gardened agent that anyone in the company can talk to. The pattern has converged on one shared agent rather than one agent per person because agents need human attention to stay useful, and a single shared instance pools the gardening cost. The second surface is the personal agent inside Codex or Claude Code that runs on your machine and reaches into your local environment, your editor, your files, and through an embedded browser into the web. Shipper calls this the new operating system. Instead of the old paradigm of putting AI inside the browser, you put the browser inside the AI. The agent sees what you see, follows what you do, and works on your stuff in your context.

The SaaS Bet: Up, Not Down

The SaaS-is-dead thesis was the consensus call of late 2025. Shipper takes the other side and would buy software stocks now. Three arguments. First, agents make SaaS accessible to people who never could have used it directly. The total addressable user base inside every company goes up. Second, the business model improves when the user runs the SaaS through their own agent, because the user supplies the tokens. Vendors stop subsidizing inference. Third, SaaS spend in his observable universe is up, not down, and is concentrating on the tools that play well with agents. He frames the prediction as a sound bite for the cycle: buy SaaS stocks, the apocalypse is dumb.

The CLI Era Is Already Over

For a moment in early 2026 it looked like everyone was migrating to the terminal because Claude Code was a CLI. Shipper says the moment is finished. The actual leverage was never the terminal. It was the model plus the ability to watch and steer an agent live. A great GUI captures every advantage of the CLI without the friction. His own engineering team at Every has mostly moved off the CLI as their primary surface and onto Codex desktop. He frames it bluntly: we speed ran the CLI era, it was nice, and now we are done. Tooling for the next two years will be visual, multi-pane, multi-agent, and built around the human watching the work unfold.

The Pull Request Flood And The Rise Of Forward-Deployed Engineers

Once non-engineers can ship code, run queries, and file changes through agents, the volume of incoming work explodes. Open-source maintainers already report receiving thousands of agent-generated pull requests per day. Inside companies, the same thing happens to data teams, ops teams, and any function that owns a review gate. The bottleneck shifts from creation to evaluation. The job that emerges to absorb the flood is the forward-deployed engineer. This is a senior person who lives in Slack with the company’s agents, fixes their context, sharpens their instructions, and prevents non-technical colleagues from making well-meaning but incoherent changes. Nitesh at Every is the example Shipper returns to. The model is the same one the labs use internally: pair every important agent with a real engineer who gardens it.

PMs And Full-Stack Designers Win The Decade

The two roles Shipper is most bullish on are product manager and full-stack designer. For PMs, the entire job of coordinating a team to translate vision into code collapses into a Codex session. A PM with strong product instincts and a little technical literacy can now prototype, iterate, and even ship. The example is Marcus, formerly a PM at Axios, who took a year to fully internalize AI and now ships faster than most engineers. For designers, the model is similar. The Friday-night-side-project designer who used to be stuck explaining a vision can now build the vision themselves, with their own taste fully expressed. The scarce skill in both cases is the same: judgment about what to build and the courage to decide it is good. Execution capacity is no longer the constraint.

The Senior Engineer Benchmark And What Models Still Miss

Shipper has built his own benchmark to test whether coding models can actually do senior engineering work. GPT-5.5 scores around 62 out of 100. Real senior engineers sit closer to 85 or 90. The gap is not in syntax or test pass rates. It is in the willingness to step back, see that a piece of code is fundamentally the wrong shape, and rewrite it from first principles. Models almost universally patch locally. They take the instruction at face value, accept the existing code as a constraint, and optimize within it. A real senior engineer ignores the prompt when the prompt is wrong. This is the durable moat for senior technical judgment, and Shipper expects it to remain visible for at least another year of model releases.

AI-Generated Writing Goes Mainstream

Internal writing inside companies is quietly becoming AI-first and Shipper thinks it should. Quarterly plans, status updates, PR descriptions, strategy memos, recruiting outreach, most internal email. He runs his own inbox through GPT-5.5 and Codex and says he would honestly prefer if the recipient knew. The point is not that AI is a better writer in some absolute sense. The point is that most humans are not very good at these specific genres, and the model produces a coherent, structurally sound first draft that a human can guide and approve. The constraint is honesty: you read it, you understand it, you stand behind it. Public writing, like the newsletters Every publishes, still demands a human voice. Internal communication does not, and treating it as if it did is a tax on the organization.

The CEO And Middle Manager Lag

Shipper points to a population that has largely escaped AI adoption: senior leaders and middle managers. They have staff to do the work, so they have not been forced to pick up the tools personally. He thinks this is the single largest pocket of latent disruption coming in the next year. Your company will only go as far as your CEO goes in AI, because every decision about where to deploy agents, where to hire, and how to restructure work flows downstream from leadership taste. A leader who has not personally lived inside Codex or Claude Code for a few weeks cannot make those calls well. Expect this to flip fast and to become a visible career liability for executives who do not adapt.

Ride The Models

The closing advice is the simplest. Ride the models. Use AI for whatever you actually do. Try every new release the day it lands. Most of the professional anxiety around AI dissolves on contact with the work, and most of the durable advantage in the field belongs to the people who do this one thing consistently. Shipper notes that the edge of AI does not live in San Francisco. It lives wherever a curious operator meets a real workflow and notices something nobody at the labs has yet. A PM in Iowa willing to spend a Tuesday night exploring Codex can find capabilities researchers have not surfaced. Pair that with his motto, do things worth writing about and write things worth reading, and you have most of an operating system for the next two years.

Notable Quotes

“The AI job apocalypse is not really a thing. I am super super bullish on PMs and full-stack designers.”
Dan Shipper, opening his contrarian thesis for the conversation

“I’m simultaneously extremely AI pilled and very bullish on humans. Automation is a lie. Every agent needs a human.”
Dan Shipper, on holding both sides of the AI debate at once

“What models do in general is they make yesterday’s human competence cheap. And so, it becomes commoditized. It’s not valuable anymore. What humans do is we go in there and we’re like, yeah, we have all this frozen human competence from yesterday, how do I use this to make something new and interesting.”
Dan Shipper, articulating the core engine behind his anti-apocalypse thesis

“I would buy SaaS stocks right now. The SaaS apocalypse is dumb. What agents do is increase the number of users of SaaS, not get rid of it.”
Dan Shipper, calling the consensus SaaS-is-dead thesis directly wrong

“We speed ran the CLI era. It was nice while it lasted, but I think CLIs are over.”
Dan Shipper, on why the terminal-first agent moment is already done

“Most of my email is written by GPT-5.5 and Codex right now. And I honestly would prefer it to say that it’s coming from GPT-5.5.”
Dan Shipper, on the new etiquette of AI-assisted communication

“The edge of AI is not in San Francisco. The edge of AI is wherever AI meets a real human doing something.”
Dan Shipper, on where the actual frontier of the field lives

“The only thing you need to do is ride the models. And that means use them for whatever it is that you do.”
Dan Shipper, distilling his career advice for the next two years

“Do things worth writing about and write things worth reading.”
Dan Shipper’s closing motto, lifted from his own operating system at Every

Watch the full conversation with Dan Shipper on Lenny’s Podcast here. The re-interview to score these predictions is scheduled for roughly May 2027.

Related Reading
- Every. Dan Shipper’s company and the live laboratory for almost every prediction in this conversation, including Spiral, Cora, and Claudie.
- The Allocation Economy by Dan Shipper. The earlier essay that frames humans as managers of AI labor and underpins much of the gardening-the-agent thesis here.
- Claude Code by Anthropic. The agent surface Shipper called correctly last year and one of the two environments he predicts will become the new operating system for work.
- Codex by OpenAI. Shipper’s current daily driver and the visual, multi-pane agent environment he uses for almost everything from coding to email.
- The Writing Life by Annie Dillard. The book Shipper makes every Every employee read, and the source of the company’s stance on writing as a tool for noticing the future.
May 25, 2026
Anthropic’s Growth Strategy Explained: $1B to $19B in 14 Months, Automating Experiments With Claude, and Why Old Playbooks Are Dead (Lenny’s Podcast Recap)

TLDW (Too Long, Didn’t Watch)

Amol Avasare, Head of Growth at Anthropic, sat down with Lenny Rachitsky to explain how Anthropic grew from $1 billion to over $19 billion in annual recurring revenue in just 14 months. He breaks down their internal tool called CASH (Claude Accelerates Sustainable Hypergrowth) that automates growth experimentation, why 50 to 70 percent of traditional growth playbooks are now obsolete, why the PM-to-engineer ratio may need to flip, and how Anthropic’s early bet on AI coding created a research flywheel that competitors are only now starting to copy. He also shares how he cold emailed his way into the job, why activation is the single hardest problem in AI products, and how he uses Cowork to detect team misalignment across Slack channels automatically.

Key Takeaways

1. Anthropic’s growth trajectory is historically unprecedented. Revenue went from $1 billion at the start of 2025 to over $19 billion ARR by February 2026. That 19x growth in 14 months dwarfs companies like Atlassian, Snowflake, and Palantir, which took 15 to 20 years to reach $4.5 to $6 billion ARR. The number Amol quoted was already outdated by the time the episode aired.

2. Anthropic is automating growth experimentation with an internal tool called CASH. CASH stands for Claude Accelerates Sustainable Hypergrowth. The growth platform team uses Claude to identify opportunities, build experiments (mostly copy changes and minor UI tweaks so far), test them against quality and brand standards, and analyze results. Amol describes the current win rate as roughly equivalent to a junior PM with two to three years of experience, but notes it was not possible at all before Opus 4.5 and has improved significantly with Opus 4.6. Human review is still in the loop but decreasing week over week.

3. Activation is the single highest-leverage growth problem in AI. The core challenge is capability overhang: models are improving so fast that users do not know what they can do. By the time you have tested and optimized onboarding for one model’s capabilities, the next model has already shipped with entirely new features that make your learnings obsolete. Anthropic addresses this by adding intentional friction in onboarding to understand who users are and funnel them to the right products and features.

4. Anthropic indexes 70/30 toward big bets, the opposite of most growth teams. Traditional growth teams spend 60 to 70 percent of effort on small to medium optimizations. Anthropic flips that ratio because they believe the product value delivered two years from now will be 100x to 1,000x what it is today. In that exponential environment, micro-optimizations capture a negligible percentage of future value. Large strategic bets are where the leverage lives.

5. The PM-to-engineer ratio may need to flip. Engineers are getting 2 to 3x more productive with tools like Claude Code, effectively turning a team of 5 engineers into the equivalent of 15 to 20. But PMs and designers have not seen the same multiplier. The result is that product management and design are “absolutely squeezed.” Anthropic is responding by hiring more PMs and deputizing product-minded engineers to act as mini-PMs on projects under two weeks. The counterintuitive insight: companies may need more PMs, not fewer, as AI accelerates engineering output.

6. Cold emailing still works if you do it right. Amol got his job by cold emailing Mike Krieger, Anthropic’s Chief Product Officer (and co-founder of Instagram), at a time when no growth role was even listed. Key tactics: use a high-converting subject line you have tested over time, find personal email addresses instead of competing in crowded LinkedIn inboxes, keep the message extremely short, and follow up relentlessly until someone explicitly asks you to stop.

7. PRDs are largely obsolete at Anthropic. Amol estimates that 60 to 80 percent of what his team ships does not have a formal PRD. For small projects, coordination happens entirely in Slack. For larger initiatives, he will sometimes throw his thoughts into Cowork five minutes before a kickoff meeting to generate a rough document. His default philosophy: if you can skip the doc and jump straight to prototyping or action, do it.

8. The AI coding bet created a research flywheel. Anthropic’s deep focus on coding was not just a commercial play. A document written by co-founder Ben Mann in 2021, just months after the company was founded, laid out the case for focusing on AI coding because better coding models would accelerate their own researchers, which would produce better models, which would produce better coding tools, in a compounding loop. This is something competitors are only now starting to recognize and copy.

9. Cowork is being used to detect organizational misalignment. Amol runs a weekly scheduled task in Cowork that uses the Slack MCP to scan conversations across the company and surface areas of potential misalignment. He describes cases where this caught teams about to do overlapping work or spin their wheels on conflicting priorities. He also uses Cowork to simulate coaching sessions with his manager, Ami Vora, by asking Claude to analyze her public writing and internal Slack activity and then deliver feedback from her perspective.

10. Anthropic’s culture is its most defensible moat. Amol describes a culture where every single person is fully engaged, nobody is checked out, and there is radical transparency through “notebook channels” on Slack where anyone, including leadership, shares their thinking publicly. Employees openly challenge Dario Amodei in these channels after all-hands meetings. These notebook channels also serve a practical purpose: they become training data that helps Claude understand how different teams think and operate.

Detailed Summary

The Cold Email That Started It All

Amol Avasare was not recruited through a job listing, a referral, or a sourcing pipeline. He cold emailed Mike Krieger, Anthropic’s CPO and the co-founder of Instagram, with a short pitch: he loved the product, thought Anthropic badly needed a growth team, and wanted to talk. At the time, Anthropic had no growth roles posted. They were just beginning to think about it internally, and the timing was perfect.

Amol’s approach to cold email is methodical. He has a subject line formula he has refined over years of founder outreach that produces abnormally high open rates (he declined to share the exact copy). He targets personal email addresses rather than work inboxes or LinkedIn, where competition for attention is fierce. The message itself is brutally short: who he is, why he would be a fit, and a request to chat. His follow-up philosophy is to keep reaching out until someone tells him to stop. Krieger responded on the first attempt.

What $1B to $19B in 14 Months Actually Feels Like

From the inside, Anthropic’s growth does not feel like a victory lap. Amol describes it as the hardest job he has ever had, harder than being a founder and harder than investment banking. About 70 percent of his time goes to what the team calls “success disasters,” which are problems created by things going extremely well. All the charts are green and up and to the right, but the underlying infrastructure, processes, and systems are constantly breaking under the strain of hypergrowth.

The revenue trajectory tells the story: $0 to $100 million in 2023, $100 million to $1 billion in 2024, $1 billion to roughly $10 billion in 2025, and already $19 billion ARR by the end of February 2026. Amol notes that at the end of 2024, Dario Amodei was pushing for growth targets that the team thought were impossible. Those targets were hit and exceeded. The internal culture has adapted accordingly. Linear charts are considered uncool. Everything is presented on a log-linear scale.

Why Activation Is the Hardest Problem in AI

The central growth challenge for AI products is not acquisition. It is activation: getting users to understand what the product can actually do for them. Amol frames this as a capability overhang problem. Models are improving so rapidly that even internal teams struggle to keep up with what is newly possible. If Anthropic employees have to carve out dedicated time to explore a new model’s capabilities, the average user is even further behind.

The danger is that someone signs up for Claude, asks it about the weather, and walks away thinking that is all it does. The product development cycle for onboarding is also under strain: by the time you have run tests, gathered learnings, and shipped an optimized activation flow for one model generation, the next model has shipped with capabilities that make your work irrelevant.

Anthropic’s approach borrows from Amol’s experience at Mercury and MasterClass. They add deliberate friction to the signup flow, asking users questions about who they are and what they want to accomplish. This allows them to route users to the right products and features. The data also feeds downstream into lifecycle marketing and ad targeting. Amol has seen this pattern work consistently across every company he has worked at: the right friction, applied at the right time, outperforms frictionless flows that dump users into a blank canvas with no guidance.

The CASH System: Automating Growth Experimentation

Anthropic’s growth platform team, led by Alexey Komissarouk (who teaches growth engineering at Reforge), has built an internal system called CASH. The name stands for Claude Accelerates Sustainable Hypergrowth.

CASH operates on a four-stage loop. First, Claude identifies growth opportunities by analyzing trends, metrics, and past experiment results. Second, Claude builds the actual feature or change. Third, Claude tests the output against quality and brand standards. Fourth, Claude analyzes the results and gathers learnings after the experiment ships.

Currently, CASH handles mostly copy changes and minor UI tweaks. The win rate is comparable to a junior PM with two to three years of experience. A senior PM would still do better. But the trajectory matters: this was not possible at all before Opus 4.5 launched, and results have improved meaningfully with Opus 4.6. Human approval is still required before shipping, but the amount of human time spent reviewing is decreasing week over week.

The part that Claude still cannot handle well is cross-functional stakeholder management. Getting six people in a room to align on a decision remains a fundamentally human problem. As Amol’s head of design joked: “We will have AGI and it will still be impossible to get six people in a room to get aligned.”

Why the PM-to-Engineer Ratio Might Flip

This is one of the most counterintuitive insights from the conversation. The conventional assumption is that AI will reduce the need for PMs. Amol argues the opposite: companies may need more PMs, at least in the near term.

The math is straightforward. Tools like Claude Code are making engineers 2 to 3x more productive. A team of 5 engineers now produces the output equivalent of 15 to 20 engineers in the pre-AI era. PMs and designers have seen productivity gains, but not at the same multiplier. The result is a bottleneck: one PM managing the equivalent output of 15 to 20 engineers worth of work, while also handling cross-functional coordination, stakeholder alignment, and strategic direction.

Anthropic’s response is twofold. First, they are actively hiring more PMs. Second, they have formalized a system where product-minded engineers act as mini-PMs on any project that is two engineering weeks or less. The engineer handles everything: talking to legal, talking to security, managing stakeholders. The PM only steps in if things go badly off track.

For larger projects, the PM remains squarely accountable. But the key insight is about leverage: if you are one PM managing 20 engineers, the highest-value use of your time is not shipping the 21st feature yourself. It is getting 5 percent better at guiding the team on what the right opportunities are and upleveling every engineer’s product thinking.

The Coding Flywheel That Changed Everything

Anthropic’s deep bet on coding was not obvious at the time. A document from co-founder Ben Mann, dated 2021, laid out the strategic logic just months after the company was founded. The argument was that investing heavily in AI coding would create a compounding flywheel: better coding models would help Anthropic’s own researchers write code more effectively, which would accelerate model development, which would produce even better coding tools.

This early focus gave Anthropic a structural advantage that competitors are only now trying to replicate. It also explains why the company went so deep on B2B and enterprise use cases rather than chasing consumer attention. The commercial opportunity of coding was large on its own, but the internal research acceleration made it doubly strategic.

Amol notes that this focus was partly born from constraint. Anthropic was the smallest, least well-funded player in the space for years. They did not have Meta’s distribution or Google’s cash flow or OpenAI’s first-mover advantage. That constraint forced extreme focus, which is a principle Amol applies broadly. He calls it “freedom through constraints.”

How Amol Uses AI to Manage His Day

Amol’s personal AI usage is extensive and worth documenting for anyone looking to see how a power user at the frontier actually operates.

Every morning, a scheduled Cowork task reviews 20 to 25 charts across Anthropic’s products and sends him a summary of what needs attention. The false positive and false negative rates are improving week over week, giving him increasing confidence in delegating this monitoring.

He uses Cowork to handle administrative tasks he hates: booking meeting rooms, first-pass email triage, filing expense reports in Brex and reimbursements in Benpass. None of this requires his attention anymore.

For management, he runs weekly Cowork tasks that review what his direct reports have done, cross-reference their work against team OKRs and meeting transcripts, and surface feedback he should give them. He also runs a parallel task for himself, asking Claude to impersonate his manager Ami Vora based on her public writing and internal Slack activity, and deliver feedback from her perspective.

Perhaps most powerfully, he runs a weekly misalignment detection task that scans Slack conversations across the company and surfaces areas where teams may be working at cross purposes. He describes cases where this caught potentially expensive coordination failures before they compounded.

Notebook Channels and the Culture Moat

Anthropic uses “notebook channels” on Slack, which function like internal Twitter feeds where employees share their thinking, priorities, and provocative ideas publicly. Everyone has one, from researchers to growth PMs to Dario Amodei himself. Employees openly disagree with leadership in these channels, and that is encouraged.

These channels serve a dual purpose. First, they help scale cultural values and operating principles as the company grows rapidly. When Amol posts about “the importance of being comfortable leaving money on the table,” every new engineer on the growth team absorbs that principle. Second, and perhaps more importantly for the long term, these channels become structured context that Claude can reference. The HR team has even documented which internal documents Claude should reference for specific topics. Amol sees this as something every company will eventually need to do: share thinking in a structured way so that the AI agents running throughout the organization have the context they need.

AI Safety as Commercial Strategy

Anthropic is structured as a Public Benefit Corporation (PBC), not a standard Delaware C-Corp. This legally allows the company to optimize for public benefit rather than being bound solely to maximize shareholder value.

Amol says the company has repeatedly taken significant commercial hits for safety reasons, including delaying product launches when safety risks were identified. He also makes a striking claim: what Anthropic says publicly about AI risk is actually a softer version of what they believe internally. The internal view on the potential downsides of powerful AI is more aggressive than the public messaging.

From a growth perspective, Amol frames safety as a long-term competitive advantage. Growth teams at most companies try to squeeze every last dollar. Anthropic’s growth team is “very comfortable forgoing metric impact” to protect brand, quality, and safety. He argues this is how all the best products operate, and that as the stakes of AI get higher, Anthropic’s credible commitment to safety will become a moat.

Advice for Thriving in the AI Era

Amol’s advice for product managers and growth practitioners boils down to four points. First, stay on top of the tools. Try every new model release. Something that did not work three months ago may work now, and you will not know unless you go back and test it. Second, go deep on your unique spike rather than trying to be well-rounded. The PM who can also design is a unicorn. The engineer who thinks like a PM is a unicorn. Find your interdisciplinary edge and double down. Third, be radically adaptable. Amol estimates that 50 to 70 percent of how you operated in the past is now irrelevant. Clinging to old playbooks creates friction. Fourth, think in exponentials, not linear projections. If you are looking at the AI landscape through a linear lens, you will consistently underestimate how quickly things are moving.

Thoughts

This interview is one of the most information-dense conversations about growth strategy in AI that has been published so far. A few things stand out.

The CASH system is the most concrete example yet of a company using AI to automate its own growth loop. The fact that it currently performs at a junior PM level is almost beside the point. What matters is the trajectory: it went from impossible to functional in a few months. If models continue improving at their current pace, this system will be operating at a senior PM level within a year. Every growth team at every AI company should be building their own version of this right now.

The PM ratio insight is genuinely surprising and underreported. The default assumption in the tech industry is that AI will reduce headcount across all functions. Amol is making the case that in the near term, the opposite is true for PMs. Engineering output is exploding, and someone needs to direct all that output toward the right problems. That is a fundamentally human, organizational, political job that AI is not close to automating.

The coding flywheel story is also worth highlighting because it shows the power of strategic focus in a world of unlimited possibilities. Anthropic had a generalist technology that could do almost anything, and they deliberately narrowed their focus to one vertical. That decision, made in 2021 before anyone knew what the market would look like, is arguably the single most important strategic bet in the company’s history.

Finally, the notebook channels concept deserves more attention. The idea that employees should share their thinking in structured, searchable formats is not just a culture tool. It is an infrastructure investment for an AI-native future where agents need organizational context to be effective. Companies that build this habit early will have a significant advantage when agent-driven workflows become the norm.

The uncomfortable subtext of this entire conversation is that Anthropic’s growth team, as talented as they clearly are, is riding a wave created almost entirely by the research team. Several YouTube commenters pointed this out, and Amol himself acknowledges it directly. The models are the product. The growth team’s job is to make sure users discover and adopt what the models can do. That is not a small job, especially at this scale, but it is a fundamentally different job than driving growth at a product that does not sell itself.

April 5, 2026