PJFP.com

Pursuit of Joy, Fulfillment, and Purpose

Category: Articles

  • Subquadratic (SubQ) Explained: The First Fully Sub-Quadratic LLM with a 12M-Token Context Window, 50x Cost Reduction, and a Post-Transformer Architecture

    Subquadratic, the AI infrastructure company behind subq.ai, just emerged from stealth with a $29M seed round and a claim that should make every AI engineer pay attention: they have built the first large language model whose compute scales linearly, not quadratically, with context length. The result is SubQ, a frontier model with a 12 million token context window, roughly 50x lower cost than leading frontier models at 1M tokens, and benchmark numbers that put it ahead of Gemini 3.1 Pro, Claude Opus 4.6/4.7, and GPT-5.4/5.5 on key long-context tasks. This is a deep, opinionated breakdown of everything Subquadratic has published so far, who is behind it, why a sub-quadratic architecture matters, and what changes for developers, agents, and enterprise AI if the numbers hold up.

    TLDR

    Subquadratic is a Miami-based frontier AI lab that launched on May 5, 2026 with $29M in seed funding and a new LLM called SubQ. SubQ is the first fully sub-quadratic LLM, meaning attention compute grows linearly with context length instead of quadratically. The model offers a 12M token context window, around 150 tokens per second, roughly one-fifth the cost of leading frontier models, 95% accuracy on RULER 128K, 92% accuracy at the full 12M tokens, and the company is targeting 100M tokens by Q4 2026. Two products are launching in private beta: SubQ API (OpenAI-compatible, streaming, tool use) and SubQ Code (a CLI coding agent that plugs into Claude Code, Codex, and Cursor to load entire repositories into a single context window).

    Key Takeaways

    • SubQ is the first fully sub-quadratic LLM, with attention compute scaling at O(n) instead of the transformer’s O(n²).
    • The context window is 12 million tokens, enough to fit the entire Python 3.13 standard library (around 5.1M tokens) or roughly 1,050 React pull requests (around 7.5M tokens) in a single prompt.
    • At 12M tokens, SubQ reduces attention compute by almost 1,000x compared to other frontier models.
    • Pricing benchmarks: 95% accuracy on RULER 128K at $8 of compute, versus 94% accuracy at roughly $2,600 on Claude Opus, a 260x to 300x cost reduction.
    • Speed: about 150 tokens per second.
    • Cost: roughly 1/5 of other leading LLMs at 1M tokens, more than 50x cheaper according to launch coverage.
    • Two products in private beta: SubQ API (12M token window, streaming, tool use, OpenAI-compatible endpoints) and SubQ Code (one-line install CLI for coding agents, ~25% lower bills, 10x faster exploration, auto-redirects expensive model turns).
    • SubQ Code integrates with Claude Code, Codex, and Cursor, positioning Subquadratic as the long-context infrastructure layer beneath existing agent workflows rather than a competing chat product.
    • Architecture: a fully sub-quadratic sparse-attention design that learns which token relationships actually matter and skips the rest, redesigned from first principles.
    • Funding: $29M seed led by investors including Javier Villamizar (former SoftBank Vision Fund partner) and Justin Mateen (Tinder co-founder, JAM Fund), alongside early investors in Anthropic, OpenAI, Stripe, and Brex.
    • Founders: Justin Dangel (CEO, five-time founder) and Alex Whedon (CTO, ex-Meta engineer, former Head of Generative AI at TribeAI). Research team includes PhDs from Meta, Google, Oxford, Cambridge, and BYU.
    • Headcount is 11 to 50, headquartered in Miami, Florida, with active hiring for API engineering, developer advocacy, product design, sales, and people operations.
    • Tagline and thesis: “Efficiency is Intelligence.” The company argues that quadratic attention has been the real ceiling on AI applications, and breaking it unlocks workloads that were previously cost-prohibitive or architecturally impossible.

    Detailed Summary

    What is Subquadratic and what is SubQ?

    Subquadratic is a frontier AI research and infrastructure company. Their public homepage is intentionally minimal, with the single line “Efficiency is Intelligence.” and a contact email at [email protected]. The full product story lives on the launch demo site, where the company introduces SubQ as the first model built specifically for long-context tasks. The pitch is direct: SubQ is a sub-quadratic LLM built for 12M-token reasoning, allowing agents to work across full repositories, long histories, and persistent state without quality loss.

    Three numbers dominate the marketing copy. Context: 12M token reasoning. Speed: 150 tokens per second. Cost: one-fifth of other leading LLMs. Those three numbers, taken together, are why this launch matters. Until now, you could optimize for one of the three at a time. SubQ claims to push all three at once because the underlying architecture changed, not because the company applied better quantization or smarter caching on top of a transformer.

    The architecture: why “sub-quadratic” is the whole story

    Standard transformers, the architecture behind ChatGPT, Claude, Gemini, and almost everything else, use dense self-attention. Every token compares itself to every other token, which means compute scales as O(n²) in the context length n. Double the context, quadruple the compute. That single property is the reason context windows are usually capped at 128K tokens for open models and around 1M tokens for the most aggressive frontier offerings, and it is the reason most production AI systems lean on retrieval-augmented generation, chunking, agentic retrieval, and prompt engineering tricks to dodge the cost curve entirely.

    SubQ is built on a fully sub-quadratic sparse-attention architecture, redesigned from first principles. The argument from co-founder and CEO Justin Dangel is that LLMs waste compute by processing every possible token-to-token relationship when only a small fraction of those relationships actually matter for the task. SubQ learns to find and focus only on those relevant relationships, which is what brings the scaling behavior down from O(n²) to O(n). At 12M tokens, this design cuts attention compute by almost 1,000x compared to other frontier models. The research community has been chasing this for years through linear attention, state space models, Mamba, and various sparse attention variants. According to Subquadratic, the unsolved problem was never the idea, it was building a sub-quadratic architecture that did not sacrifice frontier-level accuracy. That is what their team spent the time on.

    The benchmarks

    Subquadratic published a benchmark table comparing a SubQ 1M-Preview against Gemini 3.1 Pro, Claude Opus 4.6, Claude Opus 4.7, GPT-5.4, and GPT-5.5 across SWE-Bench Verified (real-world software engineering), RULER at 128K (long-context accuracy across 13 tests), and MRCR v2 8-needle at 1M (multi-round coreference resolution).

    • SWE-Bench Verified: SubQ scores 81.8%, ahead of Gemini 3.1 Pro at 80.6% and Opus 4.6 at 80.8%, with Opus 4.7 leading at 87.6%.
    • RULER at 128K: SubQ scores 95.0%, narrowly ahead of Opus 4.6 at 94.8% (internally evaluated). Other vendors did not report this benchmark.
    • MRCR v2 8-needle, 1M: SubQ scores 65.9%, behind Opus 4.6 at 78.3% and GPT-5.5 at 74.0%, but well ahead of GPT-5.4 at 36.6%, Opus 4.7 at 32.2%, and Gemini 3.1 Pro at 26.3%.
    • The launch blog post adds that on RULER 128K, SubQ scored 97% accuracy at $8 of compute, versus 94% on Claude Opus at roughly $2,600. That is a cost reduction of about 260x at superior accuracy.
    • On MRCR v2 specifically, the launch post lists SubQ at 83, Claude Opus at 78, GPT-5.4 at 39, and Gemini 3.1 Pro at 23.
    • At the full 12M token context, SubQ hits 92% on RULER while other frontier models reportedly break down well before reaching their stated 1M-token limit.
    • Subquadratic notes the SubQ results are third-party validated and a full technical report is forthcoming.

    The story these numbers tell is consistent: SubQ is competitive on traditional benchmarks like SWE-Bench, decisively better on long-context retrieval where compute economics dominate, and dramatically cheaper to run when the workload actually exercises a long context.

    The two products: SubQ API and SubQ Code

    SubQ ships in two flavors. The first is SubQ API, the full-context API for developers and enterprise teams. It exposes the 12M token context window, supports streaming and tool use, and uses OpenAI-compatible endpoints so existing client libraries and orchestration code can be repointed with minimal change. The product positioning is to process full repositories and pipeline states in a single API call at linear cost, rather than chunking inputs and stitching results.

    The second is SubQ Code, a long-context layer designed specifically for coding agents. Instead of competing with Claude Code, Codex, or Cursor, SubQ Code plugs into them. It maps codebases, gathers context, and answers token-heavy questions faster than the host agent’s default model. According to Subquadratic, the integration delivers roughly 25% lower bills and around 10x faster exploration, auto-redirects the most expensive model turns to SubQ, and installs in a single line. The design implication is that agent builders do not have to switch ecosystems to benefit from a 12M token window. They keep their preferred agent and offload the heavy long-context work to SubQ.

    Both products are in private beta. Access is gated through a request early access form where applicants choose SubQ Code, SubQ API, or both, and provide context about their workload.

    What 12M tokens actually unlocks

    Subquadratic illustrates the size of the context window with two concrete examples. The entire Python 3.13 standard library is roughly 5.1M tokens, well under the limit. Six months of React pull requests, around 1,050 PRs against the React codebase, comes in around 7.5M tokens, also under the limit with room to spare. At this scale, the standard pattern of curating which files or chunks the model gets to see goes away. The model just sees everything.

    The downstream implications are significant. RAG pipelines, embedding stores, chunking heuristics, and multi-agent coordination layers exist primarily to compensate for short context windows and quadratic compute. If a model can ingest the whole corpus in one pass at linear cost, large parts of that workaround stack become optional. Long-running agents can preserve full state instead of summarizing it. Coding agents can reason about a refactor across an entire repository without juggling tool calls. Document-heavy workflows in legal, finance, and research can run on the source material directly. And once Subquadratic hits its 100M token target by Q4 2026, the design space shifts again toward applications that depend on persistent state and long time horizons.

    The economic argument

    Subquadratic’s framing is that cost has become the binding constraint on AI deployment, not capability. Many ideas never reach production because the unit economics do not work out. Quadratic attention is the structural reason for that. By breaking the scaling law, SubQ aims to make previously cost-prohibitive workloads viable at scale: high-volume inference, longer included context, and applications that rely on sustained interaction with the model. The 260x to 300x cost reduction reported on RULER 128K is the headline number that operationalizes this thesis.

    The team and the funding

    Subquadratic raised $29M in seed funding. Investors include Javier Villamizar, former partner at SoftBank Vision Fund, and Justin Mateen, co-founder of Tinder and founder of JAM Fund, alongside early investors in Anthropic, OpenAI, Stripe, and Brex. CEO Justin Dangel is a five-time founder with prior companies in health tech, insurance tech, and consumer goods. CTO Alex Whedon previously worked as a software engineer at Meta and led over 40 enterprise AI implementations as Head of Generative AI at TribeAI. The research team is built around PhDs and published researchers from Meta, Google, Oxford, Cambridge, and BYU. The company is headquartered in Miami, Florida, with a headcount in the 11 to 50 range.

    Public hiring lists show the company is staffing across API engineering, founding developer advocacy, principal full-stack engineering, technical copywriting, account executive roles for enterprise sales, senior product design for the Voice AI and API surface, and head of people and talent operations. The Voice AI mention is notable because the public homepage at subq.ai still references a Speech-To-Text API as a current product, suggesting Subquadratic is operating across both speech and language with the same architectural thesis.

    The site itself

    The current public site at subq.ai is deliberately spartan. Visitors see only the company name, the line “Efficiency is Intelligence.”, and a contact email. The full marketing surface lives at the launch demo URL, which acts as the de facto homepage for the launch and links out to the request early access flow, the introducing SubQ blog post, the LinkedIn page, the X account, the Discord community, careers, press contact at [email protected], terms of use, privacy policy, cookies policy, and acceptable use policy. The structure makes sense for a private beta launch: keep the apex domain minimal, push announcement traffic to a dedicated launch site, and gate product access behind a form.

    Thoughts

    The interesting part of Subquadratic’s pitch is not the context window. It is the implicit claim that the entire workaround economy built around transformers, RAG vendors, vector databases, chunking middleware, agentic retrieval frameworks, context compression startups, was always a tax paid because of one architectural property: O(n²). If SubQ’s numbers hold up under independent scrutiny, a meaningful slice of that ecosystem becomes optional rather than mandatory. That has product, infrastructure, and venture implications that go well beyond a faster, cheaper LLM.

    The product strategy is also notably humble in a smart way. Subquadratic is not trying to win the consumer chat war against ChatGPT, Claude, or Gemini. SubQ Code is positioned as a layer underneath Claude Code, Codex, and Cursor, and the API is OpenAI-compatible. That is a classic infrastructure play: do not ask developers to abandon their tools, just route the expensive long-context turns to you. The “auto-redirects expensive model turns” framing is essentially a routing economic argument aimed at agent builders who already feel the pain of paying frontier prices for high-token requests.

    There are open questions worth holding lightly. The MRCR v2 numbers in the public benchmark table show SubQ behind Opus 4.6 and GPT-5.5, even as the launch post emphasizes a higher relative score. The cost comparisons rely on a specific compute basis that the upcoming technical report will need to spell out. And the gap between strong RULER scores at 128K and the 92% claim at 12M tokens is a long way to extrapolate without external replication. None of this is unusual for a launch, but it is the right place to apply pressure once the technical report drops.

    The bigger architectural bet is the one that should hold attention. If sub-quadratic attention done well genuinely matches frontier accuracy, then context length stops being a meaningful product axis and a generation of brittle infrastructure built around context limits gets reconsidered. Subquadratic is making the strongest public case so far that the post-transformer era starts with attention scaling, not parameter count. The next twelve months, the technical report, third-party benchmarks, and the first real production deployments through SubQ Code, will tell us whether this is the inflection point or another promising direction that does not quite cross the line. Either way, “Efficiency is Intelligence” is the right frame for where AI economics are heading, and Subquadratic is one of the few companies whose architecture is consistent with the slogan.

  • Howard Marks on Why Most Investors Lose, the AI Bubble, India, and the Hunt for the $10 Bill Nobody Picked Up

    TLDW

    Howard Marks, co-founder of Oaktree Capital and the author of the memos every serious investor reads first, sat down with Nikhil Kamath for a wide-ranging conversation on his 50+ year career, the philosophy of Mujo (the inevitability of change), why he chose bonds over stocks, the difference between drifting down the river and seeing it, where we sit in the current cycle, AI as both threat and opportunity, why active management lost to indexation, and why the only way to outperform in a world full of smart, motivated, computer-literate competitors is “superior insight.” His core message: investing is a puzzle that cannot be solved by formula, and the only edge that lasts is being more right than the other person, more often, with the discipline to stay calm when everyone else is panicking or partying.

    Key Takeaways

    • Mujo is the operating system. Marks took Japanese literature at Wharton and walked away with one idea that shaped his whole career: change is inevitable, unpredictable, and uncontrollable. You cannot predict the future, but you can prepare for it.
    • Cycles are excesses and corrections, not ups and downs. The S&P 500 has averaged about 10% per year for 100 years, but it is almost never between 8% and 12% in any given year. The norm is not the average. Greed and fear push the pendulum past equilibrium every time.
    • The recovery is two years older. When asked where we are in the cycle, Marks notes the bull market continued from April 2024 through January 2026, so by definition we are deeper into the cycle, with a recovery distorted by the unique man-made COVID recession.
    • Drifting versus seeing the river. Marks describes the first 35 years of his career (roughly age 14 to 49) as drifting. Starting Oaktree in 1995 was the first truly intentional decision he made. Entrepreneurship forced proactivity on him.
    • Why bonds over equities. The contractual, predictable nature of debt suited his conservative temperament (his parents were adults during the Depression). He was not voluntarily moved to bonds in 1978; a boss reassigned him just in time for the birth of the high-yield bond market.
    • Distressed debt is the bigger story. Bruce Karsh joined in 1987 and has run roughly $70 billion in distressed debt since 1988, with profits well over 90% of the total profit and loss.
    • Excess return is getting paid more than the risk warrants. If the market thinks a borrower has a 5% default probability and you correctly conclude it is 2%, you collect interest priced for 5% risk while taking 2% risk. That gap is the alpha.
    • Oaktree’s default rate is about a third of the market. Over 40 years, roughly 3.6% to 3.7% of high-yield bonds default each year. Oaktree’s rate is roughly one-third of that, achieved through process discipline, institutional memory, and analysts who stay analysts for life.
    • If you are starting a career today, understand AI. Marks says the investor who will make the most money over the next 10 years is the one who best understands AI and its capabilities, whether they bet for or against it.
    • AI is excellent at pattern matching, but cannot create new patterns. Can AI pick the Amazon out of five business plans? The Steve Jobs out of five CEOs? Marks bets no. Most humans cannot either, which means there is still a role for exceptional people.
    • Indexation won because active management lost. Passive did not become dominant because it is brilliant. It dominated because most active managers failed and charged high fees for the privilege.
    • Bad times create openings for active managers, but most cannot take them. Panic drives prices down, but the same panic prevents most investors from buying. Wally Deemer: when the time comes to buy, you will not want to.
    • The job is simple but not easy. Find the best managers, the best companies, the best ideas. Charlie Munger told Marks: anyone who thinks it is easy is stupid.
    • Where is the $10 bill nobody picked up? Marks thinks it is around AI, but only for those with insight above the average. If you are average and you crowd into AI, you get average results in a bull case and worse in a bear case.
    • Quantitative information about the present cannot produce alpha. Andrew Marks (howards son) pointed this out to his father during the COVID lockdown. Everyone has the same data. Outperformance has to come from somewhere else.
    • Buffett’s edge was reading Moody’s Manuals when nobody else would. The pre-internet research process favored those willing to do tedious work alone. The format of the edge changes; the fact that edge requires doing what others will not, does not.
    • You cannot coach height. Marks can tell you that second-level thinking, contrarian insight, and the ability to evolve at 80 are essential. He cannot tell you how to acquire any of them.
    • India: Marks declines to opine. He has deployed roughly $4 billion in India but refuses to claim expertise on the Indian stock market or recommend a sector.
    • History rhymes. Marks credits Mark Twain. The lessons that repeat are lessons of human nature, which changes incredibly slowly.
    • Investing is a puzzle, not dentistry. Quoting Taleb, Marks observes that engineers and dentists succeed by repeating the right answer. Investors face a problem with no certain solution. If you need to be right every time, do not become an investor.

    Detailed Summary

    From Queens to Wharton: The Accidental Investor

    Howard Marks grew up in Queens, New York, in a middle-class family. Neither of his parents went to college, but his father was an intelligent accountant. Marks discovered accounting in high school, fell in love with its orderliness, and chose Wharton because he was told it was the best undergraduate business school in America. Wharton required a literature class in a foreign country and a non-business minor. For reasons he no longer remembers, Marks chose Japanese studies, then took Japanese civilization and Japanese art. He calls it the most important academic decision of his life because of one concept he encountered: Mujo.

    Mujo, Independence of Events, and Why You Cannot Predict

    Mujo, the turning of the wheel of the law, teaches that change is inevitable, unpredictable, and uncontrollable, and that humans must accommodate it rather than try to control it. Marks pairs this with his deep belief in the independence of events: ten heads in a row do not change the odds on flip eleven. Roughly 20 years ago he wrote a memo titled “You Can’t Predict. You Can Prepare.” A portfolio cannot be optimized for both extreme upside and extreme downside, but it can be built to perform respectably across many possible futures, if you suboptimize for the middle of the probability distribution.

    Why Cycles Exist

    If GDP averages 2% growth, why is it never simply 2%? Marks’s answer is excesses and corrections. Optimism leads producers to overbuild and consumers to overspend, growth runs above trend, then satiation and oversupply pull it back below trend. The S&P 500 averages 10% per year over a century, but the return in any given year is almost never between 8% and 12%. The norm is not the average because human beings are not average; they are alternately greedy and fearful.

    Where Are We Now?

    Two years ago Marks told the Norwegian Sovereign Wealth Fund’s Nicolai Tangen that we were near the middle of the cycle. Two years later, the bull market in stocks continued through January 2026, so by simple math the recovery is older. The COVID recession was a man-made anomaly: one quarter of negative growth followed by the best quarter in history, triggered by a deliberate global shutdown rather than by accumulated excess. That distorts every traditional cycle metric.

    Drifting Versus Seeing the River

    One of the most personal moments in the conversation is Marks’s confession that he drifted for the first 35 years of his career. He did not pick his career, his first job, or his transition from equities to bonds in any deliberate way. Other people pushed him; he said yes. The first proactive decision of his life was co-founding Oaktree in 1995 at age 49, and even that came largely because his wife and his partner Bruce Karsh pushed him into it. Once he had to lead, he had to be intentional. Leadership cannot be passive.

    The Bond Decision

    Marks did not choose bonds; bonds chose him. In May 1978 his boss at Citibank moved him to the bond department to start a convertible fund. Three months later another phone call asked him to figure out something called high-yield bonds being run by a guy in California named Milken. Marks said yes both times. He arrived at the front of the line for high-yield in 1978 and has been there for 48 years.

    The conservative temperament fit. Marks’s parents were adults during the Depression, so he grew up hearing “don’t put all your eggs in one basket” and “save for a rainy day.” Bonds offered contractual, predictable returns. The phrase “junk bonds” was a bias that made the asset class cheaply available to anyone willing to do the analytical work.

    Distressed Debt and Excess Return

    When Bruce Karsh joined in 1987, Oaktree launched what Marks believes was the first distressed debt fund from a mainstream institution. Karsh has managed about $70 billion since 1988 with well over 90% of the total being profit. The core skill is predicting default probability better than the market. If consensus prices a borrower at a 5% default risk and you correctly assess 2%, the interest you receive is overpaid relative to actual risk. Marks calls this “excess return” and credits Mike Milken with the foundational insight: lend to borrowers others will not, demand interest beyond what compensates you, and the math works.

    Over 40 years, roughly 3.6% to 3.7% of high-yield bonds default annually on average. Oaktree’s default rate has been roughly one-third of that. Marks credits institutional culture (analysts who stay analysts for life), psychological stability in volatile periods, and a process that forces every analyst to ask the same eight questions of every company every time. In equity research, you can buy a stock for great management without examining the product, or for a great product without examining the management. In Oaktree’s bond process, you cover every base every time.

    Beginning a Career Today: The AI Question

    Asked what he would do today, Marks says the front of the line is AI. The investor who will succeed most over the next decade is the one who best understands AI, whether they bet for or against it. He notes that he was shocked by his own experience using Claude, but adds that he has not fired a single person and does not intend to.

    His view: AI excels at extracting patterns from history and applying them with discipline and without psychological wobble. But investing also requires creating new patterns. Can AI sit with five business plans and identify the future Amazon? Can it sit with five CEOs and pick Steve Jobs? Marks bets not. Then he adds the killer line: most humans cannot either. Which means the role for exceptional humans survives, but the bar gets higher.

    Why Indexation Won

    When Marks went to graduate school at the University of Chicago in 1968, his professor pointed out that most mutual funds underperformed the S&P after fees. Index funds did not exist yet; Jack Bogle launched the first one in 1974. Today, most equity mutual fund capital is passive. Marks’s controversial take: indexation did not win because it is great. It won because active management was so bad and so expensive. Even at equal fees, if active decisions are inferior, passive wins.

    Bad times create openings for active managers because panic drives prices down, but the same panic prevents most people from buying. Marks quotes the old trader Wally Deemer: when the time comes to buy, you will not want to. The advantage of an AI nudge that says “this is one of those moments, get your ass in gear and buy something” might genuinely add value, because it removes the emotion.

    Second-Level Thinking and Why You Cannot Coach It

    Marks’s first book, The Most Important Thing, has 21 chapters, each titled “The Most Important Thing Is…” Each one is different because so many things matter. The chapter on second-level thinking came to him spontaneously while writing a sample chapter for Columbia University Press. The argument is simple: if you think like everyone else, you act like everyone else, and you get the same results. To outperform, you must deviate from the herd and be more right than the herd. Different is not enough. Different and better is the bar.

    Can AI become a contrarian thinker? You can prompt Claude to give you only non-consensus answers, but the catch is that consensus is often close to right because the people building consensus are intelligent, educated, computer-literate, and motivated. Forcing non-consensus often forces wrong. The real edge is being non-consensus AND correct, which is a much narrower target.

    The $10 Bill That Nobody Has Picked Up

    Marks references the joke about the efficient market hypothesis: there is no $10 bill on the sidewalk because if there were, somebody would have already picked it up. He then concedes that the bill is probably around AI today, but only for those whose insight rises above the average. If you are average and you crowd into AI, you go along with the tide if it works and get crushed if it does not. Quoting Garrison Keillor’s Lake Wobegon, “where all the children are above average,” Marks notes that the math does not allow it. Most investors will not be above average, and acknowledging that is the first step toward becoming one of the few who are.

    Learning From Andrew, Buffett, and Onion-Skin Manuals

    Marks lived with his son Andrew during COVID and wrote a memo about it called “Something of Value” in January 2021. Andrew’s most important contribution was a near-revelation: readily available quantitative information about the present cannot be the source of investment alpha because everyone has it. Buffett’s edge in the 1950s was reading Moody’s Manuals (giant books printed on onion-skin paper with tiny type and zero narrative) when nobody else would. The medium changes; the principle that edge requires doing what others will not, does not.

    India

    Kamath asks Marks directly about India. Marks has deployed roughly $4 billion there but politely declines to claim any expertise on the Indian stock market or recommend a sector. He cautions Kamath about taking advice from people who do not know what they are talking about, and includes himself in that category on the question of India. The honesty is striking and is itself an investment lesson.

    History Rhymes, and Final Advice

    Marks reads Andrew Ross Sorkin’s 1929 and references it in an upcoming memo on private credit. He likes Mark Twain’s reputed line that history does not repeat but it rhymes, and Napoleon’s line that history is written by the winners of tomorrow. The lessons that rhyme are lessons of human nature, which evolves incredibly slowly. Fight or flight from the watering hole still drives behavior in financial markets.

    His final advice: investing is a puzzle, not engineering. A civil engineer calculates steel and concrete, builds the bridge, and the bridge stands. Every time. A dentist fills the cavity correctly and it stays filled. Every time. If you need that kind of reliability in your work, become a dentist. Investing is the act of positioning capital for a future that cannot be predicted accurately. You will be wrong sometimes. If something in your makeup cannot tolerate being wrong sometimes, do not become an investor. The puzzle has no final solution, which is exactly what makes it endlessly interesting.

    Thoughts

    The most useful thing Marks does in this conversation is admit, repeatedly and without ego, what he does not know. He does not know whether AI models differ in real intelligence. He does not know which sector in India to bet on. He does not know how to teach second-level thinking. He drifted for 35 years and only began making intentional decisions at 49. This honesty is the inverse of every guru selling certainty, and it is the actual content of the lesson he is trying to convey: epistemic humility is the precondition for superior insight, because you cannot acquire what you already think you have.

    The deepest insight in the conversation might be the one Andrew Marks (Howard’s son) gave his father during COVID: readily available quantitative information about the present cannot produce alpha because everyone has it. This is devastating in the AI era. If everyone is asking the same large language model the same question, the answers converge, and convergence is consensus, and consensus does not pay. The arms race for proprietary data, novel framings, and unconventional questions is the only thing that can break the convergence.

    Marks’s framing of cycles as excesses and corrections rather than ups and downs is genuinely useful. It reframes volatility from something to fear into something to expect, and reframes the question from “where are we going?” to “how far past trend have we already gone?” The 8 to 12 percent observation about the S&P (that the average return is almost never the actual return) is the kind of fact that should be taught in every introductory finance class but is almost never mentioned.

    The most contrarian claim in the conversation is the one about indexation: that it won because active was bad, not because passive is great. This is a useful inversion. Most defenders of passive investing argue from efficient market theory; Marks argues from the empirical failure of active managers. The implication is that if you can find the small population of active managers who genuinely outperform, the indexation argument falls apart for that subset. Most cannot. The hardest job in investing is the meta-job of identifying the few who can.

    The exchange about AI as a contrarian engine is one of the most clarifying short discussions of AI’s investment limits I have read. Different from consensus is easy. Different and better is the actual goal. Forcing different gets you wrong more often than right because consensus, built by smart, motivated, educated competitors, is usually close to correct. This is why “use AI to find non-consensus ideas” is a worse strategy than it sounds.

    Finally, the Buffett-Moody’s-Manual story is the most quietly profound moment in the interview. The edge in 1955 was the willingness to read tiny type on onion-skin paper alone in an office in Omaha when no one else would. The edge in 2026 is whatever the modern equivalent of that is, and the only honest answer is: nobody knows yet, which is precisely why finding it is worth so much money.

  • Paul Tudor Jones on Macro Trading, Bitcoin, the AI Existential Threat, and Why the US Stock Market Is the Most Leveraged in History

    Legendary macro trader Paul Tudor Jones sat down with Patrick O’Shaughnessy on Invest Like the Best for a sweeping conversation that spans 50 years of trading, the 1980 silver collapse, the 1987 crash, his evolving admiration for Warren Buffett, his alarming view of AI safety, and a daily routine that starts at 2:30 AM. This is one of the most candid and useful conversations a working trader, investor, or builder can listen to right now.

    TLDW (Too Long, Didn’t Watch)

    Paul Tudor Jones believes the United States is sitting on the most leveraged equity market in history at 252% of GDP, dwarfing 1929 and 2000. He sees a sovereign debt bubble, a coming wave of IPO supply that could reverse a decade of buyback driven gains, and a dollar yen trade setting up as the next big macro opportunity. He calls Bitcoin the best inflation hedge that exists thanks to its finite supply, but flags real cyber and quantum tail risks. He apologizes publicly to Warren Buffett for years of doubting him and calls him the OG of compound interest. He thinks AI is being deployed without any meaningful safety regulation, that watermarking AI content should be mandated by law, and that humanity is sleepwalking into a tail risk that could cost hundreds of millions of lives. And he closes with a simple life formula: God, family, friends, fun, and service, with a daily intentional act of kindness as the secret to a meaningful life.

    Key Takeaways

    • The US equity market is at 252% of GDP, the highest in history. For context, 1929 peaked at 65%, 1987 around 85 to 90%, and 2000 around 170%. A standard mean reversion to long term PEs would be a 30 to 35% decline, which on this base would shave 80 to 90% of GDP in market cap.
    • We are in a sovereign debt bubble, not necessarily an equity bubble. But the country is over equitized, individual equity weightings are at all time highs, and private equity has more than doubled as a share of institutional portfolios since 2008.
    • IPO supply is about to flip the buyback math. Buybacks have been retiring roughly 2% of market cap per year for a decade. Contemplated IPOs in the next year could equal 5 to 6% of market cap, reversing a structural tailwind.
    • Hyperscaler capex will eat into tech cash flow, which is part of why tech has been dogging it and may continue to.
    • The buy and hold S&P 500 advice is dangerous at current valuations. Historically, buying the S&P at a PE of 22 has produced negative 10 year returns. Valuation matters even on long horizons.
    • Dollar yen is his current setup. The yen has been grossly undervalued for 24 months. Japan is the largest net international investment creditor, holding roughly $4.5 trillion mostly unhedged in dollars. The catalyst is a new Reagan or Thatcher style prime minister who Paul thinks will trigger a sharp yen rally.
    • Bitcoin is the best inflation hedge in existence because it is finite and decentralized, more scarce than gold. The two real risks are kinetic conflict triggering cyber warfare and the eventual arrival of quantum computing.
    • Every major crash he has lived through had the same DNA: leverage, usually derivative driven. 1987 was 100% portfolio insurance. 1998 was Long Term Capital and derivatives. 2000 was an IPO supply unlock cascade. Today combines all three risks with sovereign debt fragility on top.
    • Trading is boxing, not chess. Most days you are jabbing and feeling out the market. A few times per cycle there is a real opening. Bitcoin in 2020 was a knockout. Two year rates in 2022 was a knockout. The job is to be ready when the opening appears.
    • Great traders are 70% born, not made. Paul polled his top risk takers and the consensus was nature dominates nurture. The traits: type A, hyper curious, loves competition, loves games, intuitive grasp of probability.
    • Liquidity is everything. His grandfather told him as a kid, “you are only worth what you can write a check for tomorrow.” He watched Bunker Hunt go from richest man on earth to virtually bankrupt in six weeks during the 1980 silver collapse. The lesson stuck.
    • Warren Buffett apology. Paul publicly recants decades of skepticism, calling Buffett a flipping genius who understood compound interest at age nine and the OG of compounding.
    • AI safety is a five alarm fire. Paul attended a small conference with modelers from the four biggest model labs. The consensus answer to how AI safety gets resolved was, paraphrasing, when 50 to 100 million people die in an accident. He thinks this is insane.
    • Mandatory AI watermarking should be a campaign issue. He wants knowing violations made a felony after three offenses. He says deepfakes have already fooled serious people he knows twice this year.
    • The build, break, iterate model is fine for most technology and catastrophic for AI because the break in this case can be civilization scale. The Atomic Energy Commission was created 18 months after the bomb. We are three years into deployed AI with effectively zero regulation.
    • Daily routine for 50 years: wake at 6:15, work an hour, 45 minutes of hard cardio, screens for the open, meetings 10 to 12, lunch meeting, hour before close and hour after to plan the next day, walk with wife at 5, work, dinner, mindless TV, work 9:30 to 10:15, sleep, wake at 2:30 or 3 AM to watch the London open and do analytical work, then back to sleep until 6:15.
    • Information overload is now the bottleneck. He works harder today than 40 years ago because the volume of inputs has exploded. The challenge is preserving what he calls exquisite execution: buying when there is blood on the ground and selling at maximum elation.
    • Eli Tullis was his trading mentor. Tullis traded almost only cotton and was a master of executing at the maximum apogee of fear and greed. The biggest lesson came after a catastrophic loss when Tullis greeted his wife’s friends with a smile and total composure. When the going gets tough, the tough get going.
    • Robin Hood Foundation was born from a wrong call. Paul was convinced 1987 would trigger a depression. It did not. But the conviction launched what became one of the most influential anti poverty organizations in America.
    • Journalism 101 should be required at every college. Newspaper inverted pyramid writing taught him principal component analysis: lead with the most important fact, then the next, then the next. He says it is exactly how he ranks variables in a trade.
    • If you do not use it, you lose it. A Palm Beach doctor told him “you retire, you die” and it changed how he thinks about working into his 90s.
    • The principal components of a great life: God, family, friends, fun, service. Significance does not come from the trades. It comes from the people you loved and the people you served.
    • Kill them with kindness. One intentional act of kindness per day, repeated, rewires you. “I should” becomes “I am.” It is the closing message of the entire conversation.

    Detailed Summary

    The Kindest Thing: A Three Year Old Lost in a Vegetable Market

    Paul opens the conversation by insisting they reverse the usual order of the show and start with Patrick’s signature closing question: what is the kindest thing anyone has ever done for you. His earliest childhood memory is being separated from his mother around age two and a half at an outdoor produce market in Memphis in 1957. An elderly Black man took his hand, walked him up and down the aisles, and reunited him with his mother. When she tried to give him five dollars, a meaningful sum at the time, he refused, saying he knew she would do it for his child. That night Paul began adding the unnamed man to his prayer list. He repeated that prayer roughly four to five thousand times over the next twelve years.

    Decades later, watching Harry Reasoner interview Eugene Lang on 60 Minutes, Paul saw the photo negative of his own story: an older man, this time helping kids of color in Harlem, promising to put them through college if they finished high school. Paul called Lang the next day and was redirected to Bedford Stuyvesant, the highest crime neighborhood in New York at the time. He adopted a class, ran after school programs, hired tutors, dealt with kids being murdered and teen pregnancy, and learned by failing what poverty actually requires to defeat. That work seeded the Robin Hood Foundation in 1987 and one of the first charter schools in New York, the Bedford Stuyvesant Charter School of Excellence, which became the number one ranked elementary school out of 543 in NYC within five years.

    Aim High and Shoot Straight

    Paul tells the story of his commencement address at what is now Rhodes College in Memphis. He polled the audience to see who remembered their own commencement speakers. Almost no one did. So he ended his speech by pulling out a bow, knocking an arrow, telling the graduates “whatever you do, aim high and shoot straight,” and shooting an apple off a table. Memorable.

    Trading vs Investing: A 50 Year Career in the Trenches

    Paul started in 1976 when inflation was raging and assets routinely doubled and halved in a single year. He cut his teeth on the floor of the cotton exchange and the COMEX, watching Bunker Hunt accumulate roughly 200 million ounces of silver at an average cost of $3.12 and ride it to roughly $50 an ounce, becoming worth $11 billion at the peak. When the COMEX restricted silver to liquidation only, the price collapsed from $50 to under $10 in eight weeks. Hunt was virtually bankrupt. The searing lesson: never trust permanence in any asset, and always preserve liquidity.

    He contrasts his own life with Warren Buffett’s. Paul’s BBI Fund has run for 40 years with a negative 0.12 correlation to the S&P 500, meaning 100% of returns are alpha. He compares trading to playing right guard in the NFL for 50 years, fighting in the trenches every single day, while Buffett’s belief in America gave him a different kind of strength: the ability to ride out a 50% drawdown in 2008 to 2009 without flinching. After listening to the Acquired podcast on Berkshire Hathaway, Paul realized Buffett understood compound interest at age nine and sought out Benjamin Graham at 17. He calls himself an idiot for ever doubting him.

    The AI Existential Risk Argument

    Paul attended a small conference around 18 months ago with roughly 35 to 40 attendees, including one modeler from each of the four largest AI labs. When he asked them point blank how they expected AI safety to get resolved, the consensus answer was, paraphrasing, that meaningful action would only happen after a mass casualty event of 50 to 100 million people. He has been alarmed ever since.

    His core critique is structural. The build, break, iterate cycle has been the engine of human invention since the beginning. The problem is that AI is the first technology where the tail event of a break could be civilizational. He compares the regulatory response unfavorably to the atomic bomb: the Atomic Energy Commission was stood up 18 months after Hiroshima. We are three years into widely deployed AI with no real regulation, no public referendum, and no convening with adversaries like China.

    His specific policy ask is mandatory watermarking of AI generated content, with knowing violations made a felony after three offenses. He says deepfakes have already deceived people he trusts twice this year and that restoring trust in a basic shared reality is foundational to fixing American discourse. He also notes that a meaningful share of senior AI scientists openly envision a future of brain implanted humans with inalienable rights. He thinks most humans, given a vote, would reject that path. His point is that there has been no vote.

    The Nature of Trading: Boxing, Not Chess

    Trading, Paul says, is more like classic boxing than chess. You are jabbing, feeling out the opponent, looking for an opening. Most days you are gathering information and not doing much. A few times per cycle there is a real opening that you can land hard. He cites Bitcoin in 2020 and two year rates in 2022 as recent knockouts.

    The genesis of every big move, he argues, is one of three things: the market got carried away, an imbalance went on too long, or a central bank or government did something they should not have. Right now he thinks dollar yen fits the pattern: the yen has been grossly undervalued for two years, Japan holds about $4.5 trillion in net international investment positions mostly unhedged in dollars, and the catalyst has arrived in a new prime minister he compares to Reagan, Thatcher, or Trump in his second term.

    Bitcoin as the Best Inflation Hedge

    Paul reiterates Bitcoin as superior to gold as an inflation hedge. Gold supply grows roughly two percent a year. Bitcoin’s supply is capped. Decentralization adds defensibility. The honest caveats: any kinetic global conflict will trigger cyber warfare, and electronic assets sit on the front line. Quantum computing, if and when it arrives, could enable hacks of any bank or any digital store of value. He is not predicting either tomorrow but he is unwilling to ignore them.

    Are We in a Bubble? Look at the Numbers

    The headline statistic is jaw dropping. Stock market capitalization to GDP is currently 252%. The 1929 peak was 65%. The 1987 peak was 85 to 90%. The 2000 peak was 170%. We have never been here before.

    Bear markets since 1970 have mean reverted on roughly a ten year cadence. A reversion to a normalized PE from current levels would imply a 30 to 35% decline. On a 250% of GDP base, that is 80 to 90 points of GDP in evaporated wealth. Capital gains tax revenue would crater, the deficit would explode, and the bond market would suffer a self reinforcing negative feedback loop.

    Add to this the IPO unlock schedule. Contemplated IPOs over the next year may equal 5 to 6% of market cap. For a decade, buybacks have removed roughly 2% per year. The math is about to flip. Hyperscaler capex commitments will further eat into the cash flow that funded the buybacks. Private equity has gone from 7% of institutional portfolios in 2007 to 16% today. Real estate and infrastructure allocations have grown. The system is dramatically more illiquid and more leveraged than it was in 2008.

    Paul’s specific warning to anyone telling clients to just buy the S&P: at a starting PE of 22, history shows negative 10 year returns. Valuation always matters.

    A Day in the Life of PTJ

    The schedule is monastic. Up at 6:15. Work an hour. 45 minutes of hard cardio. At the screens for the open. Meetings from 10 to 12. Lunch meeting. Afternoon meeting. An hour before the close and an hour after to plan tomorrow and think about what is coming overnight in Tokyo and Hong Kong. Home around 5. An hour walking with his wife. Another hour of work. Dinner. Mindless TV. Work again from 9:30 to 10:15. Sleep. Wake at 2:30 or 3 AM to watch the London open for 30 to 45 minutes and do analytical work in the quiet. Back to sleep. Wake at 6:15. Repeat for 40 years.

    He says he works harder now than ever before because of information overload. The opportunity cost of every distraction is exquisite execution: buying when there is blood on the ground, selling at peak euphoria.

    Eli Tullis and Executing at Maximum Pain

    Paul’s mentor Eli Tullis traded almost exclusively cotton. The defining moment came after Tullis was annihilated when a long awaited drought broke and cotton went limit down over a weekend. Paul watched in disbelief as Tullis welcomed his wife’s friends to a beautiful office for lunch with a smile, charm, and zero visible distress. The lesson, branded into Paul: when the going gets tough, the tough get going.

    Are Traders Born or Made

    Paul polled four or five of his best risk takers at a Christmas dinner. The unanimous answer: roughly 70% nature. The traits that recur: type A personality, hyper curiosity, love of competition, obsession with games, intuitive grasp of probability theory. Paul had a degree in probability theory without ever taking a math course on it. He played chess, backgammon, monopoly, gin rummy, gambled in college, and has never stopped playing bridge with friends.

    Why Keep Trading?

    Three reasons. First, his Palm Beach doctor told him retirement equals death. If you do not use it, you lose it. Second, his father lived to 100 and Paul wants to remain mentally sharp through his 90s. Third, and most importantly, he wants to make an absolute pot of money so he can give it away. The pursuit of nobility, as he calls it.

    The Workless World

    Paul used to despair about a future where AI does so much that humans no longer need to work. So much human significance comes from work. He has become more optimistic recently, watching how athletes find significance in sport and how he finds significance in bridge games with friends. Humans, he argues, are absurdly adaptable. We may find significance in something as small as a single intentional act of kindness per day.

    Why Journalism 101 Should Be Required

    Paul’s father ran a tiny trade finance legal paper in Memphis. Paul grew up writing for it and taking journalism classes. He argues that newspaper inverted pyramid writing should be mandatory in every college, more important than business school. Conclusion first. First sentence carries the most important fact. Who, what, where, when, why, how. Each subsequent paragraph drops one notch in importance. This is just principal component analysis applied to communication. It is also exactly how Paul ranks variables in a trade. At any given moment, ten things might matter, but only one is the catalytic variable today. The discipline of the inverted pyramid is the discipline of trading.

    The Principal Components of a Great Life

    Asked to apply the same framework to life, Paul answers without hesitation: God, family, friends, fun, service. He says he has actually thought about his own funeral with anticipation, partly because of the songs he has chosen. At the end, he says, no one thinks about the 1987 crash or Bitcoin. They think about who they loved, who loved them, what kind of relationships they had, and what they did to leave a legacy of betterment for others. Legacy, he insists, means deeds, not words.

    Kill Them With Kindness

    The closing message comes from his mother. Wake up some days you will be in a bad mood. Something on TV will make you angry. The temptation today is to demonize the other side. The antidote is intentional. One simple act of kindness per day, transmitted outward, repeated. Reps matter. “I should” becomes “I am.” Over time you become an organically kind person. Your outlook brightens. Multiply that across a country and the country changes.

    Thoughts

    The 252% market cap to GDP figure is the single most important number in the conversation. Most listeners will gloss over it. They should not. The structural argument Paul lays out is internally consistent and uncomfortably specific: an over equitized country, a sovereign debt bubble, an IPO supply wave that flips a decade of buyback math, hyperscaler capex eating cash flow, private equity more than doubled as a portfolio share since 2008, and far less liquidity than 2008 to absorb a shock. None of these are predictions of an imminent crash. They are descriptions of the kindling.

    His Buffett apology is the kind of intellectual honesty that is rare in finance. Two operators with opposite styles can both be right for fifty years. Paul’s negative correlation to the S&P with 100% alpha and Buffett’s belief in America with patient compounding are not rival theories of investing. They are different jobs. Most retail investors are trying to do Buffett’s job with a trader’s emotional reflexes, which is why so few make it.

    The AI section is the part of the interview that should make builders pause. Paul is not an AI doomer in the online sense. He is a 50 year career risk manager applying the standard framework: what is the size of the tail, what is the regulatory containment, who has the kill switch. His answer is that the tail is potentially civilization scale, the containment is effectively zero, and there is no kill switch. The historical precedent he reaches for is not science fiction but the Atomic Energy Commission stood up 18 months after Hiroshima. The contrast with our current trajectory is uncomfortable.

    The watermarking proposal is unusually concrete for a trader and unusually politically tractable for an AI safety policy. It does not require slowing capability research. It does not require international coordination as a precondition. It restores the basic epistemic substrate of public discourse: knowing what is human and what is not. Whether you think AI risks are overblown or underrated, watermarking is a Pareto improvement.

    For builders shipping software in the AI era, the meta lesson is that we are running the build, break, iterate playbook on a system whose break radius is no longer contained by the founders. That is a different kind of responsibility than the one most engineers have ever held. It does not have a clean answer yet. But the question is now visible.

    The kindness frame at the start and end is not throat clearing. It is the actual operating system Paul has run on for 70 years. The four to five thousand prayer reps for an unnamed man who held his hand in a Memphis vegetable market produced a pattern interrupt 25 years later that founded one of the most effective anti poverty organizations in the country. Compound interest applies to acts as much as to dollars. That is the through line of the entire conversation, and it is the thing most listeners will forget by tomorrow morning. They should not.

  • Claude Opus 4.7 Released: Anthropic’s New Coding Powerhouse With xhigh Effort Mode, 3.75MP Vision, and State-of-the-Art Agentic Performance

    TLDR

    Anthropic released Claude Opus 4.7 on April 16, 2026, as a direct upgrade to Opus 4.6. It delivers major gains on the hardest coding tasks, introduces a new xhigh effort level, supports images up to 2,576 pixels on the long edge (roughly 3.75 megapixels), and ships with automatic cybersecurity safeguards. Pricing stays flat at $5 per million input tokens and $25 per million output tokens. Early testers at Cursor, Replit, Vercel, Notion, Devin, Harvey, Databricks, and Warp report double-digit benchmark jumps, stronger instruction following, better long-horizon autonomy, and a more opinionated model that pushes back instead of agreeing reflexively.

    Key Takeaways

    • Direct upgrade from Opus 4.6 at the same price point, available via API as claude-opus-4-7, plus Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.
    • New xhigh effort level slots between high and max, giving developers finer control over the reasoning-versus-latency tradeoff.
    • Vision gets a real jump: images up to 2,576 pixels on the long edge, more than 3x prior Claude models. XBOW reported 98.5% visual acuity versus 54.5% for Opus 4.6.
    • Coding benchmarks up across the board: Cursor saw 70% on CursorBench versus 58% for 4.6, Rakuten-SWE-Bench resolved 3x more production tasks, and GitHub measured a 13% lift on their 93-task benchmark.
    • Long-horizon autonomy is a headline theme. Devin says Opus 4.7 works coherently for hours. Genspark highlights loop resistance and the highest quality-per-tool-call ratio they have measured.
    • Instruction following is substantially tighter, which means old prompts written for loose-interpretation models may now behave unexpectedly. Re-tune prompts and harnesses.
    • Better memory across file-system-based workflows, reducing the need for up-front context in multi-session work.
    • Tokenizer changed: same input can now map to 1.0 to 1.35x more tokens. Opus 4.7 also thinks more at higher effort levels, so output token counts rise too.
    • Cybersecurity safeguards automatically detect and block prohibited or high-risk cyber requests. Legitimate security researchers can apply to the new Cyber Verification Program.
    • Claude Code gets /ultrareview, a dedicated review session that catches bugs and design issues. Pro and Max users get three free ultrareviews. Auto mode is extended to Max users.
    • State-of-the-art on GDPval-AA, a third-party evaluation of economically valuable knowledge work spanning finance, legal, and other domains.
    • Not the most capable overall model. That distinction still goes to Claude Mythos Preview, which also remains the best-aligned model Anthropic has trained.

    Detailed Summary

    What Claude Opus 4.7 Actually Is

    Claude Opus 4.7 is Anthropic’s latest generally available frontier model, positioned as a targeted upgrade to Opus 4.6 rather than a ground-up new generation. The focus is squarely on advanced software engineering, long-running agentic workflows, and higher-fidelity vision. Anthropic describes it as handling complex, long-running tasks with rigor and consistency, paying precise attention to instructions, and devising ways to verify its own outputs before reporting back.

    The positioning matters. Claude Mythos Preview, announced alongside Project Glasswing, remains the most powerful and best-aligned model Anthropic has trained. Opus 4.7 is the first release after Mythos Preview and serves a dual purpose: give developers a concrete upgrade today, and stress-test new cybersecurity safeguards on a less capable model before Anthropic attempts a broader release of Mythos-class systems.

    Coding and Agentic Performance

    The early-access testimonials read like a highlight reel of the agentic coding ecosystem. Cursor saw CursorBench scores jump from 58% on Opus 4.6 to over 70% on Opus 4.7. Rakuten measured 3x more resolved production tasks on Rakuten-SWE-Bench with double-digit gains in code quality and test quality. GitHub measured a 13% lift on a 93-task coding benchmark including four tasks that neither Opus 4.6 nor Sonnet 4.6 could solve. Notion observed a 14% improvement over Opus 4.6 at fewer tokens and a third of the tool errors, calling it the first model to pass their implicit-need tests.

    Devin emphasized sustained autonomy, saying the model works coherently for hours and pushes through hard problems rather than giving up. Warp reported that Opus 4.7 passed Terminal Bench tasks prior Claude models had failed, including a tricky concurrency bug Opus 4.6 could not crack. Vercel highlighted a behavior they had not seen before: the model actually does proofs on systems code before starting work, and is noticeably more honest about its own limits.

    A recurring theme across testimonials is that Opus 4.7 pushes back. Replit’s president said it feels like a better coworker because it challenges technical decisions instead of agreeing by default. Augment Code noted it brings a more opinionated perspective rather than simply agreeing with the user. For anyone building real engineering workflows, that pushback behavior is arguably more valuable than raw benchmark deltas.

    Vision: The Quiet Breakthrough

    The vision upgrade may be the most underappreciated change. Opus 4.7 now accepts images up to 2,576 pixels on the long edge, roughly 3.75 megapixels, which is more than three times the previous Claude limit. This is a model-level change, not an API parameter, so every image sent to Claude is processed at higher fidelity automatically.

    XBOW, which builds autonomous penetration testing agents that rely heavily on computer use, reported the most dramatic single number in the entire announcement: 98.5% on their visual acuity benchmark versus 54.5% for Opus 4.6. They described their single biggest Opus pain point as effectively disappearing, unlocking an entire class of work where they could not previously use Claude. Solve Intelligence reported major improvements in multimodal understanding for life sciences patent workflows, from reading chemical structures to interpreting complex technical diagrams.

    This unlocks computer-use agents reading dense screenshots, data extraction from complex diagrams, and any work requiring pixel-perfect references.

    The New xhigh Effort Level

    Opus 4.7 introduces an xhigh (extra high) effort level that sits between high and max. This gives developers a new middle gear for the reasoning-versus-latency tradeoff on hard problems. In Claude Code, Anthropic raised the default effort level to xhigh across all plans. For coding and agentic use cases, Anthropic recommends starting with high or xhigh effort rather than defaulting to medium.

    Alongside effort controls, the Claude Platform is getting task budgets in public beta, letting developers guide Claude’s token spend so it can prioritize work across longer runs. This matters because Opus 4.7 thinks more at higher effort levels, particularly on later turns in agentic settings.

    Token Usage Changes You Need to Plan For

    Two token-related changes affect migration. First, Opus 4.7 uses an updated tokenizer that improves how the model processes text, but the tradeoff is that the same input can map to 1.0 to 1.35x more tokens depending on content type. Second, Opus 4.7 thinks more at higher effort levels, which means more output tokens on hard problems.

    Anthropic’s own internal coding evaluation shows the net effect is favorable when measured against quality delivered per token, but the recommendation is to measure the difference on real traffic rather than assume. Token usage can be controlled via the effort parameter, task budgets, or simply prompting the model to be more concise. Anthropic published a migration guide with tuning advice.

    Claude Code Updates: /ultrareview and Auto Mode

    Claude Code gets two meaningful additions. The new /ultrareview slash command produces a dedicated review session that reads through changes and flags bugs and design issues that a careful reviewer would catch. Pro and Max users get three free ultrareviews to try it out.

    Auto mode, a permissions option where Claude makes decisions on behalf of the user so longer tasks run with fewer interruptions, has been extended from Pro to Max users. The pitch is that auto mode is safer than skipping all permissions while still enabling long autonomous runs.

    Cybersecurity Safeguards and the Cyber Verification Program

    Opus 4.7 ships with safeguards that automatically detect and block requests indicating prohibited or high-risk cybersecurity uses. During training, Anthropic experimented with efforts to differentially reduce cyber capabilities, meaning Opus 4.7’s cyber ceiling is intentionally lower than Mythos Preview’s.

    For legitimate users, Anthropic launched a Cyber Verification Program for security professionals doing vulnerability research, penetration testing, and red-teaming. Real-world data from these safeguards will inform how Anthropic eventually releases Mythos-class models more broadly.

    Safety and Alignment

    Opus 4.7 shows a similar safety profile to Opus 4.6 overall. Honesty and resistance to prompt injection attacks improved. Some measures slipped modestly, notably a tendency to give overly detailed harm-reduction advice on controlled substances. Anthropic’s alignment assessment concluded the model is largely well-aligned and trustworthy, though not fully ideal. Mythos Preview still holds the crown as the best-aligned model according to Anthropic’s evaluations. The full Claude Opus 4.7 System Card has the complete breakdown.

    Real-World Work Beyond Code

    Opus 4.7 posts a state-of-the-art score on the Finance Agent evaluation and on GDPval-AA, a third-party evaluation of economically valuable knowledge work spanning finance, legal, and other domains. Harvey reported 90.9% on BigLaw Bench at high effort with noticeably smarter handling of ambiguous document editing tasks, including correctly distinguishing assignment provisions from change-of-control provisions. Databricks measured 21% fewer errors than Opus 4.6 on OfficeQA Pro document reasoning. Vercel went as far as calling it the best model in the world for building dashboards and data-rich interfaces.

    Pricing and Availability

    Pricing holds at $5 per million input tokens and $25 per million output tokens. Opus 4.7 is live today across all Claude products, the Claude API as claude-opus-4-7, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

    Thoughts

    The most interesting thing about this release is not the benchmark deltas, which are strong but expected for a point-release. It is the behavioral shift. When a dozen independent companies describe the same model as opinionated, willing to push back, self-verifying, and honest about its limits, that is a different product category than “next version, slightly better.” That is a model optimized for being a collaborator rather than an autocomplete.

    For solo builders running long agentic sessions, the loop resistance and long-horizon autonomy claims are the ones worth taking seriously. Genspark’s framing is sharp: a model that loops indefinitely on 1 in 18 queries wastes compute and blocks users. If Opus 4.7 genuinely closes that failure mode, the economics of overnight autonomous runs change meaningfully.

    The vision jump is the sleeper feature. 3.75 megapixel support plus the XBOW acuity number suggests computer-use agents are about to get a lot more reliable at reading actual screens. Anyone building browser agents, automated QA, or visual data extraction pipelines should retest their stacks this week.

    The instruction-following tightening is a real gotcha. Prompts written against Opus 4.6’s looser interpretation habits may produce surprising results when the model now takes every word literally. Teams with production prompt libraries should budget time for re-tuning rather than expecting a drop-in swap.

    Finally, the strategic framing around Mythos Preview is worth noting. Anthropic is explicitly using Opus 4.7 as a safeguards testbed for eventually releasing more capable cyber-capable systems. That is an honest acknowledgment that capability and deployment readiness are separate problems, and it sets a template for how frontier releases may work going forward.

  • Marc Andreessen on Zero Introspection, Founders vs. Managers, and Why Elon Musk Invented a New School of Management

    Marc Andreessen sat down with David Senra for a nearly two-hour conversation that covered everything from caffeine-induced heart palpitations to the structural collapse of managerialism, Elon Musk’s radical management system, and why the greatest entrepreneurs in history share one counterintuitive trait: they don’t look inward.

    This is one of the most information-dense podcast conversations of 2025. Here’s everything worth knowing from it.

    TL;DR

    Marc Andreessen believes introspection is a trap. The greatest founders, from Sam Walton to Elon Musk to Mark Zuckerberg, don’t dwell on the past or second-guess themselves. They just build. In this wide-ranging conversation with David Senra, Andreessen lays out his worldview on founders vs. managers, explains how he and Ben Horowitz modeled a16z after Hollywood talent agency CAA and JP Morgan’s merchant banking model, tells the origin story of Mosaic and Netscape, argues that moral panics about new technology are a pattern as old as written language, and makes a case that Elon Musk has invented an entirely new school of management that may be the least studied and most important organizational innovation in the world today.

    Key Takeaways

    1. Zero Introspection Is a Founder Superpower

    Andreessen opens the conversation by declaring he has “zero” introspection, and he says it like it’s a badge of honor. His reasoning is straightforward: people who dwell on the past get stuck in the past. He traces the entire modern impulse toward self-examination back to Freud and the Vienna-based psychoanalytic movement of the 1910s and 1920s, calling it a manufactured construct that would have been unrecognizable to history’s great builders. Christopher Columbus, Alexander the Great, Thomas Jefferson, Henry Ford: none of them were sitting around in therapy.

    Andreessen links this trait to the personality dimension of neuroticism, noting that many of the best founders he’s backed score essentially zero on that scale. They just don’t get emotionally derailed. That said, he acknowledges that some outstanding entrepreneurs are in fact quite neurotic. It’s a nice-to-have, not a prerequisite.

    2. Psychedelics Are Draining Silicon Valley of Its Best Talent

    One of the more provocative segments: Andreessen describes a pattern he’s observed repeatedly in Silicon Valley where high-performing founders get overwhelmed, discover psychedelics, have a transformative experience, and then quit their companies to become surf instructors in Indonesia. He brought this complaint to Andrew Huberman, who gave him a characteristically wise response: how do you know they aren’t happier now? Maybe the thing driving them to build was actually deep insecurity, and the psychedelics simply resolved it.

    Andreessen’s response is honest and funny: “Yeah, but their company is failing.” He and Senra both agree they aren’t willing to risk whatever is on the other side of that door. Daniel Ek of Spotify gets a shoutout here. Senra cites Ek’s philosophy that the best entrepreneurs don’t optimize for happiness, they optimize for impact.

    3. The Founder vs. Manager Debate Is the Central Tension of Modern Capitalism

    This is the intellectual core of the conversation. Andreessen draws heavily on James Burnham’s 1941 book The Machiavellians to frame two competing models of organizational leadership that have existed throughout the history of capitalism.

    The first is what Burnham called “bourgeois capitalism,” where the founder runs the company, their name is on the door, and they drive the thing forward through sheer force of will. Henry Ford in the 1920s. Elon Musk today. This was the norm for thousands of years across business, government, religion, and military conquest.

    The second is “managerialism,” the rise of the professional manager as a distinct class, trained at business schools, and treated as interchangeable across industries. This model emerged between the 1880s and 1920s and eventually produced the conglomerate era of the 1970s, where the premise was that a sufficiently skilled manager could run any business regardless of domain expertise.

    Andreessen’s argument is that Burnham’s thesis has collapsed. Managers are fine when nothing changes, when soup is soup and banks are banks. But the moment the environment shifts, managerial training is useless. SpaceX is the clearest example: imagine being a professionally trained manager at a legacy rocket company when a “crazy guy in California” figures out how to land rockets on their tail. Your MBA isn’t going to help.

    The a16z founding thesis, then, is essentially this: it’s much more likely that you can take a founder and teach them to manage at scale than take a manager and teach them to be a founder. That insight has only gotten stronger over time as manager-led institutions across the West lose trust and credibility because they can’t adapt.

    4. How a16z Was Built: The CAA Playbook and the Barbell Theory

    Before starting a16z, Andreessen and Horowitz spent a year and a half studying how other relationship-driven industries had evolved, including private equity, hedge funds, investment banks, law firms, advertising agencies, management consultancies, and Hollywood talent agencies.

    Their key structural insight was what they call the “barbell” or “death of the middle.” In industry after industry, they saw the same pattern: the middle-market firms collapse, and what survives is either ultra-lean boutique operators on one side or scaled platforms with massive networks and deep resources on the other. Department stores like Sears and JCPenney died, replaced by Gucci stores (boutique) and Amazon (scale). Mid-market investment banks disappeared while Allen & Company (boutique, founded in the 1920s, deliberately stayed small) and Goldman Sachs / JP Morgan (scaled) survived.

    The same thing had happened in private equity (KKR scaling up while solo operators stayed small), hedge funds, and advertising (the story arc of Mad Men literally dramatizes this process).

    In venture capital circa 2009, every firm was still operating as a “tribe of lone wolves.” Partners didn’t collaborate. Secretly, many didn’t even like each other. They were all fighting for bigger slices of what they perceived to be a fixed pie. Generational succession was failing. Andreessen and Horowitz decided to build the first scaled venture platform.

    The most direct inspiration came from Michael Ovitz and CAA. When Ovitz started CAA in 1975, Hollywood talent agencies were collections of independent agents. Your agent knew who they knew, and nobody else at the firm was available to help you. Ovitz changed everything. He had his team meeting at 7am instead of the industry-standard 9am, made calls by 8am (two hours before competitors), and called not just his own clients but other agencies’ clients too. The compounding effect was devastating to competitors who were still running on decades-old assumptions.

    5. The Origin Story of Mosaic, Netscape, and the Commercial Internet

    Andreessen provides a detailed firsthand account of building Mosaic at the University of Illinois, the first graphical web browser, and then co-founding Netscape with Jim Clark. A few highlights that rarely get told:

    The internet was literally illegal to commercialize. The NSF’s “acceptable use policy” prohibited commercial activity on the network. Andreessen personally served as tech support for Mosaic, fielding emails from users who thought their CD-ROM tray was a cup holder. He created a deliberately ambiguous commercial licensing form and watched 400+ commercial licensing requests pile up. That was the signal that there was a real business.

    He met Jim Clark at a legendary dinner at an Italian restaurant in Palo Alto with a dozen potential recruits. Andreessen was the only one who said yes. He also got so drunk on red wine (his first time drinking it) that he ripped the entire front end off his new car pulling out of the parking garage.

    The conversation also covers the concept of “Eternal September,” the moment in September 1993 when AOL connected its two million users to the internet, permanently transforming it from an ivory-tower utopia of the world’s smartest people into the mainstream consumer platform we know today.

    6. Jim Clark Was the Elon Musk of the Early ’90s

    Andreessen gives a vivid portrait of Jim Clark, the founder of Silicon Graphics, who had the vision to predict both the GPU revolution (what became Nvidia) and the networked computing revolution (what became the internet) years before anyone else. Clark was volatile, brilliant, and charismatic. He tried to push SGI to build a consumer graphics chip and to pursue networked computing, but the professional CEO the VCs had installed wouldn’t budge. So Clark left and started Netscape.

    The Clark story maps perfectly onto Andreessen’s founders-vs.-managers thesis. Silicon Graphics was an incredible company, but it was the founder (Clark) who saw the future, and the manager who refused to act on it. The company that capitalized on Clark’s vision of putting 3D graphics on a cheap chip was Nvidia, which had to be a new company because SGI’s management wouldn’t go there.

    7. The Two Jims: How Andreessen Got His Dual Education

    Andreessen says his formative training came from two mentors who were “polar opposites”: Jim Clark (the ultimate founder archetype) and Jim Barksdale (the ultimate professional manager, who had run parts of IBM, AT&T, and FedEx before becoming Netscape’s CEO).

    Clark represented the “will to power” founder mentality, a fountain of creativity who would bludgeon the world into accepting his ideas. Barksdale represented operational discipline: systematizing, scheduling, building processes. The key was that Barksdale never shut down the innovation; he channeled it. One of the best anecdotes: Clark got heated during a staff meeting about wanting to pursue a new idea, and Barksdale pulled him aside and defused the tension with a perfectly timed Mississippi drawl one-liner that had Clark laughing. They got along great from that point forward.

    Andreessen sees himself and Ben Horowitz as a modern version of this dynamic, with Andreessen playing more of the Clark role (fountain of ideas) and Horowitz playing more of the Barksdale role (operational discipline), though both mix it up.

    8. Moral Panics Are a Permanent Feature of Human Civilization

    Andreessen runs through a history of technology-driven moral panics that stretches across millennia: Plato and Socrates arguing that written language would destroy oral knowledge transmission. The printing press. Playing cards. Novels. Bicycles (which produced the incredible “bicycle face” panic, where young women were warned that the physical exertion of cycling would freeze their faces in an ugly expression, permanently ruining their marriage prospects). Jazz. Rock and roll. Elvis Presley being filmed from the waist up. Comic books. The Walkman. Calculators. Dungeons & Dragons. Heavy metal. Hip-hop (Jimmy Iovine was literally compared to mustard gas in congressional hearings). The early internet.

    The point isn’t that technology doesn’t change society. It does. The point is that the panicked, apocalyptic reaction is the same every single time, and it has never been correct at the catastrophic level predicted.

    9. Edison Didn’t Know What the Phonograph Would Be Used For, and Neither Do AI Inventors

    Andreessen tells a favorite story: Thomas Edison invented the phonograph fully expecting it would be used for families to listen to religious sermons at home after a long day of work. Instead, people immediately used it for ragtime and jazz music, which horrified Edison. The lesson is that the inventors of a technology are often the least qualified people to predict its long-term societal implications, because they’re too buried in the technical specifics. He applies this directly to AI, specifically calling out Geoffrey Hinton as “an actual capital-S socialist” whose prediction that AI will cause mass unemployment requiring universal basic income is really just his pre-existing political ideology dressed up as technological forecasting.

    10. Elon Musk Has Invented a New School of Management

    The final major section is Andreessen’s detailed breakdown of what he calls Elon Musk’s management method, which he says may be the “least studied and understood thing” in the world right now, despite clearly producing the best results of any organizational method operating today.

    The method has several key components:

    Bypassing the management stack. Andreessen draws a contrast with IBM in the late 1980s, where he worked as an intern. IBM had 12 layers of management between the lowest employee and the CEO. Each layer lied to the one above it to look good. After 12 rounds of compounding lies, the CEO had absolutely no idea what was happening in his own company. IBM even had an internal term for this: “the big gray cloud,” the entourage of executives in gray suits who followed the CEO everywhere and prevented him from ever speaking to anyone actually doing the work. Musk does the exact opposite: he goes directly to the engineer working on the problem and sits down to solve it with them.

    Bottleneck-first thinking. Musk runs each of his companies as a production process. Every week, he identifies the single biggest bottleneck in each company’s production pipeline. Then he personally goes and fixes that bottleneck with the responsible engineer. At Tesla, this means he’s resolving the critical production bottleneck 52 times a year, personally. Legacy automaker CEOs are not doing anything remotely comparable.

    120 design reviews per day. Musk does approximately one full day per week at each company, running 12-14 hour stretches of design reviews at five minutes per engineer. That’s roughly 12 reviews per hour, 120 per day. Each review identifies whether the project is on track, and if not, whether the problem is the production bottleneck. If it is, that’s where Musk spends the rest of the night, sometimes until 2am, working hands-on with the engineer to fix it.

    Maneuver warfare speed. Andreessen compares Musk’s operating tempo to “maneuver warfare,” the military doctrine of acting faster than the opponent can react. Where a normal company might take six months to solve a production problem, Musk solves it in four hours. The cycle time gap is so massive it’s almost incomparable.

    Shocking competence through selection pressure. Someone Andreessen knows described joining SpaceX as “being dropped into a zone of shocking competence.” Two forces create this: Musk rapidly identifies and fires underperformers (which he can do because he’s personally talking to the people doing the work), and the world’s best engineers actively want to work for him because he’s the only CEO who can work alongside them as a genuine technical peer. What engineer wouldn’t want to design a rocket engine with Elon Musk as their engineering partner?

    Andreessen introduces a half-serious, half-brilliant metric for founders: the “milli-Elon.” One milli-Elon is one-thousandth of Elon Musk’s founder capacity. Ten milli-Elons would be fantastic. A hundred, meaning 10% of an Elon, would get you all the money in the world. Most people, he says, are operating at about one milli-Elon or 0.1 milli-Elons.

    11. Starlink Is the Craziest Side Project in Business History

    Andreessen ends the Musk discussion by noting that Starlink, now with over 10 million subscribers, is essentially a side project at SpaceX. Two previous attempts at satellite-based internet (Teledesic, backed by Bill Gates and Craig McCaw, and Motorola’s Iridium) were catastrophic failures and classic business school case studies in capital destruction. Musk looked at that track record and said he’d do attempt number three as a side project, using the logic that if SpaceX’s reusable rockets were going to be launching constantly, they might as well carry their own satellites providing consumer-priced internet access. The idea was considered insane by anyone who knew the history. And of course, it worked.

    Thoughts

    There’s a reason this conversation hit so hard. Andreessen isn’t just sharing opinions. He’s connecting a mental model of organizational theory that spans JP Morgan’s 1880s merchant bank, Michael Ovitz’s 1975 Hollywood disruption, James Burnham’s 1941 political theory, IBM’s 1989 collapse, and Elon Musk’s 2025 management operating system into a single coherent framework. Very few people have both the lived experience and the historical knowledge to draw those connections, and even fewer can articulate them this clearly in real time.

    The “zero introspection” thesis is going to bother a lot of people, and it should be provocative. But the nuance is there if you listen carefully. Andreessen isn’t saying self-awareness is bad. He’s saying that the specific mode of backward-looking, guilt-driven rumination that modern therapeutic culture encourages is antithetical to the builder personality type. The great founders aren’t unaware. They’re relentlessly forward-oriented.

    The founder vs. manager framework is the most underrated idea in business strategy right now. It explains why so many legacy institutions are failing simultaneously, not because the people running them are dumb, but because the managerial class was optimized for stability in a world that no longer rewards it. When the environment changes, and it’s changing faster than ever, the only people equipped to respond are founders.

    The Elon Musk management breakdown alone is worth the entire conversation. The concept of identifying and personally fixing the critical production bottleneck every single week, for every company, by going directly to the engineer rather than through layers of management, is so simple it’s almost embarrassing that no one else does it. But that’s Andreessen’s point: almost no one can do it, because it requires a CEO who is simultaneously a world-class manager and a world-class technologist. That combination barely exists.

    If you’re a founder, operator, or anyone trying to build something that matters, this is required listening.

  • John Arnold Interview: Legendary Energy Trader Reveals China Robotics & EV Secrets, US Energy Bottlenecks for AI Data Centers, Nuclear Future & Systems Philanthropy

    TL;DR: In a wide-ranging Invest Like The Best episode, legendary energy trader and philanthropist John Arnold shares explosive insights from his recent China trip — witnessing unmatched manufacturing speed, EV factories built in 17 months, and a robotics explosion with over 100 companies. He breaks down how he built the ultimate “seat” in natural gas trading (from Enron to Centaurus), the massive AI data center power surge reshaping US energy markets, why permitting reform and fighting NIMBYism are critical, the long-term promise of nuclear, geothermal, and solar+battery tech, plus his systems-thinking approach to reforming criminal justice, healthcare, education, and journalism. This is essential listening for investors, policymakers, and anyone tracking AI, China competition, and American infrastructure.

    Key Takeaways from the John Arnold Podcast

    • China has leapfrogged the West in manufacturing speed and scale thanks to a highly educated, entrepreneurial workforce, domestic supply chains within 200 miles, and government subsidies fueling intense robotics and EV competition.
    • NIO’s EV factory went from groundbreaking to first car in just 17 months with heavy robotics automation — compared to 40-100-year-old US auto plants — proving China’s ability to build quality products at unbeatable prices.
    • Over 100 robotics companies in China compete fiercely under five-year plans; winners get supported while losers face consolidation, creating better technology but overcapacity challenges.
    • John Arnold’s trading edge came from building the “best seat” in the industry: massive risk capital, top talent, proprietary data systems, 3-and-35 fees, and market-making dominance in natural gas futures and basis trades.
    • His entrepreneurial spark started with high-school baseball card arbitrage in the late 80s/early 90s — spotting geographic price differences on bulletin boards and scaling nationally before age 17.
    • US energy system goals: affordable, reliable, clean electrons with energy security and good jobs — but politics shift every 4-8 years while infrastructure takes decades to build.
    • AI data centers are the biggest new demand driver in decades — hyperscalers prioritize speed over price and will last at least through 2030, creating enormous opportunities in energy assets.
    • NIMBYism and outdated permitting are the #1 bottleneck; China builds without these delays, threatening US competitiveness unless federal reform passes this year.
    • Inter-regional transmission lines are a win-win-win solution (lower costs, higher reliability, fewer emissions) but remain nearly impossible to permit and build in the US.
    • Nuclear (traditional AP-1000 and new SMRs/fusion) is promising for clean baseload but extremely expensive and 10-15 years away at scale; advanced geothermal is the most exciting near-term play for data centers.
    • Solar panel costs keep falling (mostly made in China) but total delivered PPA prices are 50%+ higher than 2020 lows due to land, labor, transmission, and capital costs; robotics and co-located data centers are key innovations.
    • Foundations should intentionally get weaker over time because institutions become bureaucratic and risk-averse; true philanthropy takes risks on long-term systems change that governments and markets avoid.
    • Criminal justice reform should focus on increasing probability of getting caught (via tech, cameras, drones) rather than harsher penalties; cash bail is outdated — pretrial decisions should prioritize public safety and court appearance.
    • Healthcare is riddled with market failures (asymmetric information, third-party payers, regulatory gaming like skin-substitute loopholes) requiring heavy regulation that private actors constantly exploit.
    • EdTech and AI in education have shown almost no real-world outcome improvements despite 20 years of hype; the human teacher-student connection remains irreplaceable for engagement.
    • Journalism is the “fourth estate” and deserves philanthropic support for local investigative work that commercial models no longer sustain.

    Detailed Summary of the John Arnold Interview on Invest Like The Best

    John Arnold’s Eye-Opening Trip to China: Speed, Scale & Robotics Revolution

    Arnold spent a week touring factories and meeting executives across China. His biggest takeaway: the country has transformed in 30 years from copying the West to leapfrogging it. Key advantages include a highly educated, entrepreneurial population, rapid capital deployment, deep domestic market, and hyper-flexible skilled labor that can scale by the thousands overnight. Supply chains are incredibly tight — one battery executive noted every supplier is within 200 miles and reachable same-day.

    The NIO EV factory visit was the standout: from first shovel to first car rolling off the line in just 17 months. The plant uses extensive robotics while still employing workers, producing premium EVs ($40k-$80k range) plus a new model under $10k. Contrast this with US auto plants averaging 40 years old (some over 100). China now has over 100 EV manufacturers and over 100 robotics companies, fueled by provincial subsidies tied to five-year strategic plans. Intense competition drives innovation but also overcapacity — China is now shifting to support “winners” for global dominance.

    Flights between the US and China are down 70%, Western expats in Shanghai down 50-75%, and American students down 90%. Arnold sensed growing Chinese confidence: “We used to copy the West. Now we will teach the West.”

    How John Arnold Became the World’s Greatest Energy Trader: Discipline, Data & the “Best Seat”

    Starting at Enron in 1995 at age 21, Arnold built Centaurus Advisors after the 2001 collapse. He describes total immersion: 12-hour trading days, industry nights, and constant mental replay. His edge wasn’t just talent — it was engineering the ultimate industry “seat”: massive retained earnings from early success allowed 3-and-35 fees, top talent pay, proprietary data systems, custom trade-entry platforms, and the ability to market-make in natural gas futures at Henry Hub plus basis trades across regions.

    His teenage baseball card arbitrage business (late 80s/early 90s) taught him the same lessons: spotting geographic price discrepancies on early internet bulletin boards, scaling nationally, and knowing every product’s real-time value. This directly translated to knowing every natural gas month’s fair value better than anyone else.

    The State of US Energy Markets & the AI Data Center Tsunami

    Arnold outlines five core goals for the US energy system: affordable, reliable, clean, secure, and job-creating electrons. America has abundant resources (oil, gas, wind, solar) and innovation capacity, but politics reset priorities every election cycle while infrastructure takes decades.

    The wildcard: AI data centers. Hyperscalers (the most profitable companies ever) are pouring capital in with zero price sensitivity and maximum speed urgency. Demand visibility is clear through 2030 and will create massive opportunities for new energy technologies.

    NIMBYism, Permitting Reform & Transmission Bottlenecks

    The biggest threat to US competitiveness is our inability to build. NIMBY (“Not In My Backyard”) opposition, endless lawsuits, and multi-veto-point permitting have stalled projects for 10+ years. China faces none of this friction. Arnold started a transmission company five years ago because inter-regional lines deliver lower costs, higher reliability, lower emissions, and more jobs — yet private capital has largely given up. He remains optimistic about bipartisan federal permitting reform passing in 2026.

    Nuclear, Geothermal, Solar, Batteries & the Path to Energy Abundance

    Traditional nuclear (Vogtle AP-1000) proved we can build safe plants — but at enormous cost and labor intensity (peak 9,000 skilled workers). Small modular reactors (SMRs) and fusion are promising but 10-15 years from scale and must compete on pure economics. Advanced geothermal stands out as the most exciting near-term baseload solution leveraging existing oil-and-gas talent.

    Solar panels get cheaper every year (China-dominated manufacturing), but total delivered cost has risen 50%+ since 2020 lows due to land, labor, transmission, and higher capital costs. Battery prices are also now rising with lithium. Robotics for solar construction and co-locating data centers with generation are critical innovations.

    Housing Reform Parallels & the Broader Permitting Crisis

    Arnold draws direct parallels between energy and housing: YIMBY (“Yes In My Backyard”) movements in California, Austin, Montana, and the Northeast show bipartisan recognition that affordability is now a voter priority. Politicians face short-term pressure for subsidies instead of long-term deregulation — the same trap energy faces.

    John Arnold’s Revolutionary Approach to Philanthropy & Systems Reform

    Arnold’s foundation deliberately aims to become less powerful over time because institutions grow bureaucratic and risk-averse. Philanthropy’s unique role: take political and economic risks on long-term systems change that markets and governments avoid.

    In criminal justice, the focus is probability of getting caught over penalty severity. Reforms in New Jersey and Kentucky replaced cash bail with risk-based pretrial detention. Tech (cameras, drones, real-time crime centers) offers solutions but raises privacy trade-offs that wealthy communities have already embraced.

    Healthcare is plagued by asymmetric information and regulatory arbitrage (e.g., constant new skin-substitute products to reset pricing windows). Education outcomes have declined despite massive EdTech investment — AI + AR/VR may finally crack engagement, but 20 years of hype have delivered zero aggregate gains. Journalism, the “fourth estate,” needs philanthropic funding for local investigative work that commercial models abandoned.

    Final Thoughts: Why John Arnold’s Interview Matters in 2026

    John Arnold’s conversation is a masterclass in systems thinking applied across trading, energy, China geopolitics, and American reform. His China trip should serve as a wake-up call: while the US debates, China executes at unprecedented speed and scale. The AI data center boom gives America a once-in-a-generation chance to rebuild energy infrastructure — but only if we fix permitting, transmission, and NIMBY gridlock now.

    For investors, the signals are clear: advanced geothermal, solar robotics, and transmission technologies are poised for explosive growth. For policymakers, the message is urgent — energy abundance is national security in the AI age. And for philanthropists and operators everywhere, Arnold’s insistence that institutions must weaken to stay effective is a profound leadership lesson.

    If you’re tracking AI infrastructure, US-China competition, or systems-level change, this is one of the highest-signal conversations of 2026.

  • Karpathy Autoresearch Tutorial 2026: How AI Agents Run 100+ LLM Pretraining Experiments Overnight – Complete Guide & Setup

    Andrej Karpathy Autoresearch is the breakout open-source project of March 2026. Released just days ago, this ~630-line minimalist framework lets AI coding agents (Claude, GPT-4o, Gemini, etc.) autonomously run real LLM pretraining research experiments on a single GPU while you sleep. It’s already producing real improvements that transfer to bigger models and hitting new leaderboard entries.

    Read Karpathy’s original announcement post on X here (23k+ likes, 7M+ views).

    If you’re searching for the best “Karpathy Autoresearch tutorial”, “how to setup autoresearch”, or “AI agents LLM experiments overnight”, this is the most detailed, up-to-date guide on the internet.

    TL;DR – Karpathy Autoresearch in 60 Seconds

    Karpathy Autoresearch is a single-GPU agent-driven system where:

    • You write the high-level research goal in program.md
    • The AI agent only edits train.py
    • Every experiment runs for exactly 5 wall-clock minutes
    • Better val_bpb score? Keep the git commit. Worse? Auto-revert
    • ~100 experiments per night → real breakthroughs while you sleep

    Official repo: github.com/karpathy/autoresearch. Works today on NVIDIA, Mac, and Windows.

    What Is Andrej Karpathy Autoresearch and Why Everyone Is Talking About It

    In his viral X post Karpathy wrote: “One day, frontier AI research used to be done by meat computers… That era is long gone.”

    Autoresearch takes his famous nanochat training core, strips it to a single file, and hands the entire research loop to AI agents. The human only updates the strategy in program.md. The agent experiments with architecture, optimizers, attention variants, RoPE, batch sizes — everything — and keeps only improvements via git.

    Real results: Improvements discovered on depth-12 models already transfer to depth-24 and are landing nanochat a new “time to GPT-2” leaderboard spot after ~650 experiments.

    How Autoresearch Actually Works (Step-by-Step)

    The system is deliberately minimal so agents can understand and modify it instantly:

    1. prepare.py (DO NOT TOUCH) – dataset + BPE tokenizer.
    2. program.md – your research instructions and constraints.
    3. train.py – full GPT model + Muon+AdamW optimizer + training loop. This is the ONLY file the agent edits.

    Agent loop (runs forever): read program.md → edit & commit → train 5 minutes → measure val_bpb → keep or revert. ~12 experiments per hour.

    Karpathy Autoresearch Setup Guides – NVIDIA, Mac & Windows

    1. Official NVIDIA Setup (5 Minutes)

    curl -LsSf https://astral.sh/uv/install.sh | sh
    git clone https://github.com/karpathy/autoresearch.git
    cd autoresearch
    uv sync
    uv run prepare.py
    uv run train.py

    Then paste the repo into Claude/GPT-4o and say: “Have a look at program.md and let’s kick off a new experiment!”

    2. MacOS / Apple Silicon (MLX Fork)

    github.com/trevin-creator/autoresearch-mlx – same commands, optimized for M-series chips.

    3. Windows RTX Fork

    github.com/jsegov/autoresearch-win-rtx – full PowerShell compatibility.

    4. Best Community Getting Started Guides

    Key Takeaways – What You Need to Remember

    • Only ~630 lines of code and runs on one GPU
    • Humans edit only program.md; agents do everything else
    • Fixed 5-minute runs = fair comparisons
    • val_bpb metric is reliable and vocab-size independent
    • Real transferable gains proven (depth-12 → depth-24)
    • Mac, Windows, and NVIDIA versions all live today
    • Community already reporting 19%+ overnight gains
    • This is the seed of swarm-style AI research labs

    Detailed Summary of Andrej Karpathy Autoresearch

    Released March 7-8 2026, Autoresearch is the minimal viable version of fully autonomous LLM research. Human strategy → AI agent execution → git-tracked progress. The entire system is designed to be forked and scaled into a massively collaborative platform where thousands of agents contribute “tiny papers” across branches.

    Karpathy’s Vision: From Solo Agents to Swarm Research

    In his follow-up X post Karpathy explained the bigger idea: asynchronously massively collaborative agent research (SETI@home style). The current code is just the seed — the real future is agents forking, discussing via GitHub, and accumulating discoveries across thousands of branches.

    Read the full vision post on X here.

    Final Thoughts on Karpathy Autoresearch

    From the explosive announcement post to Mac/Windows forks appearing in 48 hours and real leaderboard improvements already confirmed, this project feels like the moment everything changed. Anyone with a GPU now has a full AI research team working overnight for almost zero cost. The human’s only job is writing the research strategy prompt. The era of manual experimentation is ending — and it’s ending fast. Download the repo tonight, link Karpathy’s original X posts for context, write your first program.md, and wake up to new discoveries.

    Official Karpathy X Posts & Resources

    Ready to Start? Clone the repo, run the three commands, and let your agents take over. Bookmark this page — new forks and agent-generated papers will drop daily. Drop your overnight results in the comments!

  • Naval Ravikant 2026 Megasode: Every Lesson on Wealth, Happiness, Judgment & Truth (4-Hour Breakdown)

    TLDR

    Naval Ravikant sat down with Eric Jorgenson (author of The Almanack of Naval Ravikant) for a 4+ hour megasode on the Smart Friends podcast — his most comprehensive public conversation in years. Five years after the original Almanack, Naval updates and expands his thinking across five pillars: building wealth, building judgment, learning happiness, saving yourself, and philosophy. The biggest shifts? He now leans heavily on David Deutsch’s definition of wealth as “the set of physical transformations you can affect,” sees AI as the ultimate leverage tool (not a replacement for human judgment), and has moved past chasing happiness toward pursuing truth, love, and beauty. He’s working on a new stealth company, has met roughly a dozen people he considers genuinely enlightened, and believes the most important formula for life is: stay healthy, get wealthy, seek truth, give love, and create beauty.


    Key Takeaways

    On Wealth

    Deutsch’s definition is deeper than “assets that earn while you sleep.” Naval now defines wealth as the set of physical transformations you can affect — and the biggest driver of that capability is knowledge, not capital. If you removed Elon Musk from SpaceX, the wealth doesn’t just transfer. It disappears. The value is in the knowledge, not in the factory.

    Knowledge is the real multiplier. Ten modern humans can change more than ten paleolithic humans — not because of capital, but because of accumulated knowledge. As a society gains knowledge, it becomes wealthier. As an individual gains knowledge, they become wealthier. This is why Marx was fundamentally wrong: value is not in the capital. It’s in the people doing things.

    Ethical wealth creation is not only possible — it’s the norm in free markets. The common critiques of capitalism target cronyism, money printing, and government favoritism. None of that is free market capitalism. Real capitalism is a minimum structured set of rules that channels competitive energy into creating property instead of fighting over it.

    This is the greatest period for wealth creation in human history. More knowledge, more capital, more leverage than ever before. If you’re moderately intelligent, not afraid of hard work, and flexible, you can do extremely well. But it takes 10 to 30 years. There are no get-rich-quick schemes.

    AI is the ultimate leverage tool, not a replacement. Software engineers aren’t being replaced by AI — AI is letting software engineers replace everybody else. The people saying “programming is dead” are completely wrong. The most leveraged engineers are the ones building AI systems, then the ones using them. AI is great when wrong answers are okay. For anything requiring creativity or judgment at the edge, you still need humans.

    Good products are hard to vary. Drawing from David Deutsch’s epistemology, Naval argues that the best products — like the iPhone — are like good scientific explanations: you can’t change the details without breaking them. They encapsulate deep knowledge, have surprising reach into applications the creators never imagined, and exhibit winner-take-all network effects.

    On Judgment

    Judgment is the most valuable thing in the age of infinite leverage. The difference between a CEO who’s right 80% of the time and one who’s right 85% of the time is worth billions of dollars when you’re steering a multi-trillion dollar ship. Direction matters more than any other single thing.

    Judgment evolves into taste. First you reason through decisions logically. Then your subconscious enters into it (judgment). Then your whole body reacts to it (taste). The Rick Rubins and Steve Jobs of the world operate at the level of taste — they can’t fully explain why something is right, they just know. Naval says his investing is now “almost entirely taste.”

    It takes time to develop your gut, but once it’s developed, don’t listen to anything else. This applies to people, investments, products, and life decisions. Older people have very good judgment about other people because human interaction is the one area where everyone is constantly gaining experience.

    Learn from specific to general, not general to specific. This is Seneca’s insight: encounter reality, test it, learn from it, then generalize. Going the other way creates what Nassim Taleb calls “intellectual yet idiot” — someone overeducated and underpracticed. If you want to be a philosopher king, first be a king.

    Hard work is non-negotiable, but it shouldn’t feel like work. The most productive people work intensely on problems that fascinate them. The biggest breakthroughs come during deep immersion — 24-36 hour sessions where you can’t put the problem down. But if it feels like forced drudgery, you’ll lose to someone who finds it genuinely enjoyable.

    AI doesn’t have judgment. It has incredible information retrieval — the ability to cross-correlate all human knowledge and return the conventional correct answer. But for creative problems, novel situations, or anything requiring values and binding principles, AI falls short. It raises the tide for everyone, but there’s no “alpha” in the AI answer because everyone gets the same one.

    On Happiness

    Naval’s latest thinking: he’s not sure happiness exists. Happiness is a construct of the mind, a thought claiming to be a state. When the thought disappears, there’s no “you” there to be happy or unhappy. His focus has shifted from pursuing happiness to cultivating peace — being okay with things as they are, with few and consciously chosen desires.

    The three big ones are wealth, health, and happiness — pursued in that order, but their importance is reversed. Naturally happy people have the greatest gift and don’t need the others. Health matters more than wealth (a sick man only wants one thing). But most people will pursue them wealth-first simply because of energy, flexibility, and the practical reality of financial obligations when young.

    The more you think about yourself, the less happy you’ll be. Depressed people ruminate on themselves. Having motives larger than yourself — your mission, your children, your contribution — makes setbacks hurt less because they’re not personal. This is why Naval says: live for something larger than yourself, but only on your own terms.

    Chronic unhappiness is an ego trip. Acute unhappiness is real and useful — it’s a signal. But chronic unhappiness is wanting to feel more “you,” more separate, more important. Identity creates motivated reasoning. The thinner your identity, the more clearly you can see reality.

    The modern devil is cheap dopamine. Every deadly sin is a form of cheap dopamine. The direct pursuit of pleasure causes addiction and dopamine burnout. Virtues are the opposite — long-term individually beneficial behaviors that also create win-win outcomes for society. All virtues can be reinterpreted as long-term selfishness.

    Meditation isn’t about enlightenment — it’s about self-observation. When you’re more self-aware, you catch your mind doing things that aren’t in your long-term interest. You can reset, question whether a desire matters, and choose whether to reinterpret a situation or address the underlying problem.

    You don’t store memories — you store interpretations of memories. Changing those interpretations is what forgiveness actually is. Psychedelics, meditation, and honest introspection all work partly because they allow you to reprocess and reframe past experiences.

    On Saving Yourself

    Nobody is coming to save you. An ideal life is designed, not inherited. Naval claims his life is “really good” — at any given time he’s doing what he wants, nothing is obligatory, and if something stops being enjoyable, he changes it very quickly. This requires ruthless honesty about relationships, obligations, and what you actually want.

    Every relationship is transactional — and that’s okay. Naval draws a hard line against false obligations. He doesn’t attend obligatory events, weddings, or ritualistic celebrations. The result: he’s left with people who are similarly free, low-ego, and voluntarily present. Nobody takes each other for granted.

    The secret to a happy relationship is two happy people. You can’t be happy with your spouse if you’re not happy alone. Happiness is personal and must be tackled individually. Putting relationships ahead of your own inner work gets you neither.

    God, kids, or mission — find at least one. Naval has all three. His “God” is personal and unarticulated. Family is irreplaceable (expand your definition as you age). And mission means actively building — right now that’s a stealth company and this kind of conversation.

    Explore widely, then invest deeply. Modern society has made exploration easy, but all the benefits come from compound interest. You don’t learn through 10,000 hours — you learn through 10,000 honest iterations. Do, reflect, change, try again. Once your judgment tells you what fits, stop exploring and start compounding.

    The only true test of intelligence is whether you get what you want out of life. This is a two-part test: choosing what to want (the harder part) and then getting it. If you pass that test, there’s nothing to be envious of. Choose inspiration over envy — find the part of someone else’s success that resonates with something inside you.

    On Philosophy

    Naval’s philosophical foundation: evolution + Buddhism + Deutsch. Evolution explains humans. Buddhism is the most time-tested internal philosophy. David Deutsch’s epistemology — good explanations that are hard to vary, conjecture and criticism — provides the best framework for understanding progress in science, business, and society.

    Truth is a crystal in the multiverse. In the many-worlds interpretation, true knowledge replicates across more universes because it works. False knowledge is infinitely variable but gets eliminated. The “Rickiest of the Ricks” (from Rick and Morty) is the most truth-oriented version — lowest ego, least motivated reasoning, operating from the most universal principles.

    Enlightenment is binary, not a path. Naval has met about a dozen people he considers genuinely enlightened. They share one trait: persistent experience of “no self.” Nothing bothers them — not cancer diagnoses, not personal failures. It’s not that they lack desire or capability. They’re often more effective, not less. But they don’t take anything personally.

    The self is just a thought. When you look for the self — really look — you can never pin it down. It’s like a burning stick whirled in a circle that appears to be a flaming wheel. Just thoughts convincing you there’s someone there. Enlightened people have seen through this and their default state is pure awareness.

    The real truths are heresies. There’s a 2×2 matrix of truth vs. spreadability: conventional wisdom (true and spreads), fake news (false and spreads), nonsense (false and doesn’t spread), and heresies (true but don’t spread). Heresies don’t spread because any truth that lowers group cohesion gets suppressed. This is why the greatest philosophers are read long after their deaths — they told harsh truths while alive that society wasn’t ready to hear.

    Read the best 100 books over and over. Naval reads authors, not books. He reads philosophers, not authors. He’ll consume everything by Schopenhauer, Deutsch, Osho, Taleb, Krishnamurti — and until he’s finished everything by one thinker, he won’t move to the next. He judges philosophers by the outcomes they achieved in their own lives. A philosophy that led its creator to misery is suspect.

    Simulation theory is just modern religion. Every era maps its dominant technology onto religion — the sun god, the god-king, the mechanical universe, and now the computational universe. Naval finds understanding relativity, quantum physics, and cosmology more satisfying than saying “the universe is a computer.” He maps Buddhism onto simulation theory (the white room in the Matrix = pure consciousness = enlightenment) but considers sim theory unfalsifiable and reductive.


    Detailed Summary

    Part 1: Building Wealth (0:00 – 37:49)

    The conversation opens with Naval updating his definition of wealth through David Deutsch’s lens. Where he originally defined wealth as “assets that earn while you sleep” — a practical definition aimed at escaping the 9-to-5 trap — he now sees wealth more expansively as the set of physical transformations you can affect. This reframes wealth from a passive accumulation game to an active capability powered primarily by knowledge.

    Naval makes a forceful case that knowledge, not capital, is the real wealth multiplier. He uses SpaceX as his central example: remove Elon Musk and the wealth doesn’t just redistribute — it evaporates, because the knowledge that makes SpaceX valuable disappears with the people who hold it. This is why Marxism fundamentally fails. The value isn’t in the factories. You can’t slice it up and redistribute it like gold.

    He addresses the ethics of capitalism head-on, acknowledging that the majority of economic activity involves people fighting over existing wealth rather than creating new wealth (he draws an analogy to nature, where parasitic species outnumber standalone ones six to one). But he argues that free market capitalism, at its core, is the system that channels competitive energy into creation rather than destruction. The critiques of capitalism — bank bailouts, cronyism, government favoritism — target corruption of the system, not the system itself.

    On AI and leverage, Naval makes what may be his most quotable claim: “AI is not going to replace software engineers — AI is going to let software engineers replace everybody else.” He sees AI as an incredible information retrieval and calculation tool that raises the floor for everyone, but provides no lasting competitive edge because everyone has access to the same answers. The real edge comes from judgment, creativity, and taste — the things AI cannot provide.

    He connects Deutsch’s concept of “good explanations” to product building. Good products, like good scientific theories, are hard to vary — you can’t change the details without breaking them. The iPhone’s original form factor is still essentially unchanged because they nailed it. He notes that all technology has winner-take-all dynamics, and the best products amortize their development costs over the largest user base, making it impossible for any amount of money to buy a better alternative.

    Part 2: Building Judgment (37:49 – 1:12:30)

    Naval describes judgment as the single most important capability in an age of infinite leverage. He traces its development from conscious logical reasoning through subconscious intuition to full-body taste — the stage where you simply know what’s right without being able to articulate why.

    He quotes John Cleese on creative problem-solving: “You simply have to let your mind rest against the problem in a friendly, persistent way.” This captures Naval’s view that breakthroughs require both intense focus and a relaxed, non-forcing attitude. He shares his own experience writing a compiler in college, where his most productive sessions were 24-36 hour marathons because it took hours just to reload the problem into his head after time away.

    The section includes an important distinction between AI’s capabilities and human judgment. AI can cross-correlate all human knowledge and deliver the conventional correct answer for solved problems. But it lacks values, binding principles, and the ability to handle novel situations with idiosyncratic context. Naval sees AI as “magic” that looks like intelligence because of its staggering information retrieval, but it operates as a one-size-fits-all system trained on textbooks and data labelers’ opinions.

    He emphasizes learning from specific to general (Seneca’s principle), warns against academic over-education without practice (Taleb’s “intellectual yet idiot”), and shares how he now reads less but more deliberately — using reading to spark his own thinking rather than absorbing others’ ideas for regurgitation. He singles out Schopenhauer as a writer where every sentence is crafted and you get something different from the same essay on every re-read.

    Part 3: Learning Happiness (1:12:30 – 2:15:17)

    This is the most philosophical section, where Naval significantly updates his earlier thinking. He admits he’s “not sure happiness exists” as a distinct state, framing it instead as a thought that claims to be a state. When the thought disappears, there’s no observer left to be happy or unhappy. This is deeply Buddhist — the no-self doctrine applied to emotional states.

    His practical advice centers on cultivating peace rather than chasing happiness. He wants few, consciously chosen desires. He wants to act for reasons larger than himself (which paradoxically makes failure hurt less). And he wants to create space for authentic joy rather than ritualistic obligation.

    Naval introduces his framework of “truth, love, and beauty” as what remains after health and wealth are handled. Truth is pursued because even uncomfortable truths make life better (he uses The Matrix’s Neo vs. Cipher as his central illustration). Love is best experienced as giving rather than receiving — falling in love with someone or something is the high, not being loved. Beauty is creation — the highest human art form and what separates his view from pure Buddhist quietism.

    He discusses William Glasser’s choice theory at length, presenting the controversial view that depression often originates as a series of childhood behavioral choices that became unconscious habits. While acknowledging chemical components, he argues the explanation must be offered at the same level as the question — and that changing your brain through honest self-examination is more sustainable than long-term pharmaceutical intervention.

    The section on meditation is refreshingly honest: the first 20 minutes your mind goes berserk, then it calms, and most of the benefit comes from simply acknowledging emotions rather than solving them. He describes a personal experience of extreme unhappiness where a part of him was simultaneously watching and recognizing “there’s nothing actually here — you’re creating a drama to feel important.”

    Part 4: Saving Yourself (2:15:17 – 2:50:17)

    Naval gets deeply personal about how he’s designed his life. He claims to have “an amazing life” where at any given time he’s doing exactly what he wants. Nothing is obligatory. Every relationship is voluntary. He maintains zero estranged family members while refusing to attend weddings, obligatory events, or ritualistic celebrations.

    His stance on relationships is uncompromising: every relationship is transactional (providing mutual value), and pretending otherwise creates false obligations that breed resentment. He refuses to train his children to say “thank you” on command — if they feel genuine gratitude, it will emerge naturally. He believes the only real relationships are peer relationships, even employer-employee ones.

    The exploration-vs-investment framework is one of the most actionable parts of the conversation. Modern society has made exploration easy (you can fly anywhere, enter any career, date infinitely), but all benefits come from compound interest — which requires commitment. The key transition is recognizing when to stop exploring and start investing. Naval argues that learning happens through honest iterations (do, reflect, change, repeat), not hours logged.

    He names his sources of meaning: a personal relationship with “whatever this is” (God, loosely), his children and family, and his current stealth company. He explicitly says he doesn’t feel qualified to write a book about enlightenment because he hasn’t fully explored it himself — and he’s partly just lazy.

    Part 5: Philosophy (2:50:17 – End)

    The final section weaves together Naval’s philosophical commitments: evolution, Buddhism, and David Deutsch’s epistemology. He frames truth as “a crystal in the multiverse” — in the many-worlds interpretation, truth replicates because it works, while falsehood is infinitely variable but gets eliminated through skin-in-the-game dynamics.

    His account of enlightened people is fascinating and specific. He’s met about a dozen, verified to his own satisfaction through sustained observation (watching them encounter genuinely bad events without perturbation). They include well-known names like Rupert Spira, Mooji, and Sadhguru, plus personal friends and lesser-known figures. The key trait: a persistent experience of no self. It’s binary — not a gradient. They’re often more capable, not less. More authentic desires, less mimetic behavior, less ego-driven.

    He maps Buddhism onto simulation theory in an extended riff: breaking out of the Matrix is the quest for enlightenment, the white room is pure consciousness, and the boredom of the white room explains why consciousness generates infinite forms (why God forgets himself and goes back into the game). But he ultimately considers simulation theory a “lousy theory” — unfalsifiable, reductive, and just the latest version of mapping our dominant technology onto religion.

    The conversation closes with Naval’s 2×2 matrix of truth and spreadability (conventional wisdom, fake news, heresies, nonsense) and the observation that the only things that make it through the information environment are fake news — because conventional wisdom doesn’t need spreading, heresies can’t spread, and nonsense goes nowhere. The real truths, the heresies, can only be discovered, whispered, and perhaps read.


    Thoughts

    Five years after The Almanack of Naval Ravikant, this megasode feels like Naval 3.0. The original Naval (pre-Almanack) was focused on practical wealth creation and startup wisdom. Almanack Naval synthesized that with Eastern philosophy and general life principles. This version integrates David Deutsch’s epistemology into everything — wealth becomes knowledge creation, good products become good explanations, and even enlightenment gets framed through the multiverse.

    What strikes me most is the honesty about contradictions. Naval simultaneously says he’s “not sure happiness exists” while describing his life as amazing. He advocates dropping all obligations while maintaining zero estranged family members. He promotes laziness while admitting he’s working harder than ever on his new company. These aren’t inconsistencies — they’re the natural texture of a philosophy that’s been lived rather than theorized.

    The AI section is worth paying attention to. In a world where every AI influencer is either panicking about job replacement or promising utopia, Naval’s take is refreshingly grounded: AI is leverage, like every technology before it. It raises the floor for everyone. It provides no lasting edge because everyone gets the same answer. The edge comes from judgment, taste, and creativity — which are developed through experience, not downloaded from a model.

    His list of “enlightened” people is going to generate the most discussion and controversy. Claiming to have personally verified a dozen enlightened beings is a bold statement from someone who also says he’s “not sure there’s such a thing as enlightenment.” But it’s consistent with his framework: enlightenment isn’t a special state. It’s the absence of a constructed self. It’s binary. And it doesn’t prevent you from running a company, dating, or living a fully functional life.

    The deepest insight might be the simplest: stay healthy, get wealthy, seek truth, give love, and create beauty. If you internalize nothing else from these four hours, that five-part formula is worth the price of admission — which, in keeping with Naval’s philosophy, is free.


    This article is a summary and analysis of Naval Ravikant’s 4-hour megasode on the Smart Friends podcast with Eric Jorgenson, released January 2026. The full episode is available for free on YouTube and all major podcast platforms.

  • Claude Code Remote Control: How to Code From Your Phone in 2026 (Complete Setup Guide)

    Claude Code Remote Control: How to Code From Your Phone in 2026 (Complete Setup Guide)

    TL;DR: Anthropic just launched Claude Code Remote Control, a feature that lets you control your local Claude Code terminal sessions from your phone, tablet, or any browser. Run claude remote-control in your terminal, scan a QR code, and you’ve got full control of your coding session from anywhere in your house — or anywhere with an internet connection. Your code stays local. Nothing moves to the cloud. Available now for Max subscribers ($100–$200/month) with Pro plan access ($20/month) rolling out soon.

    What Is Claude Code Remote Control?

    Claude Code Remote Control is a new feature from Anthropic that creates a secure bridge between your local Claude Code terminal session and any remote device. Think of it as a live window into your running coding session that you can access from your phone, a tablet, or another computer’s browser.

    The critical distinction here is that this is not cloud computing. When you use Remote Control, Claude continues running on your local machine. Your filesystem, your MCP servers, your tools, your environment variables, your project configuration — all of it stays exactly where it is. The remote device is simply a viewport and input mechanism for that local session.

    This matters because many developers have complex local setups with custom tooling, private repos, and Model Context Protocol (MCP) integrations that don’t exist in the cloud. Remote Control preserves all of that context while letting you walk away from your desk.

    Key Takeaways

    Instant setup with minimal friction. Run claude remote-control or type /rc inside an existing session. A session URL and QR code appear. Scan the code with your phone and you’re connected.

    Everything stays local. Your code, files, MCP servers, and project configuration never leave your machine. The remote device is just a control interface — Anthropic’s servers route messages between your devices over TLS, but your actual development environment stays put.

    Conversations sync across all devices. You can send messages from your terminal, then from your phone, then from a browser on a different computer. The session doesn’t care where the input comes from. Everything stays in sync.

    Auto-reconnect after interruptions. If your laptop goes to sleep or your network drops, the session automatically reconnects when your machine comes back online. You don’t lose your place.

    One remote session at a time. Each Claude Code session supports a single remote connection. Your terminal must stay open — if you close it or kill the Claude process, the session ends.

    Available for Max subscribers now. Remote Control requires a Pro or Max plan. API keys are not supported. Max users have access today, with Pro access rolling out soon. Team and Enterprise plans are not yet supported.

    Not the same as Claude Code on the web. Claude Code on the web runs on Anthropic’s cloud infrastructure and doesn’t need a local machine at all. Remote Control runs on your machine and gives you remote access to that local session. Different tools for different situations.

    How Claude Code Remote Control Works Under the Hood

    When you start a Remote Control session, your local machine initiates an outbound HTTPS connection to Anthropic’s API. No inbound ports are opened on your computer. Your machine registers with the API and polls for work. When you connect from a remote device — phone, tablet, browser — Anthropic’s server routes messages between the web/mobile client and your local session over a streaming connection.

    All traffic passes through Anthropic’s API over TLS using multiple short-lived credentials that are scoped to a single purpose and expire independently. Your files and MCP servers never leave your machine. Only chat messages and tool results flow through the encrypted bridge.

    This architecture means your session URL is effectively a credential. Anyone with that URL can interact with your local Claude Code session, including approving file changes. Treat it like a password.

    How to Set Up Claude Code Remote Control: Step by Step

    Prerequisites

    Before you start, make sure your environment meets these requirements:

    Subscription: You need a Pro or Max plan on claude.ai. API keys won’t work here.

    Authentication: Run claude in your terminal and use /login to sign in through claude.ai if you haven’t already.

    Workspace trust: Run claude in your project directory at least once and accept the workspace trust dialog.

    Claude Code version: Update to version 2.1.52 or later.

    Option 1: Start a New Remote Control Session

    Navigate to your project directory and run:

    claude remote-control

    The terminal displays a session URL and stays running, waiting for remote connections. Press spacebar to toggle a QR code display for quick phone access.

    This command supports several flags:

    --verbose shows detailed connection and session logs. --sandbox / --no-sandbox enables or disables filesystem and network isolation during the session. Sandboxing is off by default.

    Option 2: Go Remote From an Existing Session

    If you’re already deep in a Claude Code conversation and want to continue it from another device, type:

    /remote-control

    Or use the shorthand:

    /rc

    This carries over your entire conversation history — every message, file edit, and tool call — and generates the session URL and QR code. The --verbose, --sandbox, and --no-sandbox flags are not available with this in-session command.

    Connecting From Your Phone or Another Device

    You have three ways to connect:

    Scan the QR code. This is the fastest path. Point your phone camera at the code and it opens directly in the Claude app if you have it installed.

    Open the session URL. The URL is displayed in your terminal alongside the QR code. Copy it and open it in any browser.

    Find it in the session list. Open claude.ai/code or the Claude mobile app and look for your session by name. Remote Control sessions display a computer icon with a green status dot when they’re online.

    Pro tip: Use /rename to give your session a descriptive name before going remote. Something like “Auth refactor – Feb 2026” is much easier to find than the default “Remote Control session.”

    If you don’t have the Claude mobile app yet, type /mobile inside Claude Code to display a download QR code for iOS or Android.

    Enable Remote Control for All Sessions Automatically

    By default, Remote Control only activates when you explicitly run the command. To make every session remotely accessible automatically, run /config inside Claude Code and set “Enable Remote Control for all sessions” to true.

    Remote Control vs. Claude Code on the Web

    Both Remote Control and Claude Code on the web use the claude.ai/code interface, but they serve fundamentally different purposes.

    Remote Control executes on your machine. Your local MCP servers, tools, custom configurations, and entire filesystem remain available. Use this when you’re in the middle of local work and want to keep going from another device.

    Claude Code on the web executes on Anthropic-managed cloud infrastructure. Use this when you want to kick off a task without any local setup, work on a repo you don’t have cloned, or run multiple tasks in parallel.

    The choice is straightforward: if you have a complex local environment with MCP servers and custom tooling, use Remote Control. If you want zero-setup cloud execution, use Claude Code on the web.

    Real-World Use Cases for Remote Control

    Long-running refactors. Kick off a large refactoring task — say, migrating 40+ files from CSS modules to Tailwind. Instead of sitting there watching for 20 minutes, scan the QR code and monitor progress from your phone while you take a break.

    Build monitoring. Start a complex build that’s been failing intermittently. Claude investigates, reads logs, and tries fixes. You head to a meeting and check results from your phone during a quiet moment.

    Multi-session management. Run three separate Claude Code sessions — one fixing a production bug, one writing tests, one doing a dependency upgrade. Each gets its own Remote Control session. Switch between them on your phone like switching chat threads.

    End-of-day code review. You’ve been at your desk for hours and don’t want to stare at a monitor anymore, but you have generated code to review. Connect from your phone on the couch, scroll through file changes, and leave follow-up instructions.

    Limitations and Things to Watch Out For

    Your terminal must stay open. Remote Control runs as a local process. If you close the terminal or stop the Claude process, the session ends. Period.

    Your machine must be on and connected. If your home WiFi goes down, the session pauses. Claude isn’t doing work in the background while your machine is offline. It auto-reconnects when connectivity returns, but no work happens in the meantime.

    One remote connection per session. You can’t have multiple remote devices controlling the same session simultaneously in separate connections — though you can switch between devices since the conversation syncs.

    Mobile screen limitations. Reviewing diffs and detailed code on a phone screen has obvious constraints. Remote Control is great for monitoring, approvals, and simple instructions. Detailed code review is still better at your desk.

    Session URL security. Anyone with your session URL has full control. This includes the ability to approve file changes on your local machine. Don’t share it carelessly.

    Early bugs exist. As with any new feature, there are some edge cases. Some users have reported that remote sessions don’t always appear in the session list, making it hard to reconnect after navigating away from the app. Anthropic is actively working on fixes.

    The Bigger Picture: Why This Matters

    Remote Control might seem like a convenience feature, but it signals something important about where AI-assisted development is heading. Claude Code has hit a $2.5 billion annualized run rate as of February 2026 — more than doubling since the start of the year. It now powers an estimated 4% of all public GitHub commits worldwide and has reached 29 million daily installs in Visual Studio Code.

    The move to mobile access reflects a shift in how developers interact with AI coding agents. These aren’t autocomplete tools that need you staring at a screen. They’re autonomous agents that work through multi-file tasks over minutes or hours. The natural next step is letting developers supervise and direct that work from wherever they happen to be.

    Before Remote Control launched officially, developers were already hacking together mobile access using Tailscale for tunneling, Termius or Termux for mobile SSH, and tmux for session persistence. Some built custom WebSocket bridges. Anthropic essentially productized what power users were already doing — but with native integration, auto-reconnect, and proper security.

    Competitors are approaching this differently. GitHub Copilot’s coding agent can be assigned from GitHub Mobile, but it runs entirely in GitHub’s cloud via Actions. Cursor shipped a third-party iOS companion app in January 2026 that relays prompts to a Mac running the Cursor IDE. Claude Code’s differentiator is clear: local execution with the full environment preserved, accessible from any device.

    My Thoughts

    This feature solves a real problem that most Claude Code users have hit at least once: you’re mid-task, something interrupts you, and you either abandon context or chain yourself to your desk. Remote Control eliminates that tradeoff entirely.

    The fact that everything stays local is the right call architecturally. Developers with complex local setups — custom MCP servers, specific environment configurations, private tooling — would never trust a cloud handoff to preserve all of that context correctly. By keeping execution local and just proxying the interface, Anthropic avoids that trust problem entirely.

    The security model is sensible too. Outbound-only connections, TLS encryption, short-lived credentials — there’s nothing unusual here, and that’s the point. The session URL being the single credential to protect keeps the mental model simple.

    Where this gets really interesting is when combined with the auto-enable option. If every Claude Code session is automatically remotely accessible, developers can adopt a new workflow pattern: start tasks, walk away, check in periodically from wherever they are, and come back to completed work. That’s a meaningful change in how coding sessions are structured throughout a day.

    For solo developers and indie hackers juggling multiple projects, the ability to monitor and manage several concurrent Claude Code sessions from a phone is genuinely powerful. It turns dead time — waiting rooms, commutes, coffee breaks — into lightweight supervision time.

    The main concern is the early-stage bugs. Session reconnection issues and visibility in session lists need to be rock-solid for this to be a reliable part of anyone’s workflow. But those are solvable problems, and Anthropic’s pace of iteration on Claude Code has been consistently fast.

    Bottom line: if you’re a Claude Code user on a Max plan, there’s no reason not to try this today. It takes 30 seconds to set up and it fundamentally changes how you can structure your development sessions.

  • Boris Cherny Says Coding Is “Solved” — Head of Claude Code Reveals What Comes Next for Software Engineers

    Boris Cherny Says Coding Is "Solved" — Head of Claude Code Reveals What Comes Next for Software Engineers

    Boris Cherny, creator and head of Claude Code at Anthropic, sat down with Lenny Rachitsky on Lenny’s Podcast to drop one of the most consequential interviews in recent tech history. With Claude Code now responsible for 4% of all public GitHub commits — and growing faster every day — Cherny laid out a vision where traditional coding is a solved problem and the real frontier has shifted to idea generation, agentic AI, and a new role he calls the “Builder.”


    TLDW (Too Long; Didn’t Watch)

    Boris Cherny, the head of Claude Code at Anthropic, hasn’t manually written a single line of code since November 2025 — and he ships 10 to 30 pull requests every day. Claude Code now accounts for 4% of all public GitHub commits and is projected to reach 20% by end of 2026. Cherny believes coding as we know it is “solved” and that the future belongs to generalist “Builders” who blend product thinking, design sense, and AI orchestration. He advocates for underfunding teams, giving engineers unlimited tokens, building products for the model six months from now (not today), and following the “bitter lesson” of betting on the most general model. The Cowork product — Anthropic’s agentic tool for non-technical tasks — was built in just 10 days using Claude Code itself. Cherny also revealed three layers of AI safety at Anthropic: mechanistic interpretability, evals, and real-world monitoring.


    Key Takeaways

    1. Claude Code’s Growth Is Staggering

    Claude Code now authors approximately 4% of all public GitHub commits, and Anthropic believes the real number is significantly higher when private repositories are included. Daily active users doubled in the month before this interview, and the growth curve isn’t just rising — it’s accelerating. Semi Analysis predicted Claude Code will reach 20% of all GitHub commits by end of 2026. Claude Code alone is generating roughly $2 billion in revenue, with Anthropic overall at approximately $15 billion.

    2. 100% AI-Written Code Is the New Normal

    Cherny hasn’t manually edited a single line of code since November 2025. He ships 10 to 30 pull requests per day, making him one of the most prolific engineers at Anthropic — all through Claude Code. He still reviews code and maintains human checkpoints, but the actual writing of code is entirely handled by AI. Claude also reviews 100% of pull requests at Anthropic before human review.

    3. Coding Is “Solved” — The Frontier Has Shifted

    In Cherny’s view, coding — at least the kind of programming most engineers do — is a solved problem. The new frontier is idea generation. Claude is already analyzing bug reports and telemetry data to propose its own fixes and suggest what to build next. The shift is from “tool” to “co-worker.” Cherny expects this to become increasingly true across every codebase and tech stack over the coming months.

    4. The Rise of the “Builder” Role

    Traditional role boundaries between engineer, product manager, and designer are dissolving. On the Claude Code team, everyone codes — the PM, the engineering manager, the designer, the finance person, the data scientist. Cherny predicts the title “Software Engineer” will start disappearing by end of 2026, replaced by something like “Builder” — a generalist who blends design sense, business logic, technical orchestration, and user empathy.

    5. Underfunding Teams Is a Feature, Not a Bug

    Cherny advocates deliberately underfunding teams as a strategy. When you assign one engineer to a project instead of five, they’re forced to leverage Claude Code to automate everything possible. This isn’t about cost-cutting — it’s about forcing innovation through constraint. The results at Anthropic have been dramatic: while the engineering team grew roughly 4x, productivity per engineer increased 200% in terms of pull requests shipped.

    6. Give Engineers Unlimited Tokens

    Rather than hiring more headcount, Cherny’s advice to CTOs is to give engineers as many tokens as possible. Let them experiment with the most capable models without worrying about cost. The most innovative ideas come from people pushing AI to its limits. Some Anthropic engineers are spending hundreds of thousands of dollars per month in tokens. Optimize costs later — only after you’ve found the idea that works.

    7. Build for the Model Six Months From Now

    One of Cherny’s most actionable insights: don’t build for today’s model capabilities — build for where the model will be in six months. Early versions of Claude Code only wrote about 20% of Cherny’s code. But the team bet on exponential improvement, and when Opus 4 and Sonnet 4 arrived, product-market fit clicked instantly. This means your product might feel rough at first, but when the next model generation drops, you’ll be perfectly positioned.

    8. The Bitter Lesson Applied to Product

    Cherny references Rich Sutton’s famous “Bitter Lesson” blog post as a core principle for the Claude Code team: the more general model will always outperform the more specific one. In practice, this means avoiding rigid workflows and orchestration scaffolding around AI models. Don’t box the model in. Give it tools, give it a goal, and let it figure out the path. Scaffolding might improve performance 10-20%, but those gains get wiped out with the next model generation.

    9. Latent Demand — The Most Important Product Principle

    Cherny calls latent demand “the single most important principle in product.” The idea: watch how people misuse or hack your product for purposes you didn’t design it for. That’s where your next product lives. Facebook Marketplace came from 40% of Facebook Group posts being buy-and-sell. Cowork came from non-engineers using Claude Code’s terminal for things like growing tomato plants, analyzing genomes, and recovering wedding photos from corrupted hard drives. There’s also a new dimension: watching what the model is trying to do and building tools to make that easier.

    10. Cowork Was Built in 10 Days

    Anthropic’s Cowork product — their agentic tool for non-technical tasks — was implemented by a small team in just 10 days, using Claude Code to build its own virtual machine and security scaffolding. Cowork was immediately a bigger hit than Claude Code was at launch. It can pay parking tickets, cancel subscriptions, manage project spreadsheets, message team members on Slack, respond to emails, and handle forms — and it’s growing faster than Claude Code did in its early days.

    11. Three Layers of AI Safety at Anthropic

    Cherny outlined three layers of safety: (1) Mechanistic interpretability — monitoring neurons inside the model to understand what it’s doing and detect things like deception at the neural level. (2) Evals — lab testing where the model is placed in synthetic situations to check alignment. (3) Real-world monitoring — releasing products as research previews to study unpredictable agent behavior in the wild. Claude Code was used internally for 4-5 months before public release specifically for safety study.

    12. Why Boris Left Anthropic for Cursor (and Came Back After Two Weeks)

    Cherny briefly left Anthropic to join Cursor, drawn by their focus on product quality. But within two weeks, he realized what he was missing: Anthropic’s safety mission. He described it as a psychological need — without mission-driven work, even building a great product wasn’t a substitute. He returned to Anthropic and the rest is history.

    13. Manual Coding Skills Will Become Irrelevant in 1-2 Years

    Cherny compared manual coding to assembly language — it’ll still exist beneath the surface, and understanding the fundamentals helps for now, but within a year or two it won’t matter for most engineers. He likened it to the printing press transition: a skill once limited to scribes became universal literacy over time. The volume of code created will explode while the cost drops dramatically.

    14. Pro Tips for Using Claude Code Effectively

    Cherny shared three specific tips: (1) Use the most capable model — currently Opus 4.6 with maximum effort enabled. Cheaper models often cost more tokens in the end because they require more correction and handholding. (2) Use Plan Mode — hit Shift+Tab twice in the terminal to enter plan mode, which tells the model not to write code yet. Go back and forth on the plan, then auto-accept edits once it looks good. Opus 4.6 will one-shot it correctly almost every time. (3) Explore different interfaces — Claude Code runs on terminal, desktop app, iOS, Android, web, Slack, GitHub, and IDE extensions. The same agent runs everywhere. Find what works for you.


    Detailed Summary

    The Origin Story of Claude Code

    Claude Code began as a one-person hack. When Cherny joined Anthropic, he spent a month building weird prototypes that mostly never shipped, then spent another month doing post-training to understand the research side. He believes deeply that to build great products on AI, you have to understand “the layer under the layer” — meaning the model itself.

    The first version was terminal-based and called “Claude CLI.” When he demoed it internally, it got two likes. Nobody thought a coding tool could be terminal-based. But the terminal form factor was chosen partly out of necessity (he was a solo developer) and partly because it was the only interface that could keep up with how fast the underlying model was improving.

    The breakthrough moment during prototyping: Cherny gave the model a bash tool and asked it what music he was listening to. The model figured out — without any specific instructions — how to use the bash tool to answer that question. That moment of emergent tool use convinced him he was onto something.

    The Growth Trajectory

    Claude Code was released externally in February 2025 and was not immediately a hit. It took months for people to understand what it was. The terminal interface was alien to many. But internally at Anthropic, daily active users went vertical almost immediately.

    There were multiple inflection points. The first major one was the release of Opus 4, which was Anthropic’s first ASL-3 class model. That’s when Claude Code’s growth went truly exponential. Another inflection came in November 2025 when Cherny personally crossed the 100% AI-written code threshold. The growth has continued to accelerate — it’s not just going up, it’s going up faster and faster.

    The Spotify headline from the week of recording — “Spotify says its best developers haven’t written a line of code since December, thanks to AI” — underscored how mainstream the shift has become.

    Thinking in Exponentials

    Cherny emphasized that thinking in exponentials is deep in Anthropic’s DNA — three of their co-founders were the first three authors on the scaling laws paper. At Code with Claude (Anthropic’s developer conference) in May 2025, Cherny predicted that by year’s end, engineers might not need an IDE to code anymore. The room audibly gasped. But all he did was “trace the line” of the exponential curve of AI-written code.

    The Printing Press Analogy

    Cherny’s preferred historical analog for what’s happening is the printing press. In mid-1400s Europe, literacy was below 1%. A tiny class of scribes did all the reading and writing, employed by lords and kings who often couldn’t read themselves. After Gutenberg, more printed material was created in 50 years than in the previous thousand. Costs dropped 100x. Literacy rose to 70% globally over two centuries.

    Cherny sees coding undergoing the same transition: a skill locked away in a tiny class of “scribes” (software engineers) is becoming accessible to everyone. What that unlocks is as unpredictable as the Renaissance was to someone in the 1400s. He also shared a remarkable historical detail — an interview with a scribe from the 1400s who was actually excited about the printing press because it freed them from copying books to focus on the artistic parts: illustration and bookbinding. Cherny felt a direct parallel to his own experience of being freed from coding tedium to focus on the creative and strategic parts of building.

    What AI Transforms Next

    Cherny believes roles adjacent to engineering — product management, design, data science — will be transformed next. The key technology enabling this is true agentic AI: not chatbots, but AI that can actually use tools and act in the world. Cowork is the first step in bringing this to non-technical users.

    He was candid that this transition will be “very disruptive and painful for a lot of people” and that it’s a conversation society needs to have. Anthropic has hired economists, policy experts, and social impact specialists to help think through these implications.

    The Latent Demand Framework in Depth

    Cherny credited Fiona Fung, the founding manager of Facebook Marketplace, for popularizing the concept of latent demand. The examples are compelling: someone using Claude Code to grow tomato plants, another analyzing their genome, another recovering wedding photos from a corrupted hard drive, a data scientist who figured out how to install Node.js and use a terminal to run SQL analysis through Claude Code.

    But Cherny added a new dimension specific to AI products: latent demand from the model itself. Rather than boxing the model into a predetermined workflow, observe what the model is trying to do and build to support that. At Anthropic they call this being “on distribution.” Give the model tools and goals, then let it figure out the path. The product is the model — everything else is minimal scaffolding.

    Safety as a Core Differentiator

    The interview made clear that safety isn’t just a talking point at Anthropic — it’s why everyone is there, including Cherny. He described the work of Chris Olah on mechanistic interpretability: studying model neurons at a granular level to understand how concepts are encoded, how planning works, and how to detect things like deception. A single neuron might correspond to a dozen concepts through a phenomenon called superposition.

    Anthropic’s “race to the top” philosophy means open-sourcing safety tools even when they work for competing products. They released an open-source sandbox for running AI agents securely that works with any agent, not just Claude Code.

    The Memory Leak Story

    One of the most memorable anecdotes: Cherny was debugging a memory leak the traditional way — taking heap snapshots, using debuggers, analyzing traces. A newer engineer on the team simply told Claude Code: “Hey Claude, it seems like there’s a leak. Can you figure it out?” Claude Code took the heap snapshot, wrote itself a custom analysis tool on the fly, found the issue, and submitted a pull request — all faster than Cherny could do it manually. Even veterans of AI-assisted coding get stuck in old habits.

    Personal Background and Post-AGI Plans

    In a touching segment, Cherny and Rachitsky discovered they’re both from Odessa, Ukraine. Cherny’s grandfather was one of the first programmers in the Soviet Union, working with punch cards. Before joining Anthropic, Cherny lived in rural Japan where he learned to make miso — a process that takes months to years and taught him to think on long timescales. His post-AGI plan? Go back to making miso.

    His book recommendations: Functional Programming in Scala (the best technical book he’s ever read), Accelerando by Charles Stross (captures the essence of this moment better than anything), and The Wandering Earth by Liu Cixin (Chinese sci-fi short stories from the Three Body Problem author).


    Thoughts and Analysis

    This interview is one of the most important conversations about the future of software engineering to come out in 2026. Here are some things worth sitting with:

    The “solved” framing is provocative but precise. Cherny isn’t saying software engineering is solved — he’s saying the act of translating intent into working code is solved. The thinking, architecting, deciding-what-to-build, and ensuring-it’s-correct parts are very much unsolved. This distinction matters enormously and most of the pushback in the YouTube comments misses it.

    The underfunding principle is genuinely counterintuitive. Most organizations respond to AI tools by trying to maintain headcount and “augment” existing workflows. Cherny’s approach is the opposite: reduce headcount on a project, give people unlimited AI tokens, and watch them figure out how to ship ten times faster. This is a fundamentally different organizational philosophy and one that most companies will resist until their competitors prove it works.

    The “build for six months from now” advice is dangerous and brilliant. Dangerous because your product will underperform for months and investors will get nervous. Brilliant because when the next model drops, you’ll have the only product that takes full advantage of it. This is how Claude Code went from writing 20% of Cherny’s code to 100% — the product was ready when the model caught up.

    The latent demand framework deserves serious study. The traditional version (watching users hack your product) is well-known from the Facebook era. The AI-native version (watching what the model is trying to do) is genuinely new. “The product is the model” is a deceptively simple statement that most AI product builders are still getting wrong by over-engineering workflows and scaffolding.

    The Cowork trajectory matters more than Claude Code. Claude Code transforms engineers. Cowork transforms everyone else. If Cowork delivers on even half of what Cherny describes — paying tickets, managing project spreadsheets, responding to emails, canceling subscriptions — then the total addressable market dwarfs coding tools. The fact that it was built in 10 days and was an immediate hit suggests Anthropic has found product-market fit for agentic AI beyond engineering.

    The safety discussion felt genuine. Cherny’s explanation of mechanistic interpretability — actually being able to monitor model neurons and detect deception — is one of the clearest public explanations of Anthropic’s safety approach. The fact that the safety mission is what brought him back from Cursor (where he lasted only two weeks) speaks to the culture. Whether you think safety is a genuine concern or a competitive moat, it’s clearly a core part of how Anthropic attracts and retains talent.

    The elephant in the room: this is Anthropic’s head of product telling you to use more tokens. Multiple YouTube commenters pointed this out, and they’re right to flag it. But the underlying logic holds: if a less capable model requires more correction rounds and more tokens to achieve the same result, then the “cheaper” model isn’t actually cheaper. That’s a testable claim, and most engineers using these tools regularly will tell you it checks out.

    Whether you agree with the “coding is solved” framing or not, the data is hard to argue with. Four percent of all GitHub commits. Two hundred percent productivity gains per engineer. A product that was built in 10 days and scaled to millions of users. These aren’t predictions — they’re measurements. And the curve is still accelerating.


    This article is based on Boris Cherny’s appearance on Lenny’s Podcast, published February 19, 2026. Boris Cherny can be found on X/Twitter and at borischerny.com.