Command Palette

Search for a command to run...

PodMine
The MAD Podcast with Matt Turck
The MAD Podcast with Matt Turck•October 23, 2025

Are We Misreading the AI Exponential? Julian Schrittwieser on Move 37 & Scaling RL (Anthropic)

Julian Schrittwieser from Anthropic discusses the exponential trajectory of AI capabilities, predicting that models will achieve full-day autonomous task completion by 2026 and expert-level performance across many professions by 2027, while exploring how pre-training combined with reinforcement learning enables AI agents to make novel scientific discoveries and potentially earn Nobel Prizes.
AI & Machine Learning
Indie Hackers & SaaS Builders
Developer Culture
Demis Hassabis
Julian Schrittwieser
Matt Turck
OpenAI
Anthropic

Summary Sections

  • Podcast Summary
  • Speakers
  • Key Takeaways
  • Statistics & Facts
  • Compelling StoriesPremium
  • Thought-Provoking QuotesPremium
  • Strategies & FrameworksPremium
  • Similar StrategiesPlus
  • Additional ContextPremium
  • Key Takeaways TablePlus
  • Critical AnalysisPlus
  • Books & Articles MentionedPlus
  • Products, Tools & Software MentionedPlus
0:00/0:00

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.

0:00/0:00

Podcast Summary

In this enlightening episode, Julian Schrittwieser, a top AI researcher at Anthropic and former contributor to DeepMind's legendary AlphaGo Zero and MuZero projects, unpacks his viral blog post "Failing to Understand the Exponential, Again" with host Matt Turck. Julian explains why the AI bubble discussions seem disconnected from frontier lab realities, where consistent exponential progress continues unabated. (01:00) The conversation explores how task length is doubling every 3-4 months, suggesting AI agents capable of working autonomously for full days by 2026 and achieving expert-level breadth across multiple professions by 2027.

  • Main themes: The exponential trajectory of AI capabilities, the intersection of pre-training and reinforcement learning, and the practical implications of increasingly autonomous AI agents for productivity and society.

Speakers

Julian Schrittwieser

Julian is a leading AI researcher at Anthropic, previously at Google DeepMind where he was second author on AlphaGo Zero and lead author on MuZero. He contributed to some of the most groundbreaking AI projects in history, including AlphaGo, AlphaZero, AlphaCode, and AlphaTensor. Julian's work has fundamentally shaped our understanding of reinforcement learning and AI agents.

Matt Turck

Matt is Managing Director at FirstMark, a leading venture capital firm focused on enterprise technology and AI investments. He hosts the MAD Podcast and writes extensively about the AI landscape and emerging technologies on his blog.

Key Takeaways

Task Length as the Critical Metric for AI Progress

Julian emphasizes that the ability for AI models to work independently for extended periods is the key unlock for delegation and economic impact. (04:56) Current models can handle tasks lasting a few hours, but exponential growth suggests full-day autonomous work capability by 2026. This metric matters because it determines what you can actually delegate to AI - frequent human intervention limits practical utility, while agents that can work for hours enable true productivity multiplication across entire teams.

Pre-training Plus RL is the Winning Recipe

The combination of pre-training on vast human knowledge with reinforcement learning creates the most capable AI systems. (38:37) Pre-training provides an implicit world model similar to evolutionary encoding in animals, while RL teaches agents to correct their own errors and learn from their actual behavior distribution. This approach is more practical than training from scratch because pre-training brings immense value and creates agents with human-aligned values from the start.

Quality Over Quantity in RL Training Data

High-quality training data is crucial for stable reinforcement learning, as demonstrated by AlphaZero's success. (47:07) AlphaZero spent significant computation on planning and search to generate exceptional training data, resulting in incredibly stable RL training that could run across continents. Modern language model RL is less stable because the difference between model capability and training data quality is smaller, suggesting that improving reasoning capabilities to generate higher-quality data is a key scaling direction.

Internal Benchmarks Trump Public Leaderboards

Goodhart's Law applies heavily to AI benchmarks - any measure that becomes a target stops being a good measure. (54:29) Public benchmarks get gamed as teams optimize specifically for them, leading to misleading performance indicators. The solution is creating private, held-out evaluations that truly represent your use case. Companies should develop internal benchmarks based on their actual tasks rather than relying on public leaderboards for model selection.

AI Will Enhance Rather Than Replace Human Capabilities

Julian argues that AI will create complementary relationships rather than one-for-one job replacement, following the economic principle of comparative advantage. (63:57) AI excels at certain tasks while humans remain superior at others, leading to gradual productivity improvements rather than sudden displacement. This pattern mirrors chess and Go, where AI tools enhanced rather than eliminated human players, making the games more accessible and popular.

Statistics & Facts

  1. Task length capabilities are doubling every 3-4 months according to consistent benchmark improvements across multiple evaluations. (02:51) This exponential trend suggests AI agents will be capable of working autonomously for full days by 2026.
  2. OpenAI's GDP-Val evaluation collected real-world tasks from domain experts and compared model performance against human experts to provide accurate economic impact predictions. (07:20) This benchmark is designed to be representative of actual economic productivity rather than academic performance.
  3. Drug discovery costs have increased exponentially from single scientists discovering antibiotics by accident 100 years ago to billions of dollars required for new drug development today. (18:36) This illustrates how scientific fields typically require exponentially more research effort to maintain linear progress.

Compelling Stories

Available with a Premium subscription

Thought-Provoking Quotes

Available with a Premium subscription

Strategies & Frameworks

Available with a Premium subscription

Similar Strategies

Available with a Plus subscription

Additional Context

Available with a Premium subscription

Key Takeaways Table

Available with a Plus subscription

Critical Analysis

Available with a Plus subscription

Books & Articles Mentioned

Available with a Plus subscription

Products, Tools & Software Mentioned

Available with a Plus subscription

More episodes like this

The School of Greatness
January 14, 2026

Stop Waiting to Be Ready: The Truth About Fear, Ego, and Personal Power

The School of Greatness
The James Altucher Show
January 14, 2026

From the Archive: Sara Blakely on Fear, Failure, and the First Big Win

The James Altucher Show
Finding Mastery with Dr. Michael Gervais
January 14, 2026

How To Stay Calm Under Stress | Dan Harris

Finding Mastery with Dr. Michael Gervais
Tetragrammaton with Rick Rubin
January 14, 2026

Joseph Nguyen

Tetragrammaton with Rick Rubin
Swipe to navigate