Command Palette

Search for a command to run...

PodMine
The MAD Podcast with Matt Turck
The MAD Podcast with Matt Turck•October 2, 2025

Inside Anthropic’s Sonnet 4.5 — Sholto Douglas & the Race to AGI

Sholto Douglas from Anthropic discusses the rapid progress in AI, focusing on the release of Claude Sonnet 4.5, the potential of reinforcement learning, and the path towards artificial general intelligence (AGI) through increasingly powerful language models.
AI & Machine Learning
Indie Hackers & SaaS Builders
Tech Policy & Ethics
Developer Culture
Matt Turk
Richard Sutton
Sholto Douglas
Noam Shazeer

Summary Sections

  • Podcast Summary
  • Speakers
  • Key Takeaways
  • Statistics & Facts
  • Compelling StoriesPremium
  • Thought-Provoking QuotesPremium
  • Strategies & FrameworksPremium
  • Similar StrategiesPlus
  • Additional ContextPremium
  • Key Takeaways TablePlus
  • Critical AnalysisPlus
  • Books & Articles MentionedPlus
  • Products, Tools & Software MentionedPlus
0:00/0:00

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.

0:00/0:00

Podcast Summary

In this special episode of the MAD podcast, host Matt Turk interviews Sholto Douglas, a leading AI researcher at Anthropic, following the release of Claude Sonnet 4.5. The conversation explores Sholto's journey from competitive fencing in Australia to becoming a key figure in AI research, highlighting his transition from Google's Gemini program to Anthropic. The discussion delves into the accelerating pace of AI progress, with models now capable of autonomous coding for up to 30 hours straight (23:01).

  • Key themes include the rapid advancement of AI coding capabilities, the shift from pretraining to reinforcement learning paradigms, and the potential path to AGI through current approaches

Speakers

Sholto Douglas

Sholto Douglas is a leading AI researcher at Anthropic who previously worked at Google on the Gemini program. He started at Google just a month before ChatGPT's release and played a crucial role in developing Gemini's inference stack, which saved hundreds of millions of dollars. Before his AI career, Sholto was a world-class fencer, reaching 43rd globally, and studied computer science robotics in Australia.

Matt Turk

Matt Turk is from Firstmark and hosts the MAD podcast. He conducts in-depth interviews with leading figures in technology and AI research, focusing on making complex technical concepts accessible to a broader audience.

Key Takeaways

Bet on the Exponential Progress of AI

Douglas emphasizes that despite monthly claims of hitting plateaus, AI progress continues exponentially across all measurable domains. He observes that current AI training pipelines are primitive and held together by "duct tape and best efforts," indicating massive room for improvement (49:23). This suggests entrepreneurs and professionals should position themselves to capitalize on capabilities that will exist six months from now, not just current limitations.

Long-term Coherency Enables Transformative Applications

The breakthrough allowing AI agents to work autonomously for 30 hours represents a fundamental shift from requiring supervision every 30 seconds to every 10-20 minutes. Douglas explains this enables building "working software rather than demos" - complete applications like functional Slack-like systems rather than simple prototypes (42:01). This extended operational capability opens entirely new categories of AI applications and business models.

Reinforcement Learning is the Key Missing Piece

Douglas explains that pretraining is like "skim reading every textbook" while RL is like "doing worked problems and getting feedback." Certain capabilities like learning to say "I don't know" can only emerge through RL, not pretraining (48:14). The combination of sufficient base model quality, adequate compute for RL, and simple approaches finally made RL work effectively with large language models in 2024.

Coding Represents the Optimal Training Ground for AGI

Anthropic's focus on coding stems from it being uniquely tractable - you can verify when code works, run tests in parallel, and iterate rapidly without real-world consequences. Unlike self-driving cars that must work perfectly the first time, coding agents can fail 100 times as long as they succeed once (30:15). This makes coding the fastest path to both economic impact and advancing toward more general AI capabilities.

Individual Leverage Will Dramatically Increase

Douglas predicts that individuals will soon manage teams of AI agents working 24/7, dramatically amplifying personal productivity and impact. He currently uses two coding agents to double his work output and expects this to scale significantly (66:04). This increased leverage should be channeled toward solving humanity's major challenges in health, housing, poverty, and other critical areas where the world remains "imperfect in so many ways."

Statistics & Facts

  1. Sonnet 4.5 achieved approximately 78% on the SWE-bench coding benchmark, up from roughly 72% with the previous version, representing substantial progress in AI coding capabilities (33:18).
  2. As recently as one year ago, the entire AI field was performing under 20% on SWE-bench, demonstrating the rapid acceleration in coding model performance over just 12 months.
  3. Douglas mentions that even top AI researcher Noam Shazeer estimates only about 10% of his research ideas actually work, establishing realistic expectations for innovation success rates in frontier AI research (24:53).

Compelling Stories

Available with a Premium subscription

Thought-Provoking Quotes

Available with a Premium subscription

Strategies & Frameworks

Available with a Premium subscription

Similar Strategies

Available with a Plus subscription

Additional Context

Available with a Premium subscription

Key Takeaways Table

Available with a Plus subscription

Critical Analysis

Available with a Plus subscription

Books & Articles Mentioned

Available with a Plus subscription

Products, Tools & Software Mentioned

Available with a Plus subscription

More episodes like this

In Good Company with Nicolai Tangen
January 14, 2026

Figma CEO: From Idea to IPO, Design at Scale and AI’s Impact on Creativity

In Good Company with Nicolai Tangen
We Study Billionaires - The Investor’s Podcast Network
January 14, 2026

BTC257: Bitcoin Mastermind Q1 2026 w/ Jeff Ross, Joe Carlasare, and American HODL (Bitcoin Podcast)

We Study Billionaires - The Investor’s Podcast Network
Uncensored CMO
January 14, 2026

Rory Sutherland on why luck beats logic in marketing

Uncensored CMO
This Week in Startups
January 13, 2026

How to Make Billions from Exposing Fraud | E2234

This Week in Startups
Swipe to navigate