Command Palette

Search for a command to run...

PodMine
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis•September 20, 2025

Full-Stack AI Safety: Why Defense-in-Depth Might Work, with Far.AI CEO Adam Gleave

A wide-ranging discussion with Far.AI CEO Adam Gleave exploring AI safety, potential post-AGI futures, alignment strategies, and the organization's approach to developing technical and policy solutions across the entire AI safety ecosystem.
AI & Machine Learning
Tech Policy & Ethics
Developer Culture
Adam Gleave
Nathan
Zvi
OpenAI
Anthropic

Summary Sections

  • Podcast Summary
  • Speakers
  • Key Takeaways
  • Statistics & Facts
  • Compelling StoriesPremium
  • Thought-Provoking QuotesPremium
  • Strategies & FrameworksPremium
  • Similar StrategiesPlus
  • Additional ContextPremium
  • Key Takeaways TablePlus
  • Critical AnalysisPlus
  • Books & Articles MentionedPlus
  • Products, Tools & Software MentionedPlus
0:00/0:00

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.

0:00/0:00

Podcast Summary

In this conversation, Adam Gleave, CEO of Far AI, shares his cautiously optimistic vision for the post-AGI world and outlines a comprehensive defense-in-depth approach to AI safety. He envisions a future where humans maintain high living standards but limited power, similar to "third sons of European nobility" - well-cared for but not in control of major world events. (07:36) The episode explores three key capability thresholds: powerful tool AIs (already here in some domains), autonomous agents capable of complex tasks (5-7 years), and full AI organizations that can outcompete human-led companies (around 14 years median estimate). (27:00)

  • Core theme: Building a practical path from today's AI systems to safe transformative AI through layered safety approaches, spanning foundational research to policy implementation

Speakers

Adam Gleave

Adam Gleave is the co-founder and CEO of Far AI, an organization that spans the entire AI safety value chain from foundational research to policy advocacy. Previously worked in quantitative finance before transitioning to AI safety research. He leads Far AI's unique approach of building capabilities across research, engineering, field building, and policy work to ensure AI safety innovations actually get implemented in practice.

Key Takeaways

Defense-in-Depth Can Work With Proper Implementation

While current AI safety systems are vulnerable due to rushed implementation, Adam argues that well-designed defense-in-depth approaches have strong potential for success. (48:00) The key is making defensive components genuinely independent rather than using correlated models, similar to how multiple weak PIN digits combine to create strong security. Current systems fail because they provide attackers with information about which defenses triggered, but proper implementation can eliminate these signals and stack weak but independent layers into robust protection.

AI Capability Growth Will Likely Be Spiky, Not Uniform

Unlike assumptions that AGI will be uniformly superhuman across all domains, Adam expects AI systems to have highly uneven skill profiles for years to come. (33:24) AIs will excel in areas with abundant training data and easily specified objectives while struggling with long-horizon, vague tasks like entrepreneurship. This spikiness means human-led organizations can remain competitive by leveraging areas where humans maintain advantages, particularly in sample-efficient learning and general-purpose decision making.

Just-in-Time Safety Is Dangerously Inadequate

Current AI development follows a "just-in-time" safety approach - identifying problems only when models are about to be deployed and rushing out patches. (49:05) Adam warns this approach runs on dangerously small safety margins and doesn't build the kind of reliable systems needed for transformative AI. Instead, developers need to design safety measures from the ground up, conduct careful experimental validation, and be willing to accept performance trade-offs when necessary for safety.

Scalable Oversight Shows Promise for Reducing AI Deception

Far AI's recent research demonstrates that training AI systems against lie detectors can significantly reduce deception rates and generalize to improved honesty across contexts. (63:54) Crucially, the training methodology matters enormously - off-policy reinforcement learning with human-anchored data shows better results than on-policy exploration which can teach models to better fool detectors. This suggests that with rigorous engineering approaches, we can make meaningful progress on core alignment problems like truthfulness.

Interpretability Should Focus on Application-Specific Understanding

Rather than pursuing the maximalist goal of fully reverse-engineering AI systems, interpretability research should target specific applications where understanding matters most. (74:16) Adam's team successfully reverse-engineered planning algorithms in game-playing models by focusing only on components relevant to long-term planning while ignoring short-term heuristics. This approach can provide actionable insights - like detecting when models use theory-of-mind reasoning in suspicious contexts - without requiring complete system comprehension.

Statistics & Facts

  1. Adam estimates median timelines of 5-7 years for autonomous AI agents capable of complex tasks like full cybersecurity attack chains, and around 14 years for AI organizations that can outcompete human-led companies across all domains. (28:00)
  2. In Far AI's deception detection research, they achieved significant reductions in AI deception rates when using proper training methodologies, with lie detector accuracy improving as model size increased rather than becoming harder to detect. (66:15)
  3. Far AI plans to double in size over the next 12-18 months and has secured funding for the next few years, with their most critical hire being a Chief Operations Officer to help scale their operations. (85:24)

Compelling Stories

Available with a Premium subscription

Thought-Provoking Quotes

Available with a Premium subscription

Strategies & Frameworks

Available with a Premium subscription

Similar Strategies

Available with a Plus subscription

Additional Context

Available with a Premium subscription

Key Takeaways Table

Available with a Plus subscription

Critical Analysis

Available with a Plus subscription

Books & Articles Mentioned

Available with a Plus subscription

Products, Tools & Software Mentioned

Available with a Plus subscription

More episodes like this

In Good Company with Nicolai Tangen
January 14, 2026

Figma CEO: From Idea to IPO, Design at Scale and AI’s Impact on Creativity

In Good Company with Nicolai Tangen
Uncensored CMO
January 14, 2026

Rory Sutherland on why luck beats logic in marketing

Uncensored CMO
We Study Billionaires - The Investor’s Podcast Network
January 14, 2026

BTC257: Bitcoin Mastermind Q1 2026 w/ Jeff Ross, Joe Carlasare, and American HODL (Bitcoin Podcast)

We Study Billionaires - The Investor’s Podcast Network
This Week in Startups
January 13, 2026

How to Make Billions from Exposing Fraud | E2234

This Week in Startups
Swipe to navigate