"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis•October 29, 2025

Is AI Stalling Out? Cutting Through Capabilities Confusion, w/ Erik Torenberg, from the a16z Podcast

Nathan Labenz discusses the ongoing progress in AI capabilities, countering arguments that AI is stalling, by highlighting advances in reasoning, context windows, multimodal abilities, and scientific contributions, while also exploring potential societal impacts and challenges in AI development.

AI & Machine Learning

Indie Hackers & SaaS Builders

0:00/0:00

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.

0:00/0:00

Podcast Summary

In this episode of The Cognitive Revolution, host Nathan Labenz debates with Eric Torenberg about whether recent AI developments suggest progress is slowing down or stalling, directly addressing arguments from Cal Newport and others. Nathan counters the "AI is stalling" narrative by highlighting significant qualitative advances including 100X context window expansion, real-time interactive voice capabilities, improved reasoning leading to IMO gold medals, dramatic improvements in vision and tool use, and AI's growing contributions to hard sciences. (03:49) The conversation explores AI's impact on labor markets, the potential for AI protectionism, concerns about recursive self-improvement, and the implications of China producing the world's best open source models. Nathan argues that while frontier capabilities remain jagged with embarrassing failures still common, every aggregate measure from token volume to task complexity to industry revenue growth suggests progress remains on trend, with frontier developers seeing a clear path for continued rapid advancement through 2027-2028.

Core theme: Separating AI capability advancement from concerns about current impact on learning and cognition

Speakers

Nathan Labenz

Nathan Labenz is the host of The Cognitive Revolution podcast and a leading AI industry analyst who provides in-depth coverage of AI capabilities and their implications. He works closely with AI companies and has extensive experience evaluating frontier models, making him a trusted voice for understanding the pace and direction of AI progress.

Eric Torenberg

Eric Torenberg hosts the a16z podcast and has been Nathan's podcast partner for The Cognitive Revolution. He brings a venture capital perspective to AI discussions, focusing on the intersection of technology development and market dynamics in the AI space.

Key Takeaways

Separate Capability Progress from Current Impact Assessment

One of the most critical insights from this discussion is the importance of distinguishing between whether AI is "good for us" versus whether AI capabilities are continuing to advance. (04:24) Nathan agrees with Cal Newport's concerns about students using AI as cognitive shortcuts and the potential negative impacts on attention spans and learning. However, he argues that these legitimate concerns about current AI usage don't negate the clear evidence of continued capability improvements. This distinction matters because conflating these separate issues can lead to dangerous complacency about preparing for more powerful future AI systems. The key insight is that one can simultaneously be concerned about AI's current educational impacts while recognizing that capabilities are advancing rapidly toward potentially transformative levels.

Context Window Expansion Represents Fundamental Progress

The expansion from GPT-4's initial 8,000 token context window to current models handling hundreds of thousands of tokens represents a qualitative leap in capability. (14:21) This improvement allows models to work with dozens of research papers simultaneously while maintaining high fidelity reasoning across the entire context. Nathan explains this effectively substitutes for having more facts baked into the model - instead of needing trillion-parameter models to memorize everything, smaller models can dynamically work with provided information. This architectural advance enables new workflows like comprehensive literature reviews and complex document analysis that were previously impossible, demonstrating that scaling isn't just about raw model size but about finding more efficient gradients of improvement.

AI is Beginning to Push the Frontier of Human Knowledge

A fundamental shift has occurred where AI systems are starting to contribute original insights to hard sciences rather than just recombining existing knowledge. (22:14) Nathan highlights Google's AI co-scientist system that generated novel hypotheses for previously unsolved virology problems, coincidentally matching solutions that human researchers had recently discovered but not yet published. Additionally, recent breakthroughs show AI systems achieving IMO gold medal performance and making progress on frontier mathematics problems. This represents a qualitative change from GPT-4, which could not push the boundaries of human knowledge in any meaningful way. The implication is that we're transitioning from AI as a sophisticated information processor to AI as a genuine research collaborator capable of scientific discovery.

Labor Market Disruption Will Follow Predictable Patterns

AI's impact on employment will vary significantly by industry based on demand elasticity and task characteristics. (39:45) Nathan predicts that industries with relatively inelastic demand - like customer service where companies only want to handle necessary tickets, or accounting where businesses buy only required services - will see the most dramatic job displacement. Conversely, software development might maintain employment levels longer because there's potentially unlimited demand for software, allowing productivity gains to translate into expanded output rather than reduced headcount. However, he warns that even in high-elasticity industries, the ratios become challenging when AI handles 90%+ of work, making it mathematically difficult to absorb displaced workers even with increased demand.

Recursive Self-Improvement Poses Near-Term Concerns

The rapid advancement in AI's ability to contribute to its own development cycle represents a critical inflection point that may arrive sooner than expected. (44:28) Nathan cites OpenAI's system card showing o3 can successfully complete 40% of pull requests from OpenAI research engineers - jumping from single digits with previous models. This suggests AI is entering the steep part of an S-curve for automating AI research itself. The concerning scenario is companies tipping into recursive self-improvement regimes without adequate controls, potentially leading to rapid capability jumps that outpace human oversight. Nathan expresses particular worry about this transition happening while current models still exhibit unpredictable behaviors and alignment challenges, as it could dramatically accelerate progress beyond our ability to steer the overall process safely.

Statistics & Facts

GPT-4.5 achieved a 65% score on Simple QA benchmark compared to o3's 50%, representing a 30% improvement in absorbing long-tail factual knowledge. (12:22) This demonstrates continued scaling benefits for pure knowledge absorption, even as the focus shifts toward reasoning capabilities.
OpenAI's o3 model can successfully complete 40% of pull requests from OpenAI research engineers, jumping from low-to-mid single digits with previous models. (46:34) This metric indicates AI is entering the steep part of the S-curve for automating AI research itself.
Intercom's fin agent now resolves 65% of customer service tickets automatically, up from 55% just three to four months earlier. (40:17) The rapid improvement trajectory suggests approaching 90%+ automation rates that would fundamentally restructure customer service employment.