Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
In this thought-provoking episode from The Dwarkesh Podcast, host Dwarkesh interviews Ilya Sutskever, cofounder of SSI and former OpenAI chief scientist, about the puzzling disconnect between AI models' impressive benchmark performance and their underwhelming real-world impact. (02:24) Sutskever explores why current AI systems excel on evaluations yet struggle with basic reliability issues, like repeatedly introducing the same bugs when asked to fix code. The conversation delves into fundamental questions about generalization, learning efficiency, and what's actually blocking progress toward artificial general intelligence. (24:00) Sutskever argues we're transitioning from the "age of scaling" back to an "age of research," where simply throwing more compute at problems won't solve the core challenges of building reliable, human-level learning systems.
Host of The Dwarkesh Podcast, known for conducting in-depth conversations with leading figures in AI, technology, and science. He has interviewed prominent researchers and industry leaders, building a reputation for thoughtful, technical discussions that explore the cutting edge of artificial intelligence research.
Cofounder of Safe Superintelligence Inc. (SSI) and former Chief Scientist at OpenAI, where he played a pivotal role in developing GPT models. Previously a research scientist at Google Brain and co-author of foundational papers including AlexNet. Widely regarded as having exceptional research taste in AI, having contributed to many breakthrough developments in deep learning from convolutional networks to large language models.
Sutskever identifies a fundamental disconnect between AI models' performance on benchmarks versus real-world applications. (02:54) He suggests this gap occurs because reinforcement learning training inadvertently takes inspiration from the evaluations themselves - researchers want their models to perform well on evals, so they design RL environments that mirror those tasks. However, if models have inadequate generalization capabilities, this creates systems that excel at specific benchmarks but fail at broader applications. This represents a form of "reward hacking" by human researchers who become too focused on eval performance rather than genuine capability development.
Unlike AI models that require massive amounts of data and specific training environments, humans demonstrate remarkable sample efficiency and robustness across domains. (27:00) Sutskever uses the analogy of two competitive programming students - one who practices 10,000 hours on specific problems versus another who practices 100 hours but has better foundational understanding. The second student, despite less practice, will likely perform better in their career. This "it factor" in human learning represents a fundamental advantage in generalization that current AI systems lack, even in domains like mathematics and coding where humans couldn't have evolutionary priors.
Sutskever argues we're transitioning from 2020-2025's "age of scaling" back to an "age of research" similar to 2012-2020. (22:51) While scaling provided a reliable recipe for improvement (more data + compute + parameters = better results), we've now reached a point where simply scaling up may not yield transformative differences. The current landscape has "more companies than ideas" because scaling sucked all the air out of the room. Now that compute is abundant, the bottleneck has shifted from computational resources back to fundamental algorithmic insights and novel approaches to training.
Current reinforcement learning approaches suffer from extremely sparse reward signals - models must complete entire trajectories before receiving any learning signal. (15:01) Value functions could provide intermediate feedback, allowing models to learn from mistakes much earlier in the process. For example, if a model is pursuing an unproductive coding solution, a value function could provide negative feedback after 1,000 steps rather than waiting for the entire solution attempt to fail. This mirrors human learning, where we have intuitive senses of progress and can course-correct quickly. Implementing effective value functions could dramatically improve learning efficiency.
Sutskever discusses a fascinating case study of a person who lost emotional processing due to brain damage. (12:57) Despite maintaining intellectual capabilities, this individual became unable to make basic decisions, spending hours choosing socks and making poor financial choices. This suggests emotions serve as a critical value function that guides human decision-making and learning. The robustness and simplicity of emotional systems, despite their ancient evolutionary origins, demonstrates how effective value functions can remain useful across vastly different environments - a principle that could inform AI system design.