Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
Jerry Tworek, a legendary AI researcher who recently left OpenAI after seven years, joins the Core Memory podcast for his "exit interview." (03:35) Tworek joined OpenAI in 2019 when it was just 30 people and worked on many of the company's most consequential products, including the reasoning technology that evolved from Q* to Strawberry to o1. In this revealing conversation, he discusses why he left OpenAI to pursue research that he felt the company couldn't support, his views on the current state of AI development, and his plans for the future.
Jerry Tworek is a renowned AI researcher from Poland who spent seven years at OpenAI, joining in 2019 when the company had around 30 employees. He led or contributed to many of OpenAI's breakthrough products, most notably the reasoning technology that began as Q* and eventually became o1 and Strawberry models. Before AI, Tworek worked in high-frequency trading and is known for his high risk tolerance and focus on foundational research breakthroughs.
Ashley Vance is the host of the Core Memory podcast and author covering technology and innovation. He has extensive experience covering the AI industry and has been following the development of major AI labs and their key researchers for years.
Kylie Robinson is co-host of the Core Memory podcast, bringing fresh perspective and energy to covering the rapidly evolving AI landscape. She focuses on the human and business dynamics within the AI industry.
Tworek emphasizes that meaningful research breakthroughs come from taking significant risks and maintaining deep focus on specific problems. (22:00) He argues that as AI companies grow larger and face commercial pressures, they naturally become more risk-averse, making it harder to pursue the kind of pioneering work that led to major breakthroughs like reinforcement learning scaling. The key is having conviction in your research direction and being willing to bet everything on it, even if it might fail. This approach contrasts sharply with spreading resources across many safe, incremental projects.
All major AI labs are essentially doing the same thing - scaling transformers - which Tworek finds deeply concerning. (16:36) He argues that while competition is good, having five major companies all following identical approaches limits innovation. Most users can't even distinguish between different models despite teams thinking they're doing meaningfully different work. This lack of diversity in approaches means fewer opportunities for breakthrough discoveries that could fundamentally change the field.
Despite transformers' success over the past six years, Tworek believes the architecture has limitations and that new approaches are needed. (33:41) He advocates for exploring novel architectures that may look somewhat like transformers or completely different. The field has become too focused on incremental improvements to transformers rather than questioning whether this is the optimal architecture for all AI tasks. This represents one of his primary research interests going forward.
Current AI models operate with separate learning and inference modes, unlike humans who learn continuously. (33:55) Tworek identifies continual learning - the ability to learn from data in real-time during operation - as one of the final crucial capabilities needed for AGI. Without this ability, models remain fundamentally limited and "dumb" compared to human intelligence. Successfully integrating continual learning with current models would represent a major step toward genuine artificial general intelligence.
Tworek sees video games as uniquely valuable training environments because they're designed to be interesting to human intelligence. (26:55) Games incorporate storytelling, problem-solving, resource allocation, and puzzle-solving in ways that are engaging and non-repetitive. Unlike early reinforcement learning approaches that trained from scratch, combining game-based training with strong world knowledge from pre-training could create more capable agents. This approach leverages the fact that games are crafted to challenge and develop human cognitive abilities.