Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
In this landmark episode, Sebastian Borgeaud, a pre-training lead for Gemini 3 at Google DeepMind and co-author of the seminal RETRO paper, gives his first-ever podcast interview. (00:58) He reveals that Gemini 3's remarkable performance comes from a deceptively simple formula: better pre-training and better post-training, achieved through the coordinated efforts of a team of 150-200 people working across data, models, infrastructure, and evaluations. (02:29) The conversation explores how the AI industry is shifting from an "infinite data" paradigm to a "data-limited regime," fundamentally changing research approaches and priorities. (04:44) Sebastian discusses the evolution from building individual models to constructing complete systems, the technical details behind Gemini 3's mixture-of-experts architecture, and why frontier research increasingly requires full-stack thinking that spans algorithms, engineering, and infrastructure.
Sebastian Borgeaud is a pre-training lead for Gemini 3 at Google DeepMind and co-author of the influential RETRO paper. Born in The Netherlands and educated across Europe, he earned his undergraduate and master's degrees from Cambridge University's computer lab before joining DeepMind in 2018 as a research engineer. He has been instrumental in developing major language models including Gopher, Chinchilla, and RETRO, and now coordinates the work of 150-200 people across data, models, infrastructure, and evaluations for Gemini's pre-training efforts.
Matt Turck is the Managing Director at FirstMark Capital and host of the MAD podcast. He focuses on investments in data infrastructure, AI, and enterprise technology, bringing deep industry expertise and insights to conversations with leading technologists and researchers.
Sebastian emphasizes that modern frontier AI development is no longer about training a single neural network architecture. (02:49) Instead, teams are building comprehensive systems that integrate models, data pipelines, infrastructure, and evaluation frameworks. This shift requires "research taste" - the ability to balance performance improvements with system complexity and team productivity. (20:44) Sebastian explains that research ideas must "play well with everyone else's research" and integrate smoothly, as slowing down the broader team often outweighs individual performance gains. This systems approach is what enabled Gemini 3's remarkable leap in capabilities through the coordinated work of hundreds of researchers and engineers.
A fundamental shift is occurring in AI research as the field moves from assuming unlimited data availability to operating within finite data constraints. (34:05) This paradigm change is driving renewed interest in techniques from pre-LLM computer vision research, where data scarcity was the norm. Sebastian notes this doesn't necessarily mean using less data, but rather optimizing within known data boundaries. This shift is catalyzing innovation in data curation, synthetic data generation, and architectural improvements that maximize learning efficiency from available datasets, fundamentally changing how researchers prioritize and approach problems.
Contrary to widespread industry speculation about the "death of scaling laws," Sebastian confirms that scale continues to provide predictable improvements in model performance. (30:54) However, the research community has shifted away from viewing scale as the primary or only lever for advancement. Modern progress comes from the compounding effects of scaling, architectural innovations, and data improvements working together. (32:12) This balanced approach allows teams to optimize for multiple objectives simultaneously, including serving costs and inference efficiency, rather than pursuing scale at any cost.
Sebastian identifies evaluation (evals) as one of the most underestimated and challenging aspects of AI research. (41:00) Pre-training evaluation faces two critical gaps: evals must predict performance at scale (since regular experiments use smaller models), and they must predict post-training performance (since models undergo additional training before deployment). (41:58) External benchmarks quickly become contaminated as they appear in training data, forcing teams to develop internal held-out evaluation systems. This evaluation challenge is particularly acute in pre-training because of the long iteration cycles and high costs of large-scale experiments.
Sebastian advocates for a new type of researcher-engineer who can understand the entire technology stack from research concepts down to hardware implementation. (50:05) He describes this full-stack understanding as a "superpower" that enables researchers to identify opportunities across system layers and reason through the implications of research ideas all the way to the TPU level. (50:18) This systems awareness becomes increasingly critical as AI models become more complex and resource-intensive, requiring researchers who can balance algorithmic innovation with practical deployment constraints.