Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
In this engaging conversation, Nick Joseph, Head of Pre-training at Anthropic, provides an insider's perspective on the evolution of AI training and the future of artificial general intelligence. From his early days at Vicarious and OpenAI to leading one of the most critical teams in AI development, Nick shares candid insights about the technical challenges, strategic decisions, and philosophical considerations that shape modern AI systems. (03:03) The discussion covers the fundamentals of pre-training, the surprising dominance of next-token prediction over other approaches, and how Anthropic operates at unprecedented scales with distributed systems spanning thousands of GPUs.
Nick Joseph is the Head of Pre-training at Anthropic, where he leads the team responsible for training large language models like Claude. Before joining Anthropic at its founding, he worked at OpenAI on safety teams and code models, and previously at Vicarious working on computer vision for robotics products. His journey into AI began through an economics background and concerns about AI safety after an internship at GiveWell, leading him to focus on the technical challenges of scaling AI systems rather than pursuing traditional academic routes.
Nick emphasizes that the bottleneck in AI progress isn't theoretical breakthroughs but engineering execution. (52:37) As he puts it, "Almost all is. Throughout the entire history of this field, it's the case that you throw more compute, the thing kinda works. The challenge is actually getting it correct isn't really an ML problem." The actual architectures are mathematically simple, but implementing them correctly at massive scale requires debugging skills across the entire technology stack - from high-level ML concepts down to network protocols and hardware failures. This insight challenges the common perception that AI teams need primarily PhD researchers, when in reality they need engineers who can solve extraordinarily complex distributed systems problems.
Despite the complexity of modern AI systems, Nick reveals that pre-training success still boils down to a single metric: driving down loss on next-token prediction. (19:23) He notes, "I think I'm still pushing down the exact same metric that I was on day one. There's like some loss function. Loss go down." This seemingly simple objective has proven remarkably robust across massive scaling efforts. While teams have grown more specialized and systems more complex, the fundamental goal remains unchanged, suggesting that this metric captures something fundamental about intelligence that scales predictably with compute and data.
One of Nick's most counterintuitive insights is that bugs, not theoretical problems, pose the greatest threat to AI progress. (48:46) He explains that "a single bug can derail you for months" because models take months to train, meaning you can "lose a whole generation off of something that just looks like, ah, you know, this piece of your code was incorrect." The challenge is compounded by the fact that traditional debugging approaches don't work at the scale of thousands of GPUs training for months. A subtle precision error deep in a kernel might only manifest after weeks of training, requiring engineers who can trace problems through tens of thousands of lines of code across multiple abstraction layers.
Nick provides fascinating insight into how scaling laws work in practice, describing them as "really a power law plus constant" where loss decreases predictably with increased compute until you "curve off that power law and then you know something is wrong." (08:53) This creates a unique debugging challenge - when performance deviates from expected scaling, it could indicate either a fundamental limit or a subtle implementation bug. The predictability of these laws has enabled strategic planning around compute allocation and has been central to Anthropic's approach of testing strategies at small scale before scaling up proportionally across data, model size, and training time.
While pre-training determines the model's fundamental capabilities, Nick explains that post-training is where teams can rapidly iterate on model personality and alignment. (45:19) The key advantage is speed: "your iteration, like the ability to make progress, is really fast. You can try something, you can try it again, you can try it again" in hours rather than months. This separation allows teams to de-risk behavioral changes before potentially incorporating them into expensive pre-training runs. However, some alignment properties may eventually need to be integrated into pre-training for greater robustness, creating an ongoing strategic tension between flexibility and stability.