a16z Podcast•November 28, 2025

How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning

A deep dive into how OpenAI is shifting from a single general-purpose model to a portfolio of specialized systems, discussing model customization, fine-tuning, agent workflows, and the evolving landscape of AI platforms.

Startup Founders

AI & Machine Learning

0:00/0:00

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.

0:00/0:00

Podcast Summary

In this episode, a16z GP Martin Casado sits down with Sherwin Wu, Head of Engineering for OpenAI's Platform, to explore how OpenAI balances its dual nature as both a horizontal API platform and a vertical product company. The conversation reveals OpenAI's evolution from believing in "one model to rule them all" to embracing a portfolio of specialized models, each designed for specific use cases. (00:35)

Main themes covered:

The platform paradox of enabling competitors while maintaining ecosystem leadership
Why models resist traditional software abstraction and create "anti-disintermediation" dynamics
The shift from prompt engineering to context design as models become more sophisticated
How deterministic agent workflows serve enterprise needs better than free-roaming AI

Speakers

Sherwin Wu

Sherwin Wu leads the engineering team for OpenAI's developer platform, primarily focusing on the API that powers much of Silicon Valley's AI ecosystem. Before joining OpenAI in 2022, he spent six years at Opendoor working on ML-powered home pricing models where incorrect predictions could cost millions. He began his career at Quora, working on newsfeed ranking alongside future founders of Perplexity, Scale AI, and other notable AI companies.

Martin Casado

Martin Casado is a General Partner at Andreessen Horowitz (a16z), where he focuses on enterprise software and infrastructure investments. Previously, he was a co-founder and CTO of Nicira, which was acquired by VMware for $1.26 billion, and has extensive experience in networking and cloud infrastructure technologies.

Key Takeaways

Models Create Anti-Disintermediation Dynamics

Unlike traditional software that can be easily abstracted away, AI models resist layering and remain visible to end users. Wu explains that users develop relationships with specific models and can distinguish between them, making it nearly impossible to hide the underlying intelligence. (12:12) This fundamentally changes platform economics because developers can't easily swap models without users noticing, creating stickiness that traditional cloud platforms lack. The implication is that model providers have stronger competitive moats than previously assumed.

The Single Model Dream is Dead

OpenAI has completely abandoned the belief that one general-purpose model will handle all tasks. Wu reveals that even within OpenAI just 2-3 years ago, the prevailing thinking was "one model to rule them all," but this has proven false. (17:28) Instead, they're seeing proliferation of specialized models for coding, reasoning, speed, and other specific use cases. This shift has major implications for businesses planning AI strategies - rather than betting on a single super-intelligent model, companies should prepare for a diverse ecosystem of specialized tools.

Context Engineering Replaces Prompt Engineering

The focus has shifted from crafting perfect prompts to designing comprehensive contexts that include the right tools, data, and capabilities. Wu describes how prompt engineering was expected to become obsolete as models improved, but instead evolved into "context design" - determining what tools to provide, when to pull in relevant data, and how to structure workflows. (24:05) This requires understanding not just what to ask the model, but how to architect the entire interaction environment.

Usage-Based Pricing is the New Standard

Usage-based pricing has become a "one-way ratchet" in AI, with companies never returning to per-seat or deployment-based models once they experience the alignment with actual utility. Wu explains that this pricing model most closely matches how AI is actually consumed and provides the fairest cost structure. (33:15) However, implementing usage-based billing at OpenAI's scale (800 million weekly ChatGPT users) presents enormous technical challenges that require dedicated engineering teams to solve correctly.

Deterministic Agents Serve Enterprise Better Than Free-Roaming AI

OpenAI's node-based agent builder reflects the reality that much enterprise work requires procedural, SOP-driven automation rather than creative exploration. Wu distinguishes between knowledge work (like coding) that benefits from open-ended AI and procedural work (like customer support) that requires strict adherence to policies. (45:21) This insight reveals that enterprise AI adoption often needs guardrails and predictability more than pure intelligence, challenging assumptions about how AI will transform business operations.

Statistics & Facts

ChatGPT now has 800 million weekly active users, representing approximately 10% of the global population. (09:29) This massive scale demonstrates unprecedented technology adoption speed and creates enormous technical challenges for usage-based billing systems.
OpenAI has successfully deployed models in classified environments, including a local deployment at Los Alamos National Labs. (02:20) This shows how AI companies are expanding beyond consumer and enterprise markets into government and high-security applications.
API usage has at times exceeded ChatGPT's reach in terms of total end-user impact, though ChatGPT's rapid growth makes this comparison dynamic. (10:18) This indicates the massive scale of businesses building on OpenAI's platform infrastructure.