Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
In this episode, AI engineers Aishwarya Naresh Reganti and Kiriti Badam share hard-earned insights from launching over 50 AI products across OpenAI, Google, Amazon, and Databricks. The conversation centers on why traditional software development approaches fail for AI products and introduces a systematic framework for building reliable AI systems. (00:36)
• Key themes: The fundamental differences between AI and traditional software development, the importance of starting small with controlled autonomy, and building continuous feedback loops for behavior calibration.
Aishwarya is an early AI researcher who worked at Alexa and Microsoft, publishing over 35 research papers. She has led and supported AI product deployments across major companies including Amazon and Databricks, and co-teaches the top-rated AI course on Maven focused on building successful AI products.
Kiriti currently works on Codex at OpenAI and has spent the last decade building AI and ML infrastructure at Google and Kumo. Together with Aishwarya, he has been instrumental in developing frameworks for enterprise AI adoption and has hands-on experience with the challenges of scaling AI systems in production.
The most successful AI products begin with minimal autonomy and maximum human oversight. (13:19) For example, in customer support, start with AI suggesting responses rather than automatically sending them. This approach allows teams to understand system behavior patterns before increasing autonomy. As Kiriti explains, when you start small, "it forces you to think about what is the problem that I'm gonna solve" rather than getting lost in solution complexity.
Unlike traditional software, AI systems require ongoing behavior calibration because they're inherently non-deterministic. (45:41) Successful teams establish feedback loops that capture both explicit user signals (thumbs up/down) and implicit signals (regenerating responses, switching off features). This continuous monitoring helps identify new error patterns that weren't anticipated during development.
Executive leadership engagement is the strongest predictor of AI adoption success. (26:31) As Aishwarya notes, the CEO of Rackspace blocks 4-6 AM daily for "catching up with AI" and has weekend coding sessions. Leaders need to rebuild their intuitions and be "comfortable with the fact that your intuitions might not be right" to guide effective AI decision-making.
The most successful AI implementations come from deep understanding of existing workflows rather than fascination with AI capabilities. (48:14) Enterprise data and infrastructure are messy, with complex taxonomies and undocumented rules. Teams that obsess over understanding these workflows can choose the right tool for each problem instead of defaulting to AI for everything.
Neither evaluations nor production monitoring alone can catch all AI system failures. (33:39) Evals catch known error patterns you've anticipated, while production monitoring reveals emerging behaviors you couldn't predict. Successful teams use both approaches: evals for regression testing and monitoring for discovering new failure modes in real user interactions.