Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
In this episode of Leaders of Code, Eira May and Natalie Rotnov dive deep into the 2025 Stack Overflow Developer Survey findings, specifically focusing on what business and tech leaders need to know about their developer teams. The conversation reveals a fascinating paradox: while AI adoption continues to surge, developer trust in AI tools is actually declining. (03:40) The discussion covers three critical areas impacting enterprise decision-making: the growing skepticism around AI-generated code quality, the persistent challenge of tool sprawl (with most developers using 6-10 tools), and the rising importance of human validation and community-driven problem-solving.
Eira May serves as the B2B Editor at Stack Overflow, focusing on enterprise content and insights for business leaders. She hosts Leaders of Code, Stack Overflow's podcast series dedicated to exploring how tech leaders build great teams and products.
Natalie Rotnov is a Senior Product Marketing Manager for Stack Overflow's Enterprise Product Suite, specializing in data licensing and Stack Overflow for Teams. She brings deep expertise in helping enterprise companies leverage Stack Overflow's 60+ million Q&A pairs and knowledge-sharing model to improve developer productivity and AI application performance.
The survey revealed that advanced questions on Stack Overflow have literally doubled since 2023, indicating that AI isn't solving complex, context-dependent problems. (07:27) Natalie emphasizes that enterprises need dedicated spaces where developers can "curate and validate new problems and solutions" in a structured format with metadata and quality signals. This isn't just about having a wiki or Slack channel—it's about creating systems that capture the nuanced discussions and problem-solving approaches that AI tools currently struggle with. Companies should prioritize platforms that allow developers to build consensus, share perspectives, and validate solutions with proper tagging and voting mechanisms.
With 36% of professional developers actively learning about Retrieval Augmented Generation (RAG), and "searching for answers" being the most widely adopted AI use case in development workflows, companies need to double down on RAG implementations. (11:12) The key is ensuring your RAG system summarizes well-structured internal knowledge sources with helpful metadata to avoid hallucinations. This directly addresses the top developer frustration: AI solutions that are "almost right, but not quite." Successful RAG implementation requires curated, tagged, and validated internal content that provides the contextual richness AI tools need to generate accurate responses.
While only 48% of developers are actively using AI agents, those who do report significant benefits—70% say agents reduce time on specific tasks and 69% report increased productivity. (19:15) Rather than rushing into complex agentic workflows, Natalie recommends starting with "low risk agentic use cases first and rolling these out iteratively." Consider piloting with newer developers or interns on contained projects. This approach acknowledges that reasoning models powering agentic systems are still immature, while allowing organizations to capture value where agents can deliver immediate impact without compromising critical workflows.
Model Context Protocol (MCP) servers are having a significant moment, offering a standardized way for AI tools to learn implicit organizational knowledge—the language, culture, and ways of working unique to your company. (20:25) Natalie explains that MCP servers can help AI agents understand context from comments and discussions, which represents "a gold mine for information and context for LLMs." For example, Stack Overflow for Teams' MCP server provides read-write access and can be integrated with tools like Cursor or Gemini, immediately grounding AI outputs in vetted organizational truth. Companies should either build MCP servers in-house or evaluate existing options for their current tool stack.
With the explosion of agentic AI, small language models (SLMs) fine-tuned for specific domains are gaining popularity because they're more cost-effective, environmentally friendly, and often more accurate for specialized tasks. (25:05) This is particularly relevant for companies in regulated industries like healthcare or finance where tasks require deep domain expertise. Rather than relying solely on general-purpose large language models, organizations should consider pre-trained domain-specific SLMs or building their own using proprietary internal data augmented with relevant third-party datasets. The key is ensuring this training data is "well structured and vetted by humans" to maintain accuracy and reliability.