Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
In this episode, Jacob Lieberman, Director of Enterprise Product Management at NVIDIA, returns to discuss the evolution of AI agent adoption in enterprises and introduces the groundbreaking AI Data Platform. While AI agents have advanced significantly with open models now matching the power of earlier commercial models, enterprises still face major challenges moving from proof-of-concept to production deployment. (02:04) The core issue revolves around data accessibility, as enterprise systems weren't originally designed for AI agents, and the majority of enterprise data remains unstructured and difficult to process. (04:04) Lieberman unveils NVIDIA's AI Data Platform - a GPU-accelerated storage solution that revolutionizes how enterprises prepare data for AI by bringing compute to the data rather than moving data to compute, eliminating security risks and inefficiencies of traditional data pipelines. (09:13)
Jacob Lieberman serves as Director of Enterprise Product Management at NVIDIA, where he focuses on enterprise AI solutions and agent deployment strategies. He specializes in helping organizations transition AI initiatives from proof-of-concept stages to full production deployment, with particular expertise in data platform architecture and GPU-accelerated enterprise solutions.
Noah Kravitz hosts the NVIDIA AI Podcast, where he explores cutting-edge developments in artificial intelligence and their real-world applications. He brings a journalistic approach to technical topics, making complex AI concepts accessible to business leaders and technology professionals.
While consumer AI agent adoption has flourished, enterprises struggle to move beyond proof-of-concept deployments because their existing systems weren't built for AI agents. (03:23) The fundamental challenge lies in securing access to accurate, recent data, as all AI applications - whether training models, fine-tuning, or retrieval augmented generation - depend entirely on this foundation. Enterprise data is predominantly unstructured (PowerPoint presentations, PDFs, audio, video files) requiring complex transformation pipelines to become "AI-ready" through processes like text extraction, semantic chunking, metadata enrichment, embedding, and vector database indexing.
Enterprises face the challenge of "data velocity" - the combined rate at which new data is created plus the rate at which existing data changes. (06:22) This isn't a one-time transformation but requires continuous reprocessing to maintain data accuracy and relevance. Most enterprises lack governance systems to track which specific data has changed, forcing them to reindex entire datasets repeatedly - like rewashing all dishes when you're unsure which ones are dirty. This creates massive inefficiencies and resource drain on data science teams who spend up to 80% of their time on data wrangling rather than actual analysis.
Traditional AI data preparation requires copying data multiple times through processing pipelines, creating significant security risks and governance challenges. (08:38) Each copy increases the attack surface, and when data is moved away from source systems, it becomes disconnected from permission changes and content updates. If an employee loses access to a document, they can still access all the AI-processed copies scattered across systems. Lieberman notes that enterprises typically end up with 7-13 copies of the same dataset across their data centers, all disconnected from the authoritative source of truth.
NVIDIA's AI Data Platform reference design revolutionizes enterprise data management by bringing GPUs directly into storage systems rather than sending data to external processing. (11:55) This approach leverages "data gravity" - the principle that large, growing datasets are expensive and difficult to move. By processing data where it lives, enterprises can perform continuous AI preparation as background operations while maintaining security and governance controls. The GPU handles the entire pipeline - data discovery, text extraction, chunking, embedding, vector indexing, and semantic search - without creating vulnerable copies or disconnecting from source permissions.
Beyond data preparation, storage-resident GPUs have sufficient compute capacity to run AI agents directly within the storage infrastructure, enabling what Lieberman calls "letting AI agents work from home." (19:41) These agents can perform sophisticated tasks like identifying documents that should be classified but aren't marked as such, or monitoring storage system telemetry to provide optimization recommendations to administrators. This approach provides agents with a controlled, secure environment where they understand the APIs, capabilities, and operating system, similar to how human workers often prefer the controlled environment of working from home.