Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
In this episode of The Next Wave, Maria Gharib interviews Victor Riparbelli, CEO and co-founder of Synthesia, diving deep into the evolution of AI-generated avatars and their transformative impact on business communication. (03:54) The conversation explores how AI video technology is progressing from simple avatar creation to interactive, conversational experiences that can fundamentally change how we learn, train, and communicate. Victor shares insights on Synthesia's Express 2 technology, which introduces full-body gestures and enhanced realism, making AI avatars suitable for public-facing content. (07:45) The discussion covers the shift from text-based corporate communication to video-first approaches, ethical considerations around deepfake technology, and the future of AI video agents that can engage in real-time conversations and role-playing scenarios.
Victor Riparbelli is the CEO and co-founder of Synthesia, a company he's been leading since 2017 to pioneer AI-generated avatar technology. He has a background in AI video research and has grown Synthesia into an industry leader in creating lifelike, customizable avatars that can communicate in over 140 languages. Victor has delivered a TED Talk on the future of video and has established Synthesia as a trusted platform used by major enterprises like UBS, Heineken, and Zoom for business communication and training.
Maria Gharib is a podcast host and AI enthusiast who conducted this interview from London. She speaks three languages and has extensive experience exploring AI technologies and their practical applications. During this episode, she underwent the process of creating her own AI avatar at Synthesia's facilities, giving her firsthand insight into the technology she's discussing.
Victor emphasized that the most successful Synthesia implementations focus on the "middle layer" of corporate communication - replacing PowerPoint presentations, wiki pages, and training documents with engaging video content. (17:52) This approach delivers immediate ROI because video has higher information retention rates and can be consumed asynchronously. Companies like UBS have deployed 200+ analyst avatars to communicate market insights to customers, transforming text-based communications into more engaging video format. Practical Example: Instead of sending a lengthy email about new pricing policies, sales managers can create a 5-minute avatar video that explains the changes with visual aids and examples.
Synthesia operates under the principle of "utility over novelty," advising customers to avoid overly ambitious projects like major advertising campaigns and instead focus on proven use cases. (19:24) Victor warns against the rookie mistake of trying to create big advertising campaigns with current AI avatar technology, as it's not yet developed for that purpose. The most impactful applications are in communication, product marketing, customer support videos, and training content where the technology delivers tangible ROI today.
The realism of AI avatars hinges primarily on natural body language and accurate voice reproduction, including accent preservation. (12:20) Victor explained that humans are extremely sensitive to even slight imperfections in digital representations, and earlier versions with strange body language created an "uncanny valley" effect. Synthesia has developed proprietary voice technology specifically trained to preserve accents because even minor accent changes can make users reject their avatars. (13:49)
The future of video lies in interactive, conversational experiences rather than passive consumption. Victor describes Synthesia's agent product that allows viewers to ask questions, practice scenarios, and receive personalized coaching. (05:10) For example, after watching a sales training video, employees can engage in a 15-minute role-play session where the AI agent acts as a difficult customer, then provides specific coaching feedback. This transforms video from a broadcast medium into a personalized learning experience.
Synthesia operates under the "three Cs" ethical framework: consent (not creating avatars without permission), control (strict content moderation), and collaboration (working with partners on responsible AI). (20:19) The company deliberately errs on the side of being overly restrictive, particularly around political content, news, and controversial topics, believing this builds long-term trust with enterprise customers. Victor acknowledges this may limit short-term growth but considers it essential for sustainable business success.