Command Palette

Search for a command to run...

PodMine
Latent Space: The AI Engineer Podcast
Latent Space: The AI Engineer Podcast•December 31, 2025

[State of Evals] LMArena's $100M Vision — Anastasios Angelopoulos, LMArena

LMArena's founder Anastasios Angelopoulos discusses their $100M raise, platform growth to 250M+ conversations, leaderboard integrity, expansion into expert and multimodal arenas, and vision to be the industry's North Star for evaluating AI model capabilities through organic user feedback.
AI & Machine Learning
Developer Culture
B2B SaaS Business
Anastasios Angelopoulos
Anjney Midha
Naina
Google
Hugging Face

Summary Sections

  • Podcast Summary
  • Speakers
  • Key Takeaways
  • Statistics & Facts
  • Compelling StoriesPremium
  • Thought-Provoking QuotesPremium
  • Strategies & FrameworksPremium
  • Similar StrategiesPlus
  • Additional ContextPremium
  • Key Takeaways TablePlus
  • Critical AnalysisPlus
  • Books & Articles MentionedPlus
  • Products, Tools & Software MentionedPlus
0:00/0:00

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.

0:00/0:00

Podcast Summary

In this episode, Anastasios Angelopoulos returns to Latent Space to recap Arena's incredible 2024 journey—from building LMArena in a Berkeley basement to raising $100M and becoming the de facto leaderboard for frontier AI models. (01:36) The conversation covers Arena's origin story as an academic project incubated by Anjney Midha at a16z, their decision to spin out as a company to achieve necessary scale, and how they're deploying their $100M raise primarily on inference costs and migrating from Gradio to React. (03:38) Anastasios addresses the controversial "Leaderboard Delusion" paper that critiqued Arena, explaining how their response demolished the paper's factual errors and misrepresentations. (10:18) The discussion also explores Arena's massive scale with 250M+ conversations, their expansion into occupational verticals and multimodal capabilities, and the viral "Nano Banana" moment that changed Google's market share overnight and validated the economic importance of multimodal AI models.

  • Main theme: Arena's evolution from academic research project to $100M company serving as the industry's North Star for AI model evaluation through millions of real-world user conversations and votes

Speakers

Anastasios Angelopoulos

Co-founder of Arena (formerly LMArena), originally part of the LMSYS research group at Berkeley. Angelopoulos helped build what became the industry's most trusted AI model leaderboard from a basement project into a company that raised $100M and serves tens of millions of monthly conversations. He has extensive experience in machine learning evaluation and statistical analysis, particularly around response bias correction and benchmark integrity.

Key Takeaways

Platform Integrity Must Come First

Anastasios emphasizes that Arena's public leaderboard operates as a "charity" and loss leader, maintaining strict independence from financial influence. (17:23) Models cannot pay to get on the leaderboard, cannot pay to get off, and scores reflect millions of real votes from actual users. This principle ensures the leaderboard serves as an unbiased North Star for the industry rather than a pay-to-play system like traditional analyst firms. The integrity of real-world user feedback over manufactured benchmarks creates lasting trust and credibility in an industry where evaluation standards are constantly questioned.

Scale Requires Company Structure, Not Just Academic Idealism

The decision to spin out from Berkeley's LMSYS group came from recognizing that only a company structure could provide the resources necessary to scale Arena's mission. (02:47) Academic projects and nonprofits lacked the funding and operational capabilities needed to handle tens of millions of monthly conversations and maintain platform quality. This pragmatic approach enabled Arena to secure $100M in funding primarily for inference costs and technical infrastructure, demonstrating that impact-driven missions sometimes require commercial structures to achieve their goals.

Organic User Data Beats Synthetic Benchmarks

Arena's competitive advantage lies in users inputting their own real-world use cases rather than evaluating on pre-generated scenarios. (07:07) This organic approach provides a level of realism that distinguishes Arena from competitors who rely on synthetic benchmarks or consultant-generated test cases. The platform captures authentic user intent and demonstrates how models perform on actual problems people are trying to solve, creating more actionable insights for model developers and users alike.

Consumer Retention Must Be Earned Daily

Despite having tens of millions of users, Anastasios recognizes that consumer loyalty is fragile and temporary. (21:10) Users can leave at any moment, making retention a daily challenge that requires constant value delivery. Simple features like sign-in and persistent conversation history became major retention drivers, showing how basic user experience improvements can have outsized impact. This mindset of earning users every single day prevents complacency and drives continuous improvement in a competitive consumer landscape.

Multimodal AI Has Massive Economic Potential

The viral "Nano Banana" moment demonstrated how multimodal capabilities, particularly image generation, drive billions in market value and fundamentally change competitive dynamics. (13:20) What initially seemed like a non-essential AI capability proved economically critical for marketing, design, and content creation use cases. This shift in perspective highlights how consumer-facing AI features can have enterprise implications and how seemingly frivolous capabilities often unlock massive business value when they meet real user needs.

Statistics & Facts

  1. Arena processes over 250 million total conversations with tens of millions of conversations happening monthly, making it one of the largest consumer platforms for LLMs after ChatGPT. (04:47) This massive scale provides statistical significance for their model rankings and demonstrates the platform's reach across the AI community.
  2. 25% of Arena's users work in software development, with about half of all users now logged in, providing valuable demographic insights about their user base. (05:04) This professional user concentration ensures the platform captures feedback from technically sophisticated users who can meaningfully evaluate model capabilities.
  3. Arena has released more real-world AI conversation data than essentially any other organization, contributing millions of authentic user interactions to the research community for studying real-world AI usage patterns. (16:35) This open data approach supports broader AI research while maintaining their competitive advantage in evaluation methodology.

Compelling Stories

Available with a Premium subscription

Thought-Provoking Quotes

Available with a Premium subscription

Strategies & Frameworks

Available with a Premium subscription

Similar Strategies

Available with a Plus subscription

Additional Context

Available with a Premium subscription

Key Takeaways Table

Available with a Plus subscription

Critical Analysis

Available with a Plus subscription

Books & Articles Mentioned

Available with a Plus subscription

Products, Tools & Software Mentioned

Available with a Plus subscription

More episodes like this

In Good Company with Nicolai Tangen
January 14, 2026

Figma CEO: From Idea to IPO, Design at Scale and AI’s Impact on Creativity

In Good Company with Nicolai Tangen
Uncensored CMO
January 14, 2026

Rory Sutherland on why luck beats logic in marketing

Uncensored CMO
We Study Billionaires - The Investor’s Podcast Network
January 14, 2026

BTC257: Bitcoin Mastermind Q1 2026 w/ Jeff Ross, Joe Carlasare, and American HODL (Bitcoin Podcast)

We Study Billionaires - The Investor’s Podcast Network
This Week in Startups
January 13, 2026

How to Make Billions from Exposing Fraud | E2234

This Week in Startups
Swipe to navigate