Search for a command to run...

Timestamps are as accurate as they can be but may be slightly off. We encourage you to listen to the full context.
In this episode, Nathan explores Polish AI sovereignty with Marek Kozlowski, head of the AI Lab at Poland's National Information Processing Institute. Marek discusses Project PLLuM (Polish Large Language Models), which aims to create smaller, localized AI models that compete with frontier models by focusing on Polish language, culture, and values. (03:17) The conversation delves into how 90% of training data for major models is English and Chinese, leaving Polish with only about 1% representation. (08:24) Marek explains their strategy of language adaptation on base models like LLaMA and Mistral, combined with organic human-curated instruction data, to achieve competitive performance for Polish use cases while maintaining transparency, sovereignty, and cost advantages.
• Main themes: AI sovereignty through localized models, regulatory challenges in the EU, the technical approach of language adaptation, and the strategic importance of maintaining national AI capabilities
Marek Kozlowski serves as head of the AI Lab at Poland's National Information Processing Institute, where he leads Project PLLuM (Polish Large Language Models). He spearheads Poland's national AI sovereignty initiative, focusing on developing transparent, locally-controlled AI models adapted for Polish language and culture. His work involves coordinating a consortium of six to eight institutes and universities funded by Poland's Ministry of Digital Affairs to create competitive alternatives to global AI models.
Rather than competing directly with frontier models on general capabilities, Poland's strategy focuses on creating smaller, domain-specific models that excel in Polish language and cultural contexts. (05:11) Marek argues that for specific business and government use cases, a well-trained 8B parameter model can match the performance of much larger cloud-based models when fine-tuned for particular tasks. This approach offers better cost control, data privacy, and regulatory compliance while serving actual user needs more effectively than general-purpose models.
The PLLuM project uses "language adaptation" - continuing pre-training of base models like LLaMA on Polish text corpora. (59:40) This technique maintains the model's existing capabilities in other languages and domains while significantly improving Polish language understanding and cultural knowledge. Though some forgetting occurs, the models retain competency in English and other areas while gaining native-level fluency in Polish idioms, cultural references, and domain-specific knowledge.
Unlike many AI projects that rely heavily on synthetic data generation, PLLuM emphasizes "organic" instruction and preference data created and validated by humans. (19:17) Marek's team employs hundreds of annotators to create manual instructions and preferences, believing this approach produces higher linguistic quality than synthetic alternatives. This human-centric data curation is resource-intensive but crucial for achieving native-level language generation and avoiding the degradation that comes from low-quality synthetic training data.
Beyond just having national AI models, true sovereignty demands transparency in training processes, data sources, and model architectures. (16:02) PLLuM publishes detailed "cookbook" documentation of their training process and open-sources samples of their datasets, going beyond just releasing model weights. This transparency enables other countries to replicate their approach and ensures that Poland maintains genuine control over its AI infrastructure rather than depending on black-box systems from global providers.
Most business and government applications need models that excel at 10-20 specific tasks rather than general-purpose capabilities across thousands of tasks. (44:15) For on-premise deployments where organizations face GPU and energy constraints, smaller fine-tuned models often outperform few-shot approaches with large cloud models. This insight suggests that the future of enterprise AI may favor specialized local models over massive general-purpose systems, especially in regulated industries and government applications.