In this episode, Dean speaks with Michał Oleszak, an ML engineering manager at Solera. Michał shares insights into how his team is using machine learning to transform the automotive claims process, from recognizing vehicle damages in images to estimating repair costs. The conversation covers the challenges of deploying ML pipelines in production, managing data quality for computer vision tasks, and balancing technical implementation with business needs. Michał also discusses his approach to model evaluation, the benefits of monorepo architecture, and his views on exciting developments in self-supervised learning for computer vision.

Join our Discord community: https://discord.gg/tEYvqxwhah

---

Timestamps:

00:00 Introduction

00:42 Production for Machine Learning at Solera

03:49 Transitioning from Images to Structured Data

04:58 Combining Deep Learning and Non-Deep Learning Models

05:15 Deployment Process for Machine Learning Models

08:01 Challenges and Solutions in Monorepo Adoption

12:57 Evaluating Model and Pipeline Versions

21:57 Tools for ML Projects: Monorepo, Pants, GitHub Actions

24:04 Data Management and Data Quality

30:14 Challenges in ML Efforts: Data Quality

30:37 Excitement about Self-Supervised Learning and JEPA Architectures

34:45 Controversial Opinion: Importance of Statistics for ML

36:40 Recommendations

Links

🌎Prisoners of Geography by Tim Marshall: https://www.amazon.com/Prisoners-Geography-Explain-Everything-Politics/dp/1501121472

➡️ Michał Oleszak on LinkedIn – https://www.linkedin.com/in/michal-oleszak/

➡️ Michał Oleszak on Twitter – https://x.com/MichalOleszak

🌐 Check Out Our Website! https://dagshub.com

Social Links:

➡️ LinkedIn: https://www.linkedin.com/company/dagshub

➡️ Twitter: https://twitter.com/TheRealDAGsHub

➡️ Dean Pleban: https://twitter.com/DeanPlbn

Podden och tillhörande omslagsbild på den här sidan tillhör Dean Pleban @ DagsHub. Innehållet i podden är skapat av Dean Pleban @ DagsHub och inte av, eller tillsammans med, Poddtoppen.