In this episode, I'm speaking with Julien Chaumond from đŸ€— HuggingFace, about how they got started, getting large language models to production in millisecond inference times, and the CERN for machine learning.

Join our Discord community: https://discord.gg/tEYvqxwhah

---

Timestamps: 

01:00 - Guest intro

02:14 - Origin of HuggingFace

05:37 - Why the focus on NLP?

07:45 - The success of the HuggingFace community

13:14 - Reproducing models and scaling for the community

18:14 - Enabling large models in production

23:14 - How HuggingFace scales so many models

27:34 - The biggest challenge HuggingFace solved in MLOps

32:02 - How HuggingFace transitions from research to production

34:44 - Using notebooks vs python modules

38:27 - The most interesting topic in ML production

40:10 - Fascinating ML research

45:24 - Learning new things

51:14 - Something that is true but most people disagree with

56:54 - Tips to organize research teams

1:00:05 - New features for accelerated inference

1:01:35 - Most common use case of HuggingFace

1:04:17 - Integrating search algorithms into transformer library

1:05:09 - Integrating vision models

1:06:06 - Long term business model

1:10:55 - Automation and simplification of the process of building models

1:13:02 - Support for real-time inference

1:14:40 - Recommendations for the audience

---

Relevant Links:

FastDS: https://github.com/DAGsHub/fds

BigScience: https://bigscience.huggingface.co

https://www.linkedin.com/company/dagshub/

https://www.linkedin.com/company/huggingface/

https://twitter.com/TheRealDAGsHub

https://twitter.com/huggingface

Podden och tillhörande omslagsbild pÄ den hÀr sidan tillhör Dean Pleban @ DagsHub. InnehÄllet i podden Àr skapat av Dean Pleban @ DagsHub och inte av, eller tillsammans med, Poddtoppen.