This week on Data in Biotech, Ryan Mork, Director of Data Science at Evozyne, joins host Ross Katz to discuss how data science and machine learning are being used in protein engineering and drug discovery.
Ryan explains how Evozyne is utilizing large language models (LLMs) and generative AI (GenAI) to design new biomolecules, training the models with huge volumes of protein and biology data. He walks through the organization’s evolution-based design approach and how it leverages the evolutionary history of protein families.
Ross and Ryan dig into the different models being used by Evozyne, including latent variable models and embeddings. They also discuss some of the challenges around testing the functionality of models and the approaches that can be used for evaluation.
Alongside the deep dive into data and modeling topics, Ryan also discusses the importance of relationships between the wet lab and data science teams. He emphasizes the need for mutual understanding of each role to ensure the entire organization pulls together towards the same goals.
Finally, Ross asks Ryan to opine on the future of GenAI and LLMs for biotechnology and how this area will develop over the next five years. He also finds out more about the R&D roadmap at Evozyne and its plans to play a part in moving GenAI for protein engineering forward.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:24] Introduction to Ryan, his career to date, and the focus of Evozyne.
[2:59] How the Evozyne data science team operates and the data sources it utilizes.
[4:22] Building models to develop synthetic proteins for therapeutic uses.
[9:10] Deciding which proteins to take into the lab for experimental validation.
[10:49] Taking an evolution-based design approach to protein engineering.
[14:34] Using latent variable models and embeddings to capture evolutionary relationships.
[18:01] Evaluating the functionality of generative models and the role of auxiliary models.
[24:24] The value of tight coupling and mutual understanding between wet lab and data science teams.
[28:07] Evozyne’s approach to developing and testing new data science tools, models, and technologies.
[31:35] Predictions for future developments in Generative AI for biotechnology.
[33:41] Evozyne’s goal to increase throughput and its planned approach.
[39:09] Where to connect with Ryan and keep up to date with news from Evozyne.