Part 2 of this series could have easily been renamed "AI for science: The expert’s guide to practical machine learning.” We continue our discussion with Christoph Molnar and Timo Freiesleben to look at how scientists can apply supervised machine learning techniques from the previous episode into their research.
Show Notes
Introduction to supervised ML for science (0:00)
The model as the expert? (1:00)
- Evaluation metrics have profound downstream effects on all modeling decisions
- Data augmentation offers a simple yet powerful way to incorporate domain knowledge
- Domain expertise is often undervalued in data science despite being crucial
Measuring causality: Metrics and blind spots (10:10)
- Causality approaches in ML range from exploring associations to inferring treatment effects
Connecting models to scientific understanding (18:00)
- Interpretation methods must stay within realistic data distributions to yield meaningful insights
Robustness across distribution shifts (26:40)
- Robustness requires understanding what distribution shifts affect your model
- Pre-trained models and transfer learning provide promising paths to more robust scientific ML
Reproducibility challenges in ML and science (35:00)
- Reproducibility challenges differ between traditional science and machine learning
Do you have a question about supervised machine learning?
Go back to listen to part one of this series for the conceptual foundations that support these practical applications.
Check out Christoph and Timo's book “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box” available online now.
Connect with The AI Fundamentalists to comment on your favorite topics:
- LinkedIn - Episode summaries, shares of cited articles, and more.
- YouTube - Was it something that we said? Good. Share your favorite quotes.
- Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.