Intro to Tidymodels

tidymodels
positron
Presentation materials from a recent talk I gave at Loyola Marymount University
Author
Published

March 24, 2025

About Tidymodels

The tidymodels framework for R is a collection of packages that brings tidy principles and a unified syntax to machine learning (“ML”) for R programmers, enabling end-to-end reproducibility for your ML workflows. I’ve been using this framework for five years and it continues to improve. Posit PBC funds a software engineering team dedicated to the development of this framework so its packages are feature-rich, regularly maintained, and current with ML trends. For Python users unfamiliar with R tools, the tidymodels framework is very similar to Python’s scikit-learn.

The core tidymodels packages include the following:

  • rsample: provides infrastructure for efficient data splitting and resampling

  • parsnip: a tidy, unified interface to models that can be used to try a range of models without getting bogged down in the syntactical minutiae of the underlying packages

  • recipes: a tidy interface to data pre-processing tools for feature engineering

  • workflows: expands the traditional model-only recipe to a much more holistic blueprint for pre-processing, modeling, post-processing, and evaluation

  • dials: creates and manages tuning parameters and parameter grids

  • tune: helps you optimize the hyperparameters of your model and pre-processing steps

  • yardstick: measures the effectiveness of models using performance metrics

  • broom: converts the information in common statistical R objects into user-friendly, predictable formats

I was thrilled to present about tidymodels last week to the Department of Mathematics, Statistics and Data Science at Loyola Marymount University. Their students and faculty were engaging and I had a great time covering a logistic regression problem with this framework.

Other Tools Explored

  • Positron: A fresh, open-source coding environment purpose-built for data analysis and modeling, including all the best bells and whistles from VS Code and RStudio.

Embedded Presentation