R Introduction to Machine Learning with tidymodels: Parts 1-2

November 16, 2021, 1:00pm to November 18, 2021, 4:00pm

Please note: Everyone is placed on the waitlist at first. It may take up to 24 hours to confirm your UCB affiliation and then you will receive a confirmation email and calendar invite. You will need to finish your registration by filling out this form: https://dlab.berkeley.edu/affiliations

Location: Remote via Zoom. Link will be sent on the morning of the event.

Date & Time: This workshop is a 2-part series that runs from 1pm-4pm

  • Tuesday, November 16
  • Thursday, November 18

Start Time: D-Lab workshops start 10 minutes after the scheduled start time (“Berkeley Time”). We will admit all participants from the waiting room at that time.


Machine learning often evokes images of Skynet, self-driving cars, and computerized homes. However, these ideas are less science fiction as they are tangible phenomena that are predicated on description, classification, prediction, and pattern recognition in data.

During this two part workshop, we will discuss basic features of supervised machine learning algorithms including k-nearest neighbor, linear regression, decision tree, random forest, boosting, and ensembling using the tidymodels framework.

To social scientists, such methods might be critical for investigating evolutionary relationships, global health patterns, voter turnout in local elections, or individual psychological diagnoses.

  • Background on machine learning

    • Classification vs regression

    • Performance metrics

  • Data preprocessing

    • Missing data

    • Train/test splits

  • Algorithm walkthroughs

    • Lasso

    • Decision trees

    • Random forests

    • Gradient boosted machines

    • SuperLearner ensembling

    • Principal component analysis

    • Hierarchical agglomerative clustering

  • Challenge questions

Prerequisites: D-Lab’s R Fundamentals or equivalent knowledge; previous experience with base R is assumed and basic familiarity with the tidyverse.

Workshop Materials: https://github.com/dlab-berkeley/Machine-Learning-with-tidymodels

Software Requirements:Installation Instructions for getting started with this working using R and RStudio.

Feedback: After completing the workshop, please provide us feedback using this form

Questions? Email: dlab-frontdesk@berkeley.edu