Machine Learning

Need help with Machine Learning?

Visit Drop-in Hours or Schedule a Consultation: <link to an embedded google calendar OB widget or google form widget> 

Below are the consultant we have available with Machine Learning and other expertise listed.

Lance Santana

Consulting Drop-In Hours: By appointment only

Consulting Areas: APIs, ArcGIS Desktop - Online or Pro, Bayesian Methods, Cluster Analysis, Data Visualization, Databases and SQL, Excel, Git or GitHub, Java, Machine Learning, Means Tests, Natural Language Processing (NLP), Python, Qualtrics, R, Regression Analysis, Research Planning, RStudio, Software Output Interpretation, SQL, Survey Design, Survey Sampling, Tableau, Text Analysis

Quick-tip: the fastest way to speak to a consultant is to first ...

Alyssa Heinze

Consulting Drop-In Hours: By appointment only

Consulting Areas: Causal Inference, Data Visualization, Experimental Design, Focus Groups and Interviews, Git or GitHub, LaTeX, Machine Learning, Meta-Analysis, Mixed Methods, Qualitative Methods, Qualtrics, R, Regression Analysis, Research Design, RStudio, STATA, Survey Design, Text Analysis

Quick-tip: the fastest way to speak to a consultant is to first ...

Carl Illustrisimo

Consulting Drop-In Hours: By appointment only

Consulting Areas: Bash or Command Line, Cluster Analysis, Data Sources, Data Visualization, Digital Humanities, Excel, Git or GitHub, Javascript, LaTeX, Machine Learning, Natural Language Processing (NLP), Python, Regression Analysis, RStudio, SQL, Text Analysis

Quick-tip: the fastest way to speak to a consultant is to first ...

Aidan Lee

Consulting Drop-In Hours: By appointment only

Consulting Areas: ArcGIS Desktop - Online or Pro, Bayesian Methods, Causal Inference, Cluster Analysis, Data Sources, Data Visualization, Databases and SQL, Digital Health, Excel, Experimental Design, Geospatial Data: Maps and Spatial Analysis, Git or GitHub, LaTeX, Machine Learning, Means Tests, Mixed Methods, Natural Language Processing (NLP), OCR, Python, Qualtrics, R, Regression Analysis, Research Design, Research Planning, RStudio, RStudio Cloud, SAS, Software Output Interpretation, SPSS, SQL,...

Jonathan Pedroza (JP)

Postdoctoral Scholar
Berkeley School of Education

JP is a postdoctoral scholar in Data Science Education. He received his PhD in Prevention Science from the University of Oregon. His research interests include: examining risk and protective factors of health disparities in Latina/o/x/e populations and investigating educational outcomes in underrepresented student populations. JP uses a social-ecological framework to address his research interests.

Previously, he has served as an adjunct lecturer at Cal Poly Pomona teaching research methods and statistics, as well as a data scientist at the University of Kansas' Accessible Teaching...

In Silico Approach to Mining Viral Sequences from Bulk RNA-Seq Data

October 28, 2025
by Carly Karrick. Viruses play important roles in evolution and influence ecosystems and host health. However, isolating and studying them can be difficult. In lieu of using resource-intensive methods to concentrate viruses into a “virome,” bulk sequencing methods include data from all biological entities present in a sample. In this tutorial, we explore an approach to mine viral sequences from publicly available bulk RNA-Seq data. The output from this analysis paves the way for future statistical analyses comparing viral communities in different contexts. This approach can be applied to other datasets, including studies of human health.

Python Introduction to Machine Learning: Parts 1-2

October 21, 2021, 1:00pm
This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets. No theory instruction will be provided.

Python Introduction to Machine Learning: Parts 1-2

September 27, 2021, 2:00pm
This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets. No theory instruction will be provided.

Forecasting Social Outcomes with Deep Neural Networks

October 7, 2025
by Paige Park. Our capacity to accurately predict social outcomes is increasing. Deep neural networks and artificial intelligence are crucial technologies pushing this progress along. As these tools reshape how social prediction is done, social scientists should feel comfortable engaging with them and meaningfully contributing to the conversation. But many social scientists are still unfamiliar with and sometimes even skeptical of deep learning. This tutorial is designed to help close that knowledge gap. We’ll walk step-by-step through training a simple neural network for a social prediction task: forecasting population-level mortality rates.

Maksymilian Jasiak

Data Science & AI Fellow 2025-2026
Civil and Environmental Engineering

Maksymilian Jasiak is a PhD Student in GeoSystems Engineering at the University of California, Berkeley. His research focuses on Distributed Fiber Optic Sensing (DFOS) for lifeline infrastructure monitoring. His work aims to advance critical infrastructure security and resilience. He holds a MS in GeoSystems Engineering from the University of California, Berkeley and a BS in Civil Engineering from the University of Illinois Urbana-Champaign.