Statistics

Michael Sholinbeck

Public Health Librarian
Bioscience, Natural Resources & Public Health Library

Michael has worked at the UC Berkeley Library since 2001, and is currently the Public Health Librarian and Liaison to the School of Optometry at the Bioscience, Natural Resources & Public Health Library. Michael coordinates public health instruction at the library, and is responsible for the public health collection. Michael has a MLIS from San Jose State University, an MS in Geography from Oregon State University, and a BA in Geography from UC Berkeley. When not at work he lives out his fantasy of being a rock and roll drummer.

Scarlet Sands-Bliss

Data Science & AI Fellow 2025-2026, Domain Consultant, Research IT
School of Public Health

Scarlet Bliss is an MS/PhD student in Epidemiology in the School of Public Health. Her work focuses on mixed methods approaches to characterizing and preventing spread of antimicrobial resistance and other enteric pathogens via the environment. She has experience in statistical analysis and public health bioinformatics. She is interested in ethical use of big data as it relates to epidemiologic research.

Armaan Hiranandani

Data Science & AI Fellow 2025-2026
School of Information

Armaan Hiranandani is a Master’s student in Data Science at UC Berkeley, where he also earned his B.S. in Industrial Engineering & Operations Research. Born and raised in Dubai, Armaan recently completed a software engineering internship at Netflix, working on the machine learning platform team. His interests include building scalable AI systems and applying data science to solve real-world problems.

Finley Golightly

IT Support & Helpdesk Supervisor
Applied Mathematics

Finley has been with D-Lab since Fall 2020, formerly as part of the UTech Management team before joining as full-time staff in Fall 2023. They love the learning environment of D-Lab and their favorite part of the job is their co-workers! In their free time, they enjoy reading, boxing, listening to music, and playing Dungeons & Dragons. Feel free to stop by the front desk to ask them any questions or just to chat!

A Practical Guide to Shift-Share Instruments (and What I Learned Replicating the China Shock)

November 26, 2025
by Jiayu Lai. Shift-share instruments are among the most widely used tools in applied economics, appearing in labor, trade, immigration, and policy evaluation research. But despite their popularity, many researchers still use them as black boxes — and risk invalid instruments as a result. In this blog post, I unpack how shift-share IVs actually work, why their validity depends on both the “shifts” and the “shares,” and what practical steps researchers should take to check assumptions. I also walk through how I used the Borusyak–Hull–Jaravel (2022, 2025) framework to reproduce the seminal Autor, Dorn, and Hanson (2013) China shock analysis.

Jonathan Pedroza (JP)

Postdoctoral Scholar
Berkeley School of Education

JP is a postdoctoral scholar in Data Science Education. He received his PhD in Prevention Science from the University of Oregon. His research interests include: examining risk and protective factors of health disparities in Latina/o/x/e populations and investigating educational outcomes in underrepresented student populations. JP uses a social-ecological framework to address his research interests.

Previously, he has served as an adjunct lecturer at Cal Poly Pomona teaching research methods and statistics, as well as a data scientist at the University of Kansas' Accessible Teaching...

John Louis-Strakes Lopez

Postdoctoral Scholar
Berkeley School of Education

John Louis-Strakes Lopez is a Data Science Education postdoctoral scholar. He recently received his PhD in Education from University of Caifornia, Irvine. John’s work looks at student epistemological development within data science contexts. He is also interested in designing -and studying artificial intelligence and playful learning technologies for learning. John serves as a co-chair for the International Learning Sciences Student Association.

Beyond work, you will find John reading at a local coffee shop or eating a warm bowl of Pho.

In Silico Approach to Mining Viral Sequences from Bulk RNA-Seq Data

October 28, 2025
by Carly Karrick. Viruses play important roles in evolution and influence ecosystems and host health. However, isolating and studying them can be difficult. In lieu of using resource-intensive methods to concentrate viruses into a “virome,” bulk sequencing methods include data from all biological entities present in a sample. In this tutorial, we explore an approach to mine viral sequences from publicly available bulk RNA-Seq data. The output from this analysis paves the way for future statistical analyses comparing viral communities in different contexts. This approach can be applied to other datasets, including studies of human health.

A brief primer on Hidden Markov Models

April 25, 2022
by Amy Van Scoyoc. For many data science problems, there is a need to estimate unknown information from a sequence of observed events. There are many ways to tackle these types of sequential input problems. In the data science world, there is a tendency to use machine learning approaches to search for relations in the dataset. But in many cases, we don’t have enough data or the sequences are too long to train RNNs effectively. In such cases, simpler is better. Enter the Hidden Markov Model.

Forecasting Social Outcomes with Deep Neural Networks

October 7, 2025
by Paige Park. Our capacity to accurately predict social outcomes is increasing. Deep neural networks and artificial intelligence are crucial technologies pushing this progress along. As these tools reshape how social prediction is done, social scientists should feel comfortable engaging with them and meaningfully contributing to the conversation. But many social scientists are still unfamiliar with and sometimes even skeptical of deep learning. This tutorial is designed to help close that knowledge gap. We’ll walk step-by-step through training a simple neural network for a social prediction task: forecasting population-level mortality rates.