Research Design

Seeing Behavior in Everyday Data

December 10, 2025
by Skyler Chen. This post discusses how my training in data science changed the way I think about behavioral research. I share how simply exploring everyday datasets and noticing small, unexpected patterns can spark new research questions, and how archival data and experiments each offer distinct yet complementary insights into how people make judgments and decisions. I also highlight the growing set of tools that help us understand behavior in richer ways.

Digitization of Historical Maps in the Age of AI

December 3, 2025
by Elena Stacy. Researchers today increasingly have access to a wealth of tools to streamline or automate labor-intensive data processing and generation tasks. When it comes to mapping, progress has been slower. This blog details the author's experience tackling the digitization of a historical map in the age of AI.

A Practical Guide to Shift-Share Instruments (and What I Learned Replicating the China Shock)

November 26, 2025
by Jiayu Lai. Shift-share instruments are among the most widely used tools in applied economics, appearing in labor, trade, immigration, and policy evaluation research. But despite their popularity, many researchers still use them as black boxes — and risk invalid instruments as a result. In this blog post, I unpack how shift-share IVs actually work, why their validity depends on both the “shifts” and the “shares,” and what practical steps researchers should take to check assumptions. I also walk through how I used the Borusyak–Hull–Jaravel (2022, 2025) framework to reproduce the seminal Autor, Dorn, and Hanson (2013) China shock analysis.

Beyond the Hype: How We Built AI Tools That Actually Support Learning

November 12, 2025
by Weiying Li. What does genuine partnership look like when building AI for education? Working with middle school teachers and computer scientists, we co-designed AI dialogs where teachers are valuable contributors to refine what the AI understands as valuable thinking. Through iterative refinement, teachers identified precursor ideas and observations that predicted future learning, and refined guidance design in the dialog. Our AI dialog sees learning the way teachers do, built through genuine collaboration where both model development, learning sciences theories, and teachers' classroom expertise work together from the start, not just at the end.

Umesh Singla

Consulting Drop-In Hours: By appointment only

Consulting Areas: Bash or Command Line, Bayesian Methods, Causal Inference, Data Visualization, Deep Learning, Diversity in Data, Git or GitHub, Hierarchical Models, High Dimensional Statistics, Machine Learning, Nonparametric Methods, Python, Qualitative Methods, Regression Analysis, Research Design

Quick-tip: the fastest way to speak to a consultant is to first ...

Alyssa Heinze

Consulting Drop-In Hours: By appointment only

Consulting Areas: Causal Inference, Data Visualization, Experimental Design, Focus Groups and Interviews, Git or GitHub, LaTeX, Machine Learning, Meta-Analysis, Mixed Methods, Qualitative Methods, Qualtrics, R, Regression Analysis, Research Design, RStudio, STATA, Survey Design, Text Analysis

Quick-tip: the fastest way to speak to a consultant is to first ...

Aidan Lee

Consulting Drop-In Hours: By appointment only

Consulting Areas: ArcGIS Desktop - Online or Pro, Bayesian Methods, Causal Inference, Cluster Analysis, Data Sources, Data Visualization, Databases and SQL, Digital Health, Excel, Experimental Design, Geospatial Data: Maps and Spatial Analysis, Git or GitHub, LaTeX, Machine Learning, Means Tests, Mixed Methods, Natural Language Processing (NLP), OCR, Python, Qualtrics, R, Regression Analysis, Research Design, Research Planning, RStudio, RStudio Cloud, SAS, Software Output Interpretation, SPSS, SQL,...

John Louis-Strakes Lopez

Postdoctoral Scholar
Berkeley School of Education

John Louis-Strakes Lopez is a Data Science Education postdoctoral scholar. He recently received his PhD in Education from University of Caifornia, Irvine. John’s work looks at student epistemological development within data science contexts. He is also interested in designing -and studying artificial intelligence and playful learning technologies for learning. John serves as a co-chair for the International Learning Sciences Student Association.

Beyond work, you will find John reading at a local coffee shop or eating a warm bowl of Pho.

Jonathan Pedroza (JP)

Postdoctoral Scholar
Berkeley School of Education

JP is a postdoctoral scholar in Data Science Education. He received his PhD in Prevention Science from the University of Oregon. His research interests include: examining risk and protective factors of health disparities in Latina/o/x/e populations and investigating educational outcomes in underrepresented student populations. JP uses a social-ecological framework to address his research interests.

Previously, he has served as an adjunct lecturer at Cal Poly Pomona teaching research methods and statistics, as well as a data scientist at the University of Kansas' Accessible Teaching...

In Silico Approach to Mining Viral Sequences from Bulk RNA-Seq Data

October 28, 2025
by Carly Karrick. Viruses play important roles in evolution and influence ecosystems and host health. However, isolating and studying them can be difficult. In lieu of using resource-intensive methods to concentrate viruses into a “virome,” bulk sequencing methods include data from all biological entities present in a sample. In this tutorial, we explore an approach to mine viral sequences from publicly available bulk RNA-Seq data. The output from this analysis paves the way for future statistical analyses comparing viral communities in different contexts. This approach can be applied to other datasets, including studies of human health.