Data Science

The Importance of Design Plans for Data Science

April 20, 2021

Since becoming a Data Fellow at the D-Lab, I have had the opportunity to assist many talented social scientists through the D-Lab’s Consulting service. A regular consulting request is to help with the research design for a new project. These requests are understandable. For empirical researchers, a high-quality research design makes or breaks a research project. In this post, I suggest a few benefits of writing a skeleton design plan before writing any code whatsoever.

One of the exciting aspects...

Handling Missing Data

May 4, 2021

I recently started working with a set of eviction data for a project on housing precarity at the Urban Displacement Project. As I began exploring the dataset, I was excited to find that it appeared to contain a wealth of historical data we could use to train a robust model for predicting eviction rates in urban neighborhoods. However, my initial excitement soon had to be scaled back when a standard check for missing data revealed that many of the observations lacked values for precisely the variable we aimed to predict. I was now faced with the problem of what to do about this...