RStudio

Christian Caballero

Data Science Fellow 2024-2025
Political Science

Christian Caballero is a Political Science PhD student at the University of California, Berkeley. His research focuses on American politics and political behavior. In particular, he studies the ways in which social networks influence processes of political persuasion and democratic deliberation, as well as how political ideologies develop within subcultures.

He holds a B.A. in Politics and Sociology from New York University and an M.A. in Political Science from the University of California, Berkeley.

Sahiba Chopra

Data Science Fellow 2024-2025
Haas

I'm a PhD student in the Management and Organizations (Macro) group at Berkeley Haas. I have a diverse professional background, primarily as a data scientist across numerous industries, including fintech, cleantech, and media. I hold a BA in Economics from the University of Maryland, an MS in Applied Economics from the University of San Francisco, and an MS in Business Administration from UC Berkeley.

My research focuses on the intersection of inequality, technology, and the labor market. I am particularly interested in understanding how to reduce inequality in...

Mingyu Yuan

Data Science for Social Justice Senior Fellow 2024
Linguistics

I am a Ph.D. candidate in Linguistics, with a focus on phonetics and phonology, specifically speech production in neuro-atypical populations. I use methods from Natural Language Processing in my day-to-day research.

Violet Davis

Data Science for Social Justice Senior Fellow 2024
MIDS

I am a Masters student studying Data Science with the School of Information. My research involves computational social science projects focused on social justice and equity.

Skyler Yumeng Chen

Data Science for Social Justice Fellow 2024
Haas School of Business

Skyler is a Ph.D. student in Behavioral Marketing at the Haas School of Business. Her research centers on consumer behavior and judgment and decision-making, with a keen interest in both experimental methods and data science techniques. She holds a B.A. in Economics and a B.S. in Data Science from New York University Shanghai.

Tracy Burnett

Data Science for Social Justice Fellow 2024
Department of Environmental Science, Policy, and Management

Tracy uses qualitative methods founded in complexity theory and hierarchy theory to model the interlinked scales of coupled social-ecological systems. She conducted the majority of her research among nomads in Amdo, Tibet. She works to develop both theoretical and technological tools that support linguistic diversity and cultural resilience.

Propensity Score Matching for Causal Inference: Creating Data Visualizations to Assess Covariate Balance in R

June 10, 2024
by Sharon Green. Although some people consider randomized experiments the gold standard, in many cases, it would be highly unethical to assign individuals to harmful exposures to measure their effects. Modern causal inference techniques help scientists to estimate treatment effects using observational data. In particular, propensity score matching helps scientists estimate causal effects using observational data by matching individuals so that the “treatment” and “control” groups are balanced on measured covariates. After implementing propensity score matching, data visualizations make it easier to assess the quality of the matches before estimating effects. This blog post is a tutorial for implementing propensity score matching and creating data visualizations to assess covariate balance–that is, visually assessing whether the matched individuals are balanced with respect to measured covariates.

Introduction to Propensity Score Matching with MatchIt

April 1, 2024
by Alex Ramiller. When working with observational (i.e. non-experimental) data, it is often challenging to establish the existence of causal relationships between interventions and outcomes. Propensity Score Matching (PSM) provides a powerful tool for causal inference with observational data, enabling the creation of comparable groups that allow us to directly measure the impact of an intervention. This blog post introduces MatchIt – a software package that provides all of the necessary tools for conducting Propensity Score Matching in R – and provides step-by-step instructions on how to conduct and evaluate matches.

What Are Vowels Made Of? Graphing a Classic Dataset with R

February 13, 2024
by Anna Björklund. Vowels are all around us. Mainstream US English has around twelve unique vowels. How can our brains tell these sounds apart? This blog post will help you answer this question by plotting vowel data from a classic American English dataset by Peterson and Barney (1952).

Reine Ngnonsse

IUSE Undergraduate Advisory Board
Genetics and Plant Biology

Reine Ngnonsse, an enthusiast for math and technology, delved into tutoring math at a community college through the EOPs program. At UC Berkeley, while pursuing Genetics and Plant Biology, She explored R programming in a CRISPR project. As an intern at Health Career Connection, Reine expanded coding skills in Python, R, and Tableau, igniting a passion for programming. With exposure to Python and Javascript, she can't wait to merge mathematical prowess with coding finesse for innovative solutions.