Data Science

R Fundamentals: Parts 1-4

May 1, 2023, 10:00am
This workshop is a four-part introductory series that will teach you R from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the open-sourced R Studio software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.

Python Web APIs

February 8, 2024, 10:00am
In this workshop, we cover how to extract data from the web with APIs using Python. APIs are often official services offered by companies and other entities, which allow you to directly query their servers in order to retrieve their data. Platforms like The New York Times, Twitter and Reddit offer APIs to retrieve data.

Python Machine Learning Fundamentals: Parts 1-2

October 4, 2022, 2:00pm
This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets. No theory instruction will be provided.

Skyler Yumeng Chen

Data Science for Social Justice Fellow 2024
Haas School of Business

Skyler is a Ph.D. student in Behavioral Marketing at the Haas School of Business. Her research centers on consumer behavior and judgment and decision-making, with a keen interest in both experimental methods and data science techniques. She holds a B.A. in Economics and a B.S. in Data Science from New York University Shanghai.

Grace Hu

Data Science for Social Justice Fellow 2024
Bioengineering

Grace is a 3rd year Bioengineering PhD candidate in the joint UC Berkeley-UCSF Graduate Program. Her research lies at the nexus of computational design and 3D-bioprinting to advance tissue engineering for regenerative medicine. She previously studied Materials Science and Engineering (B.S.) and Computer Science (M.S.) at Stanford University, where she investigated printable batteries to power an ultra-affordable scanning electron microscope and explored computer science education research by developing AI models to augment teaching ability.

In her free time she...

Taylor Galdi

Data Science for Social Justice Fellow 2024
Law (JSP)
Sociology
Social Psychology

Taylor is a dual JD/Ph.D. student in Berkeley Law's Jurisprudence and Social Policy Program. Broadly, she is interested in studying courts, social movements and social change, and the legal profession.

Jonathan Pérez

Data Science for Social Justice Fellow 2024
Education

Jonathan Pérez is a 4th year PhD student in education with a designated emphasis critical theory. His research focuses on how students understand their radicalization with a focus particularly on how California's Ethnic Studies Curriculum can equip students to better make sense of how schools and society racialize them.

Outside of of UC Berkeley, Jonathan is an adjunct at San Francisco State University and works in curriculum design for The School of The New York Times.

Elizabeth Fajardo

Data Science for Social Justice Fellow 2024
Graduate Group in Ancient History and Mediterranean Archaeology

I am a PhD Student in Ancient History and Mediterranean Archaeology. I study the Roman Imperial Economy, particularly the development of human capital during the Imperial Period and the Roman monetary system.

My main research interests include political economy, labor, and economic metaphor in Ancient Rome, particularly highlighting the intersections of economic production and power.

Propensity Score Matching for Causal Inference: Creating Data Visualizations to Assess Covariate Balance in R

June 10, 2024
by Sharon Green. Although some people consider randomized experiments the gold standard, in many cases, it would be highly unethical to assign individuals to harmful exposures to measure their effects. Modern causal inference techniques help scientists to estimate treatment effects using observational data. In particular, propensity score matching helps scientists estimate causal effects using observational data by matching individuals so that the “treatment” and “control” groups are balanced on measured covariates. After implementing propensity score matching, data visualizations make it easier to assess the quality of the matches before estimating effects. This blog post is a tutorial for implementing propensity score matching and creating data visualizations to assess covariate balance–that is, visually assessing whether the matched individuals are balanced with respect to measured covariates.

Hugh Kadhem

Mathematics

Hugh Kadhem is a Ph.D. student in Applied Mathematics, with broad research interests in computational quantum physics and high-performance scientific computing.