Data Visualization

Sahiba Chopra

Data Science Fellow 2024-2025
Haas School of Business

I'm a PhD student in the Management and Organizations (Macro) group at Berkeley Haas. I have a diverse professional background, primarily as a data scientist across numerous industries, including fintech, cleantech, and media. I hold a BA in Economics from the University of Maryland, an MS in Applied Economics from the University of San Francisco, and an MS in Business Administration from UC Berkeley.

My research focuses on the intersection of inequality, technology, and the labor market. I am particularly interested in understanding how to reduce inequality in...

Nanqin Ying

Data Science Fellow 2024-2025
Goldman School of Public Policy

Nanqin Ying, a second-year graduate student at the Goldman School of Public Policy specializing in Development Practices, combines a robust nonprofit background with advanced data science techniques. She focuses on leveraging machine learning and big data to drive significant social change, aiming to transform insights into actionable, positive impacts on communities.

Jane (Mango) Angar

Data Science Fellow 2024-2025
Political Science

Hi! I am a PhD candidate in the Political Science Department at UC Berkeley. My dissertation traces the emergence of disability rights groups in Africa, focusing on Zambia and Malawi, and examines factors influencing their effectiveness. I use mixed methods, including archival work, field interviews, participant observation, and surveys for data collection.

My data analysis techniques include text analysis, social network analysis, means tests, and regressions. In my free time, I enjoy moderately difficult hikes, walks along the beach with my dog, Princess, and...

Jaewon Saw

Data Science Fellow 2024-2025
Civil and Enviromental Engineering

I am a PhD candidate in Systems Engineering. My current research focuses on distributed acoustic sensing (DAS), a cutting-edge technology with diverse applications. I have used DAS to detect whale vocalizations in Monterey Bay, California, and to monitor roadways, water pipelines, and energy infrastructure.

I enjoy identifying and mitigating challenges that arise when applying new technologies by developing data tools, pipelines, and frameworks for real-world deployments. My work is driven by a keen interest in exploring and refining innovative...

Elijah Mercer

Data Science Fellow 2024-2025
School of Information

Elijah, originally from Newark, New Jersey, now resides in San Francisco, California, dedicated to social and juvenile justice. With a Criminology degree from American University, he began as a research intern at the Investigative Reporting Workshop, focusing on the Digital Divide.

Teaching in Baltimore with Teach for America reinforced his belief in research and data for marginalized communities. In roles at the Coalition Against Insurance Fraud, New York Police Department, and San Francisco District Attorney’s Office, Elijah used data to combat crime. Now...

Bruno Smaniotto

Data Science Fellow 2024-2025
Economics

I'm originally from Brazil, but I have been living in Berkeley for the last 5 years working towards my PhD in Economics. My main areas of interest are Behavioral and Macroeconomics, mostly their intersection, but I'm excited about learning and working on empirical applications on different fields.

Amber Galvano

Data Science Fellow 2024-2025
Linguistics

I am a fourth-year PhD student in Linguistics, with a focus in sociophonetics and phonology. In my research, I'm interested in how understudied speech communities (Andalusians, southern Spain; Lobi and Tonko Limba, West Africa) and often-relegated aspects of social identity (sexuality, gender normativity) can inform new approaches to theory and methodology and how we conceptualize the interfaces between linguistic subfields.

I'm also involved in language documentation/revitalization work for Lobi and the development of automated phonetic methods, particularly for...

Why Data Disaggregation Matters: Exploring the Diversity of Asian American Economic Outcomes Using Public Use Microdata Sample (PUMS) Data

February 11, 2025
by Taesoo Song. Asian Americans are often overlooked in discussions of racial inequality due to their high average socioeconomic attainment. Many academic and policy researchers treat Asians as a single racial category in their analysis. However, this broad categorization can mask significant within-group disparities, leaving many disadvantaged individuals without access to vital resources and policy support. Song emphasizes the importance of data disaggregation in revealing Asian American inequalities, particularly in areas like income and homeownership, and demonstrates how breaking down these categories can lead to more targeted and effective policy solutions.

Which Coin Should I Flip? The Multi-Arm Bandit

February 4, 2025
by Bruno Smaniotto. Consider the following game: You are given the option to choose between two coins to flip. These coins are possibly biased, so the probability of getting Heads for each coin might differ from 50/50. Each time that you flip Heads, you win one dollar. There are a total of 10 rounds. Which coin should you flip at each round? In this blog post, we will analyze this problem through the lens of a famous decision-making algorithm called the Multi-Arm Bandit, exploring how to structure the problem mathematically and how it can be solved for particular examples.

Measuring Vowels Without Relying on Sex-Based Assumptions

April 8, 2025
by Amber Galvano. This tutorial builds on my previous post on Python for acoustic analysis, this time focusing on measuring vocal tract resonances without relying on sex-based assumptions. I demonstrate how to process audio files and vowel annotations using an adaptive method that optimizes the acoustic analysis across a recording. Instead of fixing parameters based on generalized vocal tract length correlations, this approach varies them within a defined range for greater accuracy. This not only enhances measurement precision but also avoids requiring (or assuming) speakers’ sex in data collection. Finally, I show how to filter for outliers and create high-quality vowel space visualizations.