Data Manipulation and Cleaning

Skyler Yumeng Chen

Data Science & AI Fellow 2025-2026, Data Science for Social Justice Fellow 2024
Haas School of Business

Skyler is a Ph.D. student in Behavioral Marketing at the Haas School of Business. Her research centers on consumer behavior and judgment and decision-making, with a keen interest in both experimental methods and data science techniques. She holds a B.A. in Economics and a B.S. in Data Science from New York University Shanghai.

Decision-Making Under Pressure during My PhD: Lessons from whale songs and ocean noise

May 6, 2025
by Jaewon Saw. This blog post shares a story from a field experiment using Distributed Acoustic Sensing (DAS) to detect whale vocalizations in Monterey Bay. Most of the data got overwhelmed by noise from boat engines, wave motion, and cable instability. On the final day, a spur-of-the-moment decision to add loops to the fiber optic cable dramatically improved signal quality.

Enrique Valencia López

Data Science Fellow 2022-2023
Graduate School of Education

Enrique Valencia López is a PhD student in the Policy, Politics and Leadership cluster at the Graduate School of Education.His research interests relate to three broad areas: the stratification of education by gender, immigration status and ethnicity; the measurement of teacher working conditions and well-being; and education in Latin America.

Before coming to Berkeley, Enrique worked for Mexico’s National Institute for Educational Evaluation and Assessment (INEE) in both the Policy and Indicators area. During that time, he co-authored Mexico’s first report on the educational...

Jaewon Saw

Data Science Fellow 2024-2025
Civil and Enviromental Engineering

I am a PhD candidate in Systems Engineering. My current research focuses on distributed acoustic sensing (DAS), a cutting-edge technology with diverse applications. I have used DAS to detect whale vocalizations in Monterey Bay, California, and to monitor roadways, water pipelines, and energy infrastructure.

I enjoy identifying and mitigating challenges that arise when applying new technologies by developing data tools, pipelines, and frameworks for real-world deployments. My work is driven by a keen interest in exploring and refining innovative...

Elijah Mercer

Data Science Fellow 2024-2025
School of Information

Elijah, originally from Newark, New Jersey, now resides in San Francisco, California, dedicated to social and juvenile justice. With a Criminology degree from American University, he began as a research intern at the Investigative Reporting Workshop, focusing on the Digital Divide.

Teaching in Baltimore with Teach for America reinforced his belief in research and data for marginalized communities. In roles at the Coalition Against Insurance Fraud, New York Police Department, and San Francisco District Attorney’s Office, Elijah used data to combat crime. Now...

Christian Caballero

Data Science Fellow 2024-2025
Political Science

Christian Caballero is a Political Science PhD student at the University of California, Berkeley. His research focuses on American politics and political behavior. In particular, he studies the ways in which social networks influence processes of political persuasion and democratic deliberation, as well as how political ideologies develop within subcultures.

He holds a B.A. in Politics and Sociology from New York University and an M.A. in Political Science from the University of California, Berkeley.

Why Data Disaggregation Matters: Exploring the Diversity of Asian American Economic Outcomes Using Public Use Microdata Sample (PUMS) Data

February 11, 2025
by Taesoo Song. Asian Americans are often overlooked in discussions of racial inequality due to their high average socioeconomic attainment. Many academic and policy researchers treat Asians as a single racial category in their analysis. However, this broad categorization can mask significant within-group disparities, leaving many disadvantaged individuals without access to vital resources and policy support. Song emphasizes the importance of data disaggregation in revealing Asian American inequalities, particularly in areas like income and homeownership, and demonstrates how breaking down these categories can lead to more targeted and effective policy solutions.

Suraj Nair

Data Science Fellow 2023-2024
School of Information

I am a PhD Student at the School of Information. My research interests lie at the intersection of development economics and machine learning, with a focus on the use of large scale digital data and new computational tools to study pressing issues in global development.

María Martín López

Data Science Fellow 2023-2024
Psychology

María Martín López is a PhD student in the Cognition area within the Department of Psychology. Her research relates to cognitive computational and quantitative models of individual differences in behaviors, thoughts, and emotions. She is particularly interested in how we can create and leverage novel algorithms to understand, measure, and predict processes relating to externalizing psychopathology (e.g. impulsivity, aggression, substance use). She answers these questions using a range of computational and quantitive models including AI, NLP, SEM, time series analysis, multi-level...

Kamya Yadav

Senior Data Science Fellow 2024-2025, Data Science Fellow 2023-2024
Political Science

Kamya is a third year PhD student in the Department of Political Science. Using multimethod research, she studies gender, representation, and political parties in India to understand the barriers and pathways to women's political participation and representation. She has a BA in Politics from Princeton University.