Research Project

Why Data Disaggregation Matters: Exploring the Diversity of Asian American Economic Outcomes Using Public Use Microdata Sample (PUMS) Data

February 11, 2025
by Taesoo Song. Asian Americans are often overlooked in discussions of racial inequality due to their high average socioeconomic attainment. Many academic and policy researchers treat Asians as a single racial category in their analysis. However, this broad categorization can mask significant within-group disparities, leaving many disadvantaged individuals without access to vital resources and policy support. Song emphasizes the importance of data disaggregation in revealing Asian American inequalities, particularly in areas like income and homeownership, and demonstrates how breaking down these categories can lead to more targeted and effective policy solutions.

Claudia von Vacano, Ph.D.

Founding Executive Director, P.I., Research Director, FSRDC

Dr. Claudia von Vacano is the Founding Executive Director and Senior Research Associate of D-Lab and Digital Humanities at Berkeley and is on the boards of the Social Science Matrix and Berkeley Center for New Media. She has worked in policy and educational administration since 2000, and at the UC Office of the President and UC Berkeley since 2008. She received a Master’s degree from Stanford University in Learning, Design, and Technology. Her doctorate is in Policy, Organizations, Measurement, and Evaluation from UC Berkeley. Her expertise is in organizational theory and...

Field Experiments in Corporations

January 28, 2025
by Yue Lin. How do social science researchers conduct field experiments with private actors? Yue Lin provides a brief overview of the recent developments in political economy and management strategy, with a focus on filing field experiments within private corporations. Unlike conventional targets like individuals and government agencies, private companies are an emergent sweet spot for scholars to test for important theories, such as sustainability, censorship, and market behavior. After comparing the strengths and weaknesses of this powerful yet nascent method, Lin brainstorms some practical solutions to improve the success rate of field experimental studies. She aims to introduce a new methodological tool in a nascent research field and shed some light on improving experimental quality while adhering to ethical standards.

The Creation of Bad Students: AI Detection for Non-Native English Speakers

January 21, 2025
by Valeria Ramírez Castañeda. This blog explores how AI detection tools in academia perpetuate surveillance and punishment, disproportionately penalizing non-native English speakers (NNES). It critiques the rigid, culturally biased notions of originality and intellectual property, highlighting how NNES rely on AI to navigate the dominance of English in academic settings. Current educational practices often label AI use as dishonest, ignoring its potential to reduce global inequities. The post argues for a shift from punitive measures to integrate AIs as a tool for inclusivity, fostering diverse perspectives. By embracing AI, academia can prioritize collaboration and creativity over control and discipline.

Fritz_X_DargesBlue42… Who Are You?

January 14, 2025
by Jonathan Pérez. Reflecting on the complexities of the human experience is paramount to conducting research. Jonathan Pérez, through his exploration of a conspiracy subreddit, reflects on his experience trying to find the human behind the datum. Jonathan critiques the harmful effects of dehumanizing rhetoric and the researcher’s responsibility to navigate ethical implications. In doing so, he establishes three guiding rules to support researchers seeking to humanize their analysis: 1) a researcher must always find the story behind the data; 2) a researcher must protect themselves; 3) a researcher must still humanize participants (even those who perpetuate harmful narratives).

Tom van Nuenen, Ph.D.

Data/Research Scientist, Senior Consultant, and Senior Instructor
D-Lab
Social Sciences
Digital Humanities

I work as a Lecturer, Data Scientist, and Senior Consultant at UC Berkeley's D-Lab. I lead the curriculum design for D-Lab’s data science workshop portfolio, as well as the Digital Humanities Summer Program at Berkeley.

Former research projects include a Research Associate position in the ‘Discovering and Attesting Digital Discrimination’ project at King’s College London (2019-2022) and a researcher-in-residence role for the UK’s National Research Centre on Privacy, Harm Reduction, and Adversarial Influence Online (2022). My research uses Natural Language Processing methods to
...

Institutional Review Board (IRB) Fundamentals

February 18, 2025, 9:00am
Are you starting a research project at UC Berkeley that involves human subjects? If so, one of the first steps you will need to take is getting IRB approval.

What are Time Series Made of?

December 10, 2024
by Bruno Smaniotto. Trend-cycle decompositions are statistical tools that help us understand the different components of Time Series – Trend, Cycle, Seasonal, and Error. In this blog post, we will provide an introduction to these methods, focusing on the intuition behind the definition of the different components, providing real-life examples and discussing applications.

Language Models in Mental Health Conversations – How Empathetic Are They Really?

December 3, 2024
by Sohail Khan. Language models are becoming integral to daily life as trusted sources of advice. While their utility has expanded from simple tasks like text summarization to more complex interactions, the empathetic quality of their responses is crucial. This article explores methods to assess the emotional appropriateness of these models, using metrics such as BLEU, ROUGE, and Sentence Transformers. By analyzing models like LLaMA in mental health dialogues, we learn that while they suffer through traditional word-based metrics, LLaMA's performance in capturing empathy through semantic similarity is promising. In addition, we must advocate for continuous monitoring to ensure these models support their users' mental well-being effectively.

GitHub is Not Just for Coding: The Powerful Task Management Tool in Your Back Pocket

November 26, 2024
by Elena Stacy. This article introduces the use of GitHub as a task management tool for researchers in any field – even if your project doesn’t involve coding. GitHub is a free tool that many researchers already use in some capacity, and can be easily adapted specifically to task management to enable transparent project collaboration and documentation. We walk through the advantages of using GitHub for this purpose, and provide a comprehensive tutorial on how to get up and running with GitHub as a task management tool for your own projects.