Qualitative Analysis

Leveraging Large Language Models for Analyzing Judicial Disparities in China

October 8, 2024
by Nanqin Ying. This study analyzes over 50 million judicial decisions from China’s Supreme People’s Court to examine disparities in legal representation and their impact on sentencing across provinces. Focusing on 290 000 drug-related cases, it employs large language models to differentiate between private attorneys and public defenders and assess their sentencing outcomes. The methodology combines advanced text processing with statistical analysis, using clustering to categorize cases by province and representation, and regression models to isolate the effect of legal representation from factors like drug quantity and regional policies. Findings reveal significant regional disparities in legal access driven by economic conditions, highlighting the need for reforms in China’s legal aid system to ensure equitable representation for marginalized groups and promote transparent judicial data for systemic improvements.

Causal Inference in International Political Economy: Hurdles and Advancements

September 9, 2024
by Yue Lin. What are the key challenges and opportunities of applying experiments in the International Political Economy (IPE) research? In this blog, I reviewed an enduring methodological battle between statistics and experiments, and pointed out that the difficulties of randomization and locating credible counterfactuals have served as main hurdles for IPE scholars to widely adopt experimental tools. However, I further demonstrated some new progress in applying survey, field, and lab experiments in the recent IPE scholarship. I concluded that it is crucial for future researchers to think innovatively about how to combine different research methods to make causal claims in IPE studies.

Claudia von Vacano, Ph.D.

Availability: By appointment only

Consulting Areas: Digital Humanities, Mixed Methods, Qualitative methods, Surveys, Sampling & Interviews, MaxQDA, Career Development

Seyi Olojo

Instructor, Researcher
School of Information

Seyi is a PhD Student in the School of Information and is a member of the Algorithmic Fairness and Opacity Group. Her research broadly explores the problem space of digital memory, specifically the social discourse surrounding algorithms, ethics, and engagement. Additionally, her work often explores histories of quantification and the politics of categories within emerging technologies. She uses a mixed methods approach to research; this includes ethnography, interviews, grounded theory, surveys, data analysis and values-based design. Here at the D-lab, she leads the qualitative...

Data for a Just U.S. - Using Data Science to Empower Marginalized Communities

September 3, 2024
by Elijah Mercer. In this blog post, I share how working with marginalized communities through data science has transformed my understanding of the field. My journey from crime analysis to founding Data for Just US reveals the profound impact data can have when used to empower and uplift underserved populations. I explore the challenges and rewards of this work, illustrating how data science can drive social change and foster a more equitable future.

Sahiba Chopra

Data Science Fellow 2024-2025
Haas

I'm a PhD student in the Management and Organizations (Macro) group at Berkeley Haas. I have a diverse professional background, primarily as a data scientist across numerous industries, including fintech, cleantech, and media. I hold a BA in Economics from the University of Maryland, an MS in Applied Economics from the University of San Francisco, and an MS in Business Administration from UC Berkeley.

My research focuses on the intersection of inequality, technology, and the labor market. I am particularly interested in understanding how to reduce inequality in...

Hellina Hailu Nigatu

Data Science for Social Justice Senior Fellow 2024
Electrical Engineering and Computer Science (EECS)

I am a PhD student at UC Berkeley in the EECS department co-advised by Prof. Sarah Chasins and Prof. John Canny. My research interest broadly lies in the intersection of AI and HCI, with a focus on making usable AI tools accessible to end users.

I am currently looking into making NLP tools usable and accessible for low-resourced languages. I am also interested in the impact of AI on society, specifically in how it affects Global Majority countries and communities. Outside of research, I like to read books, make and drink traditional Ethiopian coffee, knit,...

Minding the Gaps: Pay Equity in California

July 9, 2024
by Tonya D. Lindsey, Ph.D. The gender pay gap continues to reflect that, on average, men outearn women. California is among the states with the smallest pay gaps (outpacing the national number at 13%) and is unique in that it enacted legislation aimed at eliminating pay gaps by sex and race categories. This blog post reflects on California’s pay gap as students study it in an undergraduate social statistics course. Independent variables indicate three theoretical frameworks: 1) human capital, 2) occupational segregation, and 3) discrimination. While the work students do is rigorous using a representative sample of full-time year-round California workers, there remains work to be done and caveats to the data and analyses.

Berkeley FSRDC Fundamentals

January 31, 2024, 11:00am
Interested in restricted Census or partnering RDC agency (AHRQ, BLS, BEA, NCHS) data use? This one-hour introductory workshop will provide an overview of the Berkeley Federal Statistical Research Data Center, with no prior experience assumed. Attendees will learn about the national RDC network, how to access information online about restricted Census data, and how to navigate proposal development.

US Census Bureau Restricted-Access Research Data Center (FSRDC) Info Session

April 24, 2024, 11:00am
Interested in restricted Census or partnering RDC agency (AHRQ, BLS, BEA, NCHS) data use? This one-hour introductory workshop will provide an overview of the Berkeley Federal Statistical Research Data Center, with no prior experience assumed. Attendees will learn about the national RDC network, how to access information online about restricted Census data, and how to navigate proposal development.