Data Sources

Python Web APIs

October 22, 2024, 2:00pm
In this workshop, we cover how to extract data from the web with APIs using Python. APIs are often official services offered by companies and other entities, which allow you to directly query their servers in order to retrieve their data. Platforms like The New York Times, Twitter and Reddit offer APIs to retrieve data.

Leveraging Large Language Models for Analyzing Judicial Disparities in China

October 8, 2024
by Nanqin Ying. This study analyzes over 50 million judicial decisions from China’s Supreme People’s Court to examine disparities in legal representation and their impact on sentencing across provinces. Focusing on 290 000 drug-related cases, it employs large language models to differentiate between private attorneys and public defenders and assess their sentencing outcomes. The methodology combines advanced text processing with statistical analysis, using clustering to categorize cases by province and representation, and regression models to isolate the effect of legal representation from factors like drug quantity and regional policies. Findings reveal significant regional disparities in legal access driven by economic conditions, highlighting the need for reforms in China’s legal aid system to ensure equitable representation for marginalized groups and promote transparent judicial data for systemic improvements.

Anna Björklund

Senior Data Science Fellow 2024-2025, Data Science Fellow 2023-2024
Linguistics

I am a fifth-year PhD student in the Department of Linguistics with an areal interest in the Wintuan languages, traditionally spoken in the northern Sacramento Valley and now undergoing revitalization. My primary research interests are in leveraging archival recordings for the phonetic analysis of these under-documented languages, as well as designing tools to assist in their revitalization. I have worked as a linguistic consultant for the Paskenta Band of Nomlaki Indians since 2020 and the Wintu Tribe of Northern California since 2022. I received my MA in linguistics from UC...

Alex Ramiller

Senior Data Science Fellow 2024-2025, Data Science Fellow 2023-2024
City and Regional Planning

I am a PhD Candidate in City and Regional Planning. My research focuses on the use of large administrative datasets to study residential mobility, neighborhood change, and housing access. I received a Master in Geography from the University of Washington and a Bachelor's in Economics and Geography from Macalester College. I have also consulted on analytical projects for several organizations including the San Francisco Federal Reserve Bank, PolicyLink, and the City of Seattle.

Excel Data Analysis: Introduction

October 2, 2024, 2:00pm
This is a three-hour introductory workshop that will provide an overview of Excel, with no prior experience assumed. Attendees will learn how to use functions for handling data and making calculations, how to build charts and pivot tables, and more.

Excel Data Analysis: Charts, Pivot Tables, and VLOOKUP

October 7, 2024, 2:00pm
This three-hour workshop will cover charts in more detail, review pivot tables, and the widely-used VLOOKUP function. We recommend first taking the introductory workshop Excel Data Analysis: Introduction.

Stephanie Andrews

Availability: By appointment only

Consulting Areas: Python, SQL, HTML / CSS, Javascript, APIs, Databases & SQL, Data Manipulation and Cleaning, Data Science, Data Sources, Data Visualization, Digital Humanities, Machine Learning, Natural Language Processing, Software Tools, Text Analysis, Web Scraping, Bash or Command Line, Excel, Git or Github, Tableau

Stephanie Andrews

Consultant
Info & Data Science MIDS

Stephanie Andrews is currently studying data science in the MIDS program, having previously majored in Social Welfare as an undergraduate at Cal. After graduating, she worked as an advocate for survivors of gender-based violence, as a public policy analyst focusing on anti-trafficking initiatives, and as a software engineer for progressive and social impact organizations. She is now conducting research with the Human Rights Center's Investigations Lab, using OSINT and data science methods to investigate human rights violations.

Kurt Soncco Sinchi

Consultant
Civil Engineering

First generation student and looking to improve and apply Data Science core concepts into social impactful projects, as well as trying to leverage the information from previous cases for better insights of society. Focused on infrastructure and its impact under natural disasters.

Amanda Glazer

Instructor
Statistics

Amanda is a PhD candidate in the statistics department at Berkeley. Her research focuses on causal inference with applications in education, political science and sports. Previously she earned her Bachelor’s degree in mathematics and statistics, with a secondary in computer science, from Harvard.