Software Tools

Introduction to Item Response Theory

October 24, 2023
by Mingfeng Xue. Measurements (e.g., tests, surveys, questionnaires) are inevitably involved with various sources of errors. Among many psychometric theories, item response theory stands out for its capability of detailed analyses at the item level and its potential to reduce some of the measurement errors. This post first discussed the limitations of conventional summation and average, which give rise to the IRT models, and then introduced a basic form of the Rasch model, including expressions of the model, the assumptions underlying it, some of its advantages, and software packages. Some codes are also provided.

From paper to vector: converting maps into GIS shapefiles

April 11, 2023
by Madeleine Parker. GIS is incredibly powerful: you can transform, overlay, and analyze data with a few clicks. But sometimes the challenge is getting your data into a form to be able to use with GIS. Have you ever found a PDF or even paper map of what you needed? Or googled your topic with “shapefile” after it to no avail? The process of transforming a PDF, paper, or even hand-drawn map with boundaries into a shapefile for analysis is straightforward but involves a few steps. I walk through the stages of digitization, georeferencing, and drawing, from an image to a vector shapefile ready to be used for visualization and spatial analysis.

Mapping Time-Series Satellite Images with Google Earth Engine API

July 17, 2023
by Meiqing Li. Remote sensing imagery has the potential to reveal land use patterns and human activities at a planetary scale. For example, nighttime light intensity extracted from can shed light on spatial patterns of human activities and settlements, especially in places where traditional data are scarce. This blog post introduces Google Earth Engine (GEE) as a general purpose tool to extract time-series remote sensing data from GEE data catalog. I walk through using GEE to obtain data, filter by time and geographic region, and visualize it on static and interactive maps.

Unlock the Joy and Power of Reading in Language Learning

August 21, 2023
by Bowen Wang-Kildegaard. I share my story of how reading for pleasure transformed my English speaking and writing skills. This experience inspired my passion to promote the joy and power of reading to all language learners. Using natural language processing techniques, I dive into the Language Learning subreddit, revealing a trend: Learners are often highly anxious about output practices, but are generally positive about input methods like reading and listening. I then distill complex language learning theories into actionable language learning tips, emphasizing the value of extensive reading for pleasure, pointing to potential methods like using ChatGPT for customization of reading materials, and advocating for joy in the learning journey.

My Summer Exploring Data Science for Social Justice: Learnings, Tensions & Recommendations

September 5, 2023
by Genevieve Smith. This summer I joined the D-Lab hosted Data Science for Social Justice workshop at UC Berkeley diving into Python – including TF-IDF, sentiment analysis, word embeddings, and more – with a lens towards leveraging data science for social justice. My team explored a Reddit channel on abortion and used computational analysis to answer key questions related to abortion access from before versus after Roe vs. Wade was overturned. Computational social science is incredibly powerful, but I continue to grapple with tensions particularly as it relates to employing machine learning and large language in international research, and end with key recommendations for CSS practitioners.

Michael Ruiz

IUSE Research Team
Psychology

Michael earned his B.A.in Psychology from UC Berkeley and currently works as the manager of Professor Okonofua's Equity, Diversity, and Empathy Navigation Sciences Lab in the UC Berkeley Psychology department.

Hikari Murayama

Senior Data Science Fellow, Senior Instructor
Digital Health Social Justice
Energy and Resources Group

Hikari is a graduate student in the Energy and Resource Group. Her research interests involve utilizing remote sensing and geospatial analysis to address pressing problems at the intersection of humans and climate. She recently served as a Data Science for Social Good Fellow at the University of Washington eScience Institute in the summer of 2020. She is experienced and happy to help in the areas of geospatial analysis, remote sensing, and other statistical analyses and methods. Hikari is devoted to helping community members realize their potential to conduct...

Cheng Ren

Senior Data Science Fellow
School of Social Welfare

Cheng Ren is a D-Lab Senior Data Science Fellow and a Ph.D. student at the School of Social Welfare. His research interests are community engagement and assessment, nonprofit development, community database, computational social welfare, and data for social goods.

Christopher Paciorek, Ph.D.

Research Computing Consultant, Adjunct Professor
Department of Statistics
Research IT

Chris Paciorek is an adjunct professor in the Department of Statistics, as well as the Statistical Computing Consultant in the Department's Statistical Computing Facility (SCF) and in the Econometrics Laboratory (EML) of the Economics Department. He is also a user support consultant for Berkeley Research Computing. He teaches and presents workshops on statistical computing topics, with a focus on R.

Frank Hidalgo Ruiz

Data Science Fellow
Chemistry

I am currently a 5th-year Chemical Biology Ph.D. student. My research focuses on understanding the mechanism by which mutations in a protein called Ras lead to tumorigenesis. More specifically, I aim to integrate high-throughput mutagenesis, coevolutionary analysis, and machine learning algorithms to generate a predictive model. Over the last year, I have built a Python package to process, analyze, and visualize Next Generation Sequencing datasets. I love collaborating across research fields and sharing my passion for data science.