Software Tools

R Machine Learning with tidymodels: Parts 1-2

February 27, 2024, 10:00am
Machine learning often evokes images of Skynet, self-driving cars, and computerized homes. However, these ideas are less science fiction as they are tangible phenomena that are predicated on description, classification, prediction, and pattern recognition in data. During this two part workshop, we will discuss basic features of supervised machine learning algorithms including k-nearest neighbor, linear regression, decision tree, random forest, boosting, and ensembling using the tidymodels framework. To social scientists, such methods might be critical for investigating evolutionary relationships, global health patterns, voter turnout in local elections, or individual psychological diagnoses.

Python Data Visualization Pilot: Parts 1-2

March 5, 2024, 2:00pm
For this workshop, we'll provide an introduction to visualization with Python. We'll cover visualization theory and plotting with Matplotlib and Seaborn, working through examples in a Jupyter notebook.

Berkeley FSRDC Fundamentals

January 31, 2024, 11:00am
Interested in restricted Census or partnering RDC agency (AHRQ, BLS, BEA, NCHS) data use? This one-hour introductory workshop will provide an overview of the Berkeley Federal Statistical Research Data Center, with no prior experience assumed. Attendees will learn about the national RDC network, how to access information online about restricted Census data, and how to navigate proposal development.

MaxQDA Fundamentals

February 6, 2024, 1:00pm
This two-hour introductory workshop will teach you MaxQDA from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the MaxQDA software, upload multiple forms of data then how to use manual and autocode features. We will review some of the additional analytic features including visual, memo and the Questions, Themes and Theories (QTT) tools. We will briefly touch on the MaxQDA Team cloud-based version. Instructors will share recommended resources.

Tracking Urban Expansion Through Satellite Imagery

December 12, 2023
by Leïla Njee Bugha. Among its many uses, remote sensing can prove especially useful to document changes and trends from eras or settings, where traditional sources are either inexistent or infrequently collected. This is the case when one wants to study urban expansion in sub-Saharan countries over the past 20 years. To further remedy the lack of data on land cover uses from earlier time periods, classification methods can be used as well. Using easily accessible satellite imagery from Google Earth Engine, I provide here an example combining remote sensing with classification to detect changes in the land cover in Nigeria since 2000 due to urban expansion.

MaxQDA Fundamentals

January 10, 2024, 9:00am
This two-hour introductory workshop will teach you MaxQDA from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the MaxQDA software, upload multiple forms of data then how to use manual and autocode features. We will review some of the additional analytic features including visual, memo and the Questions, Themes and Theories (QTT) tools. We will briefly touch on the MaxQDA Team cloud-based version. Instructors will share recommended resources.

Processing Videos in Python with OpenCV

November 28, 2023
by Leah Lee. Videos and images are quickly becoming the most common type of data we store and interact with. Computer vision technologies derive useful information from these forms of data and are now commonly used in health care, agriculture, transportation, and security. OpenCV is a powerful tool for image processing and computer vision tasks. In this blog post, we will explore how we can use OpenCV in Python to carry out basic computer vision tasks. Specifically, we’ll focus on the simple task of identifying an object from a video and labeling a frame with a box around the object.

Hugh Kadhem

Data Science Fellow
Mathematics

Hugh Kadhem is a Ph.D. student in Applied Mathematics, with broad research interests in computational quantum physics and high-performance scientific computing.

Mapping Census Data with tidycensus

November 6, 2023
by Alex Ramiller. The U.S. Census Bureau provides a rich source of publicly available data for a wide variety of research applications. However, the traditional process of downloading these data from the census website is slow, cumbersome, and inefficient. The R package “tidycensus” provides researchers with a tool to overcome these challenges, enabling a streamlined process to quickly downloading numerous datasets directly from the census API (Application Programming Interface). This blog post provides a basic workflow for the use of the tidycensus package, from installing the package and identifying variables to efficiently downloading and mapping census data.

Introduction to Item Response Theory

October 24, 2023
by Mingfeng Xue. Measurements (e.g., tests, surveys, questionnaires) are inevitably involved with various sources of errors. Among many psychometric theories, item response theory stands out for its capability of detailed analyses at the item level and its potential to reduce some of the measurement errors. This post first discussed the limitations of conventional summation and average, which give rise to the IRT models, and then introduced a basic form of the Rasch model, including expressions of the model, the assumptions underlying it, some of its advantages, and software packages. Some codes are also provided.