Programming Languages

Python Machine Learning Fundamentals: Parts 1-2

April 5, 2023, 10:00am
This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets. No theory instruction will be provided.

Python Intermediate: Parts 1-3

February 12, 2024, 9:00am
This three-part interactive workshop series teaches you intermediate programming Python for people with previous programming experience equivalent to our Python Fundamentals workshop. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.

R Data Visualization

September 19, 2022, 2:00pm
This workshop will provide an introduction to graphics in R with ggplot2. Participants will learn how to construct, customize, and export a variety of plot types in order to visualize relationships in data. We will also explore the basic grammar of graphics, including the aesthetics and geometry layers, adding statistics, transforming scales, and coloring or panelling by groups. You will learn how to make histograms, boxplots, scatterplots, lineplots, and heatmaps as well as how to make compound figures.

Geospatial Fundamentals with QGIS: Parts 1-2

March 1, 2022, 3:00pm
This workshop will introduce methods for working with geospatial data in QGIS, a popular open-source desktop GIS program that runs on both PCs and Macs as well as linux computers. Participants will learn how to load, query and visualize point, line and polygon data. We will also introduce basic methods for processing spatial data, which are the building blocks of spatial analysis workflows. Coordinate reference systems and map projections will also be introduced.

Python Data Wrangling and Manipulation with Pandas

May 4, 2023, 1:00pm
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Git for Research Transparency and Reproducibility Training (RT2)

June 6, 2024, 3:15pm
This is a custom Git workshop for the 2024 Research Transparency and Reproducibility Training (RT2).

Stata for Research Transparency and Reproducibility Training (RT2)

June 7, 2024, 3:15pm
This is a custom Stata workshop for the 2024 Research Transparency and Reproducibility Training (RT2).

Processing Videos in Python with OpenCV

November 28, 2023
by Leah Lee. Videos and images are quickly becoming the most common type of data we store and interact with. Computer vision technologies derive useful information from these forms of data and are now commonly used in health care, agriculture, transportation, and security. OpenCV is a powerful tool for image processing and computer vision tasks. In this blog post, we will explore how we can use OpenCV in Python to carry out basic computer vision tasks. Specifically, we’ll focus on the simple task of identifying an object from a video and labeling a frame with a box around the object.

Mapping Census Data with tidycensus

November 6, 2023
by Alex Ramiller. The U.S. Census Bureau provides a rich source of publicly available data for a wide variety of research applications. However, the traditional process of downloading these data from the census website is slow, cumbersome, and inefficient. The R package “tidycensus” provides researchers with a tool to overcome these challenges, enabling a streamlined process to quickly downloading numerous datasets directly from the census API (Application Programming Interface). This blog post provides a basic workflow for the use of the tidycensus package, from installing the package and identifying variables to efficiently downloading and mapping census data.

Using Forest Plots to Report Regression Estimates: A Useful Data Visualization Technique

October 17, 2023
by Sharon Green. Regression models help us understand relationships between two or more variables. In many cases, results are summarized in tables that present coefficients, standard errors, and p-values. Reading these can be a slog. Figures such as forest plots can help us communicate results more effectively and may lead to a better understanding of the data. This blog post is a tutorial on two different approaches to creating high-quality and reproducible forest plots, one using ggplot2 and one using the forestplot package.