Data Sources

Python Web Scraping

March 7, 2022, 10:00am

In this workshop, we cover how to extract data from the web using Python. We focus on two approaches to extracting data from the web: leveraging application programming interfaces (APIs) and web scraping.

Read more about Python Web Scraping

Michael Sholinbeck

Public Health Librarian

Bioscience, Natural Resources & Public Health Library

Michael has worked at the UC Berkeley Library since 2001, and is currently the Public Health Librarian and Liaison to the School of Optometry at the Bioscience, Natural Resources & Public Health Library. Michael coordinates public health instruction at the library, and is responsible for the public health collection. Michael has a MLIS from San Jose State University, an MS in Geography from Oregon State University, and a BA in Geography from UC Berkeley. When not at work he lives out his fantasy of being a rock and roll drummer.

Read more about Michael Sholinbeck

Finding Health Statistics and Data

March 17, 2022, 2:00pm

Participants in this workshop will learn about some of the issues surrounding the collection of health statistics, and will also learn about authoritative sources of health statistics and data. We will look at tools that let you create custom tables of vital statistics (birth, death, etc.), disease statistics, health behavior statistics, and more.

Read more about Finding Health Statistics and Data

Ian Castro

D-Lab Alumni

School of Information

Ian is a graduate student in the Master of Information Management and Systems program at the School of Information with a focus in applied data science. He earned his B.A. in Media Studies and B.S. in Microbial Biology from UC Berkeley, and his research interests and work experience are in STEM education. He focuses in building courses and academic programs to make data and computing accessible to historically marginalized students and those without prior exposure to the field.

Read more about Ian Castro

Python Data Wrangling and Manipulation with Pandas

February 15, 2022, 9:00am

Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.