Data Sources

Python Web Scraping

March 7, 2022, 10:00am
In this workshop, we cover how to extract data from the web using Python. We focus on two approaches to extracting data from the web: leveraging application programming interfaces (APIs) and web scraping.

Michael Sholinbeck

Public Health Librarian
Bioscience, Natural Resources & Public Health Library

Michael has worked at the UC Berkeley Library since 2001, and is currently the Public Health Librarian and Liaison to the School of Optometry at the Bioscience, Natural Resources & Public Health Library. Michael coordinates public health instruction at the library, and is responsible for the public health collection. Michael has a MLIS from San Jose State University, an MS in Geography from Oregon State University, and a BA in Geography from UC Berkeley. When not at work he lives out his fantasy of being a rock and roll drummer.

Finding Health Statistics and Data

March 17, 2022, 2:00pm
Participants in this workshop will learn about some of the issues surrounding the collection of health statistics, and will also learn about authoritative sources of health statistics and data. We will look at tools that let you create custom tables of vital statistics (birth, death, etc.), disease statistics, health behavior statistics, and more.

Ian Castro

D-Lab Alumni
School of Information

Ian is a graduate student in the Master of Information Management and Systems program at the School of Information with a focus in applied data science. He earned his B.A. in Media Studies and B.S. in Microbial Biology from UC Berkeley, and his research interests and work experience are in STEM education. He focuses in building courses and academic programs to make data and computing accessible to historically marginalized students and those without prior exposure to the field.

Python Data Wrangling and Manipulation with Pandas

February 15, 2022, 9:00am
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Python Data Wrangling and Manipulation with Pandas

November 15, 2021, 2:00pm
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Finding Health Statistics and Data

October 21, 2021, 11:00am
Participants in this workshop will learn about some of the issues surrounding the collection of health statistics, and will also learn about authoritative sources of health statistics and data. We will look at tools that let you create custom tables of vital statistics (birth, death, etc.), disease statistics, health behavior statistics, and more.

Python Data Wrangling and Manipulation with Pandas

October 19, 2021, 10:00am
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Python Data Wrangling and Manipulation with Pandas

November 1, 2021, 12:00pm
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Python Data Wrangling and Manipulation with Pandas

October 19, 2021, 10:00am
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.
See event details for participation information.