Python Data Wrangling and Manipulation with Pandas

February 15, 2022, 9:00am to 12:00pm

Trying to register, but not affiliated with the UCB campus? If you are from Berkeley Lab (LBL), UCSF, or CZ Biohub, please register via our partner portals here.

If you are from the UCB campus there's no more waitlist! But after registering above, please do fill out the affiliations form if you have not done so at least once before: https://dlab.berkeley.edu/affiliations

Location: Remote via Zoom. Link will be sent on the morning of the event.

Date & Time: This workshop runs from 9am-12pm on Tuesday, February 15.

Start Time: D-Lab workshops start 10 minutes after the scheduled start time (“Berkeley Time”). We will admit all participants from the waiting room at that time.

Description

Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

We will cover: 

  • Pandas data structures

  • Loading data

  • Subsetting and filtering

  • Calculating summary statistics

  • Dealing with missing values

  • Merging data sets

  • Creating new variables

  • Basic plotting

  • Exporting data

Prerequisites: D-Lab’s Python Fundamental introductory series or equivalent knowledge.

GitHub Repository:https://github.com/dlab-berkeley/introduction-to-pandas

Software Requirements:Installation Instructions for Python Anaconda

Is Python Not working on your laptop?

Attend the workshop anyway, we can provide you with a cloud-based solution until you figure out the problems with your local installation.

Feedback: After completing the workshop, please provide us feedback using this form

Questions? Email: dlab-frontdesk@berkeley.edu