Cloud SQL Databases for Social Media Data

December 10, 2024, 10:00am to 12:00pm

REGISTRATION NOTES

Click register and then use your @berkeley.edu or @lbl.gov email address.
If you have trouble, you may need to log out of Zoom and log back in.
For help read more here: https://dlab.berkeley.edu/zoom-troubleshooting-tips

Register

Location: Remote via Zoom. Link will be sent on the morning of the event.

Recordings: This D-Lab workshop will be recorded and made available to UC Berkeley participants for a limited time. Your registration for the event indicates your consent to having any images, comments and chat messages included as part of the video recording materials that are made available.

Start Time: D-Lab workshops start 10 minutes after the scheduled start time (“Berkeley Time”). We will admit all participants from the waiting room at that time.

Date & Time: This workshop:

  • Tue Dec 10

Description

This is a hands-on workshop on analyzing Social Media Data using Cloud Databases, specifically Google Cloud Platform's BigQuery. In this session, you'll learn how to leverage existing Reddit and other publicly available datasets in the cloud, import additional data, and perform meaningful analyses relevant to social science research. By the end of this workshop, you will:

• Understand the basics of Google Cloud Platform (GCP) and BigQuery.

• Explore and query public datasets on BigQuery, focusing on Reddit and Wikipedia data.

• Perform complex SQL queries to extract meaningful insights from large datasets.

• Cross-reference Reddit and Wikipedia data with other public datasets.

• Import external data (e.g., data available through the UC Berkeley Library) into BigQuery.

• Use Python and PRAW to scrape recent Reddit data and import it into BigQuery.

• Develop skills in data analysis relevant to computational social science.

Questions? Email: dlab-frontdesk@berkeley.edu