Data Science

Python Machine Learning for Data Science Discovery

March 22, 2023, 7:00pm
Overview of Machine learning, Methods of Linear Regression, Logistic Regression (Classification), and Data Preprocessing. The workshop will consist of a live coding demo with a live question-answer session.

Finley Golightly

UTech Management
Applied Mathematics

Finley is a fourth-year undergraduate student studying Applied Math at UC Berkeley. They are interested in entering a career in Data Science once they complete their Bachelor's degree.

They have been with D-Lab since Fall 2020 and are currently part of the UTech Management team. They love the learning environment of D-Lab and their favorite part of the job is their co-workers! In their free time, they enjoy reading, boxing, listening to music, and playing Dungeons & Dragons. Feel free to stop by the front desk to ask them any questions or just to chat...

Can Machine Learning Models Predict Reality TV Winners? The Case of Survivor

March 14, 2023

Since its creation in 2000, the show Survivor has dropped approximately 20 contestants in a remote location and forced them to hunt for their own food and build their own shelter. While a test of physical endurance, the social dynamics and strategy are what make the show stand out from other reality television shows. In this post, I lay out the rules of the show and...

Python Web APIs

March 14, 2023, 2:00pm
In this workshop, we cover how to extract data from the web with APIs using Python. APIs are often official services offered by companies and other entities, which allow you to directly query their servers in order to retrieve their data. Platforms like The New York Times, Twitter and Reddit offer APIs to retrieve data.

Data Science for Social Justice Workshop 2023

March 1, 2023, 12:00pm
This 8-week online workshop for currently enrolled UC Berkeley graduate students will give you the opportunity to learn the essential tools and methods for data science analysis and be introduced to critical frameworks that will enable you to create a project of your own design and to tell stories that can counter the market-first mentality of data science.
See event details for participation information.

Twitter Text Analysis: A Friendly Introduction, Part 2

March 7, 2023

Twitter logo under magnifying glass surrounded by chart and tool icons

The code for this blog post is available in this GitHub repository. You can also follow along in this ...

Twitter Text Analysis: A Friendly Introduction

October 25, 2022

Read part 2 here.

Introduction

Text analysis techniques, including sentiment analysis, topic modeling, and named entity recognition, have been increasingly used to probe patterns in a variety of text-based documents, such as books, social media posts, and others. This blog post introduces Twitter text analysis, but is not intended to cover all of the aforementioned topics. The tutorial is broken down into two parts. In this very first post, I...

Python Fundamentals Pilot: Parts 1-3

February 28, 2023, 2:00pm
This three-part interactive workshop series is your complete introduction to programming Python for people with little or no previous programming experience. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.

R Fundamentals: Parts 1-4

February 27, 2023, 9:00am
This workshop is a four-part introductory series that will teach you R from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the open-sourced R Studio software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.

Python Text Analysis: Topic Modeling

March 29, 2023, 2:00pm
In this part, we study unsupervised learning of text data. This is a stand alone work that builds from the two-part text analysis series.