Data Science

R Fundamentals: Parts 1-4

October 18, 2022, 1:00pm
This workshop is a four-part introductory series that will teach you R from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the open-sourced R Studio software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.

Python Data Wrangling and Manipulation with Pandas

November 15, 2023, 9:00am
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Python Text Analysis Fundamentals: Parts 1-2

November 1, 2022, 12:00pm
This two-part workshop series will prepare participants to move forward with research that uses text analysis, with a special focus on humanities and social science applications.

R Fundamentals: Parts 1-4

May 1, 2023, 10:00am
This workshop is a four-part introductory series that will teach you R from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the open-sourced R Studio software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.

Python Introduction to Machine Learning: Parts 1-2

February 7, 2022, 10:00am
This two-part workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets. No theory instruction will be provided.

Python Web APIs

February 8, 2024, 10:00am
In this workshop, we cover how to extract data from the web with APIs using Python. APIs are often official services offered by companies and other entities, which allow you to directly query their servers in order to retrieve their data. Platforms like The New York Times, Twitter and Reddit offer APIs to retrieve data.

Democratizing Our Data

August 26, 2021, 10:00am
There is enormous interest in building a better understanding of how evidence and data can inform policy. New possibilities have opened up to enable data to be shared and used across states and agencies. One is a technical approach – the Administrative Data Research Facility – which provides a secure environment within which education, training, and workforce data can be shared across agencies and states. The other is human – the Applied Data Analytics training program – which trains government agency staff how to combine and use the data to serve their agency missions. Over 650 participants from over 150 agencies have participated and produced new products and new networks in the process. This presentation discusses the approach sponsored by the California Department of Social Services, joint with the Department of Education and the Economic Development Department. The D-Lab worked with the Coleridge Initiative to successfully combine the two approaches. The presentation will also address the broader vision of how approaches like this can serve to democratize data for the United States.

Python Text Analysis Fundamentals: Parts 1-3

September 21, 2021, 10:00am
This three-part workshop series will prepare participants to move forward with research that uses text analysis, with a special focus on humanities and social science applications.

Qualtrics Fundamentals

April 21, 2022, 2:00pm
Qualtrics is a powerful online tool available to Berkeley community members that can be used for a range of data collection activities. Primarily, Qualtrics is designed to make web surveys easy to write, test, and implement, but the software can be used for data entry, training, quality control, evaluation, market research, pre/post-event feedback, and other uses with some creativity.

GPT Fundamentals

April 17, 2024, 3:00pm
This workshop offers a general introduction to the GPT (Generative Pretrained Transformers) model. We will explore how they reflect and shape our cultural narratives and social interactions, and which drawbacks and constraints they have.