Data Science

R Fundamentals: Parts 1-4

October 8, 2024, 5:00pm
This workshop is a four-part introductory series that will teach you R from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the open-sourced R Studio software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.

Stata Fundamentals: Parts 1-3

October 8, 2024, 2:00pm
This workshop is a three-part introductory series that will teach you Stata from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the Stata software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.

Causal Thinking in Thermal Comfort

September 17, 2024
by Ruiji Sun. We demonstrate the importance of causal thinking by comparing two linear regression approaches used in thermal comfort research: Approach (a), which regresses thermal sensation votes (y-axis) on indoor temperature (x-axis); Approach (b), which does the reverse, regressing indoor temperature (y-axis) on thermal sensation votes (x-axis). From a correlational perspective, they may appear interchangeable, but causal thinking reveals substantial and practical differences between them. Using the same data, we found Approach (b) leads to a 10 °C narrower than the conventionally derived comfort zone using Approach (a). This finding has important implications for occupant comfort and building energy efficiency. We highlight the importance of integrating causal thinking into correlation-based statistical methods, especially given the increasing volume of data in the built environment.

Python Machine Learning Fundamentals: Parts 1-2

September 30, 2024, 9:00am
This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets. No theory instruction will be provided.

Python Data Wrangling and Manipulation with Pandas

September 27, 2024, 9:00am
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Qualtrics Fundamentals

September 26, 2024, 3:00pm
Qualtrics is a powerful online tool available to Berkeley community members that can be used for a range of data collection activities. Primarily, Qualtrics is designed to make web surveys easy to write, test, and implement, but the software can be used for data entry, training, quality control, evaluation, market research, pre/post-event feedback, and other uses with some creativity.

R Fundamentals: Parts 1-4

September 17, 2024, 9:00am
This workshop is a four-part introductory series that will teach you R from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the open-sourced R Studio software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.

Python GPT Fundamentals

September 26, 2024, 2:00pm
This workshop offers a general introduction to the GPT (Generative Pretrained Transformers) model. No technical background is required. We will explore the transformer architecture upon which GPT models are built, how transformer models encode natural language into embeddings, and how GPT predicts text.

Python Fundamentals: Parts 4-6

October 1, 2024, 2:00pm
This three-part interactive workshop series teaches you intermediate programming Python for people with previous programming experience equivalent to our Python Fundamentals workshop. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.

Causal Inference in International Political Economy: Hurdles and Advancements

September 9, 2024
by Yue Lin. What are the key challenges and opportunities of applying experiments in the International Political Economy (IPE) research? In this blog, I reviewed an enduring methodological battle between statistics and experiments, and pointed out that the difficulties of randomization and locating credible counterfactuals have served as main hurdles for IPE scholars to widely adopt experimental tools. However, I further demonstrated some new progress in applying survey, field, and lab experiments in the recent IPE scholarship. I concluded that it is crucial for future researchers to think innovatively about how to combine different research methods to make causal claims in IPE studies.