Data Science

Using Forest Plots to Report Regression Estimates: A Useful Data Visualization Technique

October 17, 2023
by Sharon Green. Regression models help us understand relationships between two or more variables. In many cases, results are summarized in tables that present coefficients, standard errors, and p-values. Reading these can be a slog. Figures such as forest plots can help us communicate results more effectively and may lead to a better understanding of the data. This blog post is a tutorial on two different approaches to creating high-quality and reproducible forest plots, one using ggplot2 and one using the forestplot package.

Americanist Linguistics: on Ethics and Intent

October 17, 2023
by Anna Björklund. In this post, Anna Björklund investigates the origin of the linguistic study of indigenous American languages, its inextricable ties to settler-colonialism, and how linguistics can move forward as a field.

Python Intermediate: Parts 1-3

October 9, 2023, 1:00pm
This three-part interactive workshop series teaches you intermediate programming Python for people with previous programming experience equivalent to our Python Fundamentals workshop. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.

FSRDC 2023 Annual Meeting and Research Conference

October 2, 2023
by Renee Starowicz. Renee Starowicz, Co-Executive Director of the Berkeley Federal Statistical Research Data Center, provides an overview of the takeaways from the 2023 Annual Federal Statistical Research Data Center Business Meeting and Annual Conference. She provides a brief overview of the Berkeley FSRDC. Then, she describes the priorities for collaboration across national directors to improve outreach to diverse researchers and transparency. Additionally, she points out the other key topics of conversation at this year’s meeting.

ADOPTING DATA SCIENCE CURRICULA: A STUDENT CENTRIC EVALUATION

Susan Wang
Vandana Janeja
David Harding, Ph.D.
Claudia von Vacano, Ph.D.
Daniel Lobo
2023

With the advent of data science as a new discipline with high demand for a skilled workforce, educators are increasingly recognizing the value of translating courses and programs that have been shown to be successful and sharing lessons learned in increasing diversity in data science education. In this paper, we describe and analyze our experiences translating a lower-division data science curriculum from one university, University of California, Berkeley’s Data8 course, to other settings with very different student populations and institutional contexts at, University of Maryland,...

Critical Faculty and Peer Instructor Development: Core Components for Building Inclusive STEM Programs in Higher Education

Claudia von Vacano, Ph.D.
Michael Ruiz
Renee Starowicz, Ph.D.
Seyi Olojo
Arlyn Y. Moreno Luna
Evan Muzzall, Ph.D.
Rodolfo Mendoza-Denton, Ph.D.
David Harding, Ph.D.
2022

First-generation college students and those from ethnic groups such as African Americans, Latinx, Native Americans, or Indigenous Peoples in the United States are less likely to pursue STEM-related professions. How might we develop conceptual and methodological approaches to understand instructional differences between various undergraduate STEM programs that contribute to racial and social class disparities in psychological indicators of academic success such as learning orientations and engagement? Within social psychology, research has focused mainly on student-level mechanisms...

Python Data Wrangling and Manipulation with Pandas

October 10, 2023, 10:00am
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with 'relational' or 'labeled' data both easy and intuitive. It enables doing practical, real world data analysis in Python. In this workshop, we'll work with example data and go through the various steps you might need to prepare data for analysis.

Stata Fundamentals: Parts 1-3

October 10, 2023, 2:00pm
This workshop is a three-part introductory series that will teach you Stata from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the Stata software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.

Python Text Analysis: Topic Modeling

October 16, 2023, 2:00pm
In this part, we study unsupervised learning of text data. This is a stand alone work that builds from the two-part text analysis series.