Data Science

R Bootcamp: Fall 2021

August 21, 2021, 8:30am
The workshop will be an intensive two-day introduction to R using RStudio. After the first morning session, the workshop will (staffing permitting) be split into two separate tracks. Co-sponsored by the UC Berkeley Statistics Department and the D-Lab.
See event details for participation information.

Python Fundamentals: Parts 1-4

August 19, 2021, 1:00pm
This four-part, interactive workshop series is your complete introduction to programming Python for people with little or no previous programming experience. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.
Registration is unavailable.

R Fundamentals: Parts 1-4

August 19, 2021, 9:30am
Data are the foundations of the social and biological sciences and humanities. Familiarizing yourself with a programming language can help you better understand the roles that data play in your field. This workshop will teach you to use base R to build a programming vocabulary to develop and train your data skills! The D-Lab's R Fundamentals workshop is a four-part introductory series that will teach you R from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the open-sourced R Studio software, understand data and basic manipulations, import and subset data, explore and visualize data, and understand the basics of automation in the form of loops and functions. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.
Registration is unavailable.

Project HOME: Modeling and Mapping Eviction Rates in California

August 18, 2021

6 months ago, the D-Lab community made possible a connection between the UC Berkeley School of Information, D-Lab Data Science Fellows, and the Urban Displacement Project (UDP). A summer of brainstorming, collaboration, and multiple Zoom sessions later, the team at Project HOME is excited to present our 5th Year Master of Information and Data...

Amanda Glazer

Instructor
Statistics

Amanda is a PhD candidate in the statistics department at Berkeley. Her research focuses on causal inference with applications in education, political science and sports. Previously she earned her Bachelor’s degree in mathematics and statistics, with a secondary in computer science, from Harvard.

Julia Lane, Ph.D.

Guest Speaker
Professor at the NYU Wagner Graduate School of Public Service
Professor at the NYU Center for Urban Science and Progress
NYU Provostial Fellow for Innovation Analytics

Julia Lane is a Professor at the NYU Wagner Graduate School of Public Service, at the NYU Center for Urban Science and Progress, and a NYU Provostial Fellow for Innovation Analytics. She cofounded the Coleridge Initiative, whose goal is to use data to transform the way governments access and use data for the social good through training programs, research projects and a secure data facility. The approach is attracting national attention, including the ...

What to do about Fairness in Machine Learning?

April 7, 2020

How many thousands of machine learning applications have been developed and gone to market in recent years? Feeding vast amounts of data into software to make decisions for us is a social paradigm the 21st century is embracing to the fullest.

I’m a graduate student of public health, but have a long history as a social worker, student of psychology, literature and the human condition. Since early childhood, one thing I have always been is a science fiction fanatic: human, and societal relationships with technology have fascinated me to the core since before I can remember.

...

Democratizing Our Data

August 26, 2021, 10:00am
There is enormous interest in building a better understanding of how evidence and data can inform policy. New possibilities have opened up to enable data to be shared and used across states and agencies. One is a technical approach – the Administrative Data Research Facility – which provides a secure environment within which education, training, and workforce data can be shared across agencies and states. The other is human – the Applied Data Analytics training program – which trains government agency staff how to combine and use the data to serve their agency missions. Over 650 participants from over 150 agencies have participated and produced new products and new networks in the process. This presentation discusses the approach sponsored by the California Department of Social Services, joint with the Department of Education and the Economic Development Department. The D-Lab worked with the Coleridge Initiative to successfully combine the two approaches. The presentation will also address the broader vision of how approaches like this can serve to democratize data for the United States.
See event details for participation information.

Handling Missing Data

May 4, 2021

I recently started working with a set of eviction data for a project on housing precarity at the Urban Displacement Project. As I began exploring the dataset, I was excited to find that it appeared to contain a wealth of historical data we could use to train a robust model for predicting eviction rates in urban neighborhoods. However, my initial excitement soon had to be scaled back when a standard check for missing data revealed that many of the observations lacked values for precisely the variable we aimed to predict. I was now faced with the problem of what to do about this...