Machine Learning

Need help with Machine Learning?

Visit Drop-in Hours or Schedule a Consultation: <link to an embedded google calendar OB widget or google form widget> 

Below are the consultant we have available with Machine Learning and other expertise listed.

Python Introduction to Machine Learning: Parts 1-2

September 27, 2021, 2:00pm
This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets. No theory instruction will be provided.

Python Introduction to Machine Learning: Parts 1-2

September 27, 2021, 2:00pm
This workshop introduces students to scikit-learn, the popular machine learning library in Python, as well as the auto-ML library built on top of scikit-learn, TPOT. The focus will be on scikit-learn syntax and available tools to apply machine learning algorithms to datasets. No theory instruction will be provided.
See event details for participation information.

Jordan Weiss, Ph.D.

Data Science Fellow
Demography

Jordan Weiss is a demographer who studies population health and inequality. A central theme of his work concerns the integration of theories across multiple disciplines with advances in statistical and computational science to inform research design and translate findings into actionable, policy-relevant information.

Jordan earned his Ph.D. in Demography and Sociology from the University of Pennsylvania where he also earned an MA in Statistics. He is currently a Postdoctoral Scholar in the Department of Demography and Data Science Fellow at the University of California,
...

Ilya Akdemir

Data Science Fellow
School of Law

Ilya is a JSD candidate at UC Berkeley School of Law. His research focuses on natural language processing and machine learning applications that are motivated by both theoretical and practical questions in the legal domain.

Amanda Glazer

Instructor
Statistics

Amanda is a PhD candidate in the statistics department at Berkeley. Her research focuses on causal inference with applications in education, political science and sports. Previously she earned her Bachelor’s degree in mathematics and statistics, with a secondary in computer science, from Harvard.

What to do about Fairness in Machine Learning?

April 7, 2020

How many thousands of machine learning applications have been developed and gone to market in recent years? Feeding vast amounts of data into software to make decisions for us is a social paradigm the 21st century is embracing to the fullest.

I’m a graduate student of public health, but have a long history as a social worker, student of psychology, literature and the human condition. Since early childhood, one thing I have always been is a science fiction fanatic: human, and societal relationships with technology have fascinated me to the core since before I can remember.

...

Machine Learning in Atmospheric Science

April 20, 2021

The atmosphere is an incredibly complex (and fascinating, I would add!) chemical-physical system. Imagine it as a big mixture of gaseous molecules, liquid and solid particles, commonly referred to as aerosol particles or particulate matter (PM). The chemical compounds you can find in the air you are breathing in at any given moment are literally thousands, and they can be both inorganic (like sea salt, dust or volcanic ash) or organic, (i.e. containing carbon molecules, coming from sources such as engine exhausts, forests, scented candles and oils secreted from your own skin). In...

Machine Learning in Poverty Measurement

May 11, 2021
According to The Sustainable Development Goals (SDG) from the United Nations, the first goal is to "end poverty in all its forms everywhere". However, a common method to measure poverty is census data or large sample research, which collects data from a large sample size. The cost for conducting these researches is even higher in low-income areas due to the scarce infrastructure (Blumenstock, 2016; Jean et al., 2016; Perez et al., 2019, McBride&Nichols, 2015). As technology develops, scholars and researchers have begun to apply new techniques and massive machine-generated data sources to measure poverty. In this blog, I discuss three general trends in machine learning about poverty measurement and some concerns in the current application.

Brooks Jessup, Ph.D.

Data Science Fellow
History

Brooks received his Ph.D. in History from UC Berkeley and was trained in Data Science at General Assembly. His work applies digital tools and methods to the study of modern cities and urban issues. At D-Lab, he teaches and consults on data analytics, machine learning, geospatial analysis, and natural language processing with Python and SQL.

Handling Missing Data

May 4, 2021

I recently started working with a set of eviction data for a project on housing precarity at the Urban Displacement Project. As I began exploring the dataset, I was excited to find that it appeared to contain a wealth of historical data we could use to train a robust model for predicting eviction rates in urban neighborhoods. However, my initial excitement soon had to be scaled back when a standard check for missing data revealed that many of the observations lacked values for precisely the variable we aimed to predict. I was now faced with the problem of what to do about this...