Data Science

Bo Yun Park, Ph.D.

Postdoc
D-Lab

I am a Postdoctoral Scholar in the D-Lab at the University of California, Berkeley. My research lies at the intersection of political, cultural, and transnational sociology. I am particularly interested in dynamics of social inclusion and exclusion, social change, technology, and digital politics. My dissertation investigated how political strategists in France and the United States craft narratives of political leadership for presidential candidates in the digital age. I received my Ph.D. in Sociology at Harvard University, where I was affiliated with the Institute for Quantitative Social...

Ella Belfer

Consultant
Energy and Resources Group

Ella is a PhD student in the Energy and Resources Group. Her research examines water governance in a changing climate, drawing on geo-spatial techniques. Her past work includes applications of topic modelling in climate change adaptation research, and inductive coding of semi-structured interviews.

Alex Bruefach

Discovery Graduate Fellow
Materials Science and Engineering

Alex is a PhD Candidate in materials science and engineering developing image processing and machine learning techniques for extracting information from electron microscopy datasets. Her primary focus is understanding what information is transferred from various feature representations of images. She has extensive experience collaborating across boundaries and is passionate about brainstorming innovative approaches to challenging data science problems!

Shusheng Li

UTech Management
Data Science
Economics

Shusheng is currently a fourth-year undergraduate student studying Data Science and Economics. He is currently a part of the UTech Management team at D-Lab. Shusheng loves playing all types of sports because it's a great way to stay fit and be together with friends. Working as a UTech Front desk, Shusheng loves helping others and directing them to the right resources available.

Josh Everts

School of Information

I'm a Master's student at the Berkeley School of Information in the MIMS program, studying Data Science. I am especially interested in applying the statistical and computational methods of Data Science to problems within the natural sciences and transportation. To this end I am currently helping with the data analysis of a spectroscopy experiment at SLAC National Lab. Outside of academic work I enjoy improving my cooking skills, biking, and learning about history and geography.

Bobo Kwok

UTech
Data Science

I am an undergraduate student studying Data Science with an emphasis in Applied Mathematics & Modeling. I enjoy storytelling through data visuals and learning new visualization tools.

What is MLOps? An Introduction to the World of Machine Learning Operations

May 10, 2022
More than ever, AI and machine learning (ML) are integral parts of our lives and are tightly coupled with the majority of the products we use on a daily basis. We use AI/ML in almost everything we can think of, from advertising to social media and just going about our daily lives! With the prevalent use of these tools and models, it is essential that, as IT systems and software became a disciplined practice in terms of development, maintainability, and reliability in the early 2000s, ML systems follow a similar trend. The field focused on developing such practices is currently loosely defined under many different titles (e.g., machine learning engineering, applied data science), but is most commonly known as MLOps, or Machine Learning Operations.

Scrollytelling through a look at food prices around the world

May 2, 2022

You have gathered the needed data to support your research, check. You have made some hypotheses about what you hope to conclude, check. You have spent time cleaning the data and organizing it in a manner that permits further exploration, check. You have sliced and diced the data with your favorite data exploration software packages or techniques and created some data visualizations that you feel confident about, quadruple check! You are now armed with insights that you hope to showcase to the world, what’s next? In this article, I would like to share some tips for creating a...

Excel Fundamentals: Lookups with INDEX-MATCH-MATCH

April 18, 2022

Last week marked the D-Lab’s inaugural “Excel Fundamentals” workshop, and to celebrate I am sharing one of my favorite Excel functions: INDEX-MATCH-MATCH. By combining the INDEX and MATCH functions, we can create a faster and more flexible lookup than the typical approach with VLOOKUP.

First, let’s explore the INDEX function and its three arguments: INDEX(where, down, across). It returns the value of a single cell within a block of data. It knows which cell we are...

dbplyr: do we still need to learn SQL to create and manage databases?

April 11, 2022

How to deal with datasets that are larger than our computer’s memory? Do we still need to learn Structured Query Language (SQL) to create and manage a database?

As an incipient data analyst, one of my first major challenges was to build and manage a spatial database using PostGIS, an open-source software that adds a geographic to PostgreSQL relational databases. I was given several text files in a hard drive that weighed approximately 10 GB each! My first reaction was to double click on the first text file that I saw… but this was clearly...