In a previous blog I wrote about an exciting project on transportation in cities that, along with some colleagues, I’ve had the opportunity to work on. This time around I continue discussing this topic but from another perspective, focusing on a project that made huge waves not long ago due to its large gathering (and sharing!) of interesting transportation-related emissions data.




As mentioned in my previous blog post, the transportation sector plays a significant role in the emission of urban air pollutants. Degraded air quality has been known to be a silent killer, with more than 7 million deaths attributed globally to the effect of fine particulate matter in air pollution. Therefore, understanding the emission sources, quantifying them, and spatially resolving their distribution is an active and growing area of research and a topic with repercussions across multiple dimensions, such as health, transportation, electricity, infrastructure, etc.




Focusing again on the transportation sector, a project that gained much attention over the last 2 years was one led by UC Berkeley alum (and now professor at UT Austin) Joshua Apte, that by leveraging strategic partnerships with other universities, Google StreetView, and Aclima, Inc., managed to collect large volumes of air quality data at a high spatial resolution across multiple streets in California.


The published work “High-resolution air pollution mapping with Google Street View cars: exploiting big data” can be found here, and in summary, it mounts lab-grade air quality monitoring instruments to a Google StreetView car that drives through different cities (with an emphasis in Oakland), and maps three pollutant distributions: BC, NO, NO2.



Figure 1 – Visual abstract summary of the work by J. Apte et al, ES&T 51,12 (2017).






One of the elements we’re excited to understand is the urban structure associations one might find with air quality characteristics. The possibilities, however, are quite large and we know that different departments and different research groups might be able to adapt these data sets to other ones within their field of expertise to formulate and answer multiple interesting research questions.


The data set can be requested from Google by filling out a form that explains how you will be using this data set. If it falls within the line of interest that they are looking for, they can allow you to access the 3.2 M rows worth of data for Oakland, CA, and many million data points for other city regions (e.g., within LA).


At the D-Lab, we offer guidance with several Geographical Information Systems (GIS) software packages including QGIS and ArcGIS as well as Python and R packages, such as GeoPandas, the R raster and sf packages, that can be quite useful when evaluating large data sets that need to be spatially resolved. Make sure you check out our workshops calendar!

Lastly, if you’d like to chat about any of the elements I wrote about above, please reach out to me or any of my colleagues for some consulting time.



Sergio Castellanos

Sergio is a researcher at the California Institute for Energy & Environment and the Energy and Resources Group, with a background in mechanical engineering (Ph.D.). His research interests are in energy systems modelling, sustainable transportation, energy justice, and analysis of integrated techno-economic models for policy making.