Impactful Data Science: What I Learned Through Data Science For Social Good

November 4, 2020

It’s times like these when I like to think about how I can bring together all the technical work I like to do and the impact I want to make. As a Graduate Student in the Energy and Resources Group (ERG) focusing on the intersection of humans and climate, I thoroughly believe that the efforts to create a more just society require work across disciplines to tackle society’s most pressing issues. This led me to join the Data Science for Social Good (DSSG) program at the University of Washington eScience Institute this past summer, where I was able to work as fellow on a project for detecting voting dilution. Through this post I’ll share with you what I learned about informed and ethical data science through my experience at this program. I hope some of what I share here will inspire you to work towards improving society through your own projects.

A RECIPE FOR INFORMED AND ETHICAL DATA SCIENCE

I don’t think I truly understood stakeholder engagement, social good, and impactful and ethical data science until I joined DSSG. (Nevermind tackling something as looming and topical as voting rights dilution, which was definitely beyond my domain expertise!) The DSSG program gave me an opportunity to improve upon my data science skills and really lit a fire under me to continue my journey of becoming a scholar activist. Here are a couple steps that I think are needed to have a “recipe” for responsible and effective data intensive projects:

1. ION2K
D-Lab’s favorite phrase is definitely “It’s okay not to know (IOKN2K)” and it’s something I brought with me to DSSG. Did I know how to deal with publicly available data? Sure. Did I know about what the Gingles Test was? No (it’s a 3 pronged test to see if minority diluting exists). But I think the key part here is that IOKN2K is something DSSG helped me to continue to embody. Without the support of our project leads and their patience, it would have been impossible to do this work, let alone even know how to start. With Dr. Matt Bareto of UCLA and the Voting Rights Project, and Dr. Loren Collingwood of UC Riverside, my team and I were able to ask all of our questions from statistical bases of the package we were trying to create to the current landscape of voting rights and what that looks like beyond 2020. Even more so, the data scientists on our team, Scott Hendersen and Spencer Wood, were available over Slack and answered many of our technical questions. And on top of that, my fellow fellows on my team, Ari Decter-Frain, Juandalyn Burke, Pratik Sachdeva, were always ready to have conversations and teach each other what we knew. ION2K has to be intentional but there is so much to learn and grow from by embodying this concept.

2. Identifying Stakeholders

Prior to the program, in my mind, a stakeholder was a vague, looming cloud of policymakers or other researchers who would benefit from the work I was doing. But I was definitely proven wrong and given a wakeup call. A stakeholder is “Any group or individual who can affect or is affected by the achievement of the organization’s objectives” (Freeman, R. E. (1984) Strategic Management: A Stakeholder Approach). What that meant for my team is that we had to think about more than just the people who were using our R package, we also needed to understand what would happen if this package is used in court. Is it streamlined enough to output results fast enough for litigation? Can it be explained to the entire room, from the judge, to the opposing side, to a jury? The program put on a workshop for us where we had the opportunity to think about who our entire stakeholder landscape was. We even used tools such as Judgement Call the Game, a game that guides teams to think about aspects such as equity, privacy, and accountability when forming projects. It’s actually quite fun! And it definitely helped us think about comments we might get on our GitHub repository to what might be said in court.

3. Conversing with stakeholders

Our team was very fortunate to have the opportunity to talk to our stakeholders throughout the summer. We had discussions with folks from the American Civil Liberties Union, the National Democratic Redistricting Committee, and more. Each conversation opened our eyes to new capabilities we should include in our software package, and caveats we hadn’t thought about. Beyond that, these conversations helped us take a step back from staring at our code to think about the big picture. I’m sure those of you who work on data intensive projects are also guilty of getting bogged down in the weeds and hyper focused on your code. It’s always a good idea to ask yourself if you’re fulfilling the goals you originally sought to meet and if those goals still align with the impact you want to make.

4. Reminding yourself that you, your team, and your stakeholders are all people

Since we’re staring at screens all day, sometimes it’s hard to remember that (1) You’re not alone and (2) Everyone is human. Now what do I mean by that? I think one of the biggest reasons why I was able to cope with this Zoom and screen intensive summer project was because I remembered to keep it centered around relationships. By opening each meeting with an icebreaker, our team was able to create a bond and humanize our digital relationship. By having a Monday morning check-in with the whole program every week, we were reminded that we had a community that was there to support us. By conversing with stakeholders, we were humbled and empowered that our work had impact. I think this is the easiest one to forget about, but arguably, it’s one of the most important.

Of course this list isn’t exhaustive, but in my mind these were the key ingrediants to the creation of a successful, impactful, and responsible project.

WHAT DSSG MEANT TO ME AND WHAT’S AHEAD

As I reflect on my experience over the summer, immersing myself in the world of voting rights, it affirms my belief that there is an overlap between technical and impactful work, which is a demand that isn’t necessarily met. Back when I started my first job, I thought my love for data analysis and community engagement were separate. I was naive and wasn’t brainstorming enough (and to be honest, probably lazy and comfortable in my job). As I plan to continue onto a PhD, it’s especially vital for me now to think about the problems I want to tackle. I also realize that as a graduate student, it’s so easy to get stuck in your own research bubble. That’s why, as I go along my journey, I aim to continue asking myself “Am I working on research that helps to solve a pressing issue?” and “How will I make sure I can translate my findings into something communities and policymakers can use?” Not to mention, that there is so much to do beyond your own work! My team over the summer has inspired me to seek out ways to give my time to community initiatives. Whatever happens in the next couple of days, I hope you continuously ask yourself how you can improve society, whether through your own work, or through other means.

Thank you to my fellow Voting Rights Dilution Project Team members: the fellows (Ari Decter-Frain, Juandalyn Burke, Pratik Sachdeva), the data scientists (Scott Henderson, Spencer Wood), and the project leads (Matt Barreto, Loren Collingwood). Thank you again to the University of Washington Data Science for Social Good program and the eScience Institute for this wonderful opportunity.

Learn more about the UW eScience Institute: https://escience.washington.edu

Learn more about the DSSG program: https://escience.washington.edu/dssg/

Learn about the Voting Rights Dilution project: https://rpvote.github.io/voting-rights/

Impactful Data Science: What I Learned Through Data Science For Social Good

Topics

A RECIPE FOR INFORMED AND ETHICAL DATA SCIENCE

WHAT DSSG MEANT TO ME AND WHAT’S AHEAD

Topics

Hikari Murayama