What to do about Fairness in Machine Learning?

April 7, 2020

How many thousands of machine learning applications have been developed and gone to market in recent years? Feeding vast amounts of data into software to make decisions for us is a social paradigm the 21st century is embracing to the fullest.

I’m a graduate student of public health, but have a long history as a social worker, student of psychology, literature and the human condition. Since early childhood, one thing I have always been is a science fiction fanatic: human, and societal relationships with technology have fascinated me to the core since before I can remember. 

After teaching myself a bit about coding and managing to get accepted at UC Berkeley’s School of Public Health last year, I began exploring a landscape of inquiry into applied data science with the larger community at the d-lab. In this time, I’ve experienced a rapidly growing awareness around the implications of machine learning’s societal applications. I find myself here today: grappling with large-scale inequities outlined in social epidemiology,(link is external) the impact of unjust bias on human health, and discovering just how the mechanisms of social bias may be echoed, indeed amplified, through the employment of biased machine learning algorithms. 

Where to begin? Our world is riven with unjust bias: historical bias(link is external)representational bias(link is external)evaluation bias,(link is external)  measurement bias(link is external)benchmarking(link is external)feedback loops(PDF file)(link is external), the list goes on and on... Observable phenomena, like these above, evidence that unjust biases are not only “far out” social constructs living in our collective consciousness, they exist as latent variables inside datasets and emergent properties of algorithmic design.

Algorithmic bias(link is external) is an effect that poses tremendous threats to equality and human fairness, not only in the future, but right now. 

So what can be done? Here are four suggestions I offer us:

Establish theoretical frameworks to measure and visualize bias in machine learning algorithms. There may be no generalizable, perfect answer to the problem that bias in machine learning presents. Developing curriculum that allows for direct inquiry into an empirical, evidenced expression of bias is key.

Learn to understand bias in its many forms. Being able to intuit bias as it is expressed in constructs of organized data and algorithmic logic is a necessary skill. This intuition requires us to be more than comfortable exploring difficult questions around human differences over time, and how those differences impact human experience today. An understanding like this also requires direct interaction with others to meet human needs, creating feedback partnerships with users and those impacted by the output of machine learning models.  

Recognize there is no technology sector. Machine learning is fast becoming a ubiquitous force in human society.  If an algorithm influences our daily lives and areas of expertise, we should be able to know how it works. Students, researchers, and professionals in our respective fields must make a point of requiring algorithmic transparency, and in doing so embrace the complexities of machine learning models whenever we can. It’s important to emphasize that any model that denies auditability on the basis of “technicality” is no longer valid. 

Remove barriers to data science education. The more interdisciplinary, intersectional understanding of machine learning we have, the stronger our ability to recognize and control for algorithmic threats to equality and human fairness. Cultivating a diversity of domain specific knowledge in the development of machine learning models allows us to work towards the creation of stronger, wiser, and more effective algorithms to meet human needs.