Blog Post / July 2022

Machine Learning Fairness Bootcamp: Lessons after Two Years

ML Fairness banner

You walk into a hospital. Unbeknownst to you, a number is assigned to you. An algorithm has assigned you a “risk score.” This score will determine how much medical care you need, and how urgently you need it.

This is not science fiction. Medical risk scoring algorithms are widely deployed. And, according to recent research from the University of California, Berkeley School of Public Health, these algorithms will systematically undervalue your risk if you’re Black.

Real-world bias often makes its way into the data we collect, which results in models that learn and perpetuate that bias. This is an example of machine learning (ML) bias.

Part of the problem is that even well-intentioned engineers never learn about ML bias in school. Students may learn that it exists, but not how to tell if a particular algorithm exhibits bias, let alone what to do to make that bias less harmful.

To help address this problem, I developed MLFailures, a series of labs that teach students how to identify and address bias in supervised learning algorithms. All of these labs are freely available and licensed under the creative commons (CC0-BY-NC-SA) license. Teachers worldwide are welcome to use and modify them. We will also soon be posting lectures to help contextualize the labs with an introduction to the concept of fairness in algorithms.

I have taught this bootcamp myself twice now, both times in Professor Josh Blumenstock’s Applied Machine Learning course (INFO 251) at the UC Berkeley School of Information. Spring 2022 marked the second year of teaching this course.

What did I learn this time around? Above all, UC Berkeley graduate students are far more aware of ML bias and fairness than they were even two or three years ago. This is a bad thing in the sense that it signals ML bias is becoming more widely understood as a problem, likely because its effects are more widespread and acutely felt. On the other hand, this trend is positive, because knowledge is the precursor to action. Students who know what bias is, and care about it, will likely find more value (and take more away) from our bootcamp.

This year, I also entered into a trap I myself have identified in the past: confusing concepts for the mathematical ideas that represent or stand in for them. In ML, there’s a term called the “privileged group.” In broad strokes, this term describes a group whose members achieve a favorable outcome more often than they ought to. For example, say you trained a model to predict someone’s income based on several characteristics, including their gender. If you were to apply this formalism to that model, males would be the “privileged” group, as they would have higher salaries than women who otherwise have the same characteristics.


ML Fairness presentation slide
This slide, inspired by Lauren Chambers, clarifies the difference between privilege and the statistical formalism of a “privileged group.”

As Lauren Chambers identified, this formalism sweeps the true meaning of “privilege” under the rug: social power, held by demographic groups, that results in systematic inequality and disparate treatment. In the term “privileged group,” the word “privilege” means something subtly different: it describes one effect of privilege, but does so in a way that could mislead engineers into thinking social privilege means getting a favorable outcome at a greater statistical likelihood, which it does not. Privilege is about power, not the outcomes of stochastic processes.

If we want engineers who can identify and ameliorate bias, they must have a clear understanding of the social concepts that make bias harmful.

If we want engineers who can identify and ameliorate bias, they must have a clear understanding of the social concepts that make bias harmful — concepts like privilege. We cannot let the jargon of ML confuse our understanding of the social structures we aim to alter.

Where to Now?

Recently, a Google engineer decided that Google’s chatbot had become sentient. Google then sidelined that engineer for his comments.

What do we take away from this incident? First, practicing ethics around algorithms isn’t always technical. Noticing — indeed, staying in dialogue with the algorithm in any sense, literal or otherwise — is the key to identifying bias.

Our labs give students a particular set of tools for staying in dialogue with algorithms. These include both statistical tools as well as a vocabulary (e.g., “disparate impact,” “privileged group”) that can help describe how and when these algorithms fail to live up to ethical standards. We don’t give all the statistical tools, let alone all the social ones. We give enough tools for the people who will build algorithms to begin their journey, and enough language (which, for better or worse, includes enough confusing disciplinary jargon like “privileged group”) to educate themselves as needed.

Second, raising issues about fairness has little effect if your employer ignores your concerns. If engineers come to expect that they will be fired for raising flags, there will be little amelioration of algorithmic harms. This is not the first time Google has fired someone for critiquing its AI products, and likely will not be the last.

Our labs do not cover worker organizing, but organizing will be a necessary prerequisite to changing machine learning in practice. Only collective action can make employers accountable to employees’ concerns. As such, an organized labor force capable of identifying and ameliorating ML bias will lead us to a safer world. The Tech Workers Coalition does a fantastic job of giving tech workers an on-ramp to organizing. Future versions of the MLFailures bootcamp will likely include details about worker organizing, and the role it plays in machine learning ethics.

Teach It

As interface designer Bret Victor once wrote, “worrying about sentient AI as the ice caps melt is like standing on the tracks as the train rushes in, worrying about being hit by lightning.” I don’t disagree. But AI is dangerous even without sentience or existential risk. AI produces real harms today, like reinforcing racial profiling in police work. These issues may not involve computers that manipulate humans to achieve world domination, but they matter. The MLFailures labs aim to create a generation of engineers capable of identifying bias and making it less harmful.

If you teach a machine learning class of any sort and would like to integrate practical bias and fairness labs or lectures into your curriculum, download the course plan and modify the components at will. All materials are released under a CC0-BY-SA license.

Learn more about the ML Failures / ML Fairness labs