Esprii Chapman – Notes on You can’t eliminate bias from machine learning, but you can pick your bias – 12/16/20 (Extra Source 4 of 5)

In machine learning theory, if you can mathematically prove you don’t have any bias and if you find the optimal model, the value of the model actually diminishes because you will not be able to make generalizations. What this tells us is that, as unfortunate as it may sound, without any bias built into the model, you cannot learn. (Paragraph 2)

Well-intentioned organizations try to rectify or overcompensate for this by eliminating bias in machine learning models. What they don’t realize is that in doing so, it can mess things up further. Why is this? Once you get into removing data categories, other components, characteristics, or traits sneak in.

Suppose, for example, you uncover that income is biasing your model, but there is also a correlation between income and where someone comes from (wages vary by geography). The moment you add income into the model, you need to de-discriminate that by putting origin in as well. It’s extremely hard to make sure that you have nothing discriminatory in the model. If you take out where someone comes from, how much they earn, where they live, and maybe what their education is, there’s not much left to allow you to determine the difference between one person to another. And still, there could be some remaining bias you haven’t thought about. (Paragraphs 4-5)

What does all of this mean in the practical sense? In a nutshell, data science is hard, machine learning is messy, and there is no such thing as completely eliminating bias or finding a perfect model. There are many, many more facets and angles we could delve into as machine learning hits its mainstream stride, but the bottom line is that we’re foolish if we assume that data science is some sort of a be-all and end-all when it comes to making good decisions. (Paragraph 14)

there simply needs to be more awareness of how bias functions — not just in society but also in the very different world of data science. When we bring awareness to data science and model creation, we can make informed decisions about what to include or exclude, understanding that there will be certain consequences — and sometimes accepting that some consequences will be worth it. (Paragraph 15)

Berthold, M. (2020, November 13). You can’t eliminate bias from machine learning, but you can pick your bias. Retrieved December 17, 2020, from

1 Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s