Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas...

71
Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain

Transcript of Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas...

Page 1: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Fairness in Machine LearningFernanda Viégas @viegasfMartin Wattenberg @wattenbergGoogle Brain

Page 2: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

As AI touches high-stakes aspects of everyday life, fairness becomes more important

Page 3: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

How can an algorithm even be unfair?Aren't algorithms beautiful neutral pieces of mathematics?

The Euclidean algorithm (first discovered in 300 BCE) as described in Geometry, plane, solid and spherical, Pierce Morton, 1847.

Page 4: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

"Classic" non-ML problem: implicit cultural assumptions

Page 5: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Example: names are complex.

Brilliant, fun article.Read it! :)

Patrick McKenziehttp://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

...

Page 6: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

What's different with machine learning?

Algorithm, 300 BCE

Classical algorithms don’t rely on data

Page 7: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

What's different with machine learning?

ML systems rely on real-world data andcan pick up biases from data

Algorithm, 300 BCE

Algorithm, 2017 CE

Classical algorithms don’t rely on data

Page 8: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Sometimes bias starts before an algorithm ever runs…It can start with the data

Page 9: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Sometimes bias starts before an algorithm ever runs…It can start with the data

A real-world example

Page 10: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects
Page 11: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Can you spot the bias?

Page 12: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Can you spot the bias?

Page 13: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Model can’t recognize mugs with handle facing left

Page 14: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

How can this lead to unfairness?

Page 15: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects
Page 16: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

word embeddings

Page 17: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Word embeddings

Distributed Representations of Words and Phrases and their Compositionality

Mikolov et al. 2013

Page 18: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Meaningful directions(word2vec)

Page 19: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects
Page 20: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects
Page 21: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects
Page 22: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Can we "de-bias" embeddings?

Page 23: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Can we "de-bias" embeddings?

Bolukbasi et al.: this may be possible.

Idea: "collapse" dimensions corresponding to key attributes, such as gender.

Page 24: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

How can we build systems that are fair?

First, we need to decide what we mean by “fair”...

Page 25: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Interesting fact: You can't always get what you want in terms of “fairness”!

Page 26: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Fairness: you can't always get what you want!

https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

COMPAS (from company called Northpointe)● Estimates chances a defendant will be re-arrested

○ Issue: "rearrest" != "committed crime"● Meant to be used for bail decisions

○ Issue: also used for sentencing

Page 27: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

This conclusion came from applying COMPAS to historical arrest records.

Page 28: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Enter the computer scientists...

Page 29: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects
Page 30: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects
Page 31: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Low-riskFAIR

Page 32: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Low-riskFAIR

Medium-highFAIR

Page 33: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Low-riskFAIR

Medium-highFAIR

Page 34: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Low-riskFAIR

Medium-highFAIR

Did not reoffendUNFAIR

Black defendants who did not reoffend were more often labeled "high risk"

Page 35: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Low-riskFAIR

Medium-highFAIR

Did not reoffendUNFAIR

Unless classifier is perfect, can't all be fair due to different base rates

Page 36: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects
Page 37: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

https://www.propublica.org/article/bias-in-criminal-risk-scores-is-mathematically-inevitable-researchers-say

Page 38: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Lessons learned

● Fairness of an algorithm depends in part on how it's used ● In fairness, you can't always get (everything) you want

○ Must make a careful choice of quantitative metrics○ This involves case-by-case policy decisions○ These tradeoffs affect human decisions too!

● Improvements to fairness may come with their own costs to other values we want to retain (privacy, performance, etc.)

Page 39: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Can computer scientists do anything besides depress us?

Hardt, Price, Srebro (2016)On forcing a threshold classifier to be "fair" by various definitions:

● Group-unawareSame threshold for each group

● Demographic ParitySame proportion of positive classifications

● "Equal opportunity"Same proportion of true positives

Page 40: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Can computer scientists do anything besides depress us?

A visual explanation...

Page 41: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Page 42: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Credit Score 0 10 20 30 40 50 60 70 80 90 100

Page 43: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Credit Score 0 10 20 30 40 50 60 70 80 90 100

Score: 23 Score: 79

Page 44: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Credit Score 0 10 20 30 40 50 60 70 80 90 100

Page 45: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Credit Score 0 10 20 30 40 50 60 70 80 90 100

Page 46: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Credit Score 0 10 20 30 40 50 60 70 80 90 100

Page 47: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Credit Score 0 10 20 30 40 50 60 70 80 90 100

Lower score than threshold but would pay back loan

Page 48: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Credit Score 0 10 20 30 40 50 60 70 80 90 100

Would pay back but isn’t given a loan

Page 49: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Would default on loan Would pay back loan

Credit Score 0 10 20 30 40 50 60 70 80 90 100

Would pay back but isn’t given a loan

Would default and is given a loan

Page 50: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Page 51: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Profit: 1.2800

Page 52: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Profit: 1.2800

Page 53: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Attacking discrimination in ML mathematically

Profit: 1.2800

Page 54: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Multiple groups and multiple distributions

Page 55: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Multiple groups and multiple distributions

Page 57: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Case Study

Page 58: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Conversation AI /Perspective API (Jigsaw / CAT / others)

Jigsaw / CAT / others

Page 59: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Conversation AI /Perspective API

Page 60: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Conversation AI /Perspective API

Page 61: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

False "toxic" positives

Comment Toxicity scoreThe Gay and Lesbian Film Festival starts today. 82%Being transgender is independent of sexual orientation. 52%A Muslim is someone who follows or practices Islam. 46%

Page 62: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

How did this happen?

Page 63: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

How did this happen?

Page 64: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

One possible fix

Page 65: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

False positives - some improvement

Comment Old NewThe Gay and Lesbian Film Festival starts today. 82% 1%Being transgender is independent of sexual orientation. 52% 5%A Muslim is someone who follows or practices Islam. 46% 13%

Overall AUC for old and new classifiers was very close.

Page 66: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

A common objection...

● Our algorithms are just mirrors of the world. Not our fault if they reflect bias!

Page 67: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

A common objection...

● Our algorithms are just mirrors of the world. Not our fault if they reflect bias!

Some replies:

● If the effect is unjust, why shouldn't we fix it?● Would you apply this same standard to raising a child?

Page 68: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Another objection

● Objection: People are biased and opaque. ● Why should ML systems be any different?

○ True: this won't be easy○ We have a chance to do better with ML

Page 69: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Another objection

● Objection: People are biased and opaque. ● Why should ML systems be any different?

○ True: this won't be easy○ We have a chance to do better with ML

Page 70: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

What can you do?

1. Include diverse perspectives in design and development

2. Train ML models on comprehensive data sets

3. Test products with diverse users

4. Periodically re-evaluate and be alert to errors

Page 71: Fairness in Machine Learning - Tat's Revolution · Fairness in Machine Learning Fernanda Viégas @viegasf Martin Wattenberg @wattenberg Google Brain. As AI touches high-stakes aspects

Fairness in Machine LearningFernanda Viégas @viegasfMartin Wattenberg @wattenbergGoogle Brain