Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering....

50
Slides by: Joshua Kroll [email protected] ? Data Science 100 Lecture 27: Ethics in Data Science

Transcript of Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering....

Page 1: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Slides by:

Joshua Kroll

[email protected]

?

Data Science 100Lecture 27: Ethics in Data Science

Page 2: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 3: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 4: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 5: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 6: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 7: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 8: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Gender Shades

Page 9: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 10: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 11: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 12: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

12

Data science is everywhere…

Scoring Systems Lotteries Criminal Justice& National Security

News Lending Hiring

Page 13: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

13

Page 14: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

14

Page 15: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Data Science Life Cycle

?ContextQuestionRefine Question to an one answerable with data

DesignData CollectionData Cleaning

ModelingTest-train splitLoss function choiceFeature engineering

Transformations, Dummy Variables

Model selectionBest subset regressionCross-Validation

Model evaluationPrediction error

BIAS!

Page 16: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Context & Refinement

Page 17: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Context & Refinement

Page 18: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 19: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Proxies

Page 20: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 21: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Data Collection

Page 22: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Data Collection

Page 23: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

http://bit.ly/ds100-sp18-eth

Page 24: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

From Lum, Kristian, and William Isaac. "To predict and serve?." Significance 13,

no. 5 (2016): 14-19.

Page 25: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Data Cleaning

Page 26: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Data Cleaning

Page 27: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Data Cleaning

Page 28: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Test-Train Split

Page 29: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Loss Function Choice

Page 30: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 31: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 32: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 33: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 34: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

So why does ProPublica think COMPAS is biased?

Page 35: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model
Page 36: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Feature Engineering

Page 37: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Feature Engineering

Page 38: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Feature Engineering

Page 39: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Model Selection

Page 40: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Model Evaluation

Page 41: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Model Evaluation

Page 42: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

So what do we do about this?

Page 43: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Ethical Lessons

1. Know what the data are telling you• Understand your model, and validate it• Involve domain experts; “raw data” is an oxymoron• Communicate not just model capabilities, but also limits and assumptions

2. Data about people are about people• The data do not exist on their own, someone made them

3. Always consider why your model might be wrong in systematic ways• Don’t just trust the technology, make it trustworthy• Close the data science lifecycle: the world changes, and so should your models

Page 44: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Ethics beyond the Data Science Lifecycle

Page 45: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Should we deploy technology at all?

Page 46: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

When is it ethical to collect data?

Page 47: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

When is it ethical to collect sensitive data?

• Without knowing sensitive attributes, it’s not possible to know how to make good decisions.• E.g., in a credit-granting context, there’s no reason to believe that qualified

minority applicants will be similar to qualified members of the majority• But having sensitive data means it can be repurposed for other uses

Page 48: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Inclusion and Representation

Page 49: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

Resources

• Take a course at Berkeley on people and technology• Visit https://fatml.org and https://fatconference.org• Reading list• Principles for Accountable Algorithms

• Books• Weapons of Math Destruction, by Cathy O’Neil• How to Lie With Statistics, by Darrell Huff

• Courses around the world: tinyurl.com/ethics-classes

Page 50: Data Science 100 · Lecture 27: Ethics in Data Science. Gender Shades. 12 ... Feature Engineering. Feature Engineering. Feature Engineering. Model Selection. Model Evaluation. Model

“Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.”

-George E. P. Box