THE ISSUE OF BIAS

36
THE ISSUE OF BIAS TRADEOFFS AND BALANCE IN ML Prof. dr. Mireille Hildebrandt Interfacing Law & Technology Vrije Universiteit Brussel Smart Environments, Data Protection & the Rule of Law Radboud University

Transcript of THE ISSUE OF BIAS

Page 1: THE ISSUE OF BIAS

THE ISSUE OF BIAS

TRADEOFFS AND BALANCE IN ML

Prof. dr. Mireille Hildebrandt Interfacing Law & Technology Vrije Universiteit Brussel

Smart Environments, Data Protection & the Rule of Law Radboud University

Page 2: THE ISSUE OF BIAS

WHAT’S NEXT?

1.  Three Types of Bias 1.   inherent bias

2.   bias as unfairness

3.   bias on prohibited grounds

2.   Profile Transparency

3.   Automated Decisions 4.   Purpose

5.   GDPR

17/11/16 Hildebrandt's KNUT MEMORIAL LECTURE 2016 2

Page 3: THE ISSUE OF BIAS

THREE TYPES OF BIAS

8/12/2016 Hildebrandt - NISP ML and the LAW 3

Page 4: THE ISSUE OF BIAS

Is ML neutral, objective, true?

■  three types of bias: 1.  bias inherent in any action-perception-system (APS) 2.  bias that some would qualify as unfair 3.  bias that discriminates on the basis of prohibited legal grounds

8/12/2016 Hildebrandt - NISP ML and the LAW 4

Page 5: THE ISSUE OF BIAS

INHERENT BIAS

8/12/2016 Hildebrandt - NISP ML and the LAW 5

Page 6: THE ISSUE OF BIAS

the difference that makes a difference (Bateson)

■  bias inherent in any action-perception-system (APS) –  Thomas Nagel’s ‘Seeing like a bat’ –  the salience of the output of the APS depends on the agent & the environment –  perception is a means to anticipate the consequences of action: ‘enaction’ –  there is no such thing as objective neutrality, but –  this does not imply that anything goes –  on the contrary: life and death may depend on getting it ‘right’

8/12/2016 Hildebrandt - NISP ML and the LAW 6

Page 7: THE ISSUE OF BIAS

Machine Learning (ML)

■  ML is about –  choosing and pruning relevant, correct and sufficiently complete trainingsets –  developing and training the right algorithm to detect the right mathematical function –  ML is based on a productive bias, cp. Hume as well as Gadamer –  optimization always depends on context, purpose, availability of training and test data

–  there are always trade-offs!

–  reliability depends on the extent to which the future confirms the past –  David Wolpert’s no free lunch theorem should inform our assessment

27 October '16 Robolegal: paralegal or toplawyer? 7

Page 8: THE ISSUE OF BIAS

Hume, Gadamer, Wolpert: no free lunch theorem

Where

d = training set;

f = ‘target’ input-output relationships;

h = hypothesis (the algorithm's guess for f made in response to d); and

C = off-training-set ‘loss’ associated with f and h (‘generalization error’)

How well you do

is determined by how ‘aligned’ your learning algorithm P(h|d) is with the actual posterior, P(f|d). Check http://www.no-free-lunch.org

8/12/2016 Hildebrandt - NISP ML and the LAW 8

Page 9: THE ISSUE OF BIAS

Hume, Gadamer, Wolpert: no free lunch theorem

Implications: –  The bias that is necessary to mine the data will co-determine the results –  This relates to the fact that the data used to train an algorithm is finite –  ‘Reality’, whatever that is, escapes the inherent reduction –  Data is not the same as what it refers to or what it is a trace of

8 July 2016 Privacy Hub Summerschool 9

Page 10: THE ISSUE OF BIAS

“We shall see that most current theory of machine learning rests on the crucial

assumption that the distribution of training examples is identical to the distribution of test

examples. Despite our need to make this assumption in order to obtain theoretical results, it

is important to keep in mind that this assumption must often be violated in practice.”

Tom Mitchell

8 July 2016 Privacy Hub Summerschool 10

Page 11: THE ISSUE OF BIAS

Michael Veale: i.  ‘the common assumption that future populations are not functions of past

decisions is often violated in the public sector;’

■  actually, present futures do co-determine the future present –  predictions influence the move from training to test set –  they change the probability and the hypothesis space –  they enlarge both uncertainty and possibility

■  the point is about the distribution of both: who gets how much of what –  this depends on who gets to act on the output –  if machines define a situation as real it is real in its consequences

8/12/2016 Hildebrandt - NISP ML and the LAW 11

Page 12: THE ISSUE OF BIAS

us elections: data does NOT speak for itself

8/12/2016 Hildebrandt - NISP ML and the LAW 12

Page 13: THE ISSUE OF BIAS

us elections: data does NOT speak for itself

8/12/2016 Hildebrandt - NISP ML and the LAW 13

Page 14: THE ISSUE OF BIAS

Trustworthiness: Trade-offs

■  ML involves a training set, algorithms, a test set –  whether supervised, reinforced or unsupervised ■  trade-offs are inevitable: –  choice of training & test set: size, relevance, accuracy, completeness –  choice of learning algorithms: clustering, decision tree, deep learning, random

forests, back propagation, linear regression etc etc –  speed of output (e.g. real-time) –  accuracy of predictions –  outlier detection ■  N=All is humbug, though it may apply in a specific sense under certain conditions

8/12/2016 Hildebrandt - NISP ML and the LAW 14

Page 15: THE ISSUE OF BIAS

the new catch 22

■  suppose: –  experts train algorithms on relevant data sets –  and keep on testing the output (reinforcement learning) –  until the system does very well (e.g. Zeb, student paper grading, legal intelligence) –  and the experts get bored and do other things (semiotic desensitization)? –  while the systems start feeding increasingly on each other’s output

■  who can test whether the system is still doing well 2 years later?

■  e.g. medical diagnosis, legal intelligence, critical infrastructure

8/12/2016 Hildebrandt - NISP ML and the LAW 15

Page 16: THE ISSUE OF BIAS

the new catch 22: architecture is politics

■  who can test whether the system is still doing well 2 years later?

■  e.g. medical diagnosis, legal intelligence, critical infrastructure

■  what is ‘doing well’?

■  who gets to determine what it means to ‘do well’?

■  so, replacement is high risk high gain in terms of functionality, fairness and our ability to cognize our environment – as this cognition is mediated by ML systems

8/12/2016 Hildebrandt - NISP ML and the LAW 16

Page 17: THE ISSUE OF BIAS

e.g. automated prediction of judgment (APoJ)

■  APoJ used as a means to provide feedback to lawyers, clients, prosecutors, courts

■  APoJ could involve a sensitivity analysis, modulating facts, legal precepts, claims

■  APoJ as a domain for experimentation, developing new insights, argumentation patterns, testing alternative approaches

■  APoJ could detect missing information (facts, legal arguments), helping to improve (instead of merely predict) the outcome of cases

■  APoJ can be used to improve the acuity of human judgment, if not used to replace it

■  if APoJ is used to replace, it should not be confused with law; then is becomes administration – the difference is crucial, critical and pertinent

■  cp. http://www.vikparuchuri.com/blog/on-the-automated-scoring-of-essays/

27 October '16 Robolegal: paralegal or toplawyer? 17

Page 18: THE ISSUE OF BIAS

BIAS AS UNFAIRNESS

8/12/2016 Hildebrandt - NISP ML and the LAW 18

Page 19: THE ISSUE OF BIAS

the difference that makes a difference

■  bias that some would qualify as unfair –  this is a matter of ethics –  we may not agree about goals (values) means (nudging, forcing, negotiating) evaluation: –  deontological? utilitarian? virtue ethics? pragmatarian?

–  that is why we need law

8/12/2016 Hildebrandt - NISP ML and the LAW 19

Page 20: THE ISSUE OF BIAS

BIAS ON PROHIBITED GROUNDS

8/12/2016 Hildebrandt - NISP ML and the LAW 20

Page 21: THE ISSUE OF BIAS

the difference that makes a difference

■  bias that discriminates on the basis of prohibited legal grounds –  this is unlawful and can result in legal redress: –  fines, tort liability, compensation –  invalidation of contracts or legislation

8/12/2016 Hildebrandt - NISP ML and the LAW 21

Page 22: THE ISSUE OF BIAS

PROFILE TRANSPARENCY

8/12/2016 Hildebrandt - NISP ML and the LAW 22

Page 23: THE ISSUE OF BIAS

detecting bias

■  explanation, interpretability: if you cannot test it you cannot contest it –  flesh out the productive bias that ensures functionality: test & contest –  figure out the unfairness in the training set & the algos: test & contest –  infer discrimination on prohibited legal grounds: test & contest

8/12/2016 Hildebrandt - NISP ML and the LAW 23

Page 24: THE ISSUE OF BIAS

the opacity conundrum

■  explanation, interpretability: if you cannot test it you cannot contest it:

1.  deliberate concealment: trade secrets, IP rights and public security

2.  we are not wired for understanding statistics, ML or cyber-physical infrastructures

3.  mismatch between high dimensional math and meaning attribution

8/12/2016 Hildebrandt - NISP ML and the LAW 24

Page 25: THE ISSUE OF BIAS

‘softwire’ verification

■  software verification: mathematical (intestines) & empirical (input-output) ■  ‘softwire’ verification: real life implications, safety and reliability issues ■  explorative experiment, a posteriori control (Schiaffonati & Amigoni), AB-testing ■  pTA: citizens’ juries, participatory social science research, Wynne’s public

understanding of science, Stirling’s matrix of uncertitude ■  need for agonistic discourse (Rip in STS, Mouffe in political theory)

8/12/2016 Hildebrandt - NISP ML and the LAW 25

Page 26: THE ISSUE OF BIAS

AUTOMATED DECISION RIGHTS

8/12/2016 Hildebrandt - NISP ML and the LAW 26

Page 27: THE ISSUE OF BIAS

automated decision rights

■  current choice architecture of AI: ■  ML, IoT, AI is meant to pre-empt our intent

■  to run smoothly under the radar of everyday life

■  it is all about continuous surreptitious automated decisions

8/12/2016 Hildebrandt - NISP ML and the LAW 27

Page 28: THE ISSUE OF BIAS

automated decision rights

= choice architecture for data subjects (EU legislation) 1.  the right not to be subject to automated decisions that have a significant impact

2.   the right to a notification, an explanation and anticipation if exception applies

3.   the right to object against profiling based on legitimate interest of the controller

8/12/2016 Hildebrandt - NISP ML and the LAW 28

Page 29: THE ISSUE OF BIAS

automated decision rights

= choice architecture for data subjects: 1.  the right not to be subject to automated decisions that have a significant impact, unless a.  necessary for contract b.  authorised by EU or MS law c.   explicit consent

■  under a and c: right to human intervention, possibility to contest

■  prohibition to make such decisions based on sensitive data

8/12/2016 Hildebrandt - NISP ML and the LAW 29

Page 30: THE ISSUE OF BIAS

automated decision rights

= choice architecture for data subjects: 2.   the right to a notification, an explanation and anticipation if exception applies –  existence of decisions based on profiling –  meaningful explanation of the logic involved –  significance and envisaged consequences of such processing

8/12/2016 Hildebrandt - NISP ML and the LAW 30

Page 31: THE ISSUE OF BIAS

automated decision rights

= choice architecture for data subjects: 3.   the right to object against profiling when based on interests of the controller –  at any time against profiling for direct marketing –  or on grounds relating to their particular situation

8/12/2016 Hildebrandt - NISP ML and the LAW 31

Page 32: THE ISSUE OF BIAS

DP & Privacy Law: the new choice architecture

■  individual citizens need: –  the capability to reinvent themselves, –  segregate their data-driven audiences, –  have their human dignity respected by the data-driven infrastructures –  make sure ML applications don’t tell on them beyond what is necessary –  the capability to detect and contest bias in their data-driven environments

8/12/2016 Hildebrandt - NISP ML and the LAW 32

Page 33: THE ISSUE OF BIAS

DP & Privacy Law: the new choice architecture

■  the architects of our new data-driven world need to mind: –  integrity of method: rigorously sound and contestable methodologies –  accountabiity: (con)testability of both data sets and algorithms –  fairness: testing bias in the training set, testing bias in the learning algorithm –  privacy & data protection: reduce manipulability, go for participation and respect

8/12/2016 Hildebrandt - NISP ML and the LAW 33

Page 34: THE ISSUE OF BIAS

choice architecture is politics

■  law should enable (not force) companies to act ethically (Montesquieu)

■  need to create a level playing field that puts a threshold in the market

■  to the extent that one cannot give reasons for an automated decision, it can be contested

8/12/2016 Hildebrandt - NISP ML and the LAW 34

Page 35: THE ISSUE OF BIAS

law or ethics?

■  GDPR enforcement: –  fines of up to 4% global turnover –  investigative powers DPAs including ■  access to any premises, data processing equipment and means

8/12/2016 Hildebrandt - NISP ML and the LAW 35

Page 36: THE ISSUE OF BIAS

8/12/2016 Hildebrandt - NISP ML and the LAW 36