Post on 19-Aug-2020
What algorithms are, why they might need governing, and how
we might do it. Algorithm Workshop, University of Strathclyde, 15 Feb 2017
Michael Veale @mikarv | m.veale@ucl.ac.uk
Department of Science, Technology, Engineering & Public Policy (STEaPP)
University College London
DEPARTMENT OF SCIENCE, TECHNOLOGY, ENGINEERING AND PUBLIC POLICY
GOVERNANCEGO
VERNAN
CE
GOVERNANCE
1. What are algorithms?
Algorithms: A broad view from computer science
Well-defined (not vague) instructions
that logically frame a problem
and specify an approach to tackle it
Alan says: “Everything computers
can do can be represented as a step-by-step
process”
Sources: doi: 10.1145/359131.359136
This content downloaded from 90.198.224.115 on Tue, 07 Feb 2017 22:42:53 UTCAll use subject to http://about.jstor.org/terms
doi: 10.2307/2490013doi: 10.1007/978-3-642-18192-4
Accounting algorithms & algorithmic controversies are old news
~3500BC ~1500AD
These algorithms can be long and complicated, but structured
Source: http://www.breezetree.com/articles/nassi-shneiderman-diagram.htm
Nassi-Shneiderman diagrams
What’s new(er)? Machine learning algorithms
Machine learning identifies and utilises patterns in data.
We say a machine ‘learns’ to perform a task when a measurement of its performance increases with new data.
Machine learning algorithms and models are step-by-step in a Turing sense, but we are more interested in their complex, emergent properties.
IBM Watson says: “Several modules of my
programme are not specified explicitly by humans, but
induced from data”
Sources: Mitchell, Tom M. "Machine learning" McGraw Hill (definition)
More complex algorithms
What is machine learning? Identifying and using patterns in data. Several types: Supervised learning: (most low hanging fruit) Give me labelled data, I learn to predict labels when they’re missing
Unsupervised learning: (useful, but less deployed) Give me unlabelled data, I detect clusters and structure.
Variables Label
£££ age edu. repays loan?
28k 27 Degree Y
22k 22 High Sch. N
27k 34 Degree N
/
neural network
decision tree
support vector machine
Variables Label
£££ age edu. repays loan?
31k 22 Degree Y (60% chance)
40k 29 High Sch. Y (80% chance)
22k 31 Degree N (55% chance)
Complicated vs Complex algorithms
Complicated algorithms Complex algorithms
- Structure is opaque by its extent- Linearity- Deterministic outcomes- Reductive characteristics: structures
determine logic- Logic first
- Structure inherently opaque- Non-linearities- Probabilistic outcomes- Emergent characteristics: structure and logics co-
constitute each other- Data first
Examples: - if–else-while-foreach flowcharts- scorecards.
Examples: - machine learning models - neural networks - random forests- evolutionary algorithms- agent-based models
Source: Author
Let’s run a neural network, now (hopefully)
http://playground.tensorflow.org/
What can machine learning do fairly comfortably?
Modelling with reasonable amounts of well-ordered data • Model certain societal phenomena
• Crime, tax fraud, etc. • Predict characteristics e.g. gender, income, etc from online behaviour
• Detect basic emotions from physiological data • Detect anomalies • Target advertising
Source: HunchLab, Azavea
Recent machine learning advances
Mostly in domains where we’ve now got so much more data • Voice recognition • Image recognition (one or multiple objects in a scene) • Playing games (learning from only pixels and scores) • Understand how meanings of words relate to each other • Translation between languages • Driving cars
Source: Google, https://www.tensorflow.org
Where does machine learning struggle?
Struggles • ‘Idiot savants’: Multi-tasking and transferring learning • Dealing with messy data • Identifying context in images • Summarising text • Generalising from small amounts of data
Future directions • More unsupervised learning
•If this can’t be done, we restrict problems to only the situations where we can get lots of data
Source: xkcd, “Tasks”
Algorithms affect and are affected by people
Algorithms are actors that ‘do’ things in the world
We shape technologies and technologies shape us.
Technologies aren’t neutral, but political and distributive.
We should scrutinise the world views of ‘innovators’.
Bruno says: “When machines appear to
settle matters of fact, we start to look more at inputs and outputs
than their internals.”
Source: Latour, Bruno (1999). Pandora’s hope: Essays on the reality of science studies. Cambridge, MA: Harvard University Press.
Where are the system boundaries of ‘algorithm’?
models
organisationsmanagers and maintainers data collectors/cleaners decision support users decision-subjects
designers data sources
the wider worldpolitical, economic, legal, environmental systems
ethics public policy law economics organisational sciences science and technology studies business sociology operations research human–computer interaction requirements engineering computer science philosophy of science statistics mathematics
health warning
this is indicative! in general disciplines
resist easy linear ordering.
many important ones are also missing
Some reasons to govern algorithms
Fairness and discrimination
• Discrimination over protected characteristics • Actually including race, gender, etc in a model. • Indirectly including them through other variables,
intentionally or not. • Unfairness over non-protected characteristics
• Decisions on actions that seem arbitrary — being a tall redhead, having curiously searched for certain keywords.
• Unfairness from entrenching inequality • Deciding people or areas will be bad tomorrow because
they have been bad in the past • Unfairness from algorithmic memory and resolution
• Panoptical society/‘perfect discrimination’ • No ‘clean slates’
Transparency (accountability as virtue)
• Can’t get information about the algorithm • IP or lack of open data/standards
• Aren’t equipped to understand the algorithm • Skill mismatch, or poorly commented code
• Algorithm too complex to understand conventionally • Machine learning system without human interpretation
Photo: Author
Accountability (as mechanism, or forum)
a relationship between an actor and a forum, in which the actor has an obligation to explain and to justify his or her conduct, the forum can pose questions and pass judgement, and the actor may face consequences (Bovens, 2007)
• No idea what people actually do with the algorithm • Part of an opaque decision-support/making system
• No idea if there even is an algorithm • Unclear process, or silent negative provision
• No due process or easy redress • No non-automated institutions that work at sufficient speed/scale
• Issues compounded by technical/institutional opacity
Source: doi 10.1111/j.1468-0386.2007.00378.x ; Photo: flickr : blondavenger CC BY-NC-ND
Reliability, resilience, robustness
• Does this algorithm even work? • Does performance justify fairness/accountability issues
• Is the algorithm robust to change? • Will it stop working tomorrow, or on Wednesdays?
• Does the algorithm create problematic feedback loops? • Are future data collected a function of decisions that were made previously?
Photo: Italian Job (film), fair use
Security and gaming
Source: doi: 10.1145/2976749.2978392
• Can algorithms be gamed by malicious adversaries? • Risk of manipulation: make
biased, or make useless • Risk of private data release • Risk of interaction with other
cyberphysical systems
Mythos of neutrality and objectivity
Algorithmic races and anti-competitive behaviour
• Algorithms power business models but are powered themselves by data • Those that hold the most data can keep others out of the market
Photo: NASA, public domain
As Brixton Station sagely asks us…
Source: Photo, author. Art: Giles Round. See http://art.tfl.gov.uk/projects/design-work-leisure/
Technical governance: Removing discrimination
For more, see Kamiran, F. et al. (2012). Techniques for Discrimination-Free Predictive Models . doi: 10.1007/978-3-642-30487-3_12
what kind of discrimination?
direct (use of protected characteristics)indirect (use of correlated characteristics)both (mix)
Discrimination removal strategies
data model outputlearningalgorithm
preprocessing (massage the data)
inprocessing (change the learning logic)
postprocessing (alter the learned model)
Technical governance: Removing discrimination, but…
Source: arXiv:1606.06121v1
Adding transparency
Sources: Tickle, Alan B., et al. "The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks." IEEE Transactions on Neural Networks 9.6 (1998): 1057-1068; Andrews, Robert, Joachim Diederich, and Alan B. Tickle. "Survey and critique of techniques for extracting rules from trained artificial neural networks." Knowledge-based systems 8.6 (1995): 373-389.
decompositional make/use a more
interpretable model
regression decision trees
pedagogical/ model-agnostic
wrap an uninterpretable model with a simpler one to estimate
its logics
LIME (next slide) rule extraction
?
Adding transparency: emerging methods
Ribeiro, M.T. et al. (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier. arxiv:1602.04938
Governance approaches: food for thought
Could we… - Start a 3rd party standards and certification scheme,
with regular independent audits?
- Foster ‘chartered’ data scientists with better ethical training and professional codes?
- Better enforce the laws we have with a statutory algorithmic investigatory body/supercomplaint watchdog?
- Do something completely different?
🤖
> fin # tweet me @mikarv > |