‘Big Data’ and the Challenge to Informed
Consent as a Basis for Privacy Protection
Talk for IEEE and CLPC, 22 May 2014, UNSW
David Vaile
Co-convenorCyberspace Law and Policy Community
Faculty of Law, University of New South Waleshttp://cyberlawcentre.org/2014/IEEE/
Outline
About Big Data, Consent Challenges for consent
Big Data
Distinguishing characteristics
Context
Good and bad consent
Zombie consent
Difficulties with scale
Need for consent rejected?
No purpose, causation?
Manipulation of consent?
Lessons
Welcome
I’ll give a talk touching on Consent issues raised by Big Data. It’s first iteration, feedback welcome!
Lyria Bennett Moses will respond, and add observations from her research in technology regulation
Holly Raiche will explain the impact of the recent EU court decision which threw out the Data Retention directive
Questions of fact or clarification are OK in the talk or after, Main discussion at the end?
Background: About Big Data, about Consent
What is ‘Big Data’, after the Hype Cycle?
Partly hype and marketing, but real differences beyond scale
Many facets
A Technology, or combination of data and functionality, with certain technical features and characteristics
A ‘Frame’ or brand, a ‘Meme’ with its own rhetorical character and assumptions
Some of the key relevant uses and tools came from marketers with software engineering genius:
Google (core MapReduce tool)
Facebook (now reinventing Big Data data centre hardware)
Big Data as technology: distinguishing features cf.
old dBs PR: Velocity, volume, variety, variability, value
Huge, fast/near real time, heterogeneous…
Omnivorous:
Complete data set, not a sample
Every data set, not just one
All data types, not just obvious records
Adaptable: metadata as well as content
Low integrity: data need not be accurate, current, fit
Purposeless: no need for prior purpose to be designed in
Association not causation
Omnivorous, hungry for data for its
own sake? When “too much data is never enough”?
Many methods based on access to every record in a data set, not a sample or slice
Data takeup is very flexible, so more data sets are low cost to ingest, and thus attractive compared
Can work on both metadata and content data (in comms terms), doing pattern recognition on say movement and photograph
Dirty data is OK… until you send in the
drones There are clever means for dealing with both
incomplete data and dirty, incorrect data
This is OK for some purposes (weather) but potentially not where an individual is identified and targeted for individual treatment
The less risky end of this is marketing: if dirty data means that some ads are a few % less persuasive than otherwise, little is lost. Key tools and uses came from this industry, or those with no Personal info link.
However, personally serious outcomes such as being refused health insurance at a viable price, or becoming a drone target,
‘Purposeless’: Outputs not designed in, self-modifying rules?
NOT: collection for a purpose, limited to that purpose, destroyed when purpose is over, stored in a silo secure for that purpose. [Ebay]
Assumption of a ‘fishing expedition’: something will come up, some new association, new insight, cannot pre-specify
‘We want everything because we want everything…’
Machine learning from any given data set, generates own rules, new associations
Exploitation of new functionality using old collections of data
Prefers longitudinal data retention in lakes to transient silos
Expertise not applicable? (after a certain point)
Is expertise superseded by Big Data systems (if they work)
The scale of data is beyond human comprehension or analysis
The Algorithms similarly: Machine learning rules are written by machines, not programmers, using scale and probabilistic inputs which are beyond our ken
A good Big Data system is iteratively self improving (if you have the feedback correct), so may get better than any expert
At this point the expert’s view of what it is doing may become unreliable, and any possibility of auditing or correction lost
Deus ex Machina? Computer says no? Black box must be obeyed?
Context: Ask Forgiveness not Permission
Meme arising from early days of IT: Grace Hopper?
Chips Ahoy magazine, US Navy, July 1986,
Appropriate for fast, ‘Agile’, ‘Extreme’ software development to bypass bureaucracy
Assumes truly ‘disposable’, ‘throwaway’ prototypes. Fixed by v2
Also works for innovative business models, where failure is OK, test limits
FAIL: for personally significant information: v2 does not help the victim of unintended disclosure, publication or exposure
FAIL: if there is no effective enforcement (FB wrist slap 2011)
Context: ‘Cult of Disruption’ in key data
driven firms ‘Forgiveness not permission’ (Google, many others)
‘Move fast and break things’ (FB, fudged last week)
Attractive to small start-ups and to the online giants
Often implies the key disruption is cool new technology
But often also relies on traditional risk-shifting, cost-evading and side-stepping obligations
Reluctance to accept obligations re Tax (Google, Apple…), Insurance (Uber), Wages, License fees, Compliance, etc.
Essentially inimical to idea of compliance
Consent
One legal basis for data processing is “freely-given, unambiguous and informed consent of the data subject to the specific processing operation.”
Article 2 (h), EU Data Protection Directive
Consent also works as the basis for entry into a contract
Consumer protection recognises contract law is often unfair to consumers because of gross disparity of knowledge and bargaining power with a big business
Precautionary Principle: if there is compelling info to suggest a path has an irrevocable step into a situation with real risk of serious harm, don’t proceed until you can clarify the risk and know it is OK.
Good and bad consent (thinking as a subject)
Informed, not ignorant (info suits your needs)
Unbundled, not bundled (holding you hostage to something essential, all or nothing)
Before the fact, not after
Explicit not implied
Revocable not permanent – this is your insurance (Google likes to think you get a chance to say yes, until you do, and no way back)
Consent needs proper information to reveal the
‘price’ A business assesses ‘cost and risk’ against benefit
Due diligence needs specific info to to work out who to trust
You need info to help you appreciate risks, not just benefits, and assess probability and impact
Different people at different times need diff. info - Not beyond the power of Big Data firms
Potential reluctance to be specific, across the board: Information asymmetry: they know you, but not the reverse
Zombie consent: click that nice blue button
Many consumers, given little real choice, bundled consent, confusing and meaningless info just click the online consent button
Trained like rats or birds to click the button to get the reward
It says: “I have read and understood and agree”
It means: “ I haven’t read and couldn’t understand, whatever”
The role of consent may be limited by both consumer behaviour (lying about their agreement) and the complicity of operators (who could offer
Recent developments
US: two reports to Obama – minimal consideration of consent
EU
ECJ ruling invalidating data protection directive - Holly
EDSP report, Privacy and competitiveness in the age of big data, March 2014
ECJ ruling requiring Google to offer a ‘right to be forgotten’, Spanish bankruptcy – spent convictions model – revocation?
Challenges for Big Data and Consent
Vast aggregations are difficult to explain for
consent purposes The complexity and extent of the functionality
may present issues, especially if there is no constraint on use or purpose
But it could be done… If it mattered
Google is a master of translating complexity to comprehensible chunks
Data visualisation could help, key big data tool
Conscious decision not to try, to seek obfuscation?
Reluctance to accept transparency? Hiding behind complexity
Claims it’s too hard, Privacy is over, Consent
irrelevant From of the cult of Disruption:
We are new, fast, smart, cool, so just get out of the way!
Respecting your wishes would cramp our style, so don’t make us ask
(Real issue: we don’t want to have to obey a refusal to consent)
Bundled consent: if we have to ask consent, ‘the terrorists will win’, or ‘you won’t have any friends’, or ‘no new toys’
Is this a real objection, or framing the question to get No?
Potential reluctance to learn from e-commerce, micro-transactions, Bitcoin, other new technologies, or even Big Data itself?
Consent v. Unequal bargaining power of Big
Data ops Have we stepped back into a contract-first world,
before consumer protection stepped in to redress the imbalances?
Unilateral, non-negotiable, incomplete contracts
Swedish Data Protection Board 2013: Google refuses to negotiate on a contract that omits key data about who, where and for what purposes your personal info can be used
Compliance impossible to ascertain
So: Not suitable to sign!
The absence of key information is presented as a bluff. The Swedes called it, everyone else takes the sucker’s option
Role for consumer protection law to redress the balance?
‘Forgiveness, not permission’ = No consent?
The ‘forgiveness’ slogan appears to be fundamentally hostile in principle to idea that the data subject might have the prior right over what is done with their data – possession?
Conflates external regulation with personal permission and consent
Permission in this case is permission from the individual in the form of informed consent
Forgiveness often is sought from other than the affected subject, or only sought if caught
Hostility to any form of prior permission seeking?
When consent is reluctantly sought, it is formalistic not aimed at enabling due diligence or real understanding
Association not Causation: should you ever consent
to this? Association, uncertainty, incompleteness, out of
dateness, inapplicability to the purpose may all not be fatal flaws for the original task of marketing tweaks
But as soon as real decisions and risks are linked to individual, the reality that possibly random associations are at the core,
not falsifiable evidence-based understanding of a true deep causal connection
Raises questions about whether anyone should be expected to accept this level of uncertainty
Especially when the means for auditing or verification or correction are absent
No purpose = No information for
consent? OECD Principles-based Privacy law is based on permitting any
reasonable use of your personal data, not getting in the way of specific necessary tasks
But it assumes you must be told the purpose for a collection, use and/or disclosure
This is so you understand what it’s for, and can decline if you are not happy with that purpose or use (even at some cost)
Search warrants are also issued for a specific purpose, and not for ‘fishing expedition’
Big data purposes are often made up as you go along, precisely a ‘fishing expedition’ with machine learning and new associations, re-identification, new algorithms
Consent v. Deep understanding of what’s
in your head Psychographic profiling aims to ‘get inside your head’
by extracting insights from associations from data surveillance
A/B testing and other techniques used to refine understanding of all the factors which affect choice to clicking ‘yes’
Capacity to understand you, predict your behaviour or reactions
Capacity to persuade you, find neuro-linguistic keys to you
Capacity to frame a message irresistible to YOU
Flies under the radar, like subliminal advertising (illegal manipulation)
Potentially undermines basis for real consent?
Lessons
Too early to tell - real challenges for consent from Big Data?
Some may arise from the technology, or the business model
But some from the old-fashioned ‘cult of disruption’: uses technology to distract from unwillingness to meet obligations
Awareness of implications and risks is hard
There is a reluctance to assist understanding of this: denial, obfuscation, missing info, incomplete contracts
A poor basis for consumer friendly negotiation?
A poor basis for trust?
Questions?
David Vaile
Cyberspace Law and Policy CommunityFaculty of Law, UNSW
http://cyberlawcentre.org/2014/IEEE/
0414 731 249
Top Related