Privacy through Anonymisation in Large-scale Socio-technical Systems: The BISON Approach
-
Upload
andrea-omicini -
Category
Science
-
view
202 -
download
2
Transcript of Privacy through Anonymisation in Large-scale Socio-technical Systems: The BISON Approach
Privacy through Anonymisationin Large-scale Socio-technical Systems
The BISON Approach
Claudia Cevenini Enrico Denti Andrea Omicini Italo Cerno{claudia.cevenini, enrico.denti, andrea.omicini, italo.cerno}@unibo.it
Dipartimento di Informatica – Scienza e Ingegneria (DISI)Alma Mater Studiorum – Universita di Bologna
DMI, Universita di CataniaCatania, Italy, 25 July 2016
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 1 / 38
Outline
1 Scope & Goals
2 Legal Framework
3 Socio-Legal-Technical Analysis
4 Anonymisation Process
5 Anonymisation Process in BISON
6 Conclusion & Further Work
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 2 / 38
Scope & Goals
Outline
1 Scope & Goals
2 Legal Framework
3 Socio-Legal-Technical Analysis
4 Anonymisation Process
5 Anonymisation Process in BISON
6 Conclusion & Further Work
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 3 / 38
Scope & Goals Context & Motivation
Scope and Purpose
the research focusses on contact centres (CC) as relevant examples ofknowledge-intensive sociotechnical systems (STS)
we discuss the articulate aspects of anonymisation
individual and organisational needs clashonly an accurate balancing between legal and technical aspects couldpossibly ensure the system efficiencywhile preserving the individual right to privacy
we discuss first the overall legal framework, then the general theme ofanonymisation in CC
we overview the technical process developed in the context of theBISON project
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 4 / 38
Scope & Goals Context & Motivation
Contact Centres as STS
Typical technology issues of CC as STS
basic speech data mining technologies with multi-language capabilities
business outcome mining from speech
CC support systems integrating both speech and business outcomemining in user-friendly way
Scaling up to big data processing clearly scales up also the privacy anddata protection issues
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 5 / 38
Scope & Goals Context & Motivation
Goal of the Research
to assess how complex legal issues at both national and internationallevel can be dealt with while building a complex softwareinfrastructure for CC—both in the development and in the subsequentbusiness phases
to investigate how complex software infrastructures for CC may bedeveloped and marketed in the full respect of the data protectionlegal framework
to focus on anonymisation as a fundamental concept and tool to dealwith the potential conflict between opposite rights and needs,especially in the research and development phase of a large-scale,knowledge intensive STS
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 6 / 38
Scope & Goals Context & Motivation
Law and IT: A Focal Point
Privacy vs. efficiency
the need for a suitable compromise between law-abidingness andprivacy and system / process efficiency is a relevant goal
not just for the legal analysisbut for the whole engineering process that leads to the construction ofthe CC infrastructure
a potential conflict of interests should become composition ofinterests
the requirement of legal compliance can be exploited as a successfactor instead of a source of delays and overheads
an issue going well beyond the CC case study
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 7 / 38
Legal Framework
Outline
1 Scope & Goals
2 Legal Framework
3 Socio-Legal-Technical Analysis
4 Anonymisation Process
5 Anonymisation Process in BISON
6 Conclusion & Further Work
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 8 / 38
Legal Framework
Data Protection Directive (DPD)
DPD the EU Data Protection Directive (Dir 1999/95/EC) [DPD95] setskey principles for the fair and lawful processing of personal data andthe technical and organisational security measures designed toguarantee that all personal data are safe from destruction, loss,alteration, unauthorised disclosure, or access, during the entire dataprocessing period
data processing requires even more care when it involves largeamounts of personal and/or sensitive data
in particular, people should be able to manage the flow of their dataacross massive, third-party analytical systems, so as to have atransparent view of how information data will be used, or sold
data transfer from and outside the EU and cloud services is also aparticularly hot topic, since non-EU countries might provide aninsufficient level of protection to personal data.
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 9 / 38
Legal Framework
Personal Data
What is personal data?
any information relating to a natural person, who can be identified,either directly or indirectly, by reference to one or more factorsspecific to his/her physical, physiological, mental, economic, cultural,or social identity
if the link between an individual and personal data never occurred oris somehow broken and cannot be rebuilt in any way (such as withanonymised data), the DPD rules no longer apply
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 10 / 38
Legal Framework
Roles in Personal Data Processing
Data controller vs. data processor
the data controller is in charge of personal data processing and takesany related decision
e.g., selection of data to be processed, purposes and means ofprocessing, technical and organisational security, . . .
the data processor is a legally separate entity that processes personaldata on behalf of a controller, in force of a written agreement andfollowing specific instructions
in other words, the controller processes data on its own behalf, whilethe processor always acts on behalf of a controller, from whom itderives its power and range of activity
for instance, a company acts as a controller in processing its owncustomers data, whereas the CC entrusted with the same processingacts as a processor on behalf of the company
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 11 / 38
Legal Framework
How to Process Personal Data According to the DPD
Processing personal data
Personal data must be
processed fairly and lawfully
collected for specified, explicit, and legitimate purposes and notfurther processed in a way incompatible with those purpose
further processing of data for historical, statistical or scientific purposesmay not be considered as incompatible, with appropriate safeguards
adequate, relevant and not excessive in relation to the purposes
accurate and, where necessary, kept up to date; inaccurate orincomplete data should be erased or rectified
kept in a form which permits identification of data subjects for nolonger than is necessary for the purposes.
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 12 / 38
Legal Framework
Accountability
According to the accountability principle
data controllers must implement adequate technical andorganisational measures to promote and safeguard data protection intheir processing activities
controllers are responsible for the compliance of their processingoperations with data protection law and should be able todemonstrate compliance with data protection provisions at any time.They should also ensure that such measures are effective
in case of larger, more complex, or high-risk data processing, theeffectiveness of the measures adopted should be verified regularly,through monitoring, internal and external audits, etc.
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 13 / 38
Legal Framework
Security Measures
Technical and organisational security measures should be adopted
to protect personal data
during all the processing period
against the risks related to the integrity and confidentiality of data
The level of data security requested by the law is determined by differentelements, such as
the nature (sensitive/non-sensitive) of the collected data
the concrete availability in the market of adequate security measuresat the current state of the art
their costwhich should not be “disproportionate” with respect to thenecessity
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 14 / 38
Legal Framework
Big Speech Data Issues I
Speech
A large-scale STS infrastructure involves speech recordings, i.e. itprocesses biometric data (tone, pitch, cadence, and frequency of a personsvoice) to determine the identity of a person.
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 15 / 38
Legal Framework
Big Speech Data Issues II
from a data protection perspective, biometrics is linked to physical,physiological, behavioural, or even psychological characteristics of anindividual, some of which may be used to reveal sensitive data
biometric data may also enable automated tracking, tracing, orprofiling of persons: as such, their potential impact on privacy is high
biometric data are by nature irrevocable
→ the processing of biometric data is not only subject to the informedconsent of the data subject, but may also implyauthorisations/notifications from/vs. Data Protection Authorities andis submitted to strict rules on security measures that must be adoptedto protect data
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 16 / 38
Legal Framework
Big Speech Data Issues III
Big Data
big data analytics can involve the repurposing of personal data
if an organisation has collected personal data for one purpose and thendecides to start analysing it for another one (or to make it available forothers to do so), data subjects need to be informed and a new, specificconsent is needed
big data may in themselves contrast with the principle of dataminimisation and relevancy
the challenge for organisations is to focus on what they expect to learnor be able to do by processing big data before the beginning ofprocessing operations, thus verifying that these serve the purpose(s)they are to be collected for, and, at the same time, that they arerelevant and not excessive in relation to such aim(s)
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 17 / 38
Socio-Legal-Technical Analysis
Outline
1 Scope & Goals
2 Legal Framework
3 Socio-Legal-Technical Analysis
4 Anonymisation Process
5 Anonymisation Process in BISON
6 Conclusion & Further Work
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 18 / 38
Socio-Legal-Technical Analysis
Relevant Principles I
the current legal framework foresees a set of essential principles thatshould inspire the design and development of any law-abidinginformation system processing personal data
while some of such principles directly derive from the DPD – namely,from the “Principles relating to data quality” –, others concern thesecurity measures that should be adopted, particularly with referenceto the “Security of processing”
these principles are further strengthened and detailed in the “GeneralData Protection Regulation” (GDPR) [GDP16]
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 19 / 38
Socio-Legal-Technical Analysis
Relevant Principles II
Categories for principles
(a) principles about data processing
(b) principles about security measures
(c) other relevant principles
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 20 / 38
Socio-Legal-Technical Analysis
Principles of Data Processing
1 principle of lawfulness and fairness
2 principle of relevance and non-excessive use
3 principle of purpose
4 principle of accuracy
5 principle of data retention
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 21 / 38
Socio-Legal-Technical Analysis
Principles of Security Measures
1 principle of privacy by design
2 principle of appropriateness of the security measures
3 principle of privacy by default
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 22 / 38
Socio-Legal-Technical Analysis
Other Relevant Principles
1 principle of least privilege
2 principle of intentionality in performing any critical action
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 23 / 38
Socio-Legal-Technical Analysis
Technological Requirements for Anonymisation
Resulting requirements
personal data may be processed only to the extent they are needed toachieve specific purposes: whenever identifying data are notnecessary, only anonymous data should be used
the DPD does not apply to data rendered anonymous such that thedata subject is no longer identifiable: it does not set any prescriptivestandard, nor does it describe the de-identification processjust itsoutcome, which is a reasonably-impossible re-identification
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 24 / 38
Anonymisation Process
Outline
1 Scope & Goals
2 Legal Framework
3 Socio-Legal-Technical Analysis
4 Anonymisation Process
5 Anonymisation Process in BISON
6 Conclusion & Further Work
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 25 / 38
Anonymisation Process
How Should Data be Anonymised?
the DPD does not apply to data made anonymous in such a way thatthe data subject is no longer identifiable
however, it is difficult to create a truly anonymous dataset, and at thesame time to retain all the data required for a specific(organisational) task
on the other hand, irreversibly-preventing identification requires datacontrollers to consider all the means which may likely reasonably beused for identification, either by the controller or by a third party
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 26 / 38
Anonymisation Process
Article 29 Working Party
the Article 29 Working Party – Opinion on Anonymisation Techniques(Art. 29 WP henceforth) [Dir14] is an important reference forcompliance in anonymisation issues
the criteria on which Art. 29 WP grounds its opinion on robustnessfocus on the possibility of
singling out an individuallinking records relating to an individualinferring information concerning an individual.
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 27 / 38
Anonymisation Process in BISON
Outline
1 Scope & Goals
2 Legal Framework
3 Socio-Legal-Technical Analysis
4 Anonymisation Process
5 Anonymisation Process in BISON
6 Conclusion & Further Work
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 28 / 38
Anonymisation Process in BISON
Anonymisation in BISON
Fundamental distinction
research phase — when software and technologies are being developedand tested, but are not yet in actual production
business phase — the subsequent, foreseeable, when they actually dealwith real customers data
anonymisation is seen as the fundamental tool to set the industrialresearch phase free from the complex requirements imposed by theData Protection rules, given that the DPD does not apply toanonymised data
at the same time, in the business phase that will follow the researchproject, the tool will have to deal with real user data, in compliancewith applicable laws
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 29 / 38
Anonymisation Process in BISON
The BISON Anonymisation Process: General Overview
Figure: Anonymisation during the Start-up stage and Research stage in BISON
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 30 / 38
Anonymisation Process in BISON
The BISON Anonymisation Process: Stages
in the first stage of the BISON research, anonymisation is performedmostly with manual procedures, because of the limited data size andof the initial lack of automatic tools
in the second stage, huge amounts of speech data need to beprocessed: automatic transcription – for all the supported languages –has to be put in place
automatic anonymisation is performed on the original audio file andmay not be 100% effective
any effort should be made to reduce these errors to the minimum: theautomatic anonymiser should be designed, trained, and testedaccording to the best available practices
the subsequent feature extraction helps to deal with this issue,because the extracted statistics make it (mostly) impossible toreconstruct the original audio file.
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 31 / 38
Anonymisation Process in BISON
Technological Requirements
the BISON tool should adhere to strict security requirements: usersroles, rights, and restrictions should be tuneable on a fine-grain basis,and be further detailed case-by-case based both on the actual needsand the applicable national legal framework.
on-the-fly anonymisation should be available to deal with the casethat some unexpected personal data are heard by the CC agent
in the final state of the system (ready-to-market), users will need tobe enabled to anonymise personal data whenever not needed for thespecific purposes of the processingand they should be able to do so ina highly customisable way
the key challenge from this viewpoint is also to make anonymisationfuture-proof both with respect to a continuously-evolving legalscenario, as well as to the technology improvement, evolving evenfaster
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 32 / 38
Conclusion & Further Work
Outline
1 Scope & Goals
2 Legal Framework
3 Socio-Legal-Technical Analysis
4 Anonymisation Process
5 Anonymisation Process in BISON
6 Conclusion & Further Work
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 33 / 38
Conclusion & Further Work
Conclusions I
the practices of contemporary software engineering have to beextended to include non-computational issues such as normative,organisational, and societal aspects
this holds in particular for large-scale STS: for instance, thelaw-abidingness of complex software systems including both humanand software agents is quite an intricate issue, to be faced in therequirement stage of any reliable software engineering process
in this work we have specifically addressed the anonymisation ofspeech data in CC, discussing the need for an accurate balancingbetween legal and technical aspects in order to ensure the systemefficiency while preserving the individual right to privacy, and showinghow the legal framework can actually translate into requirements forthe software engineering process
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 34 / 38
Conclusion & Further Work
Conclusions II
by discussing the BISON approach, we show how the anonymisationprocess can be structured during the industrial research phase toenable the resulting system to deal with the amount of data actuallyrequired in the business operation phase
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 35 / 38
References
References I
Article 29 Data Protection Working Party – Opinion 05/2014 on anonymisationtechniques.http://ec.europa.eu/justice/data-protection/article-29/, 18 April 2014.0829/14/EN WP216.
Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 onthe protection of individuals with regard to the processing of personal data and on the freemovement of such data.Official Journal of the European Communities, 38(L 281):31–50, 23 November 1995.
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April2016 on the protection of natural persons with regard to the processing of personal dataand on the free movement of such data, and repealing Directive 95/46/EC (General DataProtection Regulation) (text with EEA relevance).Official Journal of the European Communities, 59(L 119):1–88, 4 May 2016.
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 36 / 38
Extras
URLs
Slides
on APICe
→ http://apice.unibo.it/xwiki/bin/view/Talks/BisonCatania2016
on SlideShare→ http://www.slideshare.net/andreaomicini/
privacy-through-anonymisation
Related paper
on APICe
→ http://apice.unibo.it/xwiki/bin/view/Publications/BisonInsci2016
on Springer
→ ?
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 37 / 38
Privacy through Anonymisationin Large-scale Socio-technical Systems
The BISON Approach
Claudia Cevenini Enrico Denti Andrea Omicini Italo Cerno{claudia.cevenini, enrico.denti, andrea.omicini, italo.cerno}@unibo.it
Dipartimento di Informatica – Scienza e Ingegneria (DISI)Alma Mater Studiorum – Universita di Bologna
DMI, Universita di CataniaCatania, Italy, 25 July 2016
Cevenini, Denti, Omicini, Cerno (UniBo) Privacy through Anonymisation DMI, Catania, 29/07/2016 38 / 38