Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project...
Transcript of Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 ... Five Safes Framework Safe people Safe project...
De-Identification Tuesday 10 May 5 - 6.30 pm Level 5 Theatrette, 121 Exhibition Street Melbourne
Privacy and Data Protection Week 9-13 May 2016
Commissioner forPrivacy and Data Protection
Commissioner forPrivacy and Data Protection
Commissioner forPrivacy and Data Protection
ORANGE – PMS 1655UPBLUE – PMS 2756UPMUSEO SLAB – 100/700
Commissionerfor Privacy and Data Protection
Commissionerfor Privacy and Data Protection
Commissionerfor Privacy and Data Protection
ORANGE – PMS 1655UPBLUE – PMS 2756UPMUSEO SLAB – 100/700
2UNCLASSIFIED
PresentersAgency Name Role
SiuMingTan ChiefMethodologistandGeneralManageroftheMethodologyDivisionattheAustralianBureauofSta@s@cs(ABS)
Dr.StephenHardy GroupleaderforDataPlaFormEngineeringatData61inCSIRO
FionaDowsley ChiefSta@s@cianofVictorianCrimeSta@s@csAgency,andAc@ngDirectorofStrategicPlanningattheDepartmentofJus@ce&Regula@on
GregGough ManageroftheDataVicAccessPolicy,DepartmentofTreasuryandFinance
WhatshallIcover?
SomeTerminology
Legisla@verequirementonmaintainingconfiden@ality
DataU@lity&DisclosureRisk
TheFiveSafesFramework
SomeTerminology
Privacy: Requirementtorespecttheprivateinforma@onofindividuals
ConfidenBality: Requirementthatinforma@on,whetherprivateornot,bestored,keptorreleasedinamannerthatiden@fica@onofwhotheinforma@onreferstoisnotpossible
AnonymisaBon: Processtoremovethedirectiden@fiersfrominforma@on(e.g.name,address,ABN).
Un-idenBfiableinfo:Informa@ontreatedinsuchawaythatre-iden@fica@onisnotpossible
TheCensusandStaBsBcsAct,1905
Ø EveryABSofficertosignanundertakingoffidelityandsecrecy(sec@on7),
Ø Sta@s@calinforma@onnottobedisseminatedinamannerlikelytoenable
theiden@fica@onofapar@cularpersonororganisa@on(subsec@on12(2))
Ø De-iden@fica@onisnotsufficienttomeetlegisla@verequirements
Ø Releasemustnotlikelyleadtore-iden@fica@on
DataUBlityversusDisclosureRisk(I)
DisclosureRiskDataUBlity
ProtecBons
Abilityinusingthedatatodrawvalidconclusions
Ø SpontaneousRecogni@on
Ø Matchingrisk
Ø Higherriskforunitrecord
thanaggregateddata
Ø Perturba@onØ CellSuppressionØ CollapsingofCategories
Ø Sampling
Ø Recordmasking
Ø Subs@tu@onofValues
DataUBlityversusDisclosureRisk(II)
Ø Disclosureriskreducebyapplyingmoreprotec@ons,butdatau@lityis
reduced
Ø Datau@lityismaximisedifthereisnoprotec@onapplied,butdisclourerisk
issignificantlyincreased
Ø Wheretodrawthebalance?
Ø Needtothinkbeyondjustapplyingdataprotec@ons.
TheFiveSafesFramework
Safepeople
Safeproject
Safese`ng
Safedata
Safeoutput
Canthepersonbetrustedtousethedataappropriately?
Isthespecificuseofthedataappropriate?
Howdoesthemodeofaccesslimittheriskofdisclosure?
Howmuchprotec@onsaretobeappliedtothedata?
Howmuchcontrolsareappliedtoensuretheoutputisnon-disclosive?
Amul@dimensionalapproachtodisclosureriskassessment
NeVlixre-idenBficaBon
U@lityvsPrivacy|StephenHardy13|
100,000,000moviera@ngs480,000NeFlixsubscribers
Anonymised:Id–movie–ra@ng-date
200510%sample
“RobustDe-anonymiza@onofLargeSparseDatasets”,NarayananandShma@kov(2008)
IdenBfied:Name–movie–ra@ng-date
RaBngUniqueness
14|
8ra@ngs(2maybewrong)andadatewithin2weeksuniquelyiden@fies99%ofthepeopleintheNeFlixdatabase
U@lityvsPrivacy|StephenHardy
Mobilitydata
15|
“Unique in the Crowd: The privacy bounds of human mobility”, de Montjoye, Hidalgo, Verleysen, & Blondel. (2013).
U@lityvsPrivacy|StephenHardy
Uniquenessof1.5millionusers
16|
4loca@ons&@mesuniquelycharacterizes95%ofthepeopleina1.5mpersonmobilitydatabase
U@lityvsPrivacy|StephenHardy
UBlityvsPrivacy
17|
Themoredatathatislinkedtogether, themoreuniqueitbecomes
Themoredatathatislinkedtogether, themoreusefulitbecomes
But… Because…
U@lityvsPrivacy|StephenHardy
CurrentApproachestoAnonymisaBon
18|
• Losesvaluableinforma@on.• Cans@llbere-iden@fiedinsomecases.
2.Generalisa@on+grouping
1.Masking
FirstName:JohnLastName:Smith
Email:[email protected]:1SmithSt
Address2:Sydney,2000LastTravelDes@na@on:Spain
TravelDate:January2015
FirstName:JohnLastName:Smith
Email:Address1:
Address2:
LastTravelDes@na@on:SpainTravelDate:January2015
U@lityvsPrivacy|StephenHardy
DifferenBalPrivacy
19|
TunedRandomnoise
Originaldata
Removeanyperson
Noisydata
TunedRandomnoise
Noisydata
IndisBnguishable!
U@lityvsPrivacy|StephenHardy
Anonaly@x:Privacy-SafeDataRelease|RoksanaBoreli20|
CreatesSynthe@cData–WithPrivacyLevelGuaranteesHighDataGranularity(UnitRecords)forspecificanalyses
AnonalyBxPrivacyTechnology
ConfidenBalCompuBng
21|
Encrypted data
Encrypted data
Encrypted Analysis
Decrypted Answers
U@lityvsPrivacy|StephenHardy
DifferenBalPrivacy
Tradeoffs
22|
UBlity
Privacy
Rawdata
Masking
k-Anonymity
EncryptedComputaBon
U@lityvsPrivacy|StephenHardy
Benefits of open data
• Increases productivity and improves personal and business decision making.
• Improves research outcomes. • Improves the efficiency and effectiveness of
government.
26
Economic Value
27
The Australian economy will grow by an extra $16 billion a year if government agencies make most
of their data freely available to the public.
• Stimulates economic activity and drives innovation and new services.
DataVic Access Policy
• The default obligation under the Policy is for agencies to make de‑identified datasets available.
• If a dataset contains personally identifiable information, and cannot be de‑identified, it is not suitable for release under the Policy.
Open by design
• When developing or procuring a database or dataset consideration should be given in the design phase to enabling public access to the data that is suitable for release under the Policy.
Just another way of looking at ‘Privacy by design’
Further information
• Websites: www.data.vic.gov.au www.dtf.vic.gov.au
• Email: [email protected] • Twitter @data_vic • Phone: (03) 9651 1880
© State of Victoria 2016 You are free to re-use this work under a Creative Commons Attribution 4.0 licence, provided you credit the State of Victoria (Department of Treasury and Finance) as author, indicate if changes were made and comply with the other licence terms. The licence does not apply to any branding, including Government logos. Copyright queries may be directed to [email protected]