Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

20
Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009

Transcript of Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Page 1: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Dimensions of Privacy

18739A: Foundations of Security and Privacy

Anupam Datta

Fall 2009

Page 2: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Privacy in Organizational Processes

Patient medical bills Insurance

CompanyHospital Drug Company

Patient information

Patient

Advertising

Achieve organizational purpose while respecting privacy expectations in the transfer and use of personal information (individual and aggregate) within and across organizational boundaries

Aggregate anonymized patient

information

PUBLIC

Complex Process within a Hospital

Page 3: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Dimensions of Privacy

What is Privacy?Philosophy, Law, Public Policy

Express and Enforce Privacy PoliciesProgramming Languages, Logics, Usability

Database PrivacyStatistics, Cryptography

Page 4: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Philosophical studies on privacy

Reading Overview article in Stanford Encyclopedia of

Philosophy http://plato.stanford.edu/entries/privacy/

Alan Westin, Privacy and Freedom, 1967 Ruth Gavison, Privacy and the Limits of Law,

1980 Helen Nissenbaum, Privacy as Contextual

Integrity, 2004 (more on Nov 8)

Page 5: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Westin 1967 Privacy and control over information

“Privacy is the claim of individuals, groups or institutions to determine for themselves when, how, and to what extent information about them is communicated to others”

Relevant when you give personal information to a web site; agree to privacy policy posted on web site

May not apply to your personal health information

Page 6: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Gavison 1980 Privacy as limited access to self

“A loss of privacy occurs as others obtain information about an individual, pay attention to him, or gain access to him. These three elements of secrecy, anonymity, and solitude are distinct and independent, but interrelated, and the complex concept of privacy is richer than any definition centered around only one of them.”

Basis for database privacy definition discussed later

Page 7: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Gavison 1980 On utility

“We start from the obvious fact that both perfect privacy and total loss of privacy are undesirable. Individuals must be in some intermediate state – a balance between privacy and interaction …Privacy thus cannot be said to be a value in the sense that the more people have of it, the better.”

This balance between privacy and utility will show up in data privacy as well as in privacy policy languages, e.g. health data could be shared with medical researchers

Page 8: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Contextual Integrity [Nissenbaum 2004] Philosophical framework for privacy Central concept: Context

Examples: Healthcare, banking, education What is a context?

Set of interacting agents in roles Roles in healthcare: doctor, patient, …

Informational norms Doctors should share patient health information as per

the HIPAA rules Norms have a specific structure (descriptive theory)

Purpose Improve health Some interactions should happen - patients should

share personal health information with doctors

Page 9: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Informational Norms

“In a context, the flow of information of a certain type about a subject (acting in a particular capacity/role) from one actor (could be the subject) to another actor (in a particular capacity/role) is governed by a particular transmission principle.”

Contextual Integrity [Nissenbaum2004]

Page 10: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

10

Privacy Regulation Example (GLB Act)

Financial institutions must notify consumers if they share their non-public personal information with non-affiliated companies, but the notification may occur either before or after the information sharing occurs

Exactly as CI says!

Sender role Subject role

Attribute

Recipient role

Transmission principle

Page 11: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Privacy Laws in the US HIPAA (Health Insurance Portability and

Accountability Act, 1996) Protecting personal health information

GLBA (Gramm-Leach-Bliley-Act, 1999) Protecting personal information held by financial service

institutions COPPA (Children‘s Online Privacy Protection Act,

1998) Protecting information posted online by children under 13

More details in later lecture about these laws and a formal logic of privacy that captures concepts from contextual integrity

Page 12: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Database Privacy Releasing sanitized databases

1. k-anonymity [Samarati 2001; Sweeney 2002]2. (c,t)-isolation [Chawla et al. 2005]3. Differential privacy [Dwork et al. 2006] (next

lecture)

Page 13: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Sanitization of Databases

Real Database (RDB)

Sanitized Database (SDB)

Health records

Census data

Add noise, delete names, etc.

Protect privacy

Provide useful information (utility)

Page 14: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Re-identification by linking

Linking two sets of data on shared attributes may uniquely identify some individuals:

Example [Sweeney] : De-identified medical data was released, purchased Voter Registration List of MA, re-identified Governor 87 % of US population uniquely identifiable by 5-digit ZIP, sex, dob

Page 15: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

1. K-anonymity Quasi-identifier: Set of attributes (e.g. ZIP, sex,

dob) that can be linked with external data to uniquely identify individuals in the population Issue: How do we know what attributes are quasi-

identifiers?

Make every record in the table indistinguishable from at least k-1 other records with respect to

quasi-identifiers

Linking on quasi-identifiers yields at least k records for each possible value of the quasi-identifier

Page 16: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

K-anonymity and beyond

Provides some protection: linking on ZIP, age, nationality yields 4 records Limitations: lack of diversity in sensitive attributes, background knowledge, subsequent releases on the same data set, syntactic definition Utility: less suppression implies better utility

l-diversity, m-invariance, t-closeness, …

Page 17: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

2. (c,t)-isolation Mathematical definition motivated by

Gavison’s idea that privacy is protected to the extent that an individual blends into a crowd.

Image courtesy of WaldoWiki: http://images.wikia.com/waldo/images/a/ae/LandofWaldos.jpg

Page 18: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Definition of (c,t)-isolation A database is represented by n points in high

dimensional space (one dimension per column)

Let y be any RDB point, and let δy=║q-y║2. We say that q (c,t)-isolates y iff B(q,cδy) contains fewer than t points in the RDB, that is, |B(q,cδy) ∩ RDB| < t.

q

yδy

cδy

x2

x1

xt-2

Page 19: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Definition of (c,t)-isolation (contd)

Page 20: Dimensions of Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009.

Another influence

Next lecture: Issues with this definition of privacy (impossible to achieve for arbitrary auxiliary information) and an alternate definition (differential privacy)

20