Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy...

67
Privacy through Accountability Anupam Datta Associate Professor CSD, ECE, CyLab Carnegie Mellon University

Transcript of Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy...

Page 1: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Privacy through Accountability

Anupam Datta

Associate Professor

CSD, ECE, CyLab

Carnegie Mellon University

Page 2: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

2

Personal Information is Everywhere

Page 3: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Research Challenge

Ensure organizations respect privacy expectations,

regulations, and organizational policies in the collection,

use, and disclosure of personal information

3

Programs and People

Page 4: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Web Advertising

Example privacy policies:

Not use detailed location (full IP address) for advertising

Not use health information for advertising

4

Page 5: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

5

Privacy through Accountability:

An Emerging Research Area

Privacy as a right to restrictions on

personal information flow

Computational accountability mechanisms

for enforcement

http://www.andrew.cmu.edu/user/danupam/privacy.html

Page 6: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Today: Focus on Web Privacy

1. Bootstrapping Privacy Compliance in Big Data Systems

Methodology

Tool and application to Bing’s advertising system

Focus on current policies

2. Information Flow Experiments

Methodology

Tool and application to Google’s advertising system

Focus on principles that go beyond current policies

6

Page 7: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

7

Bootstrapping Privacy Compliance in Big

Data Systems

With S. Sen (CMU) and

S. Guha, S. Rajamani, J. Tsai, J. M. Wing (MSR)

2014 IEEE Symposium on Security & Privacy

(Best Student Paper Award)

Page 8: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Privacy Compliance for Bing

Setting:

Auditor has access to source code

8

Page 9: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

The Privacy Compliance Challenge

9

Specification

Verification

Scale Compliance?

Page 10: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

A Streamlined Audit Workflow

10

Encode Refine

Code analysis

Checker

Annotated

Code

Legalease

Policy

Potential violations

Fix code

Update Grok Developer annotations

Page 11: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

A Streamlined Audit Workflow

Encode Refine

Code analysis, developer annotations

Checker

Annotated

Code

Legalease

Policy

Potential violations

Fix code

Update Grok

Workflow for privacy compliance

Legalease, usable yet formal policy specification language

Grok, bootstrapped data inventory for big data systems

Scalable implementation for Bing

11

Page 12: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

A Streamlined Audit Workflow

Encode Refine

Code analysis, developer annotations

Checker

Annotated

Code

Legalease

Policy

Potential violations

Fix code

Update Grok

12

Page 13: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Specification: Legalease

Usable by

lawyers

and

privacy

champs.

Expressive

enough for

real-world

policies.

Precise

semantics

for local

reasoning.

Usable.

Expressive.

Precise.

13

Page 14: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Legalease: Example Policy

DENY Datatype IPAddress

UseForPurpose Advertising

EXCEPT

ALLOW

Datatype IPAddress:Truncated

ALLOW

UseForPurpose AbuseDetect

EXCEPT

DENY Datatype

IPAddress, AccountInfo

14

We will not use full IP Address for Advertising. IP Address may be used for detecting abuse. In such cases, it will not be combined with account information.

Page 15: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Legalease: Example Policy

15

We will not use full IP Address for Advertising. IP Address may be used for detecting abuse. In such cases, it will not be combined with account information.

DENY Datatype IPAddress

UseForPurpose Advertising

EXCEPT

ALLOW

Datatype IPAddress:Truncated

ALLOW

UseForPurpose AbuseDetect

EXCEPT

DENY Datatype

IPAddress, AccountInfo

Page 16: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

DENY Datatype IPAddress

UseForPurpose Advertising

EXCEPT

ALLOW

Datatype IPAddress:Truncated

ALLOW

UseForPurpose AbuseDetect

EXCEPT

DENY Datatype

IPAddress, AccountInfo

We will not use full IP Address for Advertising. IP Address may be used for detecting abuse. In such cases, it will not be combined with account information.

We will not use full IP Address for Advertising. IP Address may be used for detecting abuse. In such cases, it will not be combined with account information.

Legalease : Policy Checking

16

Program

Page 17: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

A Lattice of Policy Labels

17

IPAddress

• If “IPAddress” use is allowed then so is everything below it

• If “IPAddress:Truncated” use is denied then so is everything above it

T

IPAddress: Truncated

Page 18: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

18

Designed for Precision

Page 19: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Designed for Expressivity (Bing, October 2013)

Page 20: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Designed for Expressivity (Google, October 2013)

20

Page 21: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

DENY Datatype IPAddress

UseForPurpose Advertising

EXCEPT

ALLOW

Datatype IPAddress:Truncated

ALLOW

UseForPurpose AbuseDetect

EXCEPT

DENY Datatype

IPAddress, AccountInfo

Designed for Usability

Exceptions How legal texts are structured

One-to one correspondence

Local Reasoning Each exception refines its immediate parent

Formally proven property

Independent of Code

21

H. DeYoung, D. Garg, L. Jia, D. Kaynar, and A. Datta,

“Experiences in the logical specification of the HIPAA and GLBA

privacy laws”

Page 22: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Legalease Usability

Survey taken by 12 policy authors within Microsoft

Encode Bing data usage policy after a brief tutorial

Time spent 2.4 mins on the tutorial

14.3 mins on encoding policy

High overall correctness

22

Page 23: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

A Streamlined Audit Workflow

Checker

Encode Refine

Code analysis

Annotated

Code

Legalease

Policy

Potential violations

Fix code

Update Grok Developer annotations

23

Page 24: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

A Streamlined Audit Workflow

Encode Refine

Code analysis, developer annotations

Checker

Annotated

Code

Legalease

Policy

Potential violations

Fix code

Update Grok

24

Page 25: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Scope, Hive, Dremel

Data in the form of Tables

Code Transforms Columns to Columns

No Shared State

Limited Hidden Flows

Process 1

Dataset A Dataset B

Dataset

C

Map-Reduce Programming Systems

25

Page 26: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Verification

Nightly

audit of

all jobs

executed.

Static

source

code

analysis.

What

data,

stored

where?

Who

used.

26

Page 27: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Process 1

Dataset A Dataset B

Dataset

C

Dataset F Dataset E

Process 2

Process 3

Dataset

D

Process 5

Dataset J

Process 6

Process 4

Dataset

H Dataset I

Dataset

G

Grok

27

Page 28: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Process 1

Dataset A Dataset B

Dataset

C

Dataset F Dataset E

Process 2

Process 3

Dataset

D

Process 5

Dataset J

Process 6

Process 4

Dataset

H Dataset I

Dataset

G

NewAcct

Login

Check

Hijack

GeoIP

Check

Fraud

Reporting

Grok

Purpose Labels

Annotate programs with purpose labels

28

Page 29: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Initial Data Labels

Heuristics and Annotations

29

Process 1

Dataset A Dataset B

Dataset

C

Dataset F Dataset E

Process 2

Process 3

Dataset

D

Process 5

Dataset J

Process 6

Process 4

Dataset

H Dataset I

Dataset

G

NewAcct

Login

Check

Hijack

GeoIP

Check

Fraud

Reporting

Name Age IPAddress IDX

?? Country

Timestamp Hash

IDX

??

Grok

Purpose Labels

Annotate programs with purpose labels

29

Page 30: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Flow Labels

Source labels propagated via data flow graph

30

Process 1

Dataset A Dataset B

Dataset

C

Dataset F Dataset E

Process 2

Process 3

Dataset

D

Process 5

Dataset J

Process 6

Process 4

Dataset

H Dataset I

Dataset

G

NewAcct

Login

Check

Hijack

GeoIP

Check

Fraud

Reporting

Name Age IPAddress IDX

Profile Country

Timestamp Hash

IDX

IDX

D. E. Denning. “A lattice model of secure information flow”

Grok

Purpose Labels

Annotate programs with purpose labels

Initial Data Labels

Heuristics and Annotations

30

Page 31: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Nightly

Compliance

Process

Generate

report

Static

code

analysis

Manual

Audit

Proce

ss 1

Datas

et A

Datas

et B

Datas

et C

Datas

et F

Datas

et E

Proce

ss 2

Proce

ss 3

Datas

et D

Proce

ss 5

Datas

et J

Proce

ss 6

Proce

ss 4

Datas

et H Datas

et I

Datas

et G

FIMLa

st

Name

LiveId

Age

ss_us

er_ip

M_A

NID

MCM

UID

LocId

s

csts msMUI

D2

msnA

NID

User

Anid

DB

Read

Datase

t D

Read

Datase

t G

Transfor

m Data

Write

Dataset

H, I

Positive

Patterns (40 Taxonomy values, 400

patterns)

Negative

Patterns (2500 total entries)

Granular Overrides (116 total entries)

-- DENY DataType UniqueIdentifier WITH PII InStore BingStore SELECT * FROM (SELECT * FROM Report WHERE Taxonomy='ANID' AND Confidence>='High') AS ID INNER JOIN (SELECT * FROM Report WHERE TaxonomyGroup='PII' AND Confidence>='High') AS P ON ID.VC = P.VC

files

25M+ schemas

2M+

privacy

elements*

300K+

audit

candidates

10K+

teams

8

audit

items

1K+ 31

Page 32: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Why Bootstrapping Grok Works

Pick the nodes which will

label the most of the

graph

~200 annotations label 60% of nodes

A small number of annotations

is enough to get off the ground.

33

Page 33: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Scale

77,000 jobs run each day By 7000 entities

300 functional groups

1.1 million unique lines of code 21% changes on avg, daily

46 million table schemas

32 million files

Manual audit infeasible

Information flow analysis takes ~30 mins daily

34

Page 34: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

A Streamlined Audit Workflow

Checker

Encode Refine

Code analysis

Annotated

Code

Legalease

Policy

Potential violations

Fix code

Update Grok Developer annotations

35

Page 35: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

A Streamlined Audit Workflow

Encode Refine

Code analysis, developer annotations

Checker

Annotated

Code

Legalease

Policy

Potential violations

Fix code

Update Grok

36

Page 36: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Today: Focus on Web Privacy

1. Bootstrapping Privacy Compliance in Big Data Systems

Methodology

Tool and application to Bing’s advertising system

Focus on current policies

2. Information Flow Experiments

Methodology

Tool and application to Google’s advertising system

Focus on principles that go beyond current policies

37

Page 37: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

38

Information Flow Experiments Methodology

With Michael Carl Tschantz (CMU UC Berkeley)

Amit Datta (CMU)

Jeannette M. Wing (CMU Microsoft Research)

Page 38: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

39

User Ads

Browsing history Other users

Advertisers

Websites

Google

Confounding

inputs

Personalized Web Advertising

?

Probabilistic Interference

Page 39: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Control Group

Experimental Design

Scientist

40

Experimental Group

Drug

Placebo

Page 40: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Group 2

Information Flow Experiment (IFE)

41

Group 1 Rehab ads

Substance abuse websites

Generic ads

Idle

Page 41: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

IFE Methodology

42

Control

treatment

Experimenter

Experimental

treatment

Random

permutation

Measurements

p-value Significance testing

The

Internet

Page 42: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Information Flow Experiments as Science

Experimental Science Information Flow

Natural process System in question

Population of units Subset of interactions

… …

Causation Information flow

43

Theorem

Pearl’s Causation = Probabilistic Interference

Page 43: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

44

Information Flow Experiments

on Personalized Ad Settings: A Tale of Opacity, Choice and Discrimination

With Amit Datta (CMU) and

Michael Carl Tschantz (UC Berkeley)

Page 44: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Google Ad Settings

45

Page 45: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Goals

Study transparency, choice, fairness

Methodology and tool (AdFisher)

Automation, statistical rigor, scalability, explanations

46

Browsing

Behavior

Ads

Received

Ad

Settings

Internal

State

Page 46: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Experiment 1: Opacity

Experimental group visits top 100 substance abuse sites

Control group idles

Then both groups visit Times of India and collects ads

47

Browsing

Behavior

Ads

Received

Ad

Settings

Internal

State

Page 47: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Experiment 1: Significant Opacity

Substance abuse: significant effect on ads, no effect on ad

settings

Disability: significant effect on ads, “unrelated” effect on ad

settings

48

Treatment p-value

Substance abuse 0.0000053

Disability 0.0000053

Mental disorder 0.053

Infertility 0.11

Adult websites 0.42

Statistical

significance

Page 48: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Experiment 1: Opacity Explanation

Top ads for group visiting substance abuse webpages

The Watershed Rehab www.thewatershed.com/Help

Watershed Rehab www.thewatershed.com/Rehab

The Watershed Rehab Ads by Google

Veteran Home Loans www.vamortgagecenter.com

CAD Paper Rolls paper-roll.net/Cad-Paper

Top ads for control group

Alluria Alert www.bestbeautybrand.com

Best Dividend Stocks dividends.wyattresearch.com

10 Stocks to Hold Forever www.streetauthority.com

Delivery Drivers Wanted get.lyft.com/drive

VA Home Loans Start Here www.vamortgagecenter.com

49

Page 49: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Experiment 2: Choice

Experimental group visits top 100 dating sites; then removes

dating interest from ad settings

Control group visits top 100 dating sites; then keeps dating

interest

Then both groups visit Times of India and collects ads

50

Browsing

Behavior

Ads

Received

Ad

Settings

Internal

State

Page 50: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Experiment 2:

Choice Buttons have an Effect

Treatment p-value

Opting out 0.0000053

Dating 0.0000053

Weight loss 0.041

51

Statistical

significance

Page 51: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Experiment 2: Choice Explanation

Top ads for group keeping dating interest

Are You Single? www.zoosk.com/Dating

Top 5 Online Dating Sites www.consumer-rankings.com/Dating

Why can't I find a date? www.gk2gk.com

Latest Breaking News www.onlineinsider.com

Gorgeous Russian Ladies anastasiadate.com

52

Top ads for group removing dating interest

Car Loans w/ Bad Credit www.car.com/Bad-Credit-Car-Loan

Individual Health Plans www.individualhealthquotes.com

Crazy New Obama Tax www.endofamerica.com

Atrial Fibrillation Guide www.johnshopkinshealthalerts.com

Free $5 - $25 Gift Cards swagbucks.com

Page 52: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Experiment 3: Discrimination

Experimental group visits top 100 job sites with gender set to

male in ad settings

Control group visits top 100 job sites with gender set to

female in ad settings

Then both groups visit Times of India and collects ads

53

Browsing

Behavior

Ads

Received

Ad

Settings

Internal

State

Page 53: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Experiment 3:

Discrimination Explanation

Top ads for female group

Jobs (Hiring Now) www.jobsinyourarea.co

4Runner Parts Service www.westernpatoyotaservice.com

Criminal Justice Program www3.mc3.edu/Criminal+Justice

Goodwill - Hiring goodwill.careerboutique.com

UMUC Cyber Training www.umuc.edu/cybersecuritytraining

54

Top ads for male group

$200k+ Jobs - Execs Only careerchange.com

Find Next $200k+ Job careerchange.com

Become a Youth Counselor www.youthcounseling.degreeleap.com

CDL-A OTR Trucking Jobs www.tadrivers.com/OTRJobs

Free Resume Templates resume-templates.resume-now.com

Page 54: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

55

Information Flow Experiments More on methodology

With Michael Carl Tschantz (CMU UC Berkeley)

Amit Datta (CMU)

Jeannette M. Wing (CMU Microsoft Research)

Page 55: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Google Exhibits Complex Behavior

0

5

10

15

20

25

30

35

40

45

0 50 100 150 200

Ad

id

Reload number

56

56

Page 56: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Browser Instances are Not Independent

57

17

13 13 13 12

11 10 10

8 7

Page 57: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Which Statistical Test to Use?

Our Idea:

Use a non-parametric test

Does not require model of Google

Specifically, a permutation test

Does not require independence among browser instances or

assumption that ads are independent and identically distributed

58

Page 58: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Permutation Test over Keywords

59

0

5 6

30 30

0

19 22

31

2

1 2 3 4 5 6 7 8 9 10

Page 59: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Permutation Test over Keywords

60

0 0 2

5 6

19 22

30 30 31

1 6 10 2 3 7 8 4 5 9

Page 60: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Permutation Test over Keywords

61

13

132

1,6,10,2,3 7,8,4,5,9

119

Page 61: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Permutation Test over Keywords

62

44

101

9,6,10,2,3 7,8,4,5,1

67

Page 62: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Permutation Test over Keywords

63

-57

119

67

7

Page 63: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Conclusion

A rigorous methodology for information flow

experiments

1. Probabilistic interference = Pearl’s causation

2. Experimental design for causal determination

3. Significance testing with non-parametric statistics

An experimental study of Google Ads

1. AdFisher Tool

2. Findings of opacity, choice and discrimination

64

Page 64: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Prior Work on Behavioral Marketing

Authors Test Limitation

Guha et al. Cosine similarity No statistical significance

Balebako et al. Cosine similarity No statistical significance

Wills and Tatar Ad hoc examination No statistical significance

Liu et al. Process of elimination No statistical significance

Barford et al. χ2 test Assumes ads identically distributed

Lécuyer et al. Parametric Model Correlation, not causation; assumes

ads are independent

65

Page 65: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Privacy as Restrictions on Personal

Information Flow

66

Restrictions

Info

rmatio

n F

low

Direct

Interference

Probabilistic

Interference

Temporal Purpose & Role based

EPAL

XACML

*-access control

Purpose Planning

FOTLs

[Formal Contextual Integrity,

Reduce audit algorithm,

Basin et al.]

Grok +

Legalease Jif,

FlowCaml,…

[Hayati &

Abadi]

Information Flow

Experiments

Differential

Privacy

Web Privacy

Healthcare

Privacy

Page 66: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

Summary

1. Information Flow Experiments

Methodology

Tool and application to Google’s advertising system with

findings of opacity, choice and discrimination

2. Privacy Compliance in Big Data Systems

Methodology

Tool and application to Bing’s compliance workflow, privacy

policies and advertising programs on production system

67

Page 67: Privacy through Accountability - CyLab · Research Challenge Ensure organizations respect privacy expectations, regulations, and organizational policies in the collection, use, and

68

Privacy through Accountability:

An Emerging Research Area

Privacy as a right to restrictions on

personal information flow

Computational accountability mechanisms

for enforcement

http://www.andrew.cmu.edu/user/danupam/privacy.html