Post on 26-Dec-2015
Strategic Modeling of Information Sharing among Data Privacy Attackers
Quang Duong, Kristen LeFevre, and Michael WellmanUniversity of Michigan
Presented by: Quang Duong
Privacy-Sensitive Data Publication
Name Age Zipcode Disease
Alex 20 13456 AIDS
Bob 25 13457 cancer
Carol 32 12345 flu
Age Zipcode Disease
Under 30 1345* AIDS
Under 30 1345* cancer
30 or above 1234* flu
Age Zipcode Disease
20 13456 AIDS
25 13457 cancer
32 12345 flu
Target’s sensitive value
Attackers’ background knowledge is relevant to data publication
de-identification
generalization
How Much Generalization?
• Competing effects:• More generalization makes published data more resistant to privacy attackers
• More generalization degrades information quality of published data
Need to model attackers’ background knowledge
Model of Privacy Attackers
• Main difference: network of attackers who share background knowledge
• Main contribution: a framework for constructing models that:
• capture information sharing activities among attackers
• estimate attackers’ background knowledge
Privacy Attacker Model’s Stages
1. ACQUIRE information separately
1. ACQUIRE information separately
2. DECIDE how much and what
to SHARE
2. DECIDE how much and what
to SHARE
3.ATTACK with their
augmented knowledge
3.ATTACK with their
augmented knowledge
Decision:
How much and what information to share
Tradeoff (of sharing background knowledge):
• Increase attack capability
• Decrease compromised data’s exclusiveness
Utility:
• (number of successful attackers)-2 if capable of compromising the dataset
• 0 otherwise
Data Privacy Attacker Model
Database Publisher Model
Decision:
How much generalization should be applied to the published data
Tradeoff (of generalizing data):
• Reduce privacy breach risk
• Induce more information loss
Utility:
(Linear) combination of privacy breach risk and information loss
Two-Stage Game Model
Publisher decides how to generalize the data setPublisher decides how to generalize the data set
Attacker nAttacker n
1st
2nd
We can reason about the attackers’ actions and background knowledge, using different solution concepts such as Nash equilibrium
Attacker 2Attacker 2Attacker 1Attacker 1…
Choose how much and what to share
Model Details: Background Knowledge
3 categories of background knowledge: [Chen et al. ‘07]
1. (L) values that the target doesn’t have:
Alex does not have cancer
2. (K) sensitive info about individuals different from the targetCarol has flu
3. (M) relations between the target’s sensitive value and others’
If Carol has AIDS, Alex has AIDS
Model Details: Attackers
1. Agent space: n attackers, each is represented by its prior knowledge set: (K,L,M)
2. Action space: Each attacker decides how many and what instances to share (ak,al,am)
1. Sharing mechanism: 1. Pair-wise: direct exchange between every pair of attackers
2. Reciprocal: exchange the same amount of information
Example Model – Empirical Study
• Overview:• Data: 10 records, |domain of sensitive values| = 5• Attackers: 3, each has 1 instance of each knowledge type• Publisher: explicitly specifies her generalization method
Construct and estimate the game’s payoff matrix
• Testing scenarios:1 Attackers share all their knowledge
2 No one shares
3 Attackers play some Nash Equilibrium (NE)
Outcomes under Different Attacker Action Scenarios
• Publisher’s actions (I, II, III…): each has 3 data points corresponding to 3 attacker action scenarios. Each point corresponds to the publisher and attackers’ actions
• Main result: the publisher may adopt different generalization strategies under different beliefs about attackers’ strategies
Concluding Remarks
Contributions: • Propose a framework for reasoning about
attackers’ actions• Initiate a game-theoretic study of privacy
attackers as a knowledge-sharing network• Demonstrate that it matters to take into account
attackers’ knowledge and their information-sharing activities
THANK YOU!