Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar...
Transcript of Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar...
![Page 1: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/1.jpg)
Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data
Nicolas Papernotjoint work with Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar
Google Brain
Nicolas is at Penn State, was an intern in Brain; Ian did part of the work at OpenAI.
PATE-
![Page 2: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/2.jpg)
Some challenges of learning from private data
2
?
Membership attacksShokri et al. (2016) Membership Inference Attacks against ML Models
Training-data extraction attacks Fredrikson et al. (2015) Model Inversion Attacks
?
???
![Page 3: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/3.jpg)
Types of adversaries and our threat model
3
In our work, the threat model assumes:
- Adversary can make a potentially unbounded number of queries- Adversary has access to model internals
Model inspection (white-box adversary)Zhang et al. (2017) Understanding DL requires rethinking generalization
Model querying (black-box adversary)Shokri et al. (2016) Membership Inference Attacks against ML ModelsFredrikson et al. (2015) Model Inversion Attacks
?Black-box
ML
![Page 4: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/4.jpg)
}Quantifying privacy
4
Randomized Algorithm
Randomized Algorithm
Answer 1Answer 2
...Answer n
Answer 1Answer 2
...Answer n
??? ?
![Page 5: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/5.jpg)
Our design goals
5
Preserve privacy of training data when learning classifiers
Differential privacy protection guarantees
Intuitive privacy protection guarantees
Generic* (independent of learning algorithm)Goals
Problem
*This is a key distinction from previous work, such as Pathak et al. (2011) Privacy preserving probabilistic inference with hidden markov models Jagannathan et al. (2013) A semi-supervised learning approach to differential privacy Shokri et al. (2015) Privacy-preserving Deep Learning Abadi et al. (2016) Deep Learning with Differential Privacy Hamm et al. (2016) Learning privately from multiparty data
![Page 6: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/6.jpg)
The PATE approach:
Private Aggregation of Teacher Ensembles
6
PÂTÉ
![Page 7: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/7.jpg)
Teacher ensemble
7
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Aggregated Teacher
Training
Sensitive Data
Data flow
![Page 8: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/8.jpg)
Aggregation
8
Count votes Take maximum
![Page 9: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/9.jpg)
Intuitive privacy analysis
9
If most teachers agree on the label, it does not depend on specific partitions, so the privacy cost is small.
If two classes have close vote counts, the disagreement may reveal private information.
![Page 10: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/10.jpg)
Noisy aggregation
10
Count votes Add Laplacian noise Take maximum
![Page 11: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/11.jpg)
Teacher ensemble
11
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Aggregated Teacher
Training
Sensitive Data
Data flow
![Page 12: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/12.jpg)
Student training
12
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Aggregated Teacher Student
Training
Available to the adversaryNot available to the adversary
Sensitive Data
Public Data
Inference Data flow
Queries
![Page 13: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/13.jpg)
Why train an additional “student” model?
13
Each prediction increases total privacy loss.Privacy budgets create a tension between the accuracy and number of predictions.
Inspection of internals may reveal private data.Privacy guarantees should hold in the face of white-box adversaries.
1
2
The aggregated teacher violates our threat model:
![Page 14: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/14.jpg)
Student training
14
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Aggregated Teacher Student
Training
Available to the adversaryNot available to the adversary
Sensitive Data
Public Data
Inference Data flow
Queries
![Page 15: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/15.jpg)
Deployment
15
Inference
Available to the adversary
QueriesStudent
![Page 16: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/16.jpg)
Differential privacy:A randomized algorithm M satisfies ( , ) differential privacy if for all pairs of neighbouring datasets (d,d’), for all subsets S of outputs:
Application of the Moments Accountant technique (Abadi et al, 2016)
Strong quorum ⟹ Small privacy cost
Bound is data-dependent: computed using the empirical quorum
Differential privacy analysis
16
![Page 17: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/17.jpg)
PATE-G: the generative variant of PATE
17
PATE-__
![Page 18: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/18.jpg)
Generative Adversarial Networks (GANs)
18
2 competing models trying to game each other:
Goodfellow et al. (2014) Generative Adversarial Networks
Generator:
Input: noise sampled from random distribution
Output: synthetic input close to the expected training distribution
Discriminator
Input: output from generator OR example from real training distribution
Output: in distribution OR fake
Gaussian sample Fake sample Sample P(real)=...
P(fake)=...
![Page 19: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/19.jpg)
GANs for semi-supervised learning
19
2 competing models trying to game each other:
Generator:
Input: noise sampled from random distribution
Output: synthetic input close to the expected training distribution
Discriminator
Input: output from generator OR example from real training distribution
Output: in distribution (which class) OR fake
Gaussian sample Fake sample Sample
P(real 0)=...P(real 1)=...
...P(real n)=...P(fake)=...
Salimans et al. (2016) Improved techniques for training GANs
![Page 20: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/20.jpg)
Student training in PATE-G
20
Partition 1
Partition 2
Partition n
Partition 3
...
Teacher 1
Teacher 2
Teacher n
Teacher 3
...
Aggregated Teacher
Queries
Training
Available to the adversaryNot available to the adversary
Sensitive Data
Inference Data flow
Generator
Discriminator
Public Data
![Page 21: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/21.jpg)
Deployment of PATE-G
21
Inference
Available to the adversary
QueriesDiscriminator
![Page 22: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/22.jpg)
Experimental results
22
![Page 23: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/23.jpg)
Experimental setup
23
Dataset Teacher Model Student Model
Student Public Data
TestingData
MNIST 2 conv + 1 relu GANs (6 fc layers) test[:1000] test[1000:]
SVHN 2 conv + 2 relu GANs (7 conv + 2 NIN) test[:1000] test[1000:]
UCI Adult RF (100 trees) RF (100 trees) test[:500] test[500:]
UCI Diabetes RF (100 trees) RF (100 trees) test[:500] test[500:]
/ /models/tree/master/differential_privacy/multiple_teachers
![Page 24: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/24.jpg)
Aggregated teacher accuracy
24
![Page 25: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/25.jpg)
Trade-off between student accuracy and privacy
25
![Page 26: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/26.jpg)
Trade-off between student accuracy and privacy
26
UCI Diabetes
1.44
10-5
Non-private baseline 93.81%
Student accuracy
93.94%
![Page 27: Google Brain Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal ... Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar Google Brain Nicolas is at Penn State, was an](https://reader035.fdocuments.in/reader035/viewer/2022081401/5f07ed2d7e708231d41f742a/html5/thumbnails/27.jpg)
?Come check out our poster C13(PATE-G swag will be distributed
while supplies last)
27
www.cleverhans.io@NicolasPapernot
got PÂTÉ?