Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images....
Transcript of Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images....
![Page 1: Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec0809eae2e43089d1f6df6/html5/thumbnails/1.jpg)
1 © 2019 FUJITSU
Ulfar Erlingsson
Senior Staff Research Scientist at Google
Heads a team within Google Brain doing research on privacy and security for machine learning.
Previously, he has been a researcher at Microsoft Research, Silicon Valley and an Associate Professor at Reykjavik University, Iceland.
![Page 2: Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec0809eae2e43089d1f6df6/html5/thumbnails/2.jpg)
Is privacy an obstacle?Where does it raise the biggest challenge?
![Page 3: Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec0809eae2e43089d1f6df6/html5/thumbnails/3.jpg)
Metaphor for Privacy(randomized response)
![Page 4: Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec0809eae2e43089d1f6df6/html5/thumbnails/4.jpg)
Microdata: An Individual’s Report
![Page 5: Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec0809eae2e43089d1f6df6/html5/thumbnails/5.jpg)
Microdata: An Individual’s Report
Each bit is flipped with probability
25%
![Page 6: Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec0809eae2e43089d1f6df6/html5/thumbnails/6.jpg)
Big Picture Remains!
![Page 7: Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec0809eae2e43089d1f6df6/html5/thumbnails/7.jpg)
Two common data sets in Machine Learning
MNIST dataset: 70,000 images
28⨉28 pixels each
CIFAR-10 dataset: 60,000 color images
32⨉32 pixels each
![Page 8: Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each](https://reader034.fdocuments.in/reader034/viewer/2022042217/5ec0809eae2e43089d1f6df6/html5/thumbnails/8.jpg)
What are the utility benefits / costs of ML privacy ?
Training ML models with privacy works and ensures strong generalization… and may help with data retention & removal concerns
But...Training with privacy means the MLmodel cannot “see” unique outliers
Model can’t learn about truly weird data
Utility of privacy-preserving ML models may always be worse on real outliers