unige.it - Differential Privacy and Generalization: Sharper...

UniversityofGenoa

PolytechnicSchoolandtheSchoolofScience

ResearchandTechnologyTransferLaboratory

Differential Privacy and Generalization: Sharper Bounds, Theoretically Grounded

Algorithms, and ThresholdoutLuca Oneto

[email protected]

Summer School on Applied Harmonic Analysis,Genoa, Italy, 24th July 2017

SmartLab

24 July 2017 University of Genoa – DIBRIS – SmartLabwww.lucaoneto.com - [email protected] 2

D.AnguitaAssociateProf.

S.RidellaEmeritusProf.

L.OnetoAssistantProfessor

G.ClericoRes.Assistant

A.LulliPostdoc

M.CambiasoRes.Assistant

I.OrlandiPhDStud.

E.FumeoPhDStud.

F.CipolliniPhDStud.

P.SanettiRes.Assistant

Expertise


Scientific Research

• Topics• Neural Networks• Kernel Methods• Ensemble Methods• Statistical Learning

Theory• Machine Learning• Data Mining• High Performance

Computing• Big & Small Data

Analysis• CBM, EDM, HAR,

Sentiment Analysis, Cybercecurity

• ···

• Publications• > 50 International High

Ranked Journals• > 100 International High

Ranked Conferences• ···

Technology Transfer

• aizoOn S.r.l.• Ansaldo STS S.p.A.• Brembo S.p.A• Bombardier Transportation• Cetena S.p.A.• Damen Shipyards Group• Ferrari S.p.A. - Scuderia Ferrari• VarGroup

• ···

European Projects

• Basic Research• EC NeuroNet I & II -

Network of Excellence on Neural Networks

• EC RAIN - Redundant Array of Inexpensive Workstations for Neurocomputing

• EC EUNITE - European Network of Excellence on Information Technology for Smart Adaptive Systems

• EC-FET NiSIS - Nature-inspired Smart Information Systems

• ···

• Applied Research• EC-H2020 IN2DREAMS • EC-H2020 In2Rail• EC-FP7 MAXBE• ···

Privacy• In the last years researchers have studied many

ways to access data in a private way (aggregate, noise, etc.)

• Privacy is a bad thing from a data scientist point of view (we cannot access data if not aggregate, etc.)

• The breakthrough was to find a way to exploit privacy as a new regularization method and as a tool for better assessing the generalization performances of a learning algorithm


Supervised Learning

24 July 2017 University of Genoa – DIBRIS – SmartLabwww.lucaoneto.com - [email protected] - [email protected] 5

The only things available for learning is a set of examples of the mapping.[1] Vapnik, V.N., 1998. Statistical learning theory. Wiley New York.

x ⇠ PX , y = f(x)x ⇠ PX , y = f(x)

Deterministic Functions/Learning AlgorithmsGiven a set of data the algorithm always returns the same model.

The model is a function chosen in a set of functions: given the function and a point, the predicted output is always the same.


x ⇠ PX , y = f(x)

f = A (s)

Randomized Functions

Given a set of data the algorithm always returns the same model.

The model is a distribution over a set of functions. Given the model and a point, the predicted output may be different.


⇢ A (s)

x ⇠ PX , f ⇠ ⇢, y = f(x)

Randomized Learning AlgorithmsGiven a set of data the algorithm may return different models.

The model can be a deterministic or randomized function. In our case the function is deterministic.


x ⇠ PX , y = F (x)

F = A (s)

Notation


[1] Oneto, Luca, Sandro Ridella, and Davide Anguita. "Differential privacy and generalization: Sharper bounds with applications." Pattern Recognition Letters 89 (2017): 31-38.[2] Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., Roth, A., 2015b.Preserving statistical validity in adaptive data analysis, in: Annual ACM Symposium on Theory of Computing.

x 2 X , y 2 Y, z 2 Z = X ⇥ Y, PX , PY , PZ

Zn = S 3 s = {z1, · · ·, zn} = {(x1, y1), · · ·, (xn, yn)} i.i.d. PS

Z 3 Z ⇠ PZ , S 3 S ⇠ PS

s = {z1, · · ·, zi�1, zi, zi+1, · · ·, zn} zi i.i.d. zi

S ✓ Sf : X ! Y, f 2 FF ✓ FA : S ! F , PA

D : F ! S` : F ⇥ Z ! [0, 1]

L(f) = EZ`(f,Z), V (f) = EZ [`(f,Z)� L(f)]2

bL

sn(f) = 1

/n

nX

i=1

`(f, zi), bV

sn (f) = 1

/n(n � 1)

nX

i=1

nX

j=i+1

[`(f, zi)� `(f, zj)]2

Goal

Estimate the true (generalization) error of the model based on the empirical data


P{|L(f)� bLn(f)| � ✏} �

|L(f)� bLn(f)| ✏, @(1� �)

� $ ✏

Differentially Private (DP) Randomized Learning Algorithms

A Randomized Learning Algorithm is-DP if

[1] Dwork, C., Roth, A., 2014. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science 9, 1–277.


PA

n

A (s) 2 Fo

e✏PA

n

A (s) 2 Fo

+ �

(✏, �)

8F ✓ F , 8s, s 2 S

Differentially Private (DP) Randomized Learning Algorithms

A Randomized Learning Algorithm is-DP if

Proof:

[1] Oneto, Luca, Sandro Ridella, and Davide Anguita. "Differential privacy and generalization: Sharper bounds with applications." Pattern Recognition Letters 89 (2017): 31-38.


✏

PA

n

A (s) 2 Fo

=R

FPA {A (s) = f}df

R

Fe✏PA {A (s) = f}df = e✏PA

n

A (s) 2 Fo

PA {A (s) = f}PA {A (s) = f} e✏, 8f 2 F , 8s, s 2 S

Hold Out, Compression, Complexity, Stability, and… Privacy (I)The first option to estimate the generalization performance of an algorithm is to split the data

[1] Hoeffding, Wassily. "Probability inequalities for sums of bounded random variables." Journal of the American statistical association 58.301 (1963): 13-30.[2] Anguita D, Ghio A, Oneto L, Ridella S. In-sample and out-of-sample model selection and error estimation for support vector machines. IEEE Transactions on Neural Networks and Learning Systems. 2012 Sep;23(9):1390-406.


s = s1 [ s2, s1 \ s2 = ↵

PS2

n

L(A (s1))� bLS2

|S2|(A (s1)) � to

e�2|s2|t2

Hold Out, Compression, Complexity, Stability, and… Privacy (II)Another option is to check how much the algorithm compresses the data

[1] Floyd S, Warmuth M. Sample compression, learnability, and the Vapnik-Chervonenkis dimension. Machine learning. 1995 Dec 1;21(3):269-304.[2] Langford J, McAllester D. Computable shell decomposition bounds. Journal of Machine Learning Research. 2004;5(May):529-47.


s0 ✓ s

PS

n

L(A (S))� bLSn (A (S)) � t

o

n

✓

n

|s0|

◆

e�2nt2

Hold Out, Compression, Complexity, Stability, and… Privacy (III)Another option is to check how large is the function space in which the algorithm chooses the solution

[1] V. N. Vapnik, Statistical learning theory, Wiley-Interscience, 1998.[2] L. Oneto, A. Ghio, S. Ridella, D. Anguita, Global rademacher complexity bounds: From slow to fast convergence rates, Neural Processing Letters 43 (2) (2015) 567–602.[3] L. Oneto, A. Ghio, S. Ridella, D. Anguita, Local rademacher complexity: Sharper risk bounds with and without unlabeled samples, Neural Networks 65 (2015) 115–125.


C(F) : ndV C(F), enR2(F)

PS

n

L(A (S))� bLSn (A (S)) � t

o

c2C(F)e�c1nt2

PS

n

L(A (S))� bLSn (A (S)) � t

o

c2C⇣n

f : f 2 F , bLsn(f) c3

o⌘

e�c1nt2

c3(L, C, t, n)

Hold Out, Compression, Complexity, Stability, and… Privacy (IV)

Another way is to check how close the functions chosen by the algorithm are

[1] O. Bousquet, A. Elisseeff, Stability and generalization, The Journal of Machine Learning Research 2 (2002) 499–526.[2] L. Oneto, A. Ghio, S. Ridella, D. Anguita, Fully empirical and data- dependent stability-based bounds, IEEE Transactions on Cybernetics 45 (9) (2015) 1913–1926.[3] Maurer A. A Second-order Look at Stability and Generalization. InConference on Learning Theory 2017 Jun 18 (pp. 1461-1475).


|` (A (s), ·)� `(A (s), ·)|1 �

PS

n

L(A (S))� bLSn (A (S)) � t

o

c2en�2�c1nt

2

DP Main Result

Proof: rather technical…

[1] C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, A. Roth, Preserving statistical validityin adaptive data analysis, in: Symposium on Theory of Computing, 2015.


If PS{S 2 D(f)} �

8f 2 F and ✏ p

ln (1/�)/2n

! PS,F {S 2 D(F )} 3p�

Hoeffding-type Bounds


Proof:

[1] Hoeffding, Wassily. Probability inequalities for sums of bounded random variables. Journal of the American statistical association 58.301 (1963): 13-30.[2] C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, A. Roth, Preserving statistical validity in adaptive data analysis, in: Symposium on Theory of Computing, 2015.[3] L. Oneto, S. Ridella, D. Anguita, Differential privacy and generalization: Sharper bounds with applications, Pattern Recognition Letters 89 (2017) 31–38.

✏ t

! PS,F {L(F ) � bLSn (F ) + t} 3e�nt2

✏ p

t2 � ln(2)/2n

! PS,F {|L(F )� bLSn (F )| � t} 3

p2e�nt2

PS{L(f)� bLSn (f) � t} e�2nt2

D(f) = {s 2 S : L(f)� bLSn (f) > t}

� = e�2nt2

O (1/pn)

Chernoff and Bennett-type Bounds


[1] H. Chernoff, A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations, The Annals of Mathematical Statistics 23 (4) (1952) 493–507.[2] A. Maurer, M. Pontil, Empirical bernstein bounds and sample variance penalization, in: Conference on Learning Theory, 2009.[3] L. Oneto, S. Ridella, D. Anguita, Differential privacy and generalization: Sharper bounds with applications, Pattern Recognition Letters 89 (2017) 31–38.

✏ t

! PS,F {L(F ) � bLSn (F ) +

p4L(F )t} 3e�nt2

✏ pt2 � ln(2)/2n

! PS,F {|L(F )� bLSn (F )| �

p6L(F )t} 3

p2e�nt2

O (1/pn)÷O (1/n)

Chernoff and Bennett-type Bounds


[1] H. Chernoff, A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations, The Annals of Mathematical Statistics 23 (4) (1952) 493–507.[2] A. Maurer, M. Pontil, Empirical bernstein bounds and sample variance penalization, in: Conference on Learning Theory, 2009.[3] L. Oneto, S. Ridella, D. Anguita, Differential privacy and generalization: Sharper bounds with applications, Pattern Recognition Letters 89 (2017) 31–38.

✏ r

t2 � ln(2)

2n

! PS,F

⇢L(F ) bLS

n (F ) +q

4bV Sn (F )t+

14nt2

3(n� 1)

� 3

p2e�nt2

✏ p

t2 � ln(3)/2n

! PS,F

⇢��L(F )� bLSn (F )

�� q

4bV Sn (F )t+

14nt2

3(n� 1)

� 3

p3e�nt2

O (1/pn)÷O (1/n)

Clopper-Pearson (Binary Classification)

[1] C. J. Clopper, E. S. Pearson, The use of confidence or fiducial limits illustrated in the case of the binomial, Biometrika 26 (4) (1934) 404– 413.[2] L. Oneto, S. Ridella, D. Anguita, Differential privacy and generalization: Sharper bounds with applications, Pattern Recognition Letters 89 (2017) 31–38.


✏ p

ln (1/�)/2n

! PS,F {L(F ) � Q[1� �;nbLSn (F ) + 1, n� nbLS

n (F )]} 3p�

✏ p

ln (1/2�)/2n

! PS,F {Q[�;nbLSn (F ), n� nbLS

n (F ) + 1] L(F )

Q[1� �;nbLSn (F ) + 1, n� nbLS

n (F )]} 3p2�

O (1/pn)÷O (1/n)

Clopper-Pearson (Regression)

Proof:

[1] L. Oneto, S. Ridella, D. Anguita, Differential privacy and generalization: Sharper bounds with applications, Pattern Recognition Letters 89 (2017) 31–38.[2] X. Chen, A link between binomial parameters and means of bounded random variables, arXiv preprint arXiv:0802.3946.


P {h � u}=Eh,u {h � u}=Eh {Eu {h � u}}=Eh {h}

✏ p

ln (1/�)/2n

! PS,F

(L(F ) � Q

"1� �;

nX

i=1

[`(F ,Zi) � ui] + 1, n�nX

i=1

[`(F ,Zi) � ui]

#) 3

p�

✏ p

ln (1/2�)/2n

! PS,F

(Q

"�;

nX

i=1

[`(F ,Zi) � ui] , n�nX

i=1

[`(F ,Zi) � ui] + 1

# L(F )

Q

"1� �;

nX

i=1

[`(F ,Zi) � ui] + 1, n�nX

i=1

[`(F ,Zi) � ui]

#) 3

p2�

O (1/pn)÷O (1/n)

⇢P{u = ↵,↵ 2 [0, 1]} = 1P{u = ↵,↵ 62 [0, 1]} = 1

Example: DP Random Forest (RF) (I)www.openml.org


Abb. ID Name n d Abb. ID Name n d

D01 40 sonar 208 61 D02 59 ionosphere 351 35D03 785 wind correlations 45 47 D04 882 pollution 60 16D05 1104 leukemia 72 7130 D06 1446 CostaMadre1 296 38D07 1453 PieChart3 1077 38 D08 1458 arcene 200 10001D09 1485 madelon 2600 501 D10 1566 hill-valley 1212 101D11 37 diabetes 768 9 D12 1005 glass 214 10D13 1494 qsar-biodeg 1055 42 D14 1134 OVA Kidney 1545 10937D15 1217 Click prediction small 149639 12 D16 1149 AP Ovary Kidney 458 10937D17 907 chscase census4 400 8 D18 976 kdd JapaneseVowels 9961 15D19 1443 PizzaCutter1 61 38 D20 871 pollen 3848 6

Example: DP RF (II)• the Random Forests (RF): the original RF

formulation• the Random Rotation Ensembles (RRE): a recent

improvement over the original RF• the Random Decision Trees (RDT): a fully

random RF implementation which is faster to be trained• the Differentially Private RDT (DPRDT): a RDT

formulation which is also DP

[1] L. Breiman, Random forests, Machine learning 45 (1) (2001) 5–32.[2] R. Blaser, P. Fryzlewicz, Random rotation ensembles, Journal of Ma- chine Learning Research 17 (4) (2015) 1–15.[3] M. Bojarski, A. Choromanska, K. Choromanski, Y. LeCun, Differentially-and non-differentially-private random decision trees, in: arXiv preprint arXiv:1410.6973, 2014.


Example: DP RF (III)


• kCV for RF, RRE, and RDT •DP for DPRDT

D01 D02 D03 D04 D05 D06 D07 D08 D09 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D20Dataset

0

0.2

0.4

0.6

0.8

1

Gen

eral

izat

ion

Erro

r

nt = 50, nd = 30, k = 3

RFRRERDTDPRDT

Randomized Functions or Randomized Algorithms?For studying Randomized Algorithms we have different options•Hold out• Stability•DP

For studying Randomized Functions, instead we only have one powerful option• PAC-Bayes theory


PAC-Bayes Theory (I)

[1] Germain, P., et al. Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm. Journal of Machine Learning Research. 2015. [2] Germain, Pascal, et al. Risk bounds for the majority vote: From a PAC-Bayesian analysis to a learning algorithm. The Journal of Machine Learning Research 16.1 (2015): 787-860. [3] McAllester D. A. Some pac-bayesian theorems. Computational learning theory. 1998Langford J. Tutorial on practical prediction theory for classification.”Journal of machine learning research. 2005 42[4]Germain P. Lacasse A. Laviolette F. Marchand M. PAC-Bayesian learning of linear classifiers. International Conference on Machine Learning. 2009


⇡ : Prior over F⇢ : Posterior over FG⇢(X) : x ⇠ PX , f ⇠ ⇢, y = f(x)

B⇢(X) : x ⇠ PX , y = E⇢{f(X)}L(B⇢) 2L(G⇢)

PAC-Bayes Theory (II)

[1] Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The annals of mathematical statistics, 22(1), 79-86.[2] Bégin, L., Germain, P., Laviolette, F., & Roy, J. F. (2016). PAC-Bayesian Bounds based on the RényiDivergence. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (pp. 435-444).


KL(⇢||⇡) = E⇢

n

ln⇣ ⇢

⇡

⌘o

P

8<

:klhbLSn (GQ)||L(GQ)

i�KL+ ln

h2pn

�

i

n

9=

; �

O⇣p

ln(n)/n⌘÷O ( ln(n)/n)

Catoni’s Result: CatoniRandomized Function (CRF)

[1] O.Catoni.Pac-bayesian supervised classification: The thermodynamics of statistical learning. arXivpreprint arXiv:0712. 0248, 2007. [2] Lever G. Laviolette F. Shawe-Taylor J. Tighter PAC-Bayes bounds through distribution-dependent priors. Theoretical Computer Science. 2013[3] Oneto, Luca, Davide Anguita, and Sandro Ridella. PAC-bayesian analysis of distribution dependent priors: Tighter risk bounds and stability analysis. Pattern Recognition Letters 80 (2016): 200-207.


⇡(f) =1

Zexp(��L(f))

⇢(f) =1

Z 0 exp(��bLsn(f))

KL(⇢||⇡) �2

2n+ �

vuut2 ln

h2pn

�

i

n, at: (1� �)

Why not a Catoni Randomized Algorithm (CRA)?Instead of building CDF based on the Catoni’s posterior we can think about a Randomized Algorithm which chooses, inside our space of function, the best function (the one with the smallest empirical error) perturbed with the Catoni’s noise (CRA).

[1] L. Oneto, S. Ridella, D. Anguita, Differential privacy and generalization: Sharper bounds with applications, Pattern Recognition Letters 89 (2017) 31–38.


CRA is DP


P{A (s)=f}P{A (s)=f}

=e�

�n

Pni=1 `(f,zi)

Pf12F e�

�n

Pni=1 `(f1,zi)

Pf12F e�

�n (

Pni=1,i 6=j `(f1,zi)+`(f1,zj))

e��n (

Pni=1,i 6=j `(f,zi)+`(f,zj))

e0P

f12F e��n

Pni=1,i 6=j `(f1,zi)e�

�n

Pf12F e�

�n

Pni=1,i 6=j `(f1,zi)e0

e��n

= e2�n .

[1] L. Oneto, S. Ridella, D. Anguita, Differential privacy and generalization: Sharper bounds with applications, Pattern Recognition Letters 89 (2017) 31–38.

CRF and CRA Generalization Properties


� = 1/2p

n ln (3p2/2�)

! PS,F

⇢Q

�2

18;nbLS

n (F ), n� nbLSn (F ) + 1

� L(F )

Q

1� �2

18;nbLS

n (F ) + 1, n� nbLSn (F )

��

CRA

CRFO⇣

4p

ln(n)/n⌘÷O

⇣pln(n)/n

⌘

O (1/pn)÷O (1/n)

� = 1/2pn ln (3

p2/2�)

! PS

8>><

>>:kl[bLS

n (GQ)||L(GQ)]�ln

⇣3p2

�

⌘

8n+

vuut2 ln⇣

3p2

�

⌘ln⇣

2pn

�

⌘

4n+ln

⇣2pn

�

⌘

n

9>>=

>>;2�.

Example: CRF and CRA


Functions space counts of trees built with an hold out set.

D01 D02 D03 D04 D05 D06 D07 D08 D09 D10 D11 D12 D13 D14 D15 D16 D17 D18 D19 D20

Dataset

0

0.2

0.4

0.6

0.8

1

Ge

ne

raliza

tio

n E

rro

r

nt = 50, k = 3

CRC

CPD

CRFCRA

Non-Adaptive Data Analysis (I)NAS: the non-adaptive setting is the case when the procedures for building our models exploit just the training set.

[1] Oneto L, Ridella S, Anguita D. Differential privacy and generalization: Sharper bounds with applications. Pattern Recognition Letters. 2017 Apr 1;89:31-8.


Algorithm 1: Union Bound for the NAS

Input: st, sh, and P1, · · ·,Pm

Output: bLshn (f1), · · ·, bLsh

n (fm)

1 for i 1 to m do

2 fi = Pi (st) and compute

bLshn (fi);

Non-Adaptive Data Analysis (II)NAS: the non-adaptive setting is the case when the procedures for building our models exploit just the training set.

Then we can use the Bonferroni Correction:



PSh

8<

:9i 2 Im :��L(Pi(st))� bLSh

n (Pi(st))��

sln

�2m�

�

2n

9=

; �

O⇣p

ln(m)/n⌘÷O ( ln(m)/n)

Adaptive Data Analysis (I)AS: the adaptive setting is the case when the procedures for building our models exploit both the training set and the performance of the procedure at previous step over the hold out set.



Algorithm 1: Hold out for the ASInput: st, sh, and P1, · · ·,Pm

Output: bLs1hn (f1), · · ·, bL

smhn (fm)

1 Split sh in sih with i 2 Im;

2 for i 1 to m do

3 fi = Pi

✓st, bL

s1hn (f1), · · ·, bL

si�1h

n (fi�1)

◆and compute

bLsihn (fi);

Adaptive Data Analysis (II)AS: the adaptive setting is the case when the procedures for building our models exploit both the training set and the performance of the procedure at previous step over the hold out set.

Then we need one test set at each step:



fi = Pi(st,Pi�1, · · · ,P1)

PSih

8<

:9i 2 Im :��L(fi)� bLSi

hn (fi)

��

sm ln

�2�

�

2n

9=

; �

O⇣p

m/n⌘÷O (m/n)

ThresholdoutThe idea is to look at the test set error only when is far from the one on the training set, but, when we look at the error on the test set, we look at it in a “private way”.

[1] Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., Roth, A., 2015c. The reusable holdout: Preserving validity in adaptive data analysis. Science 349, 636–638.[2] Oneto L, Ridella S, Anguita D. Differential privacy and generalization: Sharper bounds with applications. Pattern Recognition Letters. 2017 Apr 1;89:31-8.


Algorithm 1: Thresholdout for the AS

Input: st, sh, T,�, B, and P1, · · ·,Pm

Output: a1, · · ·, am1 � ⇠ Lap(2�), bT = T + �;2 for i 1 to m do3 if B < 1 then4 ai = ?;5 else6 fi = Pi(st, a1, · · ·, ai�1), ⌘ ⇠ Lap(4�);

7 if |bLshn (fi)� bLst

n (fi)| � bT + ⌘ then

8 ⇠ ⇠ Lap(�), � ⇠ Lap(2�), bT = T + �, B = B � 1;

9 ai = bLshn (fi) + ⇠;

10 else

11 ai = bLstn (fi);

Hoeffding-type BoundWe obtain a result which is analogous to the one of the NAS setting

[1] Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., Roth, A., 2015c. The reusable holdout: Preserving validity in adaptive data analysis. Science 349, 636–638.[2] Oneto L, Ridella S, Anguita D. Differential privacy and generalization: Sharper bounds with applications. Pattern Recognition Letters. 2017 Apr 1;89:31-8.


PAi,F i

8>><

>>:9i 2 Im : Ai 6= ?^|Ai � L(F i)| � 40

vuutB ln⇣

12m�

⌘

n

9>>=

>>; �

O⇣p

ln(m)/n⌘

Chernoff-type BoundWe can obtain a sharp result (at least asymptotically).



t = 40

v

u

u

t

B ln⇣

12m�

⌘

n

PAi,F i

n

9i 2 Im : Ai 6= ?^|Ai � L(F i)| � 30p

Ait+ 50t2o

�

O⇣p

ln(m)/n⌘÷O ( ln(m)/n)

Open Problems• Is privacy reducing our ability to learn something from

data?

• Can we improve the rate of convergence and the constants in the bounds?

• How can we exploit privacy to derive new learning algorithms?

• Can we improve the Thresholdout?

• How many times can we access the data without compromising the privacy?


unige.it - Differential Privacy and Generalization: Sharper...

Documents

Transcript of unige.it - Differential Privacy and Generalization: Sharper...