Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global...

33
Smooth Sensitivity and Sampling Sofya Raskhodnikova Penn State University Joint work with Kobbi Nissim (Ben Gurion University) and Adam Smith (Penn State University)

Transcript of Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global...

Page 1: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth Sensitivity andSampling

Sofya RaskhodnikovaPenn State University

Joint work with Kobbi Nissim (Ben Gurion University)

and Adam Smith (Penn State University)

Page 2: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Our main contributions

• Starting point: Global sensitivity framework [DMNS06]

(Cynthia’s talk)

• Two new frameworks for private data analysis

• Greatly expand the types of information that can be

released

2

Page 3: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Road map

I. Introduction

• Review of global sensitivity framework [DMNS06]

• Motivation

II. Smooth sensitivity framework

III. Sample-and-aggregate framework

3

Page 4: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Model

...

x1

x2

xn

���� Trusted

agency

A�

�Compute f(x)

A(x) =f(x) + noise

Users

Each row is arbitrarily complex data supplied by 1 person.

For which functions f can we have:

• utility: little noise

• privacy: indistinguishability definition of [DMNS06]

4

Page 5: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Privacy as indistinguishability [DMNS06]

Two databases are neighbors if they differ in one row.

x = ...

x1

x2

xn

x′ = ...

x1

x′2

xn

Privacy definition

Algorithm A is ε-indistinguishable if

• for all neighbor databases x, x′

• for all sets of answers S

Pr[A(x) ∈ S] ≤ (1 + ε) · Pr[A(x′) ∈ S]

5

Page 6: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Privacy definition: composition

If A is ε-indistinguishable on each query,it is εq-indistinguishable on q queries.

...

x1

x2

xn

���� ε-indisting.

agency

A�

�Compute f(x)

A(x) =f(x) + noise

Users

6

Page 7: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Global sensitivity framework [DMNS06]

Intuition: f can be released accurately when

it is insensitive to individual entries x1, . . . , xn.

Global sensitivity GSf = maxneighbors x,x′

‖f(x) − f(x′)‖.

Example: GSaverage = 1n

if x ∈ [0, 1]n.

Theorem

If A(x) = f(x) + Lap(

GSf

ε

)then A is ε-indistinguishable.

7

Page 8: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Instance-Based Noise

Big picture for global sensitivity framework:

• add enough noise to cover the worst case for f

• noise distribution depends only on f , not database x

Problem: for some functions that’s too much noise

8

Page 9: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Instance-Based Noise

Big picture for global sensitivity framework:

• add enough noise to cover the worst case for f

• noise distribution depends only on f , not database x

Problem: for some functions that’s too much noise

Example: median of x1, . . . , xn ∈ [0, 1]

x = 0 · · · 0︸ ︷︷ ︸n−1

2

0 1 · · · 1︸ ︷︷ ︸n−1

2

x′ = 0 · · · 0︸ ︷︷ ︸n−1

2

1 1 · · · 1︸ ︷︷ ︸n−1

2

median(x) = 0 median(x′) = 1

GSmedian = 1

• Noise magnitude: 1ε.

8

Page 10: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Instance-Based Noise

Big picture for global sensitivity framework:

• add enough noise to cover the worst case for f

• noise distribution depends only on f , not database x

Problem: for some functions that’s too much noise

Example: median of x1, . . . , xn ∈ [0, 1]

x = 0 · · · 0︸ ︷︷ ︸n−1

2

0 1 · · · 1︸ ︷︷ ︸n−1

2

x′ = 0 · · · 0︸ ︷︷ ︸n−1

2

1 1 · · · 1︸ ︷︷ ︸n−1

2

median(x) = 0 median(x′) = 1

GSmedian = 1

• Noise magnitude: 1ε.

Our goal: noise tuned to database x

8

Page 11: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Road map

I. Introduction

• Review of global sensitivity framework [DMNS06]

• Motivation

II. Smooth sensitivity framework

III. Sample-and-aggregate framework

9

Page 12: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Local sensitivity

Local sensitivity LSf (x) = maxx′: neighbor of x

‖f(x) − f(x′)‖Reminder: GSf = max

xLSf (x)

Example: median for 0 ≤ x1 ≤ · · · ≤ xn ≤ 1, odd n

�0 1� �� ��x1 xnxm−1 xm+1xm. . . . . .

�median

LSmedian(x) = max(xm − xm−1, xm+1 − xm)

Goal: Release f(x) with less noise when LSf (x) is lower.

10

Page 13: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Local sensitivity

Local sensitivity LSf (x) = maxx′: neighbor of x

‖f(x) − f(x′)‖Reminder: GSf = max

xLSf (x)

Example: median for 0 ≤ x1 ≤ · · · ≤ xn ≤ 1, odd n

�0 1� �� ��x1 xnxm−1 xm+1xm. . . . . .

�median

new medianwhen x′

1 = 1

LSmedian(x) = max(xm − xm−1, xm+1 − xm)

Goal: Release f(x) with less noise when LSf (x) is lower.

10

Page 14: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Local sensitivity

Local sensitivity LSf (x) = maxx′: neighbor of x

‖f(x) − f(x′)‖Reminder: GSf = max

xLSf (x)

Example: median for 0 ≤ x1 ≤ · · · ≤ xn ≤ 1, odd n

�0 1� �� ��x1 xnxm−1 xm+1xm. . . . . .

�median

�new medianwhen x′

n = 0

new medianwhen x′

1 = 1

LSmedian(x) = max(xm − xm−1, xm+1 − xm)

Goal: Release f(x) with less noise when LSf (x) is lower.

10

Page 15: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Instance-based noise: first attempt

Noise magnitude proportional to LSf (x) instead of GSf?

No! Noise magnitude reveals information.

Lesson: Noise magnitude must be an insensitive function.

11

Page 16: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth bounds on local sensitivity

Design sensitivity function S(x)

• S(x) is an ε-smooth upper bound on LSf (x) if:

– for all x: S(x) ≥ LSf (x)

– for all neighbors x, x′ : S(x) ≤ eεS(x′)

x

LSf (x)

Theorem

If A(x) = f(x) + noise

(S(x)

ε

)then A is ε′-indistinguishable.

Example: GSf is always a smooth bound on LSf (x)

12

Page 17: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth bounds on local sensitivity

Design sensitivity function S(x)

• S(x) is an ε-smooth upper bound on LSf (x) if:

– for all x: S(x) ≥ LSf (x)

– for all neighbors x, x′ : S(x) ≤ eεS(x′)

x

LSf (x)

S(x)

Theorem

If A(x) = f(x) + noise

(S(x)

ε

)then A is ε′-indistinguishable.

Example: GSf is always a smooth bound on LSf (x)

12

Page 18: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth Sensitivity

Smooth sensitivity S∗f (x)= max

y

(LSf (y)e−ε·dist(x,y)

)

LemmaFor every ε-smooth bound S: S∗

f (x) ≤ S(x) for all x.

Intuition: little noise when far from sensitive instances

database space

high local

sensitivity

low local

sensitivity

13

Page 19: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Smooth Sensitivity

Smooth sensitivity S∗f (x)= max

y

(LSf (y)e−ε·dist(x,y)

)

LemmaFor every ε-smooth bound S: S∗

f (x) ≤ S(x) for all x.

Intuition: little noise when far from sensitive instances

database space

high local

sensitivity

low local

sensitivity

low smooth sensitivity

13

Page 20: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Computing smooth sensitivity

Example functions with computable smooth sensitivity

• Median & minimum of numbers in a bounded interval

• MST cost when weights are bounded

• Number of triangles in a graph

Approximating smooth sensitivity

• only smooth upper bounds on LS are meaningful

• simple generic methods for smooth approximations

– work for median and 1-median in Ld1

14

Page 21: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Road map

I. Introduction

• Review of global sensitivity framework [DMNS06]

• Motivation

II. Smooth sensitivity framework

III. Sample-and-aggregate framework

15

Page 22: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

New goal

• Smooth sensitivity framework requires

understanding combinatorial structure of f

– hard in general

• Goal: an automatable transformation from

an arbitrary f into an ε-indistinguishable A

– A(x) ≈ f(x) for ”good” instances x

16

Page 23: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Example: cluster centers

Database entries: points in a metric space.x

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

x′

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

• Comparing sets of centers: Earthmover-like metric

• Global sensitivity of cluster centers is roughly the

diameter of the space. But intuitively, if clustering is

”good”, cluster centers should be insensitive.

• No efficient approximation for smooth sensitivity17

Page 24: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Example: cluster centers

Database entries: points in a metric space.x

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

����

x′

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

����

• Comparing sets of centers: Earthmover-like metric

• Global sensitivity of cluster centers is roughly the

diameter of the space. But intuitively, if clustering is

”good”, cluster centers should be insensitive.

• No efficient approximation for smooth sensitivity17

Page 25: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Example: cluster centers

Database entries: points in a metric space.x

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

��

��

����

x′

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

�� ����

��������

�������� ��

��

��

��

��

����

• Comparing sets of centers: Earthmover-like metric

• Global sensitivity of cluster centers is roughly the

diameter of the space. But intuitively, if clustering is

”good”, cluster centers should be insensitive.

• No efficient approximation for smooth sensitivity17

Page 26: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate framework

Intuition: Replace f with a less sensitive function f̃ .

f̃(x) = g(f(sample1), f(sample2), . . . , f(samples))

� �

� � �

� �

x

xi1 , . . . , xit xj1 , . . . , xjt . . . xk1 , . . . , xkt

� � �

� � �

f f f

gaggregation function

18

Page 27: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate framework

Intuition: Replace f with a less sensitive function f̃ .

f̃(x) = g(f(sample1), f(sample2), . . . , f(samples))

� �

� � �

� �

x

xi1 , . . . , xit xj1 , . . . , xjt . . . xk1 , . . . , xkt

� � �

� � �

f f f

gaggregation function

�� ��+

noise calibratedto sensitivity of f̃

output

18

Page 28: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Good aggregation functions

• average

– works for L1 and L2

• center of attention

– the center of a smallest ball containing a strict

majority of input points

– works for arbitrary metrics

(in particular, for Earthmover)

– gives lower noise for L1 and L2

19

Page 29: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate results

TheoremIf f can be approximated on x

from small samples

then f can be released with little noise

20

Page 30: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate results

TheoremIf f can be approximated on x within distance r

from small samples of size n1−δ

then f can be released with little noise ≈ rε

+ negl(n)

20

Page 31: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Sample-and-aggregate results

TheoremIf f can be approximated on x within distance r

from small samples of size n1−δ

then f can be released with little noise ≈ rε

+ negl(n)

• Works in all ”interesting” metric spaces

• Example applications

– k-means cluster centers (if data is separated a.k.a.[Ostrovsky Rabani Schulman Swamy 06])

– fitting mixtures of Gaussians (if data is i.i.d., using[Vempala Wang 04, Achlioptas McSherry 05])

– PAC concepts (Adam Smith’s talk)

20

Page 32: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Road map

I. Introduction

• Review of global sensitivity framework [DMNS06]

• Motivation

II. Smooth sensitivity framework

III. Sample-and-aggregate framework

21

Page 33: Smooth Sensitivity and SamplingCompThink/mindswaps/oct07/local... · • Review of global sensitivity framework [DMNS06] • Motivation II. Smooth sensitivity framework III. Sample-and-aggregate

Conclusion: fundamental question

Which computations are not too

sensitive to individual inputs?

Which functions f admit

ε-indistinguishable approximation A?

22