Download - Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Transcript
Page 1: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Polylogarithmic Private Approximations and

Efficient Matching

Piotr IndykMIT

David WoodruffMIT, Tsinghua

TCC 2006

Page 2: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

a {0,1}n b {0,1}n

• Want to compute some function F(a,b)• Security: protocol does not reveal anything except for the

value F(a,b)– Semi-honest: both parties follow protocol– Malicious: parties are adversarial

• Efficiency: want to exchange few bits

Secure communicationAlice Bob

Page 3: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Secure Function Evaluation (SFE)

• [Yao, GMW]: If F computed by circuit C, then F can be computed securely with O~(|C|) bits of communication

• [GMW] + … + [NN]: can assume parties semi-honest– Semi-honest protocol can be compiled to give

security against malicious parties• Problem: circuit size at least linear in n

* O~() hides factors poly(k, log n)

Page 4: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Secure and Efficient Function Evaluation

• Can we achieve sublinear communication?

• With sublinear communication, many interesting problems can be solved only approximately.

• What does it mean to have a private approximation?

• Efficiency: want SFE with communication comparable to insecure case

Page 5: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Private Approximation

• [FIMNSW’01]: A protocol computing an approximation G(a,b) of F(a,b) is private, if each party can simulate its view of the protocol given the exact value F(a,b)

• Not sufficient to simulate non-private G(a,b) using SFE• Example:

– Define G(a,b):• bin(G(a,b))i =bin((a,b))i if i>0• bin(G(a,b))0=a0

– G(a,b) is a 1 -approximation of (a,b), but not private

• Popular protocols for approximating (a,b), e.g., [KOR98], are not private

Page 6: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Approximating Hamming Distance

• [FIMNSW01]: A private protocol with complexity O~(n1/2/ )– (a,b) small: compute (a,b) exactly in O~((a,b))

bits– (a,b) high: sample O~(n/(a,b)) (a-b)i, estimate

(a,b)

• Our main result: – Complexity: O~(1/2) bits– Works even for L2 norm, i.e., estimates ||a-b||2 for

a,b {1…M}n * O~() hides factors poly(k, log n, log M, log 1/)

Page 7: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Crypto ToolsEfficient OT1

n: – P1 has A[1] … A[n] 2 {0,1}m , P2 has i 2 [n]– Goal: P2 privately learns A[i], P1 learns nothing– Can be done using O~(m) communication

[CMS99, NP99]

Circuits with ROM [NN01] (augments [Yao86])– Standard AND/OR/NOT gates– Lookup gates:

• In: i• Out: Mgate[i]

– Can just focus on privacy of the outputCommunication at most O~(m|C|)

Page 8: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

High-dimensional tools

• Random projection:– Take a random orthonormal nn matrix D,

that is ||Dx|| = ||x|| for all x.

– There exists c>0 s.t. for any xRn, i=1…n

Pr[ (Dx)i2 > ||Dx||2/n * k] < e-ck

Page 9: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Approximating ||a-b||

• Recall:– Alice has a 2 [M]d, Bob has b 2 [M]d

– Goal: privately estimate ||a-b||, x=a-b

– Suffices to estimate ||a-b||2

Page 10: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Protocol Intuition

1. Alice and Bob agree upon a random orthonormal matrix D

• Efficient by exchanging a seed of a PRG

2. Alice and Bob rotate vectors a,b, obtaining Da, Db• ||Da-Db|| = ||a-b||• D “spreads the mass” of the difference vector

uniformly across the n coordinates.• Can now try obliviously sampling coordinates as in

[FIMNSW01]

Page 11: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Protocol Intuition Con’d

1. Alice and Bob agree upon random orthonormal D

2. Alice and Bob rotate a,b, obtaining Da, Db

3. Use secure circuit with ROMs Da and Db to:

i. Circuit obtains (Da)i and (Db)i for many random indices i

Problem: Now what? Samples leak a lot of info!

Fix: - Suppose you know upper bound T with T ¸ ||a-b||2

- Flip a coin z with heads probability n((Da)i – (Db)i)2/(kT)

- Then E[z] = n||Da-Db||2/(nkT) = ||a-b||2/(kT)

- E[z] only depends on ||a-b||, and z only depends on E[z]!

Page 12: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Protocol Intuition Con’d

1. Alice and Bob agree upon random orthonormal D

2. Alice and Bob rotate a,b, obtaining Da, Db

3. Use secure circuit with ROMs Da, Db, to:i. Obtain (Da)i and (Db)i for L random i

ii. Generate Bernoulli z1, … , zL with E[zi] = ||a-b||2/(kT)

iii. Output kT zi/L

Privacy: View only depends on ||a-b||

Problem: Correctness! A priori bound T=M2 n, but ||a-b||2 may be (1), so (n) samples required.

Fix: Private binary search on T

Page 13: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Protocol Intuition Con’d

3. Use secure circuit with ROMs Da, Db to:i. Obtain (Da)i and (Db)i for L random i

ii. Generate Bernoulli z1, … , zL with E[zi] = ||a-b||2/(kT)

iii. Output kT zi/L

Fix: - Private binary search on T

- If many zi = 0, then intuitively can replace T with T/2

- Eventually T = ~(||a-b||2)

- We will show: final choice of T is simulatable!

Page 14: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

One last detail• Want to show final choice of T is simulatable

• Estimate is kT zi/L and we stop when “many” zi = 1

• Recall E[zi] = ||a-b||2/(kT)

Key Observation: Since orthonormal D is uniformly random,

can guarantee that if many zi = 0, then ||a-b||2 << T.

Note: - Suppose didn’t use D, and a = (M, 0, …, 0), b = (0, …, 0)

- Then ||a-b||2 = M2 is large, but almost always zi = 0,

so you’ll choose T < ||a-b||2.

- Not simulatable since T depends on the structure of a, b

Page 15: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Algorithm vs. Simulation

SIMULATION• Repeat

– Generate L independent bits zi such that

Pr[zi=1]= ||a-b|| 2/Tk

– T=T/2

• Until Σi zi ≥ (L/k)

• Output E= Σi zi /L * 2Tk as an estimate of ||a-b||2

ALGORITHM• Repeat

– Generate L independent bits zi such that

Pr[zi=1]= ||D(a-b)|| 2/Tk – T=T/2

• Until Σi zi ≥ (L/k)

• Output E= Σi zi /L * 2Tk as an estimate of ||a-b||2

Recall:||D(a-b)||=||a-b||

Communication = O~(L) = O~(1/2)

Page 16: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Other Results

• Use homomorphic encryption tricks to get better upper bounds for private nearest neighborprivate nearest neighbor and private all-pairs private all-pairs nearest neighbors.nearest neighbors.

• Define private approximate nearest neighbor problem:private approximate nearest neighbor problem:

– Requires a new definition of private approximations for functionalities that can return sets of values.

– Achieve small communication in this setting.