Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT,...

16
Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006

Transcript of Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT,...

Page 1: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Polylogarithmic Private Approximations and

Efficient Matching

Piotr IndykMIT

David WoodruffMIT, Tsinghua

TCC 2006

Page 2: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

a {0,1}n b {0,1}n

• Want to compute some function F(a,b)• Security: protocol does not reveal anything except for the

value F(a,b)– Semi-honest: both parties follow protocol– Malicious: parties are adversarial

• Efficiency: want to exchange few bits

Secure communicationAlice Bob

Page 3: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Secure Function Evaluation (SFE)

• [Yao, GMW]: If F computed by circuit C, then F can be computed securely with O~(|C|) bits of communication

• [GMW] + … + [NN]: can assume parties semi-honest– Semi-honest protocol can be compiled to give

security against malicious parties• Problem: circuit size at least linear in n

* O~() hides factors poly(k, log n)

Page 4: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Secure and Efficient Function Evaluation

• Can we achieve sublinear communication?

• With sublinear communication, many interesting problems can be solved only approximately.

• What does it mean to have a private approximation?

• Efficiency: want SFE with communication comparable to insecure case

Page 5: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Private Approximation

• [FIMNSW’01]: A protocol computing an approximation G(a,b) of F(a,b) is private, if each party can simulate its view of the protocol given the exact value F(a,b)

• Not sufficient to simulate non-private G(a,b) using SFE• Example:

– Define G(a,b):• bin(G(a,b))i =bin((a,b))i if i>0• bin(G(a,b))0=a0

– G(a,b) is a 1 -approximation of (a,b), but not private

• Popular protocols for approximating (a,b), e.g., [KOR98], are not private

Page 6: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Approximating Hamming Distance

• [FIMNSW01]: A private protocol with complexity O~(n1/2/ )– (a,b) small: compute (a,b) exactly in O~((a,b))

bits– (a,b) high: sample O~(n/(a,b)) (a-b)i, estimate

(a,b)

• Our main result: – Complexity: O~(1/2) bits– Works even for L2 norm, i.e., estimates ||a-b||2 for

a,b {1…M}n * O~() hides factors poly(k, log n, log M, log 1/)

Page 7: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Crypto ToolsEfficient OT1

n: – P1 has A[1] … A[n] 2 {0,1}m , P2 has i 2 [n]– Goal: P2 privately learns A[i], P1 learns nothing– Can be done using O~(m) communication

[CMS99, NP99]

Circuits with ROM [NN01] (augments [Yao86])– Standard AND/OR/NOT gates– Lookup gates:

• In: i• Out: Mgate[i]

– Can just focus on privacy of the outputCommunication at most O~(m|C|)

Page 8: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

High-dimensional tools

• Random projection:– Take a random orthonormal nn matrix D,

that is ||Dx|| = ||x|| for all x.

– There exists c>0 s.t. for any xRn, i=1…n

Pr[ (Dx)i2 > ||Dx||2/n * k] < e-ck

Page 9: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Approximating ||a-b||

• Recall:– Alice has a 2 [M]d, Bob has b 2 [M]d

– Goal: privately estimate ||a-b||, x=a-b

– Suffices to estimate ||a-b||2

Page 10: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Protocol Intuition

1. Alice and Bob agree upon a random orthonormal matrix D

• Efficient by exchanging a seed of a PRG

2. Alice and Bob rotate vectors a,b, obtaining Da, Db• ||Da-Db|| = ||a-b||• D “spreads the mass” of the difference vector

uniformly across the n coordinates.• Can now try obliviously sampling coordinates as in

[FIMNSW01]

Page 11: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Protocol Intuition Con’d

1. Alice and Bob agree upon random orthonormal D

2. Alice and Bob rotate a,b, obtaining Da, Db

3. Use secure circuit with ROMs Da and Db to:

i. Circuit obtains (Da)i and (Db)i for many random indices i

Problem: Now what? Samples leak a lot of info!

Fix: - Suppose you know upper bound T with T ¸ ||a-b||2

- Flip a coin z with heads probability n((Da)i – (Db)i)2/(kT)

- Then E[z] = n||Da-Db||2/(nkT) = ||a-b||2/(kT)

- E[z] only depends on ||a-b||, and z only depends on E[z]!

Page 12: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Protocol Intuition Con’d

1. Alice and Bob agree upon random orthonormal D

2. Alice and Bob rotate a,b, obtaining Da, Db

3. Use secure circuit with ROMs Da, Db, to:i. Obtain (Da)i and (Db)i for L random i

ii. Generate Bernoulli z1, … , zL with E[zi] = ||a-b||2/(kT)

iii. Output kT zi/L

Privacy: View only depends on ||a-b||

Problem: Correctness! A priori bound T=M2 n, but ||a-b||2 may be (1), so (n) samples required.

Fix: Private binary search on T

Page 13: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Protocol Intuition Con’d

3. Use secure circuit with ROMs Da, Db to:i. Obtain (Da)i and (Db)i for L random i

ii. Generate Bernoulli z1, … , zL with E[zi] = ||a-b||2/(kT)

iii. Output kT zi/L

Fix: - Private binary search on T

- If many zi = 0, then intuitively can replace T with T/2

- Eventually T = ~(||a-b||2)

- We will show: final choice of T is simulatable!

Page 14: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

One last detail• Want to show final choice of T is simulatable

• Estimate is kT zi/L and we stop when “many” zi = 1

• Recall E[zi] = ||a-b||2/(kT)

Key Observation: Since orthonormal D is uniformly random,

can guarantee that if many zi = 0, then ||a-b||2 << T.

Note: - Suppose didn’t use D, and a = (M, 0, …, 0), b = (0, …, 0)

- Then ||a-b||2 = M2 is large, but almost always zi = 0,

so you’ll choose T < ||a-b||2.

- Not simulatable since T depends on the structure of a, b

Page 15: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Algorithm vs. Simulation

SIMULATION• Repeat

– Generate L independent bits zi such that

Pr[zi=1]= ||a-b|| 2/Tk

– T=T/2

• Until Σi zi ≥ (L/k)

• Output E= Σi zi /L * 2Tk as an estimate of ||a-b||2

ALGORITHM• Repeat

– Generate L independent bits zi such that

Pr[zi=1]= ||D(a-b)|| 2/Tk – T=T/2

• Until Σi zi ≥ (L/k)

• Output E= Σi zi /L * 2Tk as an estimate of ||a-b||2

Recall:||D(a-b)||=||a-b||

Communication = O~(L) = O~(1/2)

Page 16: Polylogarithmic Private Approximations and Efficient Matching Piotr Indyk MIT David Woodruff MIT, Tsinghua TCC 2006.

Other Results

• Use homomorphic encryption tricks to get better upper bounds for private nearest neighborprivate nearest neighbor and private all-pairs private all-pairs nearest neighbors.nearest neighbors.

• Define private approximate nearest neighbor problem:private approximate nearest neighbor problem:

– Requires a new definition of private approximations for functionalities that can return sets of values.

– Achieve small communication in this setting.