Private Information Retrieval

21
Private Information Retrieval Based on the talk by Yuval Ishai, Eyal Kushilevitz, Tal Malkin

description

Private Information Retrieval. Based on the talk by Yuval Ishai, Eyal Kushilevitz, Tal Malkin. Outline. Introduction Common approaches Information-theoretical Computational Summary. Private Information Retrieval (PIR):assumptions. Semi-honest assumption on servers - PowerPoint PPT Presentation

Transcript of Private Information Retrieval

Page 1: Private Information Retrieval

Private Information Retrieval

Based on the talk by

Yuval Ishai, Eyal Kushilevitz, Tal Malkin

Page 2: Private Information Retrieval

Outline Introduction Common approaches

Information-theoretical Computational

Summary

Page 3: Private Information Retrieval

Private Information Retrieval (PIR):assumptions

Semi-honest assumption on servers Server is trustable in terms of honestly

following the protocol Server knows every bit of the data Server may record client’s

requests/queries

Malicious servers Drop messages Change messages Collude with other parties

Page 4: Private Information Retrieval

Private Information Retrieval (PIR): intro

Goal: allow user to query database while hiding the identity of the data-items she is after.

Note: hides identity of data-items; not the existence of interaction with the user.

Motivation: patent databases; stock quotes; web access; many more....

Paradox(?): imagine buying in a store without the seller knowing what you buy.

(Encrypting requests is useful against third parties; not against the owner of data.)

Page 5: Private Information Retrieval

Modeling Server: holds n-bit string x n should be thought of as very large

User: wishes to retrieve xi

and to keep i private

Remark: it is the most basic version; the building block for involved retrieval.

Page 6: Private Information Retrieval

Server sends entire database x to User.

Information theoretic privacy.

Communication: n

SERVER

xi

USER

x =x1,x2 , . . ., xn

x1,x2 , . . ., xn

Trivial Private Protocol

Is this optimal?

Page 7: Private Information Retrieval

Obstacle

Theorem [CGKS]: In any 1-server PIR with informationtheoretic privacy the communication is

at least n.

Information theoretic privacy/security:The ciphertext gives no information about the

plaintext

Page 8: Private Information Retrieval

More “solutions” User asks for additional random indices.

Pick a few random indices to hide the real one Drawback: can be estimated

Employ general crypto protocols to compute xi privately. 1-out-N Oblivious Transfer Drawback: highly inefficient (polynomial in n).

Anonymity (e.g., via Anonymizers). Note: address different problems: hides

identity of user; not the fact that xi is retrieved.

Page 9: Private Information Retrieval

Two Approaches

Information-Theoretic PIR [CGKS95,Amb97,...] Replicate database among k servers.

servers do collude.

Computational PIR [CG97,KO97,CMS99,...] Computational privacy, based on cryptographic assumptions – NP hard to break the approach

Page 10: Private Information Retrieval

Known Communication Upper Bounds

Multiple servers, information-theoretic PIR: 2 servers, comm. n1/3 [CGKS95]

k servers, comm. n1/(k) [CGKS95, Amb96,…,BIKR02]

log n servers, comm. Poly( log(n) ) [BF90, CGKS95]

Single server, computational PIR: Comm. Poly( log(n) ), n is the # of items Under appropriate computational assumptions

[KO97,CMS99]

Page 11: Private Information Retrieval

Approach I: k-Server PIR

Correctness: User obtains xi

Privacy: No single server gets information about i

U

S1

x {0,1}n

S2x {0,1}n

i

x {0,1}n Sk

Page 12: Private Information Retrieval

Information-Theoretic 2-Server PIR

Best Known Protocol: comm. n1/3 Let’s look at an example with comm.

cost n1/2

Two protocols:Protocol I: n bit queries, 1 bit answersProtocol II: n1/2 bit queries, n1/2 bit

answers

Page 13: Private Information Retrieval

Protocol I: 2-server O(n) PIR

S2

i

U

i

n

Q1{1,…,n}

S1

Q2=Q1 i

0 0 1 1 0 011 10 00

*User sent O(n) bits

1 + 1 = 0

0 + 0 = 0

1 + 0 = 1

Meaning of Q2=Q1 i

Q1 is a random subset

Page 14: Private Information Retrieval

Protocol I: 2-server PIR

S2

i

U

i

n

Q1 {1,…,n}

S1

11

Qa x

Q2=Q1 i

0 1 0 0 1 1 0 1 0 0 010

*Server replies 1 bit

Page 15: Private Information Retrieval

Protocol I: 2-server PIR

S2

i

U

i

n

Q1 {1,…,n}

S1

a1a2=xi

11

Qa x

2

2Q

a x

Q2=Q1 i

0 1 0 0 1 0 1 0 0 01 110

Page 16: Private Information Retrieval

Protocol II: PIR with O(n1/2) Communication

S2

j,i

U

i

X

m=n1/2

m=n1/2

Q1 {1,…,m}

S1

1,1 1,2 1,, ,..., ma a a

Q2=Q1 i

j

2,1 2,2 2,, ,..., ma a a

1, ja

1,2a1,1a

a1,ja2,j=xj,i

Make the n-bit vector as a n1/2 * n1/2 matrix

Apply ex-or sum to each row

Page 17: Private Information Retrieval

Computational PIR with O(n1/2) Comm. Based on encryption

Quadratic Residue (QR)

N = p*q , p,q are primes.Q(y, N) = 0 iff exists w such that w^2 = y (mod N) 1 otherwise

Security:

if p, q is unknown, it is computationally impossible

to determine Q(y,N).

If p,q is known, Q(y,N) can be determined in O(|N|^3)

Understanding modulo: w^2 = y+k*N, k can be any integer

Example: 2^2 = 4 (mod 10), 4^2 = 6 (mod 10)

Page 18: Private Information Retrieval

Example Quadratic Residue:

E(0) QRN

E(1) NQRN

Properties:

QR QR = QR

NQR QR = NQR

For any y, y^2 is QR.

Page 19: Private Information Retrieval

Computational PIR with O(n1/2) Comm.

0 1 1 0 1 1 1 0 1 1 0 0 0 0 0 1

n1/2

n1/2

b

Goal: user wants to know entry M(a,b)

1.User picks N=pq and sends N to server

2.User picks uniformly at random s=n1/2 numbers, from the set Z={x|1<=x<=N, gcd(N, x)=1}, such that yb is NQR and yj (j!=b) is QR, and sends them to server

3. For each row r, server calculateW(r,j) = yj

2 if M(r,j) =0, yj if M(r,j)=1

a

),(1

jrwZt

jr

5. Zr is a QR iff M(r,b) =0

4. Server sends Z1,…,ZsTo User, and User checks with Za is QR

Page 20: Private Information Retrieval

Related work Can be a building block for high-level

security protocols Has connection with “Locally

Decodable Codes” (LDC) It was proved that the three

problems: PIR, LDC, and Circuit-based SMC are equivalent.

Page 21: Private Information Retrieval

Summary

Focus so far: communication complexityObstacle: time complexity All existing protocols require high

computation by the servers (linear computation per query).

Are there methods to reduce server cost?