When Are LDCs a False Promise? Moni Naor Weizmann Institute of Science.

53
When Are LDCs a False Promise? Moni Naor Weizmann Institute of Science

Transcript of When Are LDCs a False Promise? Moni Naor Weizmann Institute of Science.

When Are LDCs a False Promise?

Moni Naor

Weizmann Institute of Science

Talk Based on:

• The Complexity of Online Memory Checking[Naor and Rothblum]

• Fault Tolerant Storage And Quorum Systems[Nadav and Naor]

• On the Compressibility of NP Instance and Cryptographic Applications

[Harnik and Naor]

Theme: cases where LDC should be helpful but

•Either provably not helpful

•Or open problem

Authentication

Verifying a string has not been modified– Central problem in cryptography– Many variants

Our Setting:• User works on large file residing on a remote server• User stores a small secret `fingerprint’ (hash) of file

– Used to detect corruption• What is the size of the fingerprint?

– A well understood problem

Online Memory CheckingProblem with the model:

What if we don’t want to read the entire file?

What if we only want small part? Read entire file?!

Idea: Don’t verify the entire file, verify what you need!– How much of the file do you read per

authenticated bit?– How large a fingerprint do you need?

Online Memory CheckersUser makes store and retrieve requests to memory

a vector in {0,1}n under adversary’s control Checker Checks: answer to retrieve = last stored valueChecker:

– Has secret reliable memory: space complexity s(n)– Makes its own reads/writes: query complexity q(n)

Want small s(n) and small q(n)!

C memory checkerUser Public

memorysecret memory

s(n) bits

store(i,b)

retrieve(i)

b

q(n) bits

R/W

R/W

R/W

Memory Checker Requirements:For ANY sequence of user requests and

ANY responses from public memory:Completeness: If every read from public memory = last write

Guarantee: user retrieve = last store (w.h.p)Soundness: If some read from public memory ≠ last write

Guarantee: user retrieve = last store or BUG (w.h.p)

C memory checkerUser Public

memorysecret memory

s(n) bits

retrieve(i)

bb or BUG

[Blum, Evans, Gemmel, Kannan and Naor 1991]Offline Memory Checkers:Detect errors only at end of long request sequence

q(n)=O(1) (amortized) s(n)=O(log n)

No Crypto assumptions!

Online Memory Checkers:

Past Results:

With One-Way Functions

q(n)=O(log n)s(n)=n (for any > 0)

No Computational Assumpt.

q(n) (any query complexity)

s(n) = O(n/q(n))

Are they necessary?!

s(n) x q(n) = O(n)

Other Results:

Optimal [Gemmel Naor 92]

Must be invasive [Ajtai 2003]

Very Simple (in chunks)

AuthenticatorsMemory Checkers allow reliable local decodability,What about reliable local testability?

Authenticators:• Encode the file x 2 {0,1}n into:

• a large public encoding px

• a small secret encoding sx. Space complexity: s(n)• Decoding Algorithm D:

– Receives a public encoding p and decodes it into a vector x 2 {0,1}n

• Consistency verifier checks (repeatedly) public encodingwas it (significantly) corrupted? reading only a few bits: t(n).– If not currupted: verifier should output “Ok”– If verifier outputs “Ok”, decoder can (whp) retrieve the file

Good example: Reed Solomon

Pretty Good Authenticatorwith computational assumptions

• Idea: encode file X using a good error correcting code C– Actually erasures are more relevant– As long as a certain fraction of the symbols of C(X) is available,

can decode X• Add to each symbol a tag Fk(a,i), a function of

• secret information k 2 {0,1}s, seed of a PRF• symbol a 2 • location i

• Verifiers picks random location i reads symbol ’a’ and tag t – Check whether t=Fk(a,i) and rejects if not

• Decoding process removes all inappropriate tags and uses the decoding procedure of C

Memory Checker Authenticator

If there exists an online memory checker with– space complexity s(n) – query complexity t(n)

then there exists an authenticator with– space complexity O(s(n)) – query complexity O(t(n))

Idea: Use a high-distance code

Improve the Information Theoretic Upper Bound(s)?

Maybe we can use:

Locally Decodable Codes?

Locally Testable Codes?

PCPs of proximity?

The Lower Bound

Theorem 1 [Tight lower bound]:For any online memory checker secure against a computationally unbounded adversary

s(n) x q(n) = (n)

True also for authenticators

Memory Checkers and One-Way Functions

Breaking the lower bound implies one-way functions.Theorem 2:If there exists an online memory checker:

– Working in polynomial time – Secure against polynomial time adversaries – With query and space complexity:

s(n) x q(n) < c · n (for a constant c > 0)

Then there exist functions that are hard to invert for infinitely many input lengths(“almost one-way” functions)

This Talk:

• Not say much about the proof– It is involved

• Initial insight: connection to the simultaneous message model

Simultaneous Messages Protocols [Yao 1979]

• For the equality function:– |mA| + |mB| = (√n) [Newman Szegedy 1996]

– |mA| x |mB| = (n) [Babai Kimmel 1997]

mB

f(x,y)

x {0,1}n

y {0,1}n

x=y?

mA

ALICE

BOB

CAROL

Ingredients for Full Proof:

• Consecutive Messages Model:Generalized communication complexity lower bound.

• Adversary “learns” public memory access distribution:Learning Adaptively Changing Distributions [NR06].

• “Bait and Switch” technique:Handle adaptive checkers.

• One-Way functions:Breaking the generalized communication complexity lower bound in a computational setting requires one-way functions.

Conclusions for OMC

Settled the complexity of online memory checkingCharacterized the computational assumptions required for good online memory checkers

Open Questions:Do we need logarithmic query complexity for online memory checking with computational assumptions?Understanding relationships of crypto/complexity objectsQuantum Memory Checkers?

LDC

Talk Based on:

• The Complexity of Online Memory Checking[Naor and Rothblum]

• Fault Tolerant Storage And Quorum Systems[Nadav and Naor]

• On the Compressibility of NP Instance and Cryptographic Applications

[Harnik and Naor]

Theme: cases where LDC should be helpful but

•Either provably not helpful

•Or open problem

Goal• Distributed file storage system

– Peer-to-peer environment– Processors join and leave the system continuouslyWant to be able to store and retrieve files distributively

• Partial Solutions– Distributed File sharing applications [Gnutella, Kazaa]– Distributed Hash Tables [DH, Chord, Viceroy]

• Store (key, value) pairs and perform lookup on key

Fault-Tolerant Storage System• Censor

– Aims to eliminate access to some files– Can take down some servers

• Design Goal:

– A reader should be able to reconstruct each file with high probability even after faults have occurred

Probability taken over coins of the writer and reader

Adversarial Behavior

• How are the faulty processors chosen?What is the influence of the adversary

• Type of faults– Complete/Partial control

Adversarial Model• Adversary chooses the set of processors to crash

• Different degrees of adaptiveness– Non adaptive adversary

• Choice of faulty processors is not based on their content– Adversary with a limited number of queries

• May query some processors

• fail-stop failures– We do not consider Byzantine failures

Other Fault Models• Random faults model:

– Examples: Distance Halving DHT, Chord– Standard technique:

• Replication to log(n) processors • Assures survival with high probability

• Adversarial faults [Fiat, Saia]

– Large fraction accessible after adversary crashes a linear fraction of the processors

• Still, a censor can target a specific file

Measures of Quality• Read/Write complexity:

– Average number of processors accessed during a read/write operation

• Number of rounds:– Number of rounds required from an adaptive reader

• Blowup Ratio: – Ratio between the total number of bits used for the storage of a

file and its size

Connection to LDC

• If you are willing to have high write complexity:

• Can encode ALL the data with an LDC

• Parameters of the LDC determine how good the data storage is

Probabilistic Storage system based on intersecting quorum system

• Storage System:

– To store a file: pick a set of size uniformly at random

• replicate the file to all members of the quorum set

– Retrieval: Choose a random set of size and probe its

members

– Intersection follows from the birthday paradox

Properties of the Probabilistic Storage System• Pros:

– Simplicity

– Resilient against linear number of faults• Even if the processors are chosen by the adversary adaptively

– Adapted to a dynamic environment [Abraham, Malkhi]

Want a storage system with better parameters

•Cons:

•High read/write complexity

•High blowup-ratio

Non-adaptive readers are wasteful!

• Non-adaptive reader: – Processors are chosen without accessing any processor

Theorem: A fault tolerant storage system, in the non-adaptive reader model, resilient against (n) faults, cannot do better than the -intersecting storage system example.Read Complexity ¢ Write Complexity is (n)Blowup Ratio is (√n)

Open Question

• Do the lower bounds for the case when both the reader and the adversary are non-adaptive hold when both are fully adaptive?

E

For Effort

Talk Based on:

• The Complexity of Online Memory Checking[Naor and Rothblum]

• Fault Tolerant Storage And Quorum Systems[Nadav and Naor]

• On the Compressibility of NP Instance and Cryptographic Applications

[Naor and Harnik]

Theme: cases where LDC should be helpful but

•Either provably not helpful

•Or open problem

The ProblemIs it possible to have an efficient procedure:• Given CNF formulae 1 and 2 on same

variables and same lengthcome up with a CNF formula that is: 1. Satisfiable if and only if 1 v 2 is satisfiable

2. Shorter than |1|+|2| If yes: There is a construction of Collision Resistant Hash functions from any one-way function

No “black box” construction of CRH from OWF [Simon98]Construction uses the code of the one-way function

If no: there is hope for:• Efficient everlasting encryption in the hybrid bounded storage model

• Forward-Secure-Storage [Dziembowski]• Derandomization of Sampling [Dubrov-Ishai]

Sufficiently short to apply recursively (1-) (|1|+|2|)

No Witness Retrievable Compression• Given CNF formulae 1 and 2 on same variables

come up with a formula that is: 1. Satisfiable if and only if 1 v 2 is satisfiable

2. Shorter than |1|+|2|

Claim: if one-way functions exist, then a witness for either 1 or 2 cannot yield a witness for efficiently.Most natural ideas are witness retrievable

Satisfying assignment

Proof intuition based on broadcast encryption lower bounds

Garey and Johnson, 1979

I can’t find an algorithm for the

problem

Maybe I can approximate

it Solve it for some fixed parameters

Find an algorithm

that usually works?

Solve it in time 2n

Could we just postpone it ?

Approaches for dealing with NP-complete problems:Approaches for dealing with NP-complete problems:• Approximation algorithms • Sub-exponential time algorithms• Parameterized complexity• Average case complexity• Save it for the future

Verdict on LDCs?

Uncompressed paper on compressibility: www.wisdom.weizmann.ac.il/~naor/PAPERS/compressibility.html

Compressed version FOCS 2006

THE ENDThank You

Slides for the Proof of OMC

Simultaneous

Theorem (lower bound for CM protocols):For any equality protocol, as long as |mP| ≤ n/100,|mA| x |mB| = (n)

mP

x {0,1}n

y {0,1}n

mA

mB

x=y?

Consecutive Messages ProtocolsALIC

E

BOB

CAROL

Program for This Talk:

• Define online memory checkers• Review some past results• Describe new results• Proof sketch:

– Define communication complexity model– Sketch lower bound for a simple case– Ideas for extending to the general case

The Reduction

Use online memory checkerto construct a consecutive messages equality protocol

Online Checker

Space: s(n)

Query: q(n)

Equality Protocol

Alice msg: s(n)

Bob msg: O(q(n))

Reduction

Conclusion: s(n) x q(n) = Ω(n)(From communication complexity lower bound)

Simplifying Assumption

Assumption: checker chooses indices to read from public memory independently of secret memory

Checker Operation:

1. Get an index i in the original file2. Choose which indices to read from the public memory,

and read them.3. Get the secret memory4. Retrieve i-th bit or say BUG

(With loss of generality)

The Reduction: OutlineUse online memory checkerConstruct “random index” protocol, Bob chooses random index i:

If x = y, then Carol acceptsIf xi ≠ yi, then Carol rejects

Use online checker to build this protocol

Use error correcting codeGo from “random index” to equality testing:

Alice, Bob encode inputs and run “random index” protocolIf Alice’s and Bob’s inputs different at even one index, encodings are different at many indices.

q(n)+1 bits

s(n) bits

CAROL

BOB

ALICE

Checker

Secret Memory S(x)

Public Memory P(x)

store(x)

Get random index i

x{0,1}n

S(x)

y{0,1}n

i, yi

Checker

Secret Memory S(y)

store(y)

Public Memory P(y)

retrieve(i)

retrieve(i)

Bits for Carol

Ci = xi/BUG

xi

Secret Memory S(x)

Accept if

yi = Cix=y accept

xi≠yi reject

WANT: An adversary that can find bad x,y for protocol Can be used to find bad x,P(y),i for memory checker

PROBLEM: Protocol adversary sees randomness!

SOLUTION: Re-Randomize! Alice re-computes S(x) with different randomness,New S(x) independent of public randomness (given P(x))Requires exponential time Alice

Conclusion [Weak Theorem]:

For “restricted” online memory checkers

s(n) x q(n) = Ω(n)

Program for This Talk:

• Define online memory checkers• Review some past results• Describe new results• Proof sketch:

– Define communication complexity model– Sketch lower bound for a simple case– Ideas for extending to the general case

Recall Simplifying Assumption

Assumption: checker chooses indices to read from public memory independently of secret memory

Do we really need the assumption?

Idea: If checker uses secret memory to choose indices, Adversary learns something about the secret memory from indices the checker reads.

Access Pattern DistributionFor a retrieve request

Access Pattern:Bits of public memory accessed by checker

Access Pattern Distribution:Distribution of the checker’s access pattern(given its secret memory)Randomness: over checker’s coin tosses

Where Do We Go From Here?

Observation:If adversary doesn’t know the access pattern distribution, then the checker is “home free”.

Lesson for adversary:Activate checker many times, “learn” its access pattern distribution!

[NR05]: Learning to Impersonate.

Learning The Access Pattern DistributionTheorem (Corollary from [NR05])

Learning algorithm for adversary:– Adversary stores x, secret memory s– Adversary makes O(s(n)) retrieves,

p: Final public memory (after the stores and retrieves)– Adversary learns L, can generate distribution DL(p).– “Real” distribution is DS(p)

Guarantee: With high probability, the distributionsDL(p) and DS(p) are ε -close.L is of size O(q(n) x s(n)) bits.

Guarantee is only for the public memory p reached by checker!

CAROL

BOB

ALICE

Checker

Secret Memory S(x)

Public Memory P(x)

store(x), retrieves

Get random index i

x{0,1}n

S(x)

y{0,1}n

i, yi

Checker

Secret Memory S(y)

store(y), retrieves

Public Memory P(y)

retrieve(i)

Bits for Carol

Ci = xi/BUG

Secret Memory S(x)

Accept if

yi = Cix=y accept

xi≠yi reject

Learned L

Run Learner with public

coins Learned L

Run Learner with same

coins

O(s(n)xq(n)) bitsq(n)+1 bits

s(n) bits

L

Completeness: Adversary that finds x s.t. Carol rejects when Alice AND Bob’s inputs are x, also fools memory checker

Access pattern distributions by “real” S and “learned” L are close on P(x).Protocol adversary sees L, checker adversary learns it!

Soundness: An adversary that finds x≠y s.t. Carol doesn’t reject, also fools memory checker

Does this work???PROBLEM: distributions by “real” S and “learned” L are close on original P(x)! They may be very far on P(y)!

Does it Work?

• Will the protocol work when y≠x?• No! Big problem for the adversary:

Can learn access pattern distribution on correct and unmodified public memory…really wants the distribution on different modified memory!

• Learned information L may be:– Good on unmodified memory (DL(P(x)), DS(P(x))

close)– Bad on modified memory (DL(P(y)), DS(P(y)) far)

• Can’t hope to learn distribution on modified public memory

Bait and Switch

Carol knows S and L, if only she could check whether DL(P(y)), DS(P(y)) are ε-close…

If far:P(y)≠P(x) (not “real” public memory)! Reject!

If close:OK for Bob to use L for access pattern!

Bob always uses L to determine access pattern.This is a “weakening” of the checker.

Bait and Switch:Carol Approximates the Distance

Main Observation:Carol (computationally unbounded) can compute probabilities of any access pattern for which all the bits read from P(y) are known.(Probabilities by both DL(P(y)) and DS(P(y)))

Solution:Sample O(1) access patterns by DL(P(y)), use them to approximate distance between the distributions.

In the protocol Bob sends these samples to Carol, she approximates the distance.

Putting It Together

From any memory checker, we get a CM protocol for equality with:

• Public message: length O(s(n) x q(n))• Alice message: length s(n)• Bob message: length O(q(n))

Conclusion: s(n) x q(n) = (n)

Conclusion

Settled the complexity of online memory checkingCharacterized the computational assumptions required for good online memory checkers

Open Questions:Do we need logarithmic query complexity for online memory checking with computational assumptions?Understanding relationships of crypto/complexity objectsQuantum Memory Checkers?