Post on 30-Dec-2015
Talk Based on:
• The Complexity of Online Memory Checking[Naor and Rothblum]
• Fault Tolerant Storage And Quorum Systems[Nadav and Naor]
• On the Compressibility of NP Instance and Cryptographic Applications
[Harnik and Naor]
Theme: cases where LDC should be helpful but
•Either provably not helpful
•Or open problem
Authentication
Verifying a string has not been modified– Central problem in cryptography– Many variants
Our Setting:• User works on large file residing on a remote server• User stores a small secret `fingerprint’ (hash) of file
– Used to detect corruption• What is the size of the fingerprint?
– A well understood problem
Online Memory CheckingProblem with the model:
What if we don’t want to read the entire file?
What if we only want small part? Read entire file?!
Idea: Don’t verify the entire file, verify what you need!– How much of the file do you read per
authenticated bit?– How large a fingerprint do you need?
Online Memory CheckersUser makes store and retrieve requests to memory
a vector in {0,1}n under adversary’s control Checker Checks: answer to retrieve = last stored valueChecker:
– Has secret reliable memory: space complexity s(n)– Makes its own reads/writes: query complexity q(n)
Want small s(n) and small q(n)!
C memory checkerUser Public
memorysecret memory
s(n) bits
store(i,b)
retrieve(i)
b
q(n) bits
R/W
R/W
R/W
Memory Checker Requirements:For ANY sequence of user requests and
ANY responses from public memory:Completeness: If every read from public memory = last write
Guarantee: user retrieve = last store (w.h.p)Soundness: If some read from public memory ≠ last write
Guarantee: user retrieve = last store or BUG (w.h.p)
C memory checkerUser Public
memorysecret memory
s(n) bits
retrieve(i)
bb or BUG
[Blum, Evans, Gemmel, Kannan and Naor 1991]Offline Memory Checkers:Detect errors only at end of long request sequence
q(n)=O(1) (amortized) s(n)=O(log n)
No Crypto assumptions!
Online Memory Checkers:
Past Results:
With One-Way Functions
q(n)=O(log n)s(n)=n (for any > 0)
No Computational Assumpt.
q(n) (any query complexity)
s(n) = O(n/q(n))
Are they necessary?!
s(n) x q(n) = O(n)
Other Results:
Optimal [Gemmel Naor 92]
Must be invasive [Ajtai 2003]
Very Simple (in chunks)
AuthenticatorsMemory Checkers allow reliable local decodability,What about reliable local testability?
Authenticators:• Encode the file x 2 {0,1}n into:
• a large public encoding px
• a small secret encoding sx. Space complexity: s(n)• Decoding Algorithm D:
– Receives a public encoding p and decodes it into a vector x 2 {0,1}n
• Consistency verifier checks (repeatedly) public encodingwas it (significantly) corrupted? reading only a few bits: t(n).– If not currupted: verifier should output “Ok”– If verifier outputs “Ok”, decoder can (whp) retrieve the file
Good example: Reed Solomon
Pretty Good Authenticatorwith computational assumptions
• Idea: encode file X using a good error correcting code C– Actually erasures are more relevant– As long as a certain fraction of the symbols of C(X) is available,
can decode X• Add to each symbol a tag Fk(a,i), a function of
• secret information k 2 {0,1}s, seed of a PRF• symbol a 2 • location i
• Verifiers picks random location i reads symbol ’a’ and tag t – Check whether t=Fk(a,i) and rejects if not
• Decoding process removes all inappropriate tags and uses the decoding procedure of C
Memory Checker Authenticator
If there exists an online memory checker with– space complexity s(n) – query complexity t(n)
then there exists an authenticator with– space complexity O(s(n)) – query complexity O(t(n))
Idea: Use a high-distance code
Improve the Information Theoretic Upper Bound(s)?
Maybe we can use:
Locally Decodable Codes?
Locally Testable Codes?
PCPs of proximity?
The Lower Bound
Theorem 1 [Tight lower bound]:For any online memory checker secure against a computationally unbounded adversary
s(n) x q(n) = (n)
True also for authenticators
Memory Checkers and One-Way Functions
Breaking the lower bound implies one-way functions.Theorem 2:If there exists an online memory checker:
– Working in polynomial time – Secure against polynomial time adversaries – With query and space complexity:
s(n) x q(n) < c · n (for a constant c > 0)
Then there exist functions that are hard to invert for infinitely many input lengths(“almost one-way” functions)
This Talk:
• Not say much about the proof– It is involved
• Initial insight: connection to the simultaneous message model
Simultaneous Messages Protocols [Yao 1979]
• For the equality function:– |mA| + |mB| = (√n) [Newman Szegedy 1996]
– |mA| x |mB| = (n) [Babai Kimmel 1997]
mB
f(x,y)
x {0,1}n
y {0,1}n
x=y?
mA
ALICE
BOB
CAROL
Ingredients for Full Proof:
• Consecutive Messages Model:Generalized communication complexity lower bound.
• Adversary “learns” public memory access distribution:Learning Adaptively Changing Distributions [NR06].
• “Bait and Switch” technique:Handle adaptive checkers.
• One-Way functions:Breaking the generalized communication complexity lower bound in a computational setting requires one-way functions.
Conclusions for OMC
Settled the complexity of online memory checkingCharacterized the computational assumptions required for good online memory checkers
Open Questions:Do we need logarithmic query complexity for online memory checking with computational assumptions?Understanding relationships of crypto/complexity objectsQuantum Memory Checkers?
LDC
Talk Based on:
• The Complexity of Online Memory Checking[Naor and Rothblum]
• Fault Tolerant Storage And Quorum Systems[Nadav and Naor]
• On the Compressibility of NP Instance and Cryptographic Applications
[Harnik and Naor]
Theme: cases where LDC should be helpful but
•Either provably not helpful
•Or open problem
Goal• Distributed file storage system
– Peer-to-peer environment– Processors join and leave the system continuouslyWant to be able to store and retrieve files distributively
• Partial Solutions– Distributed File sharing applications [Gnutella, Kazaa]– Distributed Hash Tables [DH, Chord, Viceroy]
• Store (key, value) pairs and perform lookup on key
Fault-Tolerant Storage System• Censor
– Aims to eliminate access to some files– Can take down some servers
• Design Goal:
– A reader should be able to reconstruct each file with high probability even after faults have occurred
Probability taken over coins of the writer and reader
Adversarial Behavior
• How are the faulty processors chosen?What is the influence of the adversary
• Type of faults– Complete/Partial control
Adversarial Model• Adversary chooses the set of processors to crash
• Different degrees of adaptiveness– Non adaptive adversary
• Choice of faulty processors is not based on their content– Adversary with a limited number of queries
• May query some processors
• fail-stop failures– We do not consider Byzantine failures
Other Fault Models• Random faults model:
– Examples: Distance Halving DHT, Chord– Standard technique:
• Replication to log(n) processors • Assures survival with high probability
• Adversarial faults [Fiat, Saia]
– Large fraction accessible after adversary crashes a linear fraction of the processors
• Still, a censor can target a specific file
Measures of Quality• Read/Write complexity:
– Average number of processors accessed during a read/write operation
• Number of rounds:– Number of rounds required from an adaptive reader
• Blowup Ratio: – Ratio between the total number of bits used for the storage of a
file and its size
Connection to LDC
• If you are willing to have high write complexity:
• Can encode ALL the data with an LDC
• Parameters of the LDC determine how good the data storage is
Probabilistic Storage system based on intersecting quorum system
• Storage System:
– To store a file: pick a set of size uniformly at random
• replicate the file to all members of the quorum set
– Retrieval: Choose a random set of size and probe its
members
– Intersection follows from the birthday paradox
Properties of the Probabilistic Storage System• Pros:
– Simplicity
– Resilient against linear number of faults• Even if the processors are chosen by the adversary adaptively
– Adapted to a dynamic environment [Abraham, Malkhi]
Want a storage system with better parameters
•Cons:
•High read/write complexity
•High blowup-ratio
Non-adaptive readers are wasteful!
• Non-adaptive reader: – Processors are chosen without accessing any processor
Theorem: A fault tolerant storage system, in the non-adaptive reader model, resilient against (n) faults, cannot do better than the -intersecting storage system example.Read Complexity ¢ Write Complexity is (n)Blowup Ratio is (√n)
Open Question
• Do the lower bounds for the case when both the reader and the adversary are non-adaptive hold when both are fully adaptive?
E
For Effort
Talk Based on:
• The Complexity of Online Memory Checking[Naor and Rothblum]
• Fault Tolerant Storage And Quorum Systems[Nadav and Naor]
• On the Compressibility of NP Instance and Cryptographic Applications
[Naor and Harnik]
Theme: cases where LDC should be helpful but
•Either provably not helpful
•Or open problem
The ProblemIs it possible to have an efficient procedure:• Given CNF formulae 1 and 2 on same
variables and same lengthcome up with a CNF formula that is: 1. Satisfiable if and only if 1 v 2 is satisfiable
2. Shorter than |1|+|2| If yes: There is a construction of Collision Resistant Hash functions from any one-way function
No “black box” construction of CRH from OWF [Simon98]Construction uses the code of the one-way function
If no: there is hope for:• Efficient everlasting encryption in the hybrid bounded storage model
• Forward-Secure-Storage [Dziembowski]• Derandomization of Sampling [Dubrov-Ishai]
Sufficiently short to apply recursively (1-) (|1|+|2|)
No Witness Retrievable Compression• Given CNF formulae 1 and 2 on same variables
come up with a formula that is: 1. Satisfiable if and only if 1 v 2 is satisfiable
2. Shorter than |1|+|2|
Claim: if one-way functions exist, then a witness for either 1 or 2 cannot yield a witness for efficiently.Most natural ideas are witness retrievable
Satisfying assignment
Proof intuition based on broadcast encryption lower bounds
Garey and Johnson, 1979
I can’t find an algorithm for the
problem
Maybe I can approximate
it Solve it for some fixed parameters
Find an algorithm
that usually works?
Solve it in time 2n
Could we just postpone it ?
Approaches for dealing with NP-complete problems:Approaches for dealing with NP-complete problems:• Approximation algorithms • Sub-exponential time algorithms• Parameterized complexity• Average case complexity• Save it for the future
Verdict on LDCs?
Uncompressed paper on compressibility: www.wisdom.weizmann.ac.il/~naor/PAPERS/compressibility.html
Compressed version FOCS 2006
Simultaneous
Theorem (lower bound for CM protocols):For any equality protocol, as long as |mP| ≤ n/100,|mA| x |mB| = (n)
mP
x {0,1}n
y {0,1}n
mA
mB
x=y?
Consecutive Messages ProtocolsALIC
E
BOB
CAROL
Program for This Talk:
• Define online memory checkers• Review some past results• Describe new results• Proof sketch:
– Define communication complexity model– Sketch lower bound for a simple case– Ideas for extending to the general case
The Reduction
Use online memory checkerto construct a consecutive messages equality protocol
Online Checker
Space: s(n)
Query: q(n)
Equality Protocol
Alice msg: s(n)
Bob msg: O(q(n))
Reduction
Conclusion: s(n) x q(n) = Ω(n)(From communication complexity lower bound)
Simplifying Assumption
Assumption: checker chooses indices to read from public memory independently of secret memory
Checker Operation:
1. Get an index i in the original file2. Choose which indices to read from the public memory,
and read them.3. Get the secret memory4. Retrieve i-th bit or say BUG
(With loss of generality)
The Reduction: OutlineUse online memory checkerConstruct “random index” protocol, Bob chooses random index i:
If x = y, then Carol acceptsIf xi ≠ yi, then Carol rejects
Use online checker to build this protocol
Use error correcting codeGo from “random index” to equality testing:
Alice, Bob encode inputs and run “random index” protocolIf Alice’s and Bob’s inputs different at even one index, encodings are different at many indices.
q(n)+1 bits
s(n) bits
CAROL
BOB
ALICE
Checker
Secret Memory S(x)
Public Memory P(x)
store(x)
Get random index i
x{0,1}n
S(x)
y{0,1}n
i, yi
Checker
Secret Memory S(y)
store(y)
Public Memory P(y)
retrieve(i)
retrieve(i)
Bits for Carol
Ci = xi/BUG
xi
Secret Memory S(x)
Accept if
yi = Cix=y accept
xi≠yi reject
WANT: An adversary that can find bad x,y for protocol Can be used to find bad x,P(y),i for memory checker
PROBLEM: Protocol adversary sees randomness!
SOLUTION: Re-Randomize! Alice re-computes S(x) with different randomness,New S(x) independent of public randomness (given P(x))Requires exponential time Alice
Conclusion [Weak Theorem]:
For “restricted” online memory checkers
s(n) x q(n) = Ω(n)
Program for This Talk:
• Define online memory checkers• Review some past results• Describe new results• Proof sketch:
– Define communication complexity model– Sketch lower bound for a simple case– Ideas for extending to the general case
Recall Simplifying Assumption
Assumption: checker chooses indices to read from public memory independently of secret memory
Do we really need the assumption?
Idea: If checker uses secret memory to choose indices, Adversary learns something about the secret memory from indices the checker reads.
Access Pattern DistributionFor a retrieve request
Access Pattern:Bits of public memory accessed by checker
Access Pattern Distribution:Distribution of the checker’s access pattern(given its secret memory)Randomness: over checker’s coin tosses
Where Do We Go From Here?
Observation:If adversary doesn’t know the access pattern distribution, then the checker is “home free”.
Lesson for adversary:Activate checker many times, “learn” its access pattern distribution!
[NR05]: Learning to Impersonate.
Learning The Access Pattern DistributionTheorem (Corollary from [NR05])
Learning algorithm for adversary:– Adversary stores x, secret memory s– Adversary makes O(s(n)) retrieves,
p: Final public memory (after the stores and retrieves)– Adversary learns L, can generate distribution DL(p).– “Real” distribution is DS(p)
Guarantee: With high probability, the distributionsDL(p) and DS(p) are ε -close.L is of size O(q(n) x s(n)) bits.
Guarantee is only for the public memory p reached by checker!
CAROL
BOB
ALICE
Checker
Secret Memory S(x)
Public Memory P(x)
store(x), retrieves
Get random index i
x{0,1}n
S(x)
y{0,1}n
i, yi
Checker
Secret Memory S(y)
store(y), retrieves
Public Memory P(y)
retrieve(i)
Bits for Carol
Ci = xi/BUG
Secret Memory S(x)
Accept if
yi = Cix=y accept
xi≠yi reject
Learned L
Run Learner with public
coins Learned L
Run Learner with same
coins
O(s(n)xq(n)) bitsq(n)+1 bits
s(n) bits
L
Completeness: Adversary that finds x s.t. Carol rejects when Alice AND Bob’s inputs are x, also fools memory checker
Access pattern distributions by “real” S and “learned” L are close on P(x).Protocol adversary sees L, checker adversary learns it!
Soundness: An adversary that finds x≠y s.t. Carol doesn’t reject, also fools memory checker
Does this work???PROBLEM: distributions by “real” S and “learned” L are close on original P(x)! They may be very far on P(y)!
Does it Work?
• Will the protocol work when y≠x?• No! Big problem for the adversary:
Can learn access pattern distribution on correct and unmodified public memory…really wants the distribution on different modified memory!
• Learned information L may be:– Good on unmodified memory (DL(P(x)), DS(P(x))
close)– Bad on modified memory (DL(P(y)), DS(P(y)) far)
• Can’t hope to learn distribution on modified public memory
Bait and Switch
Carol knows S and L, if only she could check whether DL(P(y)), DS(P(y)) are ε-close…
If far:P(y)≠P(x) (not “real” public memory)! Reject!
If close:OK for Bob to use L for access pattern!
Bob always uses L to determine access pattern.This is a “weakening” of the checker.
Bait and Switch:Carol Approximates the Distance
Main Observation:Carol (computationally unbounded) can compute probabilities of any access pattern for which all the bits read from P(y) are known.(Probabilities by both DL(P(y)) and DS(P(y)))
Solution:Sample O(1) access patterns by DL(P(y)), use them to approximate distance between the distributions.
In the protocol Bob sends these samples to Carol, she approximates the distance.
Putting It Together
From any memory checker, we get a CM protocol for equality with:
• Public message: length O(s(n) x q(n))• Alice message: length s(n)• Bob message: length O(q(n))
Conclusion: s(n) x q(n) = (n)
Conclusion
Settled the complexity of online memory checkingCharacterized the computational assumptions required for good online memory checkers
Open Questions:Do we need logarithmic query complexity for online memory checking with computational assumptions?Understanding relationships of crypto/complexity objectsQuantum Memory Checkers?