Post on 29-Mar-2015
PROOFS OF RETRIEVABILITY VIA
HARDNESS AMPLIFICATION
Yevgeniy Dodis, Salil Vadhan and Daniel Wichs
Remote Data Storage
Average Computer User: Bob
Remote Storage Server:
Lots of data (music, photos, e-mails, forms…)
Lots of devices (desktop, laptop, music player, phone, camera…)
Accessibility: Wants ability to access all data at all time from all devices.
Reliability: Should never loose data.
Provides greater accessibility and reliability.
(for a cheap price)
Does all of my data
still exist?
Is my data private?
Is it authentic?
Bob:
Remote Storage Server:
Remote Data Storage
Encrypt and MAC data
before storing it remotely
Proofs of Retrievability (PoR)
Introduced by [Juels, Kaliski 07]. An audit protocol between Bob and the server in which Bob checks that his data still retrievable.
Formalized using the extraction paradigm (as in proofs of knowledge).
Naïve Protocol: To run an audit, Bob downloads all his data
and verifies signature. Too costly! Bob does not actually need the data at the time
of an audit.
Goal: An audit protocol that has: Low communication complexity. Locality (server only accesses few locations of the data).
Direct-Product Scheme (One Audit)
Bob:
Bob’s file F Server file S
Error CorrectingCode
Remote Storage Server:
Store t random blocks S[r1],…,S[rt].
r1
r2
rt
Enrollment
Direct-Product Scheme (One Audit)
Server file S
Remote Storage Server:
r1
r2
r3
Bob:
e = r1,…,rt
S[r1],…,S[rt]
Verify thatreceived blocksare correct.
Store t random blocks S[r1],…,S[rt].
Audit
Direct-Product Scheme (One Audit)
Intuition for security: If the server knows enough blocks of
the server file S, then can decode F. If the sever knows too few blocks of
S, then it cannot pass an audit.
Unfortunately, intuition does not translate into a proof since the server does not gives us blocks of S.
Question 1: Is this scheme secure in general?
Question 2: Is the tradeoff between server storage overhead, communication, and locality optimal?
Know
Server file S
KnowDon’t know
KnowKnow
Don’t knowKnowKnowKnow
Don’t know
Direct-Product Scheme (One Audit)
Arbitrary Adversarial Server:
Intuition for security: If the server knows enough blocks of
the server file S, then can decode F. If the sever knows too few blocks of S,
then it cannot pass an audit.
Unfortunately, intuition does not translate into a proof since the server does not gives us blocks of S.
Question 1: Is this scheme secure in general? How do we extract the file?
Question 2: Is the tradeoff between server storage, communication, and locality optimal?
e= (r1,…,rt)C*(e)
Answers ² fraction ofchallenges correctly with C*(e)= (S[r1],…,S[rt])
Prior Work
The “direct-product” scheme was introduced by [Naor, Rothblum 05] in the context of sublinear authenticators. PoR schemes were studied by [Juels, Kaliski 07], [Ateniese et al. 07], [Shacham, Waters 08].
Question 1: Is the direct-product scheme secure? Yes if… [JK07]: Make simplifying assumptions on behavior of the adversary. [JK07,SW08]: Add MACs to authenticate the responses.
Good: gives us “many-time” scheme + proof of security.Bad: increased server storage overhead (and computation/communication).
Question 2: Is the tradeoff between server storage overhead, communication, and locality optimal?An optimization to direct-product scheme appears as part of an optimized
MAC/Sig based scheme of [SW08].Nearly optimal parameters required Random Oracles.
Direct-Product Protocol (One Audit)
Server file S
Remote Storage Server:
Bob:
e = r1,…,rt
C(e) = S[r1],…,S[rt]
Verify thatreceived blocksare correct.
Store t random blocks S[r1],…,S[rt].Store key k for a MAC.
Tags
S[r] ¾[r] = mack(S[r])
¾[r1],…,¾[rt]
Prior Work
The “direct-product” scheme was introduced by [Naor, Rothblum 05] in the context of sublinear authenticators. PoR schemes were studied by [Juels, Kaliski 07], [Ateniese et al. 07], [Shacham, Waters 08].
Question 1: Is the direct-product scheme secure? Yes if… [JK07]: Make simplifying assumptions on behavior of the adversary. [JK07,SW08]: Add MACs to authenticate the responses.
Good: gives us “many-time” scheme + proof of security. Bad: increased server storage overhead (and computation/communication).
Question 2: Is the tradeoff between server storage overhead, communication, and locality optimal? No, e.g. Optimizations to communication complexity appear in [SW08] but
utilized Random Oracles to get nearly optimal parameters. Remove R.O. ? Further improvements?
Our Results Introduce new primitive called PoR codes.
Abstract key component of PoR into a clean coding-theoretic problem. Three ways to turn PoR codes into PoR schemes with various tradeoffs.
1. Security of PoR , efficient (list) decoding algorithms for such codes. 2. Efficiency of PoR , optimizing various parameters of PoR codes.
Construct nearly optimal PoR codes (and therefore PoR schemes). Along the way, answer questions 1, 2. Answer 1: The direct-product scheme is secure.
First storage efficient PoR scheme (optimization of [JK07]) with full proof of security.
First information-theoretically secure PoR. Answer 2: Further optimize all previous schemes.
In particular, remove Random Oracles from [SW08]. Key Step: Connect (list) decoding of PoR codes to seemingly
unrelated area of hardness amplification.
Our abstraction: PoR Codes
Bob’s file F Server file S 2 ¦n
PoR Codeword C 2 §N
… Coordinate C[e] corresponds to server’s response on challenge e.
In particular C can be exponential as it is never stored explicitly. Locality: C[e] can be computed from only a few positions in S. Ignores how Bob decides whether responses are correct/incorrect.
eStorage Server:Bob:
Direct Product PoR
ECC All t-tuples
e
C[e]
SF
Decoding PoR Codes (Attempt)
Remote Storage Server:
Given oracle access to C* that is ²-close to C, decode F. But we cannot uniquely
decode when ² · ½.
…
Incorrect codeword C*
C*(e)e
Decoder
Decoding PoR Codes: Two variants
Remote Storage Server:
Error List Decoding: Given oracle access to C* that is ²-close to C, produce a (short) list containing F Corresponds to “basic” scheme.
Erasure Decoding: given oracle access to C* that is ²-close to C and C*[e] 2 {C[e] , ? }, recover F Corresponds to MAC based
scheme.
Efficiency: Run-time poly(|F|, 1/²).
…
Incorrect codeword C*
C*(e)e
Decoder
PoR Schemes from PoR codes
Sheme 1: Bob stores (challenge, response) pairs locally. Good: Information Theoretic security. Optimal server storage. Bad: Bounded Use. Large client storage.
Scheme 2: Offload storage to the server (encrypt/MAC). Good: Optimal client storage. Small additive overhead to server
storage. Bad: Bounded use.
Scheme 3: Authenticate each block of server file. Good: Unbounded use. Optimal client storage. Bad: Server storage roughly doubles.
Basic ideas of Schemes 1,2,3 come from [NR05], [JK07],[SW08]. Efficiency of all schemes improved with optimized PoR codes. Security of schemes 1& 2 requires error list-decoding which has
not been known before (optimized or not).
List decoding “direct-product” codes
Bob’s file F Server file S…
ECC All t-tuples
Given oracle access to C* which is ²-close to C, output a small list containing F.
Hardness Amplification(direct-product theorems)
If S(r) is ±-hard then the direct-product
function
C(e) = (S(r1),…,S(rt)) e= (r1,…,rt)
is ²-hard, where ² ¿ ±.
PoR Codeword C
List decoding “direct-product” codes
Hardness Amplification(direct-product theorems)
9 adversary computing
C(e) = (S(r1),…,S(rt)) on an ²-fraction of
tuples
)9 adversary that
computes S(r) on a ±-fraction of inputs.
Bob’s file F Server file S…
ECC
Given oracle access to C* which is ²-close to C, output a small list containing F.
PoR Codeword C
All t-tuples
List decoding “direct-product” codes
…
ECC
Hardness Amplification(uniform direct product theorems)
[Trev05], [IJK06], [IJKW08]
Bob’s file F Server file S
Given oracle access to C* which is ²-close to C, output a small list containing F.
Given oracle access to an adversary that
computes C(e) = (S(r1),…,S(rt))on an ²-fraction of
tuples,construct a short list of adversaries one of which computes S(r) on a ±-fraction
of inputs.
PoR Codeword C
All t-tuples
List decoding “direct-product” codes
…
ECC
Bob’s file F Server file S
Step 1: C* ) short list containing S* which is ±-close to S.
Step 2: Short list containing S* ) short list containing F.
Hardness Amplification(uniform direct product theorems)
[Trev05], [IJK06], [IJKW08]
Given oracle access to an adversary that
computes C(e) = (S(r1),…,S(rt))on an ²-fraction of
tuples,construct a short list of adversaries one of which computes S(r) on a ±-fraction
of inputs.
PoR Codeword C
All t-tuples
Parameters of Direct-Product Codes
Tradeoff between locality and server storage is optimal. Easy to show that challenge/response size must be O(¸). Does the challenge/response size need to depend on t?
Parameters Security param ¸.
Server Storage = °|F|. Any ° ¸ 1. Locality t= O(¸/(° -1)) Chall. Size = t log(n) Resp. Size = t log(|¦|)
…
ECC
Bob’s file F Server file S 2 ¦n
PoR codewordC 2 (¦ t)N
e= (r1,…,rt)
All t-tuples
U = S[r1],…,S[rt]
Two optimizations
Shorter Responses: Instead of sending response U = (S[r1],…, S[rt]), ask server to send a random position in an error-correcting encoding of U. [SW08]: Implicitly use Hadamard which
increases challenge. Can be replaced by Reed-Solomon.
Making this optimization work with MAC based scheme was major contribution of [SW08].
Shorter Challenges: Use a randomness efficient “hitter” to sample indices (r1,…, rt) with a shorter challenge. Works for erasure decoding.
Removes Random Oracles from [SW08].
Open for efficient error decoding. (works for inefficient decoding)
Storage Server:
Bob:
S
e
U = S[r1],…,S[rt]
ECC(U)[p]
=(r1,…,rt),p
e
Conclusions
Introduce PoR codes. Give nearly optimal constructions. Proves security of storage-efficient PoR schemes. First information-theoretic scheme. Remove the use of Random Oracles from [SW08].
Open questions: Can we show efficient list-decoding for optimized PoR
codes with a hitter? Do unbounded use schemes require poor server
storage overhead?