CSE Identity-Based Distributed Provable Data

download CSE Identity-Based Distributed Provable Data

of 6

Transcript of CSE Identity-Based Distributed Provable Data

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    1/13

    Identity-Based Distributed Provable DataPossession in Multicloud Storage

    Huaqun Wang

    Abstract—Remote data integrity checking is of crucial importance in cloud storage. It can make the clients verify whether their 

    outsourced data is kept intact without downloading the whole data. In some application scenarios, the clients have to store their 

    data on multicloud servers. At the same time, the integrity checking protocol must be efficient in order to save the verifier’s cost. From

    the two points, we propose a novel remote data integrity checking model: ID-DPDP (identity-based distributed provable data

    possession) in multicloud storage. The formal system model and security model are given. Based on the bilinear pairings, a concrete

    ID-DPDP protocol is designed. The proposed ID-DPDP protocol is provably secure under the hardness assumption of the standard

    CDH (computational Diffie-Hellman) problem. In addition to the structural advantage of elimination of certificate management,

    our ID-DPDP protocol is also efficient and flexible. Based on the client’s authorization, the proposed ID-DPDP protocol can realize

    private verification, delegated verification, and public verification.

    Index Terms—Cloud computing, provable data possession, identity-based cryptography, distributed computing, bilinear pairings

    Ç

    1 INTRODUCTION

    OVER   the last years, cloud computing has become animportant theme in the computer field. Essentially, ittakes the information processing as a service, such asstorage, computing. It relieves of the burden for storagemanagement, universal data access with independentgeographical locations. At the same time, it avoids of capital expenditure on hardware, software, and personnelmaintenances, etc. Thus, cloud computing attracts moreintention from the enterprise.

    The foundations of cloud computing lie in the outsourc-ing of computing tasks to the third party. It entails thesecurity risks in terms of confidentiality, integrity andavailability of data and service. The issue to convince thecloud clients that their data are kept intact is especially vitalsince the clients do not store these data locally. Remote dataintegrity checking is a primitive to address this issue. Forthe general case, when the client stores his data onmulticloud servers, the distributed storage and integritychecking are indispensable. On the other hand, theintegrity checking protocol must be efficient in order tomake it suitable for capacity-limited end devices. Thus,

     based on distributed computation, we will study distrib-uted remote data integrity checking model and present thecorresponding concrete protocol in multicloud storage.

    1.1 Motivation

    We consider an ocean information service corporationCor   in the cloud computing environment.   Cor   canprovide the following services: ocean measurementdata, ocean environment monitoring data, hydrologicaldata, marine biological data, GIS information, etc. Besidesof the above services,   Cor   has also some privateinformation and some public information, such as the

    corporation’s advertisement. Cor

     will store these differentocean data on multiple cloud servers. Different cloudservice providers have different reputation and chargingstandard. Of course, these cloud service providers needdifferent charges according to the different security-levels. Usually, more secure and more expensive. Thus,Cor   will select different cloud service providers to storeits different data. For some sensitive ocean data, it willcopy these data many times and store these copies ondifferent cloud servers. For the private data, it will storethem on the private cloud server. For the publicadvertisement data, it will store them on the cheappublic cloud server. At last,  Cor  stores its whole data on

    the different cloud servers according to their importanceand sensitivity. Of course, the storage selection will takeaccount into the   Cor’s profits and losses. Thus, thedistributed cloud storage is indispensable. In multicloudenvironment, distributed provable data possession is animportant element to secure the remote data.

    In PKI (public key infrastructure), provable data pos-session protocol needs public key certificate distributionand management. It will incur considerable overheadssince the verifier will check the certificate when it checksthe remote data integrity. In addition to the heavycertificate verification, the system also suffers from the

    other complicated certificates management such as certifi-cates generation, delivery, revocation, renewals, etc. Incloud computing, most verifiers only have low computa-tion capacity. Identity-based public key cryptography can

    .   The author is with the School of Information Engineering, Dalian OceanUniversity, Dalian 116023, China, and also with the State Key Laboratoryof Information Security, Institute of Information Engineering, Chinese

     Academy of Sciences, 100049, China. E-mail: [email protected].

     Manuscript received 6 Jan. 2013; revised 5 Apr. 2013; accepted 16 Nov. 2013.Date of publication 10 Mar. 2014; date of current version 10 Apr. 2015.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference the Digital Object Identifier below.Digital Object Identifier no. 10.1109/TSC.2014.1

    1939-1374 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015328

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    2/13

    eliminate the complicated certificate management. In orderto increase the efficiency, identity-based provable datapossession is more attractive. Thus, it will be verymeaningful to study the ID-DPDP.

    1.2 Related Work 

    In cloud computing, remote data integrity checking is animportant security problem. The clients’ massive data isoutside his control. The malicious cloud server maycorrupt the clients’ data in order to gain more benefits.Many researchers proposed the corresponding systemmodel and security model. In 2007, provable data posses-sion (PDP) paradigm was proposed by Ateniese  et al.  [1].In the PDP model, the verifier can check remote dataintegrity with a high probability. Based on the RSA, theydesigned two provably secure PDP schemes. After that,Ateniese et al.  proposed dynamic PDP model and concretescheme [2] although it does not support insert operation.In order to support the insert operation, in 2009, Erway  et al.proposed a full-dynamic PDP scheme based on the authen-ticated flip table [3]. The similar work has also been done byF. Sebé  et al.  [4]. PDP allows a verifier to verify the remotedata integrity without retrieving or downloading the wholedata. It is a probabilistic proof of possession by samplingrandom set of blocks from the server, which drasticallyreduces I/O costs. The verifier only maintains smallmetadata to perform the integrity checking. PDP is aninteresting remote data integrity checking model. In 2012,

    Wang proposed the security model and concrete scheme of proxy PDP in public clouds [5]. At the same time, Zhu  et al.proposed the cooperative PDP in the multicloud storage [6].

    Following Ateniese   et al.’s pioneering work, manyremote data integrity checking models and protocolshave been proposed [7], [8], [9], [10], [11], [12]. In 2008,Shacham presented the first proof of retrievability (POR)scheme with provable security [13]. In POR, the verifier cancheck the remote data integrity and retrieve the remotedata at any time. The state of the art can be found in [14],[15], [16], [17]. On some cases, the client may delegate theremote data integrity checking task to the third party. It

    results in the third party auditing in cloud computing [18],[19], [20], [21]. One of benefits of cloud storage is to enableuniversal data access with independent geographicallocations. This implies that the end devices may be mobile

    and limited in computation and storage. Efficient integritychecking protocols are more suitable for cloud clientsequipped with mobile end devices.

    1.3 Contributions

    In identity-based public key cryptography, this paperfocuses on distributed provable data possession in multi-cloud storage. The protocol can be made efficient by

    eliminating the certificate management. We propose thenew remote data integrity checking model: ID-DPDP. Thesystem model and security model are formally proposed.Then, based on the bilinear pairings, the concrete ID-DPDPprotocol is designed. In the random oracle model, our ID-DPDP protocol is provably secure. On the other hand, ourprotocol is more flexible besides the high efficiency. Basedon the client’s authorization, the proposed ID-DPDPprotocol can realize private verification, delegated verifi-cation and public verification.

    1.4 Paper Organization

    The rest of the paper is organized as follows. Section 2formalizes the ID-DPDP model. Section 3 presents our ID-DPDP protocol with a detailed performance analysis.Section 4 evaluates the security of the proposed ID-DPDPprotocol. Finally, Section 5 concludes the paper.

    2 SYSTEM  MODEL AND  SECURITY MODELOF  ID-DPDP

    The ID-DPDP system model and security definition arepresented in this section. An ID-DPDP protocol comprisesfour different entities which are illustrated in Fig. 1. Wedescribe them below:

    1.   Client: an entity, which has massive data to bestored on the multicloud for maintenance andcomputation, can be either individual consumer orcorporation.

    2.   CS  (Cloud Server): an entity, which is managed bycloud service provider, has significant storage spaceand computation resource to maintain the clients’data.

    3.   Combiner: an entity, which receives the storagerequest and distributes the block-tag pairs to thecorresponding cloud servers. When receiving thechallenge, it splits the challenge and distributesthem to the different cloud servers. When receivingthe responses from the cloud servers, it combinesthem and sends the combined response to theverifier.

    4.   PKG   (Private Key Generator): an entity, whenreceiving the identity, it outputs the correspondingprivate key.

    First, we give the definition of interactive proof system.It will be used in the definition of ID-DPDP. Then, wepresent the definition and security model of ID-DPDPprotocol.

    Definition 1 (Interactive Proof System) [22]. Let c; s :N  ! R  be functions satisfying  cðnÞ 9 sðnÞ þ ð   1 pðnÞÞ  for some polynomial   pðÞ. An interactive pair   ðP; V Þ   is called a

    Fig. 1. System model of ID-DPDP.

    WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 329

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    3/13

    interactive proof system for the language  L, with complete-ness bound cðÞ and soundness bound  sðÞ, if 

    1.   Completeness: for every   x 2  L,   Pr½hP; V iðxÞ ¼ 1 cðjxjÞ.

    2.   Soundness: for every   x 62 L   and every interactivemachine B , Pr½hB; V iðxÞ ¼ 1  sðjxjÞ.

    Interactive proof system is used in the definition of ID-DPDP, i.e., Definition 2.

    Definition 2 (ID-DPDP). An ID-DPDP protocol is a collectionof three algorithms  ðSetup; Extract; TagGenÞ and an inter-active proof system   ðProofÞ. They are described in detailbelow.

    1.   Setupð1kÞ: Input the security parameter  k, it outputs thesystem public parameters params, the master public keympk and the master secret key  msk.

    2.   Extractð1k; params; mpk; msk; IDÞ: Input the public parameters   params, the master public key   mpk, the

    master secret key msk

    , and the identity ID

     of a client, itoutputs the private key skID that corresponds to the clientwith the identity ID.

    3.   TagGenðskID; F i; PÞ: Input the private key   skID, theblock F i  and a set of  CS  P ¼ fCS  jg, it outputs the tuplefi; ðF i; T iÞg, where   i   denotes the   i-th record of metadata, ðF i; T iÞ denotes the i-th block-tag pair. Denoteall the metadata  fig as  .

    4.   ProofðP ; C ðCombinerÞ; V ðVerifierÞÞ: i s a p ro to co lamong P , C  and  V . At the end of the interactive protocol,V   outputs a bit f0j1g denoting false or true.

    Besides of the high efficiency based on the communica-

    tion and computation overheads, a practical ID-DPDPprotocol must satisfy the following security requirements:

    1. The verifier can perform the ID-DPDP protocolwithout the local copy of the file(s) to be checked.

    2. If some challenged block-tag pairs are modified orlost, the response can not pass the ID-DPDP protocoleven if  P  and  C  collude.

    To capture the above security requirements, we definethe security of an ID-DPDP protocol as follows.

    Definition 3 (Unforgeability). An ID-DPDP protocol is

    unforgeable if for any (probabilistic polynomial) adversary A (malicious CS and combiner) the probability that  A  wins theID-DPDP game on a set of file blocks is negligible. The ID-DPDP game between the adversary  A   and the challenger  C can be described as follows:

    1.   Setup: The challenger   C   runs   Setupð1kÞ   and getsð params; mpk; mskÞ. It sends the public parametersand master public key ð params; mpkÞ to A while it keepsconfidential the master secret key  msk.

    2.   First-Phase Queries: The adversary  A  adaptively makesExtract; Hash; TagGen   queries to the challenger   C   as follows:

    .   Extract   queries. The adversary   A    queries the private key of the identity   ID. By runningExtractð params; mpk; msk; IDÞ, the challenger   C 

     gets the private key  skID  and forwards it to  A . LetS 1   denote the extracted identity set in the first- phase.

    .   Hash   queries. The adversary   A   queries   hash function adaptively.   C   responds the   hash   valuesto  A .

    .   TagGen   queries. The adversary  A  makes block-tag pair queries adaptively. For a block tag query  F i, the

    challenger calculates the tag  T i  and sends it back tothe adversary. Let   ðF i; T iÞ   be the queried block-tag pair for index  i  2 I1, where   I1  is a set of indices thatthe corresponding block tags have been queried in the first-phase.

    3.   Challenge:  C  generates a challenge  chal which defines aordered collection   fID; i1; i2; . . . ; icg, where   ID

    62 S 1,fi1; i2; . . . ; icg 6 I1, and   c   is a positive integer. Theadversary is required to provide the data possession proof  for the blocks  F i1 ; . . . ; F ic .

    4.   Second-Phase Queries: Similar to the First-PhaseQueries. Let the   Extract   query identity set be   S 

    2  and

    the  TagGen query index set be I2. The restriction is thatfi1; i2; . . . ; icg 6 ðI1 [ I2Þ and  ID

    62 ðS 1 [ S 2Þ.5.   Forge: The adversary   A   responses     for the challenge

    chal.

    We say that the adversary A  wins the ID-DPDP game if the response   can pass C ’s verification.

    Definition 3.  States that, for the challenged blocks, themalicious CS  or  C  cannot produce a proof of data possessionif the blocks have been modified or deleted. On the other hand,the definition does not state clearly the status of the blocks

    that are not challenged. In practice, a secure ID-DPDP protocol also needs to convince the client that all of hisoutsourced data is kept intact with a high probability. We givethe following security definition.

    Definition 4 (ð;  Þ  Security).  An ID-DPDP protocol isð;  Þ secure if the CS corrupted  fraction of the whole blocks,the probability that the corrupted blocks are detected is atleast  .

    The notations throughout this paper are listed in Table 1.

    3 THE  PROPOSED  ID-DPDP PROTOCOLIn this section, we present an efficient ID-DPDP protocol. Itis built from bilinear pairings which will be brieflyreviewed below.

    3.1 Bilinear Pairings

    Let  G 1  and  G 2  be two cyclic multiplicative groups with thesame prime order q . Let e  :  G 1  G 1 ! G 2  be a bilinear map[25] which satisfies the following properties:

    1. Bilinearity: 8g 1; g 2; g 3  2 G 1  and  a; b 2 Z q ,

    eðg 1; g 2g 3Þ ¼ eðg 2g 3; g 1Þ ¼ eðg 2; g 1Þeðg 3; g 1Þe g 1

    a; g 2b

    ¼ eðg 1; g 2Þ

    ab:

    2. Non-degeneracy: 9g 4; g 5 2 G 1 such that eðg 4; g 5Þ 6¼ 1G 2 .

    IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015330

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    4/13

    3. Computability:   8g 6; g 7 2 G 1, there is an efficientalgorithm to calculate eðg 6; g 7Þ.

    Such a bilinear map e can be constructed by the modifiedWeil [23] or Tate pairings [24] on elliptic curves. Our ID-

    DPDP scheme relies on the hardness of CDH (Computa-tional Diffie-Hellman) problem and the easiness of DDH(Decisional Diffie-Hellman) problem. They are defined below.

    Definition 5 (CDH Problem on  G 1). Let  g  be the generatorof   G 1. Given   g; g 

    a; g b 2 G 1   for randomly chosen   a; b 2 Z q ,calculate g ab 2 G 1.

    Definition 6 (DDH Problem on G 1). Let g  be the generator of G 1. Given   ðg; g 

    a; g b; ĝ Þ 2 G 41   for randomly chosen   a; b 2 Z q ,

    decide whether  g ab ¼? ĝ .

    In the paper, the chosen group   G 1   satisfies that CDHproblem is difficult but DDH problem is easy. The DDH

    problem can be solved by making use of the bilinearpairings. Thus,   ðG 1; G 2Þ   are also defined as GDH (Gap

    Diffie-Hellman) groups.

    3.2 The Concrete ID-DPDP Protocol

    This protocol comprises four procedures:   Setup,   Extract,TagGen, and   Proof. Its architecture can be depicted inFig. 2. The figure can be described as follows:

    1. In the phase Extract, PKG creates the private key forthe client.

    2. The client creates the block-tag pair and uploads itto combiner. The combiner distributes the block-tagpairs to the different cloud servers according to the

    storage metadata.3. The verifier sends the challenge to combiner and the

    combiner distributes the challenge query to thecorresponding cloud servers according to the stor-age metadata.

    4. The cloud servers respond the challenge and thecombiner aggregates these responses from the cloudservers.

    The combiner sends the aggregated response to theverifier. Finally, the verifier checks whether the aggregatedresponse is valid.

    The concrete ID-DPDP construction mainly comes from

    the signature, provable data possession and distributedcomputing. The signature relates the client’s identity withhis private key. Distributed computing is used to store theclient’s data on multicloud servers. At the same time,distributed computing is also used to combine the multi-cloud servers’ responses to respond the verifier’s chal-lenge. Based on the provable data possession protocol [13],the ID-DPDP protocol is constructed by making use of thesignature and distributed computing.

    Without loss of generality, let the number of stored blocks be n. For different block F i, the corresponding tupleðN i; CS li ; iÞ   is also different.   F i   denotes the   i-th block.

    Denote N i  as the name of  F i. F i is stored in CS li  where li  isthe index of the corresponding CS. ðN i; CS li ; iÞ will be usedto generate the tag for the block F i. The algorithms can bedescribed in detail below.

    TABLE 1Notations and Descriptions

    Fig. 2. Architecture of our ID-DPDP protocol.

    WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 331

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    5/13

    .   Setup: Let g  be a generator of the group  G 1 with theorder   q . Define the following cryptographic hashfunctions:

    H   : f0; 1g ! Z q 

    h : f0; 1g Z q  ! G 1

    h1  : f0; 1g ! Z q :

    Let   f   be a pseudo-random function and     be apseudo-random permutation. They can be describedin detail below

    f   : Z q   f1; 2; . . . ; ng ! Z q 

     : Z q   f1; 2; . . . ; ng ! f1; 2; . . . ; ng:

    PKG picks a random number  x  2 Z q  and calculatesY  ¼ g x. The parameters fG 1; G 2;e;q;g;Y;H;h;h1; f ; gare made public. PKG keeps the master secret key  xconfidential.

    .   Extract: Input the identity ID, PKG picks r  2 Z q  and

    calculates

    R ¼  g r;   ¼  r þ xH ðID;RÞ mod q:

    PKG sends the private key skID  ¼ ðR;  Þ to the client by the secure channel. The client can verify thecorrectness of the received private key by checkingwhether the following equation holds

    g   ¼?

    RY H ðID;RÞ:   (1)

    If the formula (1) holds, the client   ID  accepts theprivate key; otherwise, the client  ID  rejects it.

    .   TagGenðskID; F; PÞ: Split the whole file   F   into   n blocks, i.e.,  F  ¼ ðF 1; F 2; . . . ; F nÞ. The client preparesto store the block  F i  in the cloud server  CSli . Then,for 1    i    n, each block F i is split into s sectors, i.e.,F i  ¼ f  ~F i1;  ~F i2; . . . ;  ~F isg. Thus, the client gets   n ssectors   f ~F ijgi2½1;n;j2½1;s. Picks   s   random   u1; u2; . . . ;us  2 G 1. Denote u  ¼ fu1; u2; . . . ; usg. For F i, the clientperforms the procedures below:

    1. The client calculates   F ij  ¼  h1ð ~F ijÞ   for everysector   ~F ij; 1   j    s.

    2. The client calculates

    T i ¼   h N i; CS li ; ið ÞYs j¼1

    uF ij j ! 

    :

    3. The client adds the record i  ¼ ði;u;N i; CS li Þ  tothe table T cl. It stores T cl  locally.

    4. The client sends the metadata table   T cl   to thecombiner. The combiner adds the records of  T clto its own metadata table  T o.

    5. The client outputs T i  and stores  ðF i; T iÞ in  CSli .

    .   ProofðP ; C; VÞ: This is a 5-move protocol amongP ¼ fCS igi2½1;n̂),   C , and   V    with the input ( public parameters, T cl). If the client delegates the verification

    task to some verifier, it sends the metadata table  T clto the verifier. Of course, the verifier may be thethird auditor or the client’s proxy. The interactionprotocol can be given in detail below.

    1. Challenge 1   ðC   VÞ: the verifier picks thechallenge   chal ¼ ðc; k1; k2Þ   where   1   c    n; k1;k2 2 Z 

    q  and sends chal to the combiner  C ;

    2. Challenge 2   ðP  CÞ: the combiner calculatesvi  ¼  k1ðiÞ; 1   i    c  and looks up the table  T o  toget the records that correspond to   fv1; v2; . . . ;vcg ¼ M1

    SM1

    S

    SMn̂.   Mi   denotes the in-

    dex set where the corresponding block-tag pair

    is stored in   CSi. Then,   C   sends   ðMi; k2Þ   toCSi  2 P .

    3. Response1 ðP ! CÞ: For CSi  2 P , it performs theprocedures below:

    .   For vl  2 Mi, CSi splits F vl into s sectors F vl  ¼f ~F vl1;  ~F vl2; . . . ;  ~F vl sg   and calculates   F vl j ¼h1ð ~F vl jÞ   for   1   j    s. (Note:   F vl j ¼   h1ð ~F vl jÞcan also be pre-computed and stored by CS).

    .   CSi  calculates al ¼  f k2 ðlÞ; vl 2 Mi  and

    T ðiÞ ¼

    Yvl2MiT vl

    al :

    .   For  1    j    s, CSi  calculates

    F̂ ðiÞ

     j   ¼X

    vl2Mi

    alF vl j:

    Denote  F̂ ðiÞ

    ¼ ð F̂ ðiÞ

    1   ; . . . ; F̂ 

    ðiÞ

    s   Þ..   CSi  sends i  ¼ ðF̂ 

    ðiÞ; T ðiÞÞ to  C .

    4. Response2   ðC !  VÞ: After receiving all theresponses from   CSi  2 P , the combiner aggre-gates figCS i2P   into the final response as

    T  ¼YCS i T 

    ðiÞ

    ;  ^F l  ¼

    XCS i

    ^F 

    ðiÞ

    l   :

    Denote  F̂  ¼ ðF̂ 1;  F̂ 2; . . . ;  F̂ sÞ. The combiner sends ¼ ð F̂ ; T Þ to the verifier  V .

    5. After receiving the response    ¼ ðF̂ ; T Þ, theverifier calculates

    vi  ¼ k1ðiÞ

    hi  ¼ h N vi ; CS lvi ; vi

    ai  ¼ f k2 ðiÞ:

    Then, it verifies whether the following formulaholds.

    eðT; g Þ ¼?

    eYci¼1

    haiiYs j¼1

    uF̂  j

     j   ; RY H ðID;RÞ

    !:   (2)

    If the formula (2) holds, then the verifier outputs‘‘success’’. Otherwise, the verifier outputs‘‘failure’’.

    3.2.1 Notes 

    In   ProofðP ; C; VÞ, the verifier picks the challenge   chal ¼

    ðc; k1; k2Þ  where k1; k2  are randomly chosen from  Z q . Based

    on the challenge   chal, the combiner determines the chal-lenged block index set as   fv1; v2; . . . ; vcg   where   vi ¼ k1 ðiÞ;1   i    c. Since   k1 ðÞ   is a pseudo-random permutation

    IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015332

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    6/13

    determined by the random   k1, the challenged   c  block-tagpairs come from the random selection of the  n  stored block-tag pairs.

    3.2.2    Correctness

    An ID-DPDP protocol must be workable and correct. Thatis, if the PKG,   C ,   V   and   P   are honest and follow thespecified procedures, the response  can pass V ’s checking.The correctness follows from below:

    eðT; g Þ ¼ eYCS i

    T ðiÞ; g  !

    ¼ eYCS i

    Yvl2Mi

    T f k2 ðlÞvl   ; g 

    !

    ¼ eYCS i

    Yvl2Mi

    hall

    Ys j¼1

    ualF vl j

     j

    !; g  

    !

    ¼ eYci¼1

    haii

    !Ys j¼1

    uF̂  j

     j   ; RY H ðID;RÞ

    !:

    3.3 Performance Analysis

    First, we analyze the performance of our proposed ID-DPDP protocol from the computation and communicationoverhead. We compare our ID-DPDP protocol with theother up-to-date PDP protocols. On the other hand, ourprotocol does not suffer from resource-consuming certifi-cate management which is require by the other existingprotocols. Second, we analyze our proposed ID-DPDPprotocol’s properties of flexibility and verification. Third,we give the prototypal implementation of the proposed ID-DPDP protocol.

    3.3.1 Computation 

    Suppose there are n message blocks which will be stored in n̂cloud servers. The block’s sector number is s. The challenged block number is   c. We will consider the computationoverhead in the different phases. On the group  G 1, bilinearpairings, exponentiation, multiplication, and the hash func-tion   h1   (the input may be large data, such as, 1 G byte)contribute most computation cost. Compared with them, thehash function  h, the operations on  Z q  and G 2  are faster, thehash function  H  can be done once for all. Thus, we do notconsider the hash functions  h and  H , the operations on Z q and  G 2. On the client, the computation cost mainly comesfrom the procedures of  TagGen   and  Verification   (i.e., the

    phase 5 in the protocol ProofðP ; C; VÞ). In the phase TagGen,the client performs  nðs þ 1Þ  exponentiation,   ns  multiplica-tion on G 1, ns hash function h1 and  n hash function h. At thesame time, for every file, the corresponding record   i   is

    stored by the client and C . These stored metadata is small. Inthe phase   Proof , in order to respond the challengechal ¼ ðc; k1; k2Þ   and generate the response   ,   P   and thecombiner  C  perform  c  exponentiation,  c 1  multiplicationon the group G 1 and cs hash function h1. In the verification of the response   ,   V    performs   c þ s þ 1   exponentiation, 2pairings,   c þ s  multiplication on the group   G 1   and   c   hashfunction   h. The exponentiation and multiplication opera-tions are more efficient than pairing [26]. Based on the testpresented in [27], [28], [29], a Tate pairing operation

    consumes about 10 ms on a platform with PIII 3.0 GHz,30 ms on a platform with Pentium D 3.0 GHz and 170 ms onan iMote2 sensor working at 416 MHz. On the other hand, in2012, Zhu   et al.   proposed the cooperative provable datapossession for integrity in multicloud storage [6]. Almost atthe same time, Zhu   et al.   proposed the dynamic auditservices for outsourced storages in clouds [21]. In 2012,Barsoum et al.  proposed a provable multicopy data posses-sion protocol [30]. These three PDP protocols are designed inthe PKI. Thus, the certification verification is necessary whenthese three PDP protocols are performed. Compared withthem, our proposed ID-DPDP scheme is more efficient in the

    computation cost. The computation comparison can besummarized in Table 2. In Table 2,   C exp   denotes the timecost of exponentiation on the group   G 1;   C mul   denotes thetime cost of multiplication on the group  G 1;  C e  denotes thetime cost of bilinear pairing;   C h1  denotes the time cost of the hash function h1. In other schemes, the sector must be inZ q . Our scheme only requires the hash function  h1’s valuelies in Z q . Thus, the hash function h1 can be used to generateless block-tag pairs for the same file. Less block-tag pairsonly incur less computation cost. It shows that our protocolcan be implemented in mobile devices which have limitedcomputation power.

    3.3.2 Communication National Bureau of Standards and ANSI X9 have deter-mined the shortest key length requirements: RSA and DSAis 1024 bits, ECC is 160 bits [31]. Based on the requirements,we give our ID-DPDP protocol’s communication overhead.In the phase   Proof, the communication overhead mainlycomes from the challenge chal and response. The block-tagpairs are uploaded once and for all. After that, the phaseProof will be performed periodically. Thus, the communi-cation overheads mainly come from the ID-DPDP queriesand responses. Suppose there are   n   message blocks arestored in the CS.  G 1  and  G 2  have the same order  q . In ID-

    DPDP query, the verifier sends the challenge   chal ¼ ðc;k1; k2Þ   to combiner, i.e., the communication overhead islog2 n þ 2log2 q . In the response, the combiner responds 1element in G 1 and s elements in Z 

    q  to the verifier. On the other

    TABLE 2Comparison of Computation Cost

    WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 333

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    7/13

    hand, Zhu  et al.   [6], Zhu  et al.   [21], and Barsoum  et al.  [30]proposed three different provable data possession scheme.Compared with these three schemes, our ID-DPDP scheme ismore efficient in the communication cost. The communicationcomparison can be summarized in Table 3. In Table 3,  1G 1denotes one element of  G 1 and 1G 2 denotes one element of G 2.

    3.3.3 Flexibility 

    In the ID-DPDP protocol, one block is split into  s  sectors.Then, the hash function   h1  will be used on these sectors.

    These hash values are shorter than the order q  of G 1. Denotek ¼  log2 q . Let   h1   be   h1  :  f0; 1g

    ~k ! f0; 1gk. This propertymakes our ID-DPDP protocol more flexible betweenstorage overhead and computation overhead. For example,for the file F , the client divides it into n blocks which will besplit into   s  sectors. The hash function   h1   will be used onthese sectors. Thus,  ~ks ¼ jF j=n. In order to save the storagespace, the client will divide F  into less blocks because every block corresponds to one block-tag pair. Thus,  s ¼ jF j=n~kwill increase when   n  decreases. The picked public para-meters   u j; 1   j    s  will become more. These parameterswill incur more computation cost with less storage cost.And vice versa. When   s   decreases,   n   increases. Thecomputation cost will decrease along with the storagecost increases. Thus, our ID-DPDP protocol has theflexibility to adjust the computation cost and storage cost.

    3.3.4 Private Verification, Delegated Verification, and 

    Public Verification 

    Our proposed ID-DPDP protocol satisfies the privateverification and public verification. In the verificationprocedure, the metadata in the table   T cl   and   R   areindispensable. Thus, it can only be verified by the clientwho has   T cl   and   R   i.e., it has the property of privateverification. On some cases, the client has no ability to

    check its remote data integrity, for example, he takes part inthe battle in the war. Thus, it will delegate the third party toperform the ID-DPDP protocol. The third party may be thethird auditor or the proxy or other entities. The client willsend T cl and  R to the consignee. The consignee can performthe ID-DPDP protocol. Thus, it has the property of delegated verification. On the other hand, if the clientmakes T cl  and  R  public, every entity can perform the ID-DPDP protocol by himself. Thus, it has also the property of public verification.

    3.3.5 Notes 

    In the verification, the verifier must have   T cl   and   R.   T clstores the metadata which has no any information on theprivate key   ðR;  Þ. On the other hand, even if   R   is madepublic, the private key    still keeps secret. The private key

    extraction process Extract   is actually a modified ElGamalsignature scheme which is existentially unforgeable. Forthe identity ID, the corresponding private key   ðR;  Þ   is asignature on ID. Since it is existentially unforgeable, theprivate key      still keeps secret even if  R   is made public.Thus, the client can generate the tags for different files withthe same private key   even if it makes T cl  and  R  public.

    3.3.6 Simulation 

    In order to study its prototypal implementation, we havesimulated the proposed ID-DPDP protocol by using C

    programming language with GMP Library (GMP-5.0.5)[33], Miracl Library [34] and PBC Library (pbc-0.5.13) [35].In the simulation, CS works on an ASUS S56C Laptop withthe following settings:

    .   CPU: Intel Core [email protected] GHz;

    .   Physical Memory: 4 GB DDR3 1600 MHz;

    .   OS: Ubuntu 13.04 Linux 3.8.0-19-generic SMP i686.

    The client works on an PC Laptop with the followingsettings:

    .   CPU: CPU I PDC E6700 3.2 GHz;

    .

      Physical Memory: DDR3 2G;.   OS: Ubuntu 11.10 over VMware-workstation-

    full-8.0.0.

    In the simulation, we choose an elliptic curve with 160-bitgroup order, which provides a competitive security levelwith 1024-bit RSA. The certification verification schemecomes from the IEEE P1363 (Hybrid Schemes) [36].

    In order to describe our ID-DPDP protocol’s computa-tion performance, we compare our ID-DPDP with Zhu’sscheme [6]. Fig. 3 depicts the comparison on computationcost for TagGen based on the sector number  s  ¼  20. X-labeldenotes the block number  n  and Y-label denotes the time

    cost (second) in order to generate   n   tags. It is easilyobserved that our ID-DPDP protocol becomes moreefficient along with the block number   n   increases. Fig. 4depicts the comparison on computation cost for Verify

    TABLE 3Comparison of Communication Cost (Bits)

    Fig. 3. Comparison on computation cost for TagGen based on  s  ¼  20.

    IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015334

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    8/13

     based on the same sector number   s ¼  20. X-label denotesthe challenged block number c and Y-label denotes the timecost (second) in order to verify the response. Fig. 5 depictsthe comparison on CS computation cost for response basedon the sector number s = 20 and the CS number be n̂ ¼  10,i.e.,  jPj ¼  n̂. X-label denotes the challenged block numberc   and Y-label denotes the time cost (second) in order tocreate the response. They also show that our ID-DPDPscheme is more efficient than Zhu’s scheme [6]. Then, wedescribe our ID-DPDP protocol’s communication perfor-mance by comparing our ID-DPDP with Zhu’s scheme [6].Fig. 6 depicts the comparison on communication cost for

    challenge based on the challenged block number   c ¼  10.X-label denotes the whole block number   n   and Y-labeldenotes the communication cost (bit) of challenge. Fig. 7depicts the comparison on communication cost for re-sponse. X-label denotes the sector number   s  and Y-labeldenotes the communication cost (bit) of response from thecombiner. From Figs. 6 and 7, our ID-DPDP has shortercommunication length.

    In order to show the feasibility of our proposedscheme in concrete terms, we assume that the clientwill store 50 Tbyte data to CS. The order  q  of  G 1  and  G 2   is160 bit. We take use of the hash function  h  which comesfrom the [37]. Let the hash functions h1  be SHA 1 whichmaps 1 G byte to 160 bit. Let the hash function  H  be alsoSHA 1. Let the sector number s  be 1. Thus, 50 Tbyte datacan be split into 50 1024 blocks. The elliptic curve G 1 is theType A whose order is 160 bit and point size is 128 byte[35]. In the phase   TagGen, the whole computation cost is501024 ð2C exp þ C mul þ C h þ C h1Þ. Based on the 100 sim-ulation, the average computation cost is 17.5535 hours.

    Now, there are 50 1024 ¼  101200 blocks are stored in CS.Suppose there are 1000 block-tag pairs are corrupted. If 230 block-tag pairs are challenged, the corrupted block-tagpairs can be detected with the probability at least 90 percent(Theorem 2). Thus, the challenge is   chal ¼ ð230; k1; k2Þ. Inthe response, CS will perform c  ¼  230  exponentiation andc 1 ¼  229  multiplication on the group   G 1   (We omit the

    Fig. 4. Comparison on computation cost for verify based on  s  ¼ 20.

    Fig. 5. Comparison on computation cost for CS’s response based ons ¼  20,  n̂ ¼  10.

    Fig. 6. Comparison on communication cost for Chal based on  c  ¼  10.

    Fig. 7. Comparison on communication cost for response.

    WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 335

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    9/13

    hash function h1  since it can be pre-computed). The wholetime cost is 0.5 s. In the verification, the client will perform 2pairings,   c þ 1 ¼  231  exponentiation,   c þ 1 ¼  231   additionon the group G 1 and 1 hash function H . The whole time costis 1.08 s. On the other hand, we consider the communicationcost. For 50 Tbyte data, the whole block-tag pairs are50 10244 þ 50 1024 128   byte. The redundancy rate is1:2 107. In the local area network with the bandwidth

    1000 Mb/s, in order to upload these block-tag pairs, the timecost is 116.509 hours. The challenge size is   log 2ð101200Þþ160 þ 160 ¼ 337 bit and the response size is ð1G þ 128Þ 8 ¼8:6 109  bit. The whole time cost is 8.2 s. From thecomputation technology and communication technology,our proposed ID-DPDP scheme is feasible.

    4 SECURITY ANALYSIS

    The security of our ID-DPDP protocol mainly consists of the correctness and unforgeability. The property of correctness has been shown in the Section 3.2. On the other

    hand, we study the universal unforgeability. It means thatthe attacker includes all the cloud servers   P   and thecombiner   C . If our ID-DPDP protocol is universallyunforgeable, it can also prevent the collusion of maliciouscloud servers   P   and   C . Our proof comprises two parts:single block-tag pair is universally unforgeable; theresponse is universally unforgeable.

    Definition 7.  A forger  A ðt;;QH ; Qh; Qh1 ; QE ; QT Þ-breaks asingle tag scheme if it runs in time at most t ; A makes at mostQH  queries to the hash function H , at most Qh  queries to thehash function h, at most Qh1  queries to the hash function  h1,at most QE  queries to the extraction oracle  Extract, at mostQT  queries to the tag oracle TagGen ; and the probability thatA   forges a valid block-tag pair is as least   . A single tagscheme is   ðt;;QH ; Qh; Qh1 ; QE ; QT Þ-existentially unforge-able under an adaptive chosen-message attack if no forgerðt;;QH ; Qh; Qh1 ; QE ; QT Þ-breaks it.

    The following Lemma 1 shows that the single tagscheme is secure.

    Lemma 1.  Let  ðG 1; G 2Þ be a ðt0; 0Þ-GDH group pair of order  q .T he n t he t ag s ch em e o n   ðG 1; G 2Þ   i s   ðt;;QH ; Qh;Qh

    1

    ; QE ; QT Þ-secure against existential forgery under anadaptive chosen-block attack (in the random oraclemodel) for all   t   and     satisfying   0  1=9   and   t0 23ðQH þQhÞðtþOðQH ÞþOðQE ÞþOðQhÞþOðQh1 ÞþOðQT ÞÞ

     ð1ÞQE ð1 ÞQT   , where   0   G ;

       G  1. Based on the difficulty of the CDH problem, our singletag scheme is secure.

    Proof. Let the stored index-block pair set be fð1; F 1Þ; ð2; F 2Þ;. . . ; ðn; F nÞg. For  1   i    n, each block  F i   is split into  ssectors, i.e.,   F i  ¼ f ~F i1;  ~F i2; . . . ;  ~F isg. Then, the hashfunction   h1   is used on these sectors, i.e.,  F ij  ¼  h1ð ~F ijÞ.Then, we get the hash value set  fF ijg. Let the selected

    identity set be I ¼ fID1; ID2; . . . ; ID~ng. Weshow how toconstruct a  t0-algorithm B  that solves CDH problem onG 1  with probability at least  

    0. This will contradict thefact that ðG 1; G 2Þ are GDH group pair.

    Algorithm   B   is given   ðg; g a; g bÞ 2 G 31. Its goal is tooutput   g ab 2 G 1. Algorithm   B  simulates the challengerand interacts with forger  A  as follows.

    Setup. Algorithm B starts by setting the master publickey as  Y   ¼ g a while a  keeps unknown. It initializes thethree tables Tab1, Tab2, and Tab3. Then, it picks the twoparameters   and   from the interval (0, 1).

    H-Oracle. At any time algorithm   A   can query the

    random oracle H . To respond to these queries, algorithm B maintains a list of tuples ðID j; R j; H  jÞ as explained below.We refer to this list asTab1. When A queries the oracle H  atthe pair ðID j; R jÞ, algorithm B  responds as follows:

    1. If   ðID j; R j; Þ 2 Tab1, then algorithm B  retrieves thetuple   ðID j; R j; H  jÞ   and responds with   H  j, i .e.,H ðID j; R jÞ ¼ H  j.

    2. Otherwise, B  picks a random H  j  2 Z q . Then, it addsthe tuple ðID j; R j; H  jÞ to  Tab1 and responds to  A  bysetting H ðID j; R jÞ ¼ H  j.

    Extract-Oracle. At any time algorithm A  can query the

    oracle Extract. To respond to these queries, algorithm  B maintains a list of tuples  ðci; IDi; Ri; H i;  iÞ  as explained below. We refer to this list as Tab2. When A  queries theoracleExtract attheidentityIDi  2 I ,algorithm B picksabitci  2 f0; 1g according to the bivariate distribution function:Pr½ci  ¼  0 ¼  , Pr½ci  ¼  1 ¼  1 . Based on ci, B  respondsas follows:

    1. If    ci  ¼  1,   B   independently picks the random i; H i  2 Z q  and calculates Ri  ¼  g 

     i Y H i .

    .   If  ðIDi; Ri; Þ 62 Tab1, then algorithm  B  adds thetuple ðIDi; Ri; H iÞ to  Tab1 and the tuple ðci; IDi;

    Ri; H i;  iÞ to  Tab2..   If  ðIDi; Ri; Þ 2 Tab1, then algorithm B  picks the

    other random    i; H i  2 Z q   and calculates   Ri  ¼g  i Y H i until ðIDi; Ri; Þ 62Tab1. Then, algorithmB   adds the tuple   ðIDi; Ri; H iÞ   to  Tab1   and thetuple ðci; IDi; Ri; H i;  iÞ to  Tab2.

    .   Finally, algorithm B  responds with ðRi;  iÞ.

    2. If   ci  ¼  0, algorithm B  independently picks a randomRi  2 G 1  and calculates  H i  ¼  H ðIDi; RiÞ  which satis-fies  ðIDi; Ri; Þ 62 Tab1   (otherwise, picks another  Riand calculates   H ðIDi; RiÞ). Then,  B  adds the tupleðID

    i; R

    i; H 

     to Tab

    1 and the tuple ðc

    i; ID

    i; R

    i; H 

    i;  

    to  Tab2, where  i  ¼?. B  fails.

    h-Oracle. At any time algorithm   A   can query therandom oracle h. To respond to these queries, algorithmB maintains a list of tuples ðN i; CS li ; i ;z  i; bi; d iÞ as explained below. We refer to this list as Tab3. When A  queries theoracle h at the tuple ðN i; CS li ; iÞ, B  responds as follows:

    1. If    ðN i; CS li ; i ; z  i; bi; d iÞ 2 Tab3   and   1   i    n, thenalgorithm   B   responds with   z i

    Qs j¼1 u

    F ij j   , i . e .,

    hðN i; CS li ; iÞ ¼ z iQs

     j¼1 uF ij

     j   .2. Otherwise, algorithm B  picks a random bi  2 Z 

    q  and

    a random coin  d i  2 f0; 1g according to the bivariate

    distribution Pr½d i  ¼ 0 ¼  , Pr½d i  ¼ 1 ¼ 1  .

    .   If  d i  ¼  1, algorithm B calculates z i  ¼  g bi . If  d i  ¼ 0,

    algorithm B  calculates z i  ¼ ðg bÞ

    bi .

    IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015336

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    10/13

    .   Algorithm B  adds the tuple  ðN i; CS li ; i ; z  i; bi; d iÞto   Tab3   and responds with   z i

    Qs j¼1 u

    F ij j   , i.e.,

    hðN i; CS li ; iÞ ¼ z iQs

     j¼1 uF ij

     j   .

    h1-Oracle. Upon receiving the query  F̂ ,   B   takes arandom value from Z q  and responds it to  A . It is a realhash function.

    TagGen-Oracle. Let ðN i; CS li ; i ;F iÞ be a tag query issued

     by A . Algorithm B  responds to this query as follows:

    1. Algorithm B  runs the above algorithm for respond-ing to h-queries to obtain hðN i; CS li ; iÞ. Let ðN i; CS li ;i; z i; bi; ciÞ be the corresponding tuple in Tab3. If  d i  ¼0, then algorithm B  reports failure and terminates.

    2. If   d i  ¼ 1, define  i  ¼ ðg aÞbi . Observe that T i  ¼ ðg aÞ

    bi ¼ðg bi Þ

    a¼ ðhðN i; CS li ; iÞ

    Qs j¼1 u

    F ij j   Þ

    aand therefore T i is a

    valid tag on  F i  under the public key  ðg a; u1; ; usÞ.Algorithm B  gives T i  to algorithm A .

    Output. Eventually, on behalf of the client   ID  whoseprivate keyis ðR;  Þ, algorithmA producesa block-tag pair

    ðF w; T wÞ such that no tag query was issued for ðN w; CS lw ;w; F wÞ. If ðN w; CS lw ; w; Þ 62 Tab3, B issues a query itself forhðN w; CS lw ; wÞ   to ensure that such a tuple exists. Weassume T w is a valid tag on F w under the given public key;if it is not, B  reports failure and terminates. Next, B  findsthe tuple   ðN w; CS lw ;j ;z  w; bw; d wÞ   in   Tab3. If   d w ¼  1,   B reports failure and terminates. Otherwise, d w ¼  0, ðF w; T wÞsatisfies the following verification equation

    ðT w; g Þ ¼ e h N w; CS lw ; wð ÞYs j¼1

    uh1ðF wjÞ

     j   ; RY H ðID;RÞ

    !:

    This completes the description of algorithm B . It remainsto show that   B  solves the given instance of the CDHproblem with a high probability.

    Breaking CDH: Based on the oracle-replay technique[32], B  can get another block-tag pair  ðF w;  T̂ wÞ  on the same block F w by using different hash functions ĥ and  Ĥ . Then, wecan get eðT̂ w; g Þ ¼eðĥðN w; CS lw ; wÞ

    Qs j¼1 u

    h1ðF wjÞ j   ;RY 

    H ðID;RÞÞ.A c co r di n g t o t h e s i mu l at i on ,   B    k no w s t h e

    corresponding  bw; b̂w  that satisfy

    h N w; CS lw ; wð Þ ¼ ðg bÞ

    bwYs j¼1

    uh1ðF wjÞ

     j

    ĥ N w; CS lw ; wð Þ ¼ ðg bÞ^bwYs j¼1

    uh1ðF wjÞ j   :

    Thus, we can get

    eðT w; g Þ ¼ e   ðg bÞ

    bw; RY H ðID;RÞ

    eðT̂ w; g Þ ¼ e   ðg bÞ

    b̂w; RY 

     Ĥ ðID;RÞ

    e T w T̂ wbwb̂w ; g 

    ¼ e   ðg bÞ

    bw; Y H ðID;RÞ

     Ĥ ðID;RÞ

    :

    Finally, we get

    g ab ¼   T w T̂ wbwb̂w

      1bw   H ðID;RÞ Ĥ ðID;RÞð Þ

    :

    Probability Analysis. To evaluate the success probabil-ity for algorithm B , we analyze the four events neededfor  B  to succeed:

    .   E 1: B  does not abort as a result of any of  A ’s Extractqueries.

    .   E 2:   B   does not abort as a result of any of   A ’s tagqueries.

    .   E 3: A  generates a valid block-tag forgery ðF w; T wÞ.

    .   E 4: Event  E 3,   cw  ¼  0   for the tuple containing   F w   inTab2 and d w  ¼  0  for the tuple containing  F w in  Tab3.

    B   succeeds if all of these events happen. Theprobability Pr½E 1 ^ E 2 ^ E 3 ^ E 4 decomposes as

    Pr½E 1 ^ E 2 ^ E 3 ^ E 4

    ¼ Pr½E 1 Pr½E 2jE 1 Pr½E 3jE 1; E 2 Pr½E 4jE 1; E 2; E 3

    ¼ ð1 ÞQE  ð1  ÞQT  :

    Thus, in  B ’s simulation environment,  A  can forge avalid block-tag pair with probability

    ̂   P 1 ¼   ð1 ÞQE ð1  ÞQT :

    Algorithm B ’s running time is the same as A ’s runningtime plus the time which is taken to respond to   QH   H queries,  Qh  h queries, Qh1   h1   queries, QE  Extract  queriesand QT  tag queries. Hence, the total running time is at mostt̂   t þ OðQH Þ þ OðQE Þ þ OðQhÞ þ OðQh1Þ þ OðQT Þ. B ytaking use of the oracle replay technique [32],  A   can gettwo different tags on the same block and randomness withthe probability0  1=9withinthetime t0 23ðQH þQhÞt̂̂

    1,

    i.e.,   t0  23ðQH þQhÞðtþOðQH ÞþOðQE ÞþOðQhÞþOðQh1 ÞþOðQT ÞÞ

     ð1ÞQE ð1 ÞQT    . After

    obtaining the two different tags on the same block andrandomness, algorithm B  can break the CDH problem. Itcontradicts the assumption that  ðG 1; G 2Þ  is a GDH grouppair. This completes the proof.   Ì

    Lemma 1 states that the untrusted CS has no ability toforge the single tag. But, can the untrusted CS forge theaggregated block-tag pair to cheat the verifier? First, when allthe stored block-tag pairs are challenged, we study whetherthe untrusted CS can aggregate the fake block-tag pairs tocheat the verifier. It corresponds to Lemma 2. Second, whensome stored block-tag pairs are not challenged, we study

    whether the untrusted CS can substitute the valid unchal-lenged block-tag pairs for the modified block-tag pairs tocheat the verifier. It corresponds to Lemma 3.

    Lemma 2. If all the stored block-tag pairs are challengedand some block-tag pairs are modified, the malicious CSaggregates the fake block-tag pairs which are different fromthe challenged block-tag pairs. The combined block-tag pairðF̂ ; T Þ   (i.e., response) only can pass the verification withnegligible probability.

    Proof. We can use proof by contradiction to prove Lemma 2.For the challenge   chal ¼ ðc; k1; k2Þ, the following para-

    meters can be calculated

    vi ¼ k1 ðiÞ; hi  ¼  h N vi ; CS lvi ; vi

    ; ai  ¼ f k2 ðiÞ:

    WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 337

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    11/13

    Assume the combined block-tag pair   ð F̂ ; T Þ   is forgedand it canpass theverification, where  F̂  ¼ ð  F̂ 1; . . . ;  F̂ sÞ, i.e.,

    eðT; g Þ ¼ eYci¼1

    haiiYs j¼1

    uF̂  j

     j   ; RY H ðID;RÞ

    !

    ¼ e

    Yc

    i¼1

    haii Ys

     j¼1

    u

    Pci¼1

     ai F̂ vi j j   ; RY 

    H ðID;RÞ

    !:

    Then,

    Yci¼1

    e T vi ; g ð Þai ¼

    Yci¼1

    e hiYs j¼1

    uF̂ vi j

     j   ; RY H ðID;RÞ

    !ai:

    Let d  be a generator of  G 2. There exist xi; yi  2 Z q  withthe following properties:

    e T vi ; g ð Þ ¼ d xi ; e hi

    Ys j¼1

    uF̂ vi j

     j   ; RY H ðID;RÞ

    !¼ d yi :

    We can get

    d Pc

    i¼1 aixi ¼ d 

    Pci¼1

     ai yi

    Xci¼1

    aiðxi  yiÞ ¼ 0:   (3)

    Since the single tag is existentially unforgeable (Lemma1), there exist at least two different indices   i   such thatxi  6¼ yi. Suppose there exist  s   c  index pairs ðxi; yiÞ withthe property   xi  6¼ yi. Then, there exist   q 

    s1 tuplesða1; a2; . . . ; acÞ which satisfy the above equation (3). Sinceða1; a2; . . . ; acÞ  is a random vector, the equation (3) holdsonly with the probability less than q s1=q c  q c1=q c ¼ q 1.It is negligible. Thus, if some fake block-tag pairs areaggregated, the combined block-tag pair  ð  F̂ ; T Þ  only canpass the verification with negligible probability.   Ì

    When some modified block-tag pairs are challenged andsome valid block-tag pairs are unchallenged, the maliciousCS still can not cheat the verifier.

    Lemma 3.  If some challenged block-tag pairs are modified, themalicious CS substitutes the other valid block-tag pairs (whichare not challenged) for the modified block-tag pairs (which arechallenged). The combined block-tag pair ð F̂ ; T Þ (i.e., response)

    only can pass the verification with negligible probability.

    Proof. Define the modified block-tag pair index set as  Mand the valid block-tag index set as   M. Let the verifier’sc ha ll eng e b e   chal ¼ ðc; k1; k2Þ   and   S ¼ fk1 ð1Þ; . . . ;k1ðcÞg. For the CS set   P , suppose some challenged block-tag pairs   fðF l; T lÞ; l 2 S 1  Sg   are modified, i.e.,S 1 ¼ S 

    TM 6¼ F .   F    denot es t he empt y set. T he

    corresponding CSs (whose modified block-tag pairsare challenged) substitute the other valid block-tag pairsfðF ̂l; T ̂lÞ; l̂ 2 S 2  

      Mg   (which are not challenged) forfðF l; T lÞ; l 2 S 1g where jS 1j ¼ jS 2j. By making use of theforged i  ¼ ðF̂ 

    ðiÞ; T ðiÞÞ, the combined response  ¼ ðF̂ ; T Þ

    only can pass verification with negligible probability.

    If the challenged block-tag pairs   ðF l; T lÞ; l 2 S 1   aremodified,  P  substitutes the other valid block-tag pairs

    ðF ̂l; T ̂lÞ for them. For 1    i    c, the following parametersare calculated

    ai  ¼  f k2 ðiÞ; vi  ¼  k1 ðiÞ; hi  ¼  h N vi ; CS lvi ; vi

    :

    Based on the forgery process, we know the forgedresponse   ¼ ðF̂ ; T Þ satisfies the formulas below

    T  ¼ Yvi2S T

      M

    T aivi Yvi2S 2

    T aivi

    F̂  j ¼X

    vi2S T

      M

    aiF vi j þX

    vi2S 2

    aiF vi j;   1   j    s

    where  F̂  ¼ ð  F̂ 1;  F̂ 2; . . . ;  F̂ sÞ.If the forged response    can pass the verification, then

    eðT; g Þ ¼ eYci¼1

    haiiYs j¼1

    uF̂  j

     j   ; RY H ðID;RÞ

    !i.e.,

    eY

    vi2S T

      MT aivi

    Yvi2S 2

    T aivi ; g 0B@ 1CA

    ¼ eY

    vi2S T

      M

    haiiYs j¼1

    u

    Pvi2S T

      MaiF vi j

     j

    0B@

    1CA

    0B@

    Y

    vi 2S 1

    haiiYs j¼1

    u

    Pvi 2S 2

    ai F vi j

     j

    !; RY H ðID;RÞ

    !:

    (4)

    For the index in the set S T   M, we can get

    eY

    vi2S T

      M

    T aivi ; g 0B@ 1CA

    ¼ eY

    vi2S T

      M

    haiiYs j¼1

    u

    Pvi 2S 

    T  M

    aiF vi j

     j   ; RY H ðID;RÞ

    0B@

    1CA:(5)

    According to the formulas (4), (5), we can get theformula below

    e Yvi2S 2

    T aivi

    ; g  !¼ e Yvi2S 1

    haii Ys

     j¼1

    uPvi 2S 2 aiF vi j j   ; RY H ðID;RÞ !:(6)

    On the other hand, we can get the following formula by making use of the tag scheme

    eY

    vi2S 2

    T aivi ;g 

    !¼ e

    Yvi2S 2

    haiiYs j¼1

    u

    Pvi 2S 2

    aiF vi j

     j   ; RY H ðID;RÞ

    !:   (7)

    From the formulas (6), (7), we can get

    Yvi2S 1 haii   ¼ Yvi 2S 2 h

    aii   :   (8)

    Since   h   is a cryptographic hash function which iscollision-free, the probability that the equation (8) holds

    IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015338

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    12/13

    is  1=q , i.e., it is negligible. The combined block-tag pairð F̂ ; T Þ (i.e., response) only can pass the verification with

    negligible probability.   Ì

    Thus, from the above three lemmas (Lemma 1, Lemma 2,Lemma 3), we have the following claim.

    Theorem 1. The proposed ID-DPDP protocol is existentiallyunforgeable in the random oracle model if the CDH  problem on   G 1   is hard.

    Proof. Let the stored index-block pair set be fð1; F 1Þ; ð2; F 2Þ;. . . ; ðn; F nÞg   and the CS set be   P . For   1    i   n, each block   F i   is split into   s   sectors, i.e.,   F i  ¼ f ~F i1;  ~F i2; . . . ;~F isg. Then, the hash function   h1   is used on these

    sectors, i.e., F ij  ¼  h1ð~

    F ijÞ. Then, we get the hash value setfF ijg. Let the selected identity set be   I ¼ fID1; ID2;. . . ; ID~ng. Define the challenger as B and the adversary asA . Algorithm   B   is given   ðg; g a; g bÞ 2 G 1. Its goal is tooutput  g ab 2 G 1.

    The Setup; ExtractOracle; HOracle; hOracle, h1Oracle; TagGen Oracle procedures are the same as thecorresponding procedures in the proof of Lemma 1.

    Let the challenge be   chal ¼ ðc; k1; k2Þ  and the forgedresponse be    ¼ ðF̂ ; T Þ  where  F̂  ¼ ð  F̂ 1;  F̂ 2; ;  F̂ sÞ. As-sume that   can pass the verification, i.e.,

    eðT; g Þ ¼ eYci¼1 h

    ai

    iYs j¼1 u

    F̂  j

     j   ; RY H ðID;RÞ !

      (9)

    where hi  ¼  hðN k1 ðiÞ; CS lk1 ðiÞ; k1 ðiÞÞ, ai  ¼  f k2ðiÞ, 1    i    c.

    According to Lemma 1 and Lemma 2, if somechallenged block-tag pairs are modified, the verificationformula (9) only holds with negligible probability. Sincethe verifier does not store block-tag pairs,   A   cansubstitute the other valid block-tag pairs for themodified block-tag pairs. According to Lemma 3, thesubstituted response only can pass the verification withnegligible probability, i.e., the formula (9) only holdswith negligible probability.

    Thus, in the random oracle, except with negligibleprobability no adversary can cause the verifier to acceptthe challenged response if some challenged block-tagpairs are modified. Theorem 1 is proved.   Ì

    Theorem 2.  Suppose that  n  block-tag pairs are stored,   d  block-tag pairs are modified and  c   block-tag pairs are challenged.T h e n , o u r p r o p o s e d I D - D P D P p r o t o c o l i sðd=n; 1 ððn  d Þ=nÞcÞ-secure, i.e.,

    1   n  d 

    n

    c P X   1

      n c þ 1  d 

    n c þ 1

    c

    where   P X    denotes the probability of detecting themodification.

    Proof. Let the challenged block-tag pair set be  S  and themodified block-tag pair set be   M. Let the randomvariable   X   be   X  ¼ jS 

    TMj. According to the defini-

    tion of   P X , we can get

    P X  ¼ P fX    1g

    ¼ 1 P fX  ¼  0g

    ¼ 1 n  d 

    n

    n 1  d 

    n 1 

    n c þ 1  d 

    n c þ 1  :

    Thus, we can get

    1   n  d 

    n

    c P X   1

      n c þ 1  d 

    n c þ 1

    c:

    This completes the proof of the theorem.   Ì

    Fig. 8 shows the probability curve of P X . From the picture,we know that our protocol has high integrity checkingprobability.

    5 CONCLUSION

    In multicloud storage, this paper formalizes the ID-DPDPsystem model and security model. At the same time, wepropose the first ID-DPDP protocol which is provablysecure under the assumption that the CDH problem ishard. Besides of the elimination of certificate management,our ID-DPDP protocol has also flexibility and highefficiency. At the same time, the proposed ID-DPDPprotocol can realize private verification, delegated verifi-cation and public verification based on the client’sauthorization.

    ACKNOWLEDGMENT

    The author sincerely thanks the Editor for allocatingqualified and valuable referees. The author sincerely thanksthe anonymous referees for their very valuable comments.This work was partly supported by the National NaturalScience Foundation of China (No. 61272522) and by by theOpen Foundation of State Key Laboratory of InformationSecurity of China (No. 2013-3-4).

    REFERENCES

    [1] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner,

    Z. Peterson, and D. Song, ‘‘Provable Data Possession atUntrusted Stores,’’ in  Proc. CCS, 2007, pp. 598-609.[2] G. Ateniese, R. DiPietro, L.V. Mancini, and G. Tsudik, ‘‘Scalable

    and Efficient Provable Data Possession,’’ in   Proc. SecureComm,2008, pp. 1-10.

    Fig. 8. Probability curve P X  of detecting the corruption.

    WANG: IDENTITY-BASED DISTRIBUTED PROVABLE DATA POSSESSION IN MULTICLOUD STORAGE 339

  • 8/20/2019 CSE Identity-Based Distributed Provable Data

    13/13

    [3] C.C. Erway, A. Kupcu, C. Papamanthou, and R. Tamassia,‘‘Dynamic Provable Data Possession,’’ in   Proc. CCS, 2009,pp. 213-222.

    [4] F. Sebé, J. Domingo-Ferrer, A. Martı́nez-Ballesté, Y. Deswarte,and J. Quisquater, ‘‘Efficient Remote Data Integrity Checkingin Critical Information Infrastructures,’’   IEEE Trans. Knowl.Data Eng., vol. 20, no. 8, pp. 1034-1038, Aug. 2008.

    [5] H.Q. Wang. (2013, Oct./Dec.). Proxy Provable Data Possession inPublic Clouds. IEEE Trans. Serv. Comput.   [Online]. 6(4), pp. 551-559. Available: http://doi.ieeecomputersociety.org/10.1109/

    TSC.2012.35.[6] Y. Zhu, H. Hu, G.J. Ahn, and M. Yu, ‘‘Cooperative Provable DataPossession for Integrity Verification in Multicloud Storage,’’IEEE Trans. Parallel Distrib. Syst., vol. 23, no. 12, pp. 2231-2244,Dec. 2012.

    [7] Y. Zhu, H. Wang, Z. Hu, G.J. Ahn, H. Hu, and S.S. Yau, ‘‘EfficientProvable Data Possession for Hybrid Clouds,’’ in  Proc. CCS, 2010,pp. 756-758.

    [8] R. Curtmola, O. Khan, R. Burns, and G. Ateniese, ‘‘MR-PDP:Multiple-Replica Provable Data Possession,’’ in   Proc. ICDCS,2008, pp. 411-420.

    [9] A.F. Barsoum and M.A. Hasan, ‘‘Provable possession andreplication of data over cloud servers,’’ Centre Appl. Cryptogr.Res., Univ. Waterloo, Waterloo, ON, Canada, Rep. 2010/32.[Online]. Available: http://www.cacr.math.uwaterloo.ca/techreports/2010/cacr2010-32.pdf .

    [10] Z. Hao and N. Yu, ‘‘A Multiple-Replica Remote Data PossessionChecking Protocol with Public Verifiability,’’ in   Proc. 2nd Int.Symp. Data, Privacy, E-Comm., 2010, pp. 84-89.

    [11] A.F. Barsoum and M.A. Hasan, ‘‘On verifying dynamic multipledata copies over cloud servers,’’ Int. Assoc. Cryptol. Res., NewYork, NY, USA, IACR eprint Rep. 447, 2011. [Online]. Available:http://eprint.iacr.org/2011/447.pdf .

    [12] A. Juels and B.S. Kaliski, Jr., ‘‘PORs: Proofs of Retrievability forLarge Files,’’ in  Proc. CCS, 2007, pp. 584-597.

    [13] H. Shacham and B. Waters, ‘‘Compact Proofs of Retrievability,’’in Proc. ASIACRYPT , vol. 5350, LNCS, 2008, pp. 90-107.

    [14] K.D. Bowers, A. Juels, and A. Oprea, ‘‘Proofs of Retrievability:Theory and Implementation,’’ in Proc. CCSW , 2009, pp. 43-54.

    [15] Q. Zheng and S. Xu, ‘‘Fair and Dynamic Proofs of Retrievability,’’in Proc. CODASPY , 2011, pp. 237-248.

    [16] Y. Dodis, S. Vadhan, and D. Wichs, ‘‘Proofs of Retrievability viaHardness Amplification,’’ in   Proc. TCC, vol. 5444,   LNCS, 2009,pp. 109-127.

    [17] Y. Zhu, H. Wang, Z. Hu, G.J. Ahn, and H. Hu, ‘‘Zero-KnowledgeProofs of Retrievability,’’   Sci. Chin. Inf. Sci., vol. 54, no. 8,pp. 1608-1617, Aug. 2011.

    [18] C. Wang, Q. Wang, K. Ren, and W. Lou, ‘‘Privacy-PreservingPublic Auditing for Data Storage Security in Cloud Computing,’’in Proc. IEEE INFOCOM, Mar. 2010.

    [19] Q. Wang, C. Wang, K. Ren, W. Lou, and J. Li, ‘‘Enabling PublicAuditability and Data Dynamics for Storage Security in CloudComputing,’’   IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 5,pp. 847-859, May 2011.

    [20] C. Wang, Q. Wang, K. Ren, N. Cao, and W. Lou, ‘‘Toward Secureand Dependable Storage Services in Cloud Computing,’’   IEEETrans. Serv. Comput., vol. 5, no. 2, pp. 220-232, Apr./June 2012.

    [21] Y. Zhu, G.J. Ahn, H. Hu, S.S. Yau, H.G. An, and S. Chen. (2013,Apr./June). Dynamic Audit Services for Outsourced Storages inClouds.   IEEE Trans. Serv. Comput.  [Online].   6(2), pp. 227-238.Available: http://doi.ieeecomputersociety.org/10.1109/TSC.2011.51.

    [22] O. Goldreich,  Foundations of Cryptography: Basic Tools . Beijing,China: Publishing House of Electronics Industry, 2003,pp. 194-195.

    [23] D. Boneh and M. Franklin, ‘‘Identity-Based Encryption from theWeil Pairing,’’ in   Proc. CRYPTO, vol. 2139,   LNCS, 2001,pp. 213-229.

    [24] A. Miyaji, M. Nakabayashi, and S. Takano, ‘‘New ExplicitConditions of Elliptic Curve Traces for FR-Reduction,’’   IEICETrans. Fundam., vol. E84A, no. 5, pp. 1234-1243, May 2001.

    [25] D. Boneh, B. Lynn, and H. Shacham, ‘‘Short Signatures from theWeil Pairing,’’ in   Proc. ASIACRYPT , vol. 2248,   LNCS, 2001,pp. 514-532.

    [26] H.W. Lim, ‘‘On the Application of Identity-Based Cryptography

    in Grid Security,’’ Ph.D. dissertation, Univ. London, London,U.K., 2006.[27] S. Yu, K. Ren, and W. Lou, ‘‘FDAC: Toward Fine-Grained

    Distributed Data Access Control in Wireless Sensor Networks,’’IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 4, pp. 673-686,Apr. 2011.

    [28] S. Yu, K. Ren, and W. Lou, ‘‘Attribute-Based on-DemandMulticast Group Setup with Membership Anonymity,’’   Calc.Netw., vol. 54, no. 3, pp. 377-386, Feb. 2010.

    [29] P.S.L.M. Barreto, B. Lynn, and M. Scott, ‘‘Efficient Implementa-tion of Pairing-Based Cryptosystems,’’  J. Cryptol., vol. 17, no. 4,pp. 321-334, Sept. 2004.

    [30] A.F. Barsoum and M.A. Hasan, ‘‘Integrity Verification of Multiple Data Copies over Untrusted Cloud Servers,’’ in  Proc.12th IEEE/ACM Int. Symp. CCGRID, 2012, pp. 829-834.

    [31] C. Research ‘‘SEC 2: Recommended Elliptic Curve Domain

    Parameters.’’ [Online]. Available: http://www.secg.org/collateral/sec2_final.pdf .

    [32] D. Pointcheval and J. Stern, ‘‘Security Arguments for DigitalSignatures and Blind Signatures,’’   J. Cryptol., vol. 13, no. 3,pp. 361-396, July 2000.

    [33] The GNU Multiple Precision Arithmetic Library (GMP). [On-line]. Available: http://gmplib.org/.

    [34] Multiprecision Integer and Rational Arithmetic C/C++ Library(MIRACL). [ Online]. Available: http://certivox.com/.

    [35] The Pairing-Based Cryptography Library (PBC). [Online]. Avail-able: http://crypto.stanford.edu/pbc/howto.html.

    [36] IEEE P1363: Hybrid Schemes. [Online]. Available: http://grouper.ieee.org/groups/1363/StudyGroup/Hybrid.html.

    [37] B. Lynn, ‘‘On the implementation of pairing-based cryptosys-tems,’’ Ph.D. dissertation, Stanford Univ., Stanford, CA, USA,2008. [Online]. Available: http://crypto.stanford.edu/pbc/thesis.pdf .

    Huaqun Wang   received the BS degree inmathematics education from the ShandongNormal University, Jinan, China, in 1997, theMS degree in applied mathematics from the EastChina Normal University, Shanghai, China, in2000, and the PhD degree in cryptography fromNanjing University of Posts and Telecommuni-cations, Nanjing, China, in 2006. Since then, hehas been with Dalian Ocean University, Dalian,China, as an Associate Professor. His researchinterests include applied cryptography, network

    security, and cloud computing security. Dr. Wang has published morethan 40 papers. He has served in the program committee of severalinternational conferences and the editor board of international journals.

    .   For more information on this or any other computing topic,please visit our Digital Library at  www.computer.org/publications/dlib.

    IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 8, NO. 2, MARCH-APRIL 2015340