Multi Cloud Data Storage System With High Level Authorized Deduplication

International journal of scientific and technical research in engineering (IJSTRE)

www.ijstre.com Volume 1 Issue 1 ǁ April 2016.

Manuscript id. 626565474 www.ijstre.com

Multi Cloud Data Storage System With High Level Authorized

Deduplication

SHOBANA.K M.E (CSE)

Jay Sriram Group of

Institution

[email protected]

Mrs.GOKILAVANI.A Assistant Professor

Jay Sriram Group of Institution

[email protected]

DR.RAJALAKSHMI HOD(CSE)

Jay Sriram Group of

Institution

[email protected]

ABSTRACT Cloud computing is a climbing technology that as of late has drawn essential consideration from

each one exchange and academe. There is high amount of data stored in cloud from various clients,

deduplication is useful and efficient technique to make data management more scalable. Data deduplication is a

specialized data compression method for removing duplicate copies of repeating data in cloud storage. Data

deduplication removes the duplicate copies of repeating data. To encrypt the sensitive data, the convergent

encryption method has been proposed. It is first official attempt to address the issue of secure authorized data

deduplication. Our work is to propose new data deduplication construction supporting system which authorizes

data duplicate check in public multi cloud architecture.The result outcome shows that proposed authorized data duplicate check scheme incurs minimal overhead in public multi cloud architecture as compared to normal

operations.

KEYWORDS Proof Of Ownership, Storage Cloud Service Provider, Data Sharing Scheme, Oblivious Pseudo

Random Function, Management Console Security.

I. INTRODUCTION Cloud computing, or something being in the cloud, is an expression used to describe a variety of

different types of computing concepts that involve a large number of computers connected through a real-time

communication network such as the internet. In science, cloud computing is a synonym for distributed

computing over a network and means the ability to run a program on many connected computers at the same

time. The phrase is also more commonly used to refer to network-based services which appear to be provided by

real server hardware, which in fact is served up by virtual hardware, simulated by software running on one or

more real machines. Such virtual servers do not physically exist and can therefore be moved around and scaled

up (or down) on the fly without affecting the end user—arguably, rather like a cloud.

To make data management scalable in cloud computing, deduplication has been a well-known technique and has

attracted more and more attention recently. Data deduplication is a specialized data compression technique for eliminating duplicate copies of repeating data in storage. The technique is used to improve storage utilization

and can also be applied to network data transfers to reduce the number of bytes that must be sent. Instead of

keeping multiple data copies with the same content, deduplication eliminates redundant data by keeping only

one physical copy and referring other redundant data to that copy. Deduplication can take place at either the file

level or the block level. For file-level deduplication, it eliminates duplicate copies of the same file.

Deduplication can also take place at the block level, which eliminates duplicate blocks of data that occur in non-

identical files. Traditional deduplication systems based on convergent encryption, although providing

confidentiality to some extent, do not support the duplicate check with differential privileges. In other words, no

differential privileges have been considered in the deduplication based on convergent encryption technique. It

seems to be contradicted if we want to realize both deduplication and differential authorization duplicate check

at the same time.

II. Data Deduplication :

A. DupLESS: Server-Aided Encryption for Deduplicated Storage

Cloud storage service providers such as Google Drive, Dropbox, Mozy, and others perform data

deduplication to save space by only storing one copy of each data uploaded. Should users conventionally

encrypt their datas, thus, savings are lost. Message-locked encryption (the most prominent manifestation of

which is convergent encryption) determines this tension. Thus it is inherently subject to brute-force attacks that

Multi Cloud Data Storage System With High Level Authorized Deduplication


can recover datas falling into a known set. This proposed an architecture that provides secure deduplicated

storage resisting brute-force attacks, and realizes it in a system called DupLESS. In DupLESS, users encrypt

under message-based keys attained from a key-server via an unaware PRF protocol. It enables users to store

encrypted data/content with an existing service, have the service perform data deduplication on their behalf, and yet achieves strong confidentiality guarantees.

However, previous deduplication systems cannot support differential authorization duplicate check,

which is important in many applications.

In existing data deduplication systems, the private cloud is involved as a proxy to allow data owner/users

to securely perform duplicate check with differential privileges.

Existing prototype overhead is high in file upload operations.

Fig.1. The definition of multicloud Architecture for Data Deduplication

B. Proofs of Ownership in Remote Storage Systems Cloud storage systems introduce the notion of proofs-of - ownership (PoWs), which lets a user

efficiently prove to a server that that the user holds a data, rather than just some short information about it. This

formalizes the theory of proof-of-ownership, under accurate security definitions, and rigorous efficiency

requirements of Peta byte scale storage systems. This then present solutions based on Merkle trees and specific

encodings, and analyze their security. This implemented one variant of the approach. These performance

measurements indicate that the approach incurs only a small overhead compared to naive user-side data

deduplication.

In this work this put forward the notion of proof-of-ownership, by which a user can prove to a server

that it has a copy of a data without actually sending the data. This can be used to counter attacks on data-data

deduplication systems where the attacker obtains a ―short summary‖ of the data and uses it to fool the server

into thinking that the attacker owns the entire data.

C. Cloud Deduplication

ClouDedup, a secure and efficient storage service which assures block-level data deduplication and

data/content confidentiality at the same time. Even though based on convergent encryption, ClouDedup remains

secure thanks to the definition of a component that implement an additional encryption operation and an access

control mechanism. Furthermore, as the requirement for data deduplication at block-level lifts an problem with

respect to key management, This suggest to include a new component in order to execute the keys management

for each block together with the actual data deduplication operation.

This designed a system which achieves confidentiality and enables block-level data deduplication at the same time. This system is built on top of convergent encryption. This showed that it is worth performing block-level data deduplication instead of data level data deduplication since the gains in terms of storage space are not affected by the overhead of metadata management, which is minimal. Additional layers of encryption are added by the server and the optional HSM. Thanks to the features of these components, secret keys can be generated in a hardware dependent way by the device itself and do not need to be shared with anyone else. As the additional encryption is symmetric, the impact on performance is negligible. This also showed that this design, in which no component is completely trusted, prevents any single component from compromising the security of the whole system. This solution also prevents curious cloud storage providers from inferring the original content of stored data/content by observing access patterns or accessing metadata.



Furthermore, this showed that this solution can be easily implemented with existing and widespread technologies.

III. PROPOSED WORK Hybrid clouds offer a greater flexibility to businesses while offering choice in terms of keeping control

and security. Hybrid clouds are usually deployed by the organizations willing to push part of their workloads to

Cloud Severs either for cloud convulsive purposes or for project requiring faster implementation.

Secure Data deduplication mechanism over multi-cloud architecture, where Private Cloud is integrated

as an intermediary to permit privileges’ to securely perform the data duplications check with differential

benefits. Our proposed work focuses on to reduce overhead by extern the hybrid cloud architecture for data

deduplication in terms of multi cloud integration system. Thus multi public clouds are integrating each other

and apply deduplication in each and every public cloud under receiving duplication check user request. Through the proposed architecture reduce the overhead and execute the deduplication efficiently. The idea of

data deduplication with secured manner is the foremost objective of the proposed system, in this connection we

proposed secure data deduplication mechanism by distinguish sensitive and non-sensitive data at data uploading

into cloud level and apply the crypto algorithm for sensitive data by applying this data get secured and

authorized .Finally our system proves that our system has secured data deduplication mechanism with improved

storage space, reduced request/response load of public cloud and bandwidth.

Data Deduplication eradicates the redundant data by storing only the single copies of data. It uses the

convergent encryption technique to encrypt the data with Authorized duplicate check, so that only authorized

user with specified privileges can perform the duplicate check. The concept de-duplications save the bandwidth

and reduce the storage space. It also eradicates the duplicates of data in the cloud storage. It does not allow the

unauthorized user to steal information. Thus it provides lots of benefits based on the confidentiality, authorized

duplicate check also the cloud storage space as well as the healing information is prevented. We also present several new data deduplication constructions supporting authorized duplicate check in a multi-cloud

architecture.

A. PUBLIC USER REGISTRATION AND LOGIN ACCESS

Private cloud takes charge of system parameters generation, user registration, proof of ownership, and

maintain the user privilege keys. In the given example, the private cloud is acted by the administrator of the

users and deduplication process. Therefore, we assume that the private cloud is fully trusted by the other parties.

Users are registered under private cloud for to upload, download or access data’s in the cloud. Private cloud

distributes login access for every registered user which is used for connecting public cloud. User’s details with

specific privilege keys periodically revoke to the cloud system for proof of ownership request member

authorization.

B. PRIVATE CLOUD SETUP

Maintain User’s Privilege Keys

As cloud computing becomes prevalent, an increasing amount of data/content is being stored in the

cloud system and shared by users/clients with specified privileges, which define the access rights of the stored

data. So the private keys for the privileges are managed by the private cloud, who answers the file token

requests from the users/clients. The interface offered by the private cloud system allows user/client to submit

files and requested queries to be securely stored and computed respectively. The private keys for privileges

will not be delivered to users/clients directly, which will be kept and managed by the private cloud server

system.

User Token Request

To get a file token, the users/clients send a request to the private cloud server. To perform the duplicate check for some file, the user/client needs to get the file token from the private cloud server system.

To support authorized deduplication, the tag of a request file will be determined by the request file and the

user privilege. To show the dissimilarity with traditional notation of tag, call it file token as an alternative. To

sustain authorized access, a secret key will be bounded with a requested user/client privilege to generate a file

token.

File Token Response

The private cloud server will also check the user’s/clients identity once receives request form

user/client before issuing the corresponding file token to the user/client. Users have access to the private cloud

server, a semi-trusted third party which will aid in performing data deduplicable encryption by generating file

tokens for the requesting users/clients. The right to use a file is defined based on a set of privileges.



C. MULTI PUBLIC CLOUD SETUP

Authorized Deduplication (Token verification)

The authorized duplicate check for this file can be performed by the user with the public cloud

system before uploading this file/data. Based on the results outcome of data duplicate check, the user either uploads this file or runs PoW. So after receiving the tag from private cloud, the user will send file duplication

request tag to the S-CSP. If a file/data duplicate is found, the user/client needs to run the protocol POW (proof

of ownership) with the SCSP (public cloud) to prove the file ownership. Otherwise, if no duplicate is found, a

proof from the S-CSP will be returned.

File Upload Request/Response

After no duplication response from the public cloud then, the user computes the encryption to the

upload file with key using a symmetric encryption algorithm. Then the user computes the encrypted file with

the convergent key and uploads to the cloud server. The convergent key is stored by the user locally.

File Download Request/Response

Suppose a user wants to download a file. It first sends a request and the file name to the S-CSP.

Upon receiving the request and name of the file, the S-CSP will check whether the user/client is eligible to download. If failed, the S-CSP sends back a signal to the user/client to indicate the download failure or else,

the S-CSP returns the corresponding cipher text. Upon receiving the encrypted data/content from the S-CSP,

the user uses/clients the key stored locally to recover the original file i.e the user can recover the original file

with the convergent key after receiving the encrypted data from the S-CSP.

IV. CONCLUSION In this project, the notion of authorized data deduplication was proposed to protect the data security by including differential privileges of users in the data duplicate check. Also presented several new data

deduplication methods supporting authorized data duplicate check in hybrid cloud architecture, in which the

data duplicate-check tokens of files are generated by the private cloud server system with private keys. Security

analysis demonstrates that our approaches are secure in terms of insider and outsider intruders specified in the

proposed security architecture. As a proof of concept, implemented a prototype of our proposed authorized data

duplicate check method and executed on our prototype. Results showed that our authorized data duplicate check

method incurs low overhead compared to convergent encryption and network transfer.

REFERENCES [1] OpenSSL Project, (1998). [Online]. Available: http://www.openssl.org/

[2] P. Anderson and L. Zhang, ―Fast and secure laptop backups with encrypted deduplication,‖ in Proc. 24th Int. Conf. Large

Installation Syst. Admin., 2010, pp. 29–40.

[3] M. Bellare, S. Keelveedhi, and T. Ristenpart, ―Dupless: Serveraided encryption for deduplicated storage,‖ in Proc. 22nd

USENIX Conf. Sec. Symp., 2013, pp. 179–194.

[4] M. Bellare, S. Keelveedhi, and T. Ristenpart, ―Message-locked encryption and secure deduplication,‖ in Proc. 32nd Annu. Int.

Conf. Theory Appl. Cryptographic Techn., 2013.

[5] M. Bellare, C. Namprempre, and G. Neven, ―Security proofs for identity-based identification and signature schemes,‖ J. Cryptol.,

vol. 22, no. 1, pp. 1–61, 2009.

[6] M. Bellare and A. Palacio, ―Gq and schnorr identification schemes: Proofs of security against impersonation under active and

concurrent attacks,‖ in Proc. 22nd Annu. Int.Cryptol. Conf. Adv. Cryptol., 2002, pp. 162–177.

[7] S. Bugiel, S. Nurnberger, A. Sadeghi, and T. Schneider, ―Twin clouds: An architecture for secure cloud computing,‖ in Proc.

Workshop Cryptography Security Clouds, 2011.

[8] J. R. Douceur, A. Adya, W. J. Bolosky, D. Simon, and M. Theimer, ―Reclaiming space from duplicate files in a serverless

distributed file system,‖ in Proc. Int. Conf. Distrib. Comput. Syst., 2002, pp. 617–624.

[9] D. Ferraiolo and R. Kuhn, ―Role-based access controls, ‖ in Proc. 15th NIST-NCSC Nat. Comput. Security Conf., 1992, pp.

554–563.

Multi Cloud Data Storage System With High Level Authorized Deduplication

Engineering

Transcript of Multi Cloud Data Storage System With High Level Authorized Deduplication