AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line...

38
AUTOMATION REPLICATION IN OPENSTACK WAN NUR ATHIQAH SHAFIRAH BINTI WAN MUHAMMAD SHAFAWI BACHELOR OF COMPUTER SCIENCE (COMPUTER NETWORK SECURITY) WITH HONORS UNIVERSITI SULTAN ZAINAL ABIDIN 2018

Transcript of AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line...

Page 1: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

AUTOMATION REPLICATION IN

OPENSTACK

WAN NUR ATHIQAH SHAFIRAH BINTI WAN MUHAMMAD

SHAFAWI

BACHELOR OF COMPUTER SCIENCE (COMPUTER

NETWORK SECURITY) WITH HONORS

UNIVERSITI SULTAN ZAINAL ABIDIN

2018

Page 2: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

AUTOMATION REPLICATION IN

OPENSTACK

WAN NUR ATHIQAH SHAFIRAH BINTI WAN MUHAMMAD

SHAFAWI

BACHELOR OF COMPUTER SCIENCE (COMPUTER

NETWORK SECURITY) WITH HONORS

FACULTY OF INFORMATICS AND COMPUTING

UNIVERSITI SULTAN ZAINAL ABIDIN, TERENGGANU,

MALAYSIA

2018

Page 3: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

DECLARATION

I hereby declare that this report is based on my original work except for quotations and

citations, which have been duly acknowledged. I also declare that it has not been previously

or concurrently submitted for any other degree at University Sultan Zainal Abidin or other

institutions.

____________________________

Name: Wan Nur Athiqah Shafirah binti Wan

Muhammad Shafawi

Date:

Page 4: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

CONFIRMATION

This is to confirm that:

The research conducted and the writing of this report was under my supervision.

___________________________

Name: Prof. Madya Dr. Zarina binti Mohamad

Date:

Page 5: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

DEDICATION

In the Name of Allah, the Most Merciful, the Most Compassionate all praise be to

Allah, the Lord of the worlds; and prayers and peace upon Mohamed His servant and

messenger.

This thesis is dedicated to my beloved supervisor, Prof. Madya Dr. Zarina binti

Mohamad who has always been a constant source of support and encouragement during the

challenges of completing this project. I would like to sincerely thank her for the guidance

throughout this study especially for her confidence in me. Also to the fellow panels Prof.

Madya Dr. Mohamad Afendee bin Mohamed and Dr. Aznida Hayati binti Zakaria @

Mohamad, I would like to warmly thank them for the constructive criticism and ideas which

have provided a good basis for the present thesis.

Besides, I would like to express my deep and sincere gratitude to my beloved parents,

Wan Muhammad Shafawi bin Wan Ibrahim and Rosmini binti Mat Jusoh for their thrust and

non-stop encouragement to complete my study. Without their support and understanding it

would have been impossible for me to finish this work.

Last but not least, I would like to thank all my friends for their kindness to share their

knowledges, moral support and helps in order to complete the thesis. My deepest thanks go to

all people who took part in making this thesis.

Page 6: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

ABSTRACT

Nowadays, more and more embedded devices such as mobile phones, tablet PCs and

laptops are used in every field. Therefore, huge files conveniently to be stored or backed up

into cloud storage. Cloud computing is a collection of computing resources provided with the

help of Internet. Cloud has various benefits such as broad area network access, cost reduction,

ease of use, time saving, on-demand delivery of its services and all device support which has

Internet connection. Optimizing the performance of cloud storage is very important for

Internet development and people convenient. This paper presents the automation replication

in OpenStack. OpenStack is an open-source platform that provides cloud as a service. There

are many features that had been provided by OpenStack cloud computing components to help

users manage their own cloud storage from OpenStack dashboard. This paper mainly focuses

on the automation mechanism of cloud storage as well as the replication methods to process

different files purposely for data availability in cloud storage. This work provides

contributions through the challenge in order to implement automation replication feature in

OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project

developer. The configurations of OpenStack Swift system need to be done and we need to

analyse how the components can be related with each other to achieve the expected

objectives.

Page 7: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

CONTENTS

DECLARATION ....................................................................................................................... 3

CONFIRMATION ..................................................................................................................... 4

DEDICATION ........................................................................................................................... 5

ABSTRACT ............................................................................................................................... 6

CONTENTS ............................................................................................................................... 7

LIST OF FIGURES ................................................................................................................... 9

CHAPTER I ............................................................................................................................. 10

INTRODUCTION ................................................................................................................... 10

1.1 BACKGROUND ....................................................................................................... 10

1.2 PROBLEM STATEMENT ....................................................................................... 12

1.3 OBJECTIVE.............................................................................................................. 13

1.4 SCOPE ...................................................................................................................... 13

1.5 LIMITATION OF WORK ........................................................................................ 13

CHAPTER II ............................................................................................................................ 14

LITERATURE REVIEW ........................................................................................................ 14

2.1 INTRODUCTION .......................................................................................................... 14

2.2 CLOUD COMPUTING ................................................................................................. 14

2.2.1 CLOUD COMPUTING CONCEPT ....................................................................... 14

2.2.2 CLOUD ARCHITECTURE .................................................................................... 15

2.3 SOFTWARE ON CLOUD ............................................................................................. 16

Page 8: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

2.3.1 OPENSTACK .......................................................................................................... 16

2.3.2 OPENSTACK ARCHITECTURE .......................................................................... 17

2.4 RESEARCH ON RELATED TOPIC ............................................................................ 19

2.4.1 AUTOMATED CLOUD COMPUTING SCENARIOS ......................................... 19

2.4.2 REPLICATION CONCEPT .................................................................................... 20

2.4.3 DATA AVAILABILITY ......................................................................................... 22

2.4.4 OPENSTACK OBJECT STORAGE (SWIFT) ....................................................... 22

2.6 SUMMARY ....................................................................................................................... 23

CHAPTER III .......................................................................................................................... 24

METHODOLOGY .................................................................................................................. 24

3.1 INTRODUCTION .......................................................................................................... 24

3.2 PROJECT METHODOLOGY ....................................................................................... 25

3.3 PROJECT FRAMEWORK ................................................................................................ 27

3.2.1 INSTALLATION AND CONFIGURATION OF SOFTWARE ............................ 27

3.2.2 SWIFT COMPONENT ........................................................................................... 28

3.2.3 ADD NEW PROGRAM .......................................................................................... 28

3.2.4 HORIZON COMPONENT ..................................................................................... 28

3.4 OPENSTACK ARCHITECTURE ................................................................................. 29

3.5 OPENSTACK SWIFT OBJECT STORAGE SERVICE PLATFORM ........................ 31

3.6 DATA MODEL OF DATA REPLICATION IN OPENSTACK .................................. 33

3.7 PROOF OF CONCEPT .................................................................................................. 34

Page 9: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

REFERENCES ........................................................................................................................ 36

LIST OF FIGURES

FIGURE TITLE PAGE

3.2 Methodology Phase for Automation Replication in OpenStack 25

3.3 Framework of Automation Replication in OpenStack 27

3.4.1 OpenStack Conceptual Architecture 29

3.4.2 OpenStack Logical Architecture 30

3.5 OpenStack Object Storage Service 31

3.6 Flowchart of Data Replication Conduct in Swift 33

3.7.1 Creating Packstack File 34

3.7.2 Configuration of OpenStack Component 34

3.7.3 Installation of OpenStack 34

Page 10: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

CHAPTER I

INTRODUCTION

1.1 BACKGROUND

Cloud computing consists of emerging computing paradigm which makes use of the

contemporary virtual machine technology. Cloud Computing allows the users to access a

pool of shared computing resources like storage, networks, application and services that can

be rapidly provisioned [4]. The collaboration between internet and virtual machine

technologies enable cloud computing to emerge as a paradigm with promising prospects to

facilitate the development of large scale, flexible computing infrastructures and available on-

demand to meet the computational requirement of e-Science applications. Cloud computing

technology attracts ICT service providers that offer tremendous opportunities for online

distribution of services. End users can benefit from the convenience of accessing data and

services globally, from centrally managed backups, high computational capacity and flexible

billing strategies. Cloud computing is a service where data is remotely maintained, managed

and backup for users over a network. There are some open source platforms provided by

expertise in developing a cloud computing service such as OpenStack, Eucalyptus,

CloudStack and OpenNebula.

OpenStack is a set of software tools for building and managing cloud computing

platforms for public and private clouds. OpenStack is open source platform for cloud

computing and easily available to users and user can deploy service models. It is component

based software which includes various projects in different code names to deliver cloud

services. OpenStack is for building private cloud that provides three service delivery models;

Infrastructure as a Service (IaaS), Software as a Service (SaaS) and Platform as a Service

(PaaS) model. It consists of resources like storage, networking and processing throughout a

Page 11: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

datacenter. It can be managed through a web-based dashboard or via the OpenStack API. It

allows administrators to control the dashboard while empowering their users to provision

resources through a web interface. OpenStack works with popular enterprise and open source

technologies that make it ideal for heterogeneous infrastructure. In April 2012, RackSpace

announced it would implement OpenStack Compute as the underlying technology for their

Cloud Servers product. The change will come a new control panel as well as add-on cloud

services offering databases, server monitoring, block storage and virtual networking.

Nowadays, OpenStack is the major cloud tool used in the industry. OpenStack is made of

several components with each one of them with a particular function that include nova, swift,

glance, horizon and keystone [6].

Swift is a two-tier storage system consists of a proxy tier, which handles all incoming

requests and an object storage tier where the actual data is stored. Swift uses a data structure

called a ring to outline URL for an object to a particular location in the cluster where the

object is stored. Swift is used for storing objects and files which provides scalability and to

ensure data integrity and data replication. Swift enables users to store, retrieve and delete

objects with their associated metadata in containers via RESTful HTTP API. Metadata is a

set of data that describes and gives information about other data. Swift replicates each object

three times [11]. When storing the copies, it tries to spread them out over different servers

and disks so the failure of a single component would not cause data lost data transfer. Cloud

computing systems need at least twice the number of storage devices it requires to store

information to support failure occurrence during data transmission across the network. A

cloud computing system must make a copy of all clients' information and store it on other

devices. The copies enable the central server to access backup machines to retrieve data that

otherwise would be unreachable. Making copies of data as a backup is called replication.

Data replication allows user to reduce waiting time, speeding up data access and increase data

Page 12: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

availability by providing the user with different replicas of the same service, all of them with

a coherent state [7]. Replication is a frequently used technique in the cloud such as GFS

(Google file system), HDFS (Hadoop Distributed File System).

From this project, we will improve the data availability in cloud computing via Swift

component using OpenStack as the cloud platform. OpenStack users can manage the

automation replication service through the web-based dashboard.

In conclusion, cloud computing is a computing manner based on the Internet that can

provide corresponding hardware and software resources and information according to the

needs [1]. OpenStack cloud platform enables consumers to hire and utilize infrastructure

resources like, compute power, storage and network in a need based manner. Based on the

requirements, users can access and utilize the resources as a service instead of purchasing

them. Swift is the open source storage subproject of OpenStack. It is an object storage

services that provisioned in a scalable manner by the storage component. It uses the common

servers to build redundancy, scalable, distributed object storage cluster and the storage

capacity could reach up to PB-level [9]. Swift component enables users to replicate data in

cloud storage.

1.2 PROBLEM STATEMENT

OpenStack is advanced open source cloud computing software which comes with

various features and its services that provided for the end users. One of the cloud computing

features is to availability of data and it is covered in OpenStack. However, automation

replication is not being implemented with other OpenStack features. Replication is important

in cloud service to ensure the availability of requested data. Replication technique is included

in OpenStack via Swift component [17].

Page 13: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

1.3 OBJECTIVE

There are three main objectives for this project. These objectives are derived to overcome

problems that stated before. The objectives are:

To propose an automation replication in OpenStack

To design an automation replication framework in OpenStack

To implement automation of replication in OpenStack

1.4 SCOPE

The project focuses on administrators and users scope. Administrator will provide

maintenance to achieve the reliability and availability of data in OpenStack. For the users’

scope, it includes how convenient the users can manage the application on their own.

1.5 LIMITATION OF WORK

During the development of this project, there might be a few challenges occur which

include the limitation of huge storage capacity in order to install the software needed.

Page 14: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

CHAPTER II

LITERATURE REVIEW

2.1 INTRODUCTION

This chapter describes about research conducted by other parties of existing computer

that related to the ongoing project. Research is an important aspect where a study is carried

out to obtain the information related to the proposed project. When the study is conducted,

we can examine all the available information that related with the project. This chapter also

describes the techniques, methods, equipment or technology that suitable for the

implementation of projects or studies. There are some references used for this chapter, which

included journal articles, internet and thesis.

2.2 CLOUD COMPUTING

2.2.1 CLOUD COMPUTING CONCEPT

Cloud computing is a new delivery and consumption model for IT services. It

involves provision of dynamically scalable and often virtualized resources typically over the

Internet. Cloud computing can also be called as “computing in an independent or remote

location with shared resources available on demand” [12]. Cloud computing developed by

emerging concept that combine many fields of computing. Cloud computing provide and

enable the use of distributed computing, storage resources and services that have been

developed, thoroughly tested and adopted by industry, science and government [2].

Nowadays more and more data is generated in different forms and there always a concern

about the storage of data. Therefore, Cloud is implemented as one of solution for storing data.

Page 15: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

Cloud Computing allows the users to access a pool of shared computing resources

like storage, networks, application and services that can be rapidly supplied. Cloud

computing has primarily three service delivery models which are Infrastructure as a Service

(IaaS), Software as a Service (SaaS) and Platform as a Service (PaaS). There are some

essential features of cloud computing, that includes Broad network access, on-demand self-

service, Resource pooling, Measured service and Rapid elasticity. Four service deployment

models of cloud are private cloud, public cloud, community cloud, hybrid cloud.

2.2.2 CLOUD ARCHITECTURE

Cloud computing architecture refers to the components and subcomponents required

for cloud computing.

The various cloud based services have their own distinct and unique cloud architectures:

• Software as a Service (SaaS) involves software hosted and maintained on internet.

With SaaS, users do not have to install the software locally.

• Development as a Service (DaaS) involves web based development tools shared

across communities.

• Platform as a Service (PaaS) provides users with application platforms and databases.

• Infrastructure as a Service (IaaS) provides for infrastructure and hardware such as

servers, networks, storage devices.

Page 16: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

2.3 SOFTWARE ON CLOUD

S. Yadav (2013) tells about cloud computing technology and its basic concepts. It

explained about the demand for private cloud and its deployment [10]. This paper describes

three open source software and tells how to deploy private cloud using open source software

provided such as OpenStack, Eucalyptus and OpenNebula. A comparative study of those

three open source software based on their architecture is analysed and it mentioned about

cloud implementation, programming language, database compatibility and OS compatibility

of these software. This summarization and comparison helps to choose better services

according to requirements.

2.3.1 OPENSTACK

OpenStack is open source platform for cloud computing and easily available to users

and user can deploy service models. OpenStack is for building private cloud that provides

three service delivery models which are Infrastructure as a Service (IaaS), Software as a

Service (SaaS) and Platform as a Service (PaaS) model. It consists of resources like Storage,

Networking and processing throughout a datacenter that users can manage through a web

based dashboard [4]. There are numerous components in an OpenStack that provides

different services accordingly. OpenStack is open source software platform which developers

and researchers can used to setup and run the different types of cloud. The OpenStack is

basically used to implement a private cloud.

S. Nivedha et. al (2015) tells how to build private cloud using open source software

[13]. There also stated with various features of OpenStack software and its services,

capabilities provided to the end users. Dr. U. Pol (2014), in the journal stated that OpenStack

open source software is used for providing three software deployment models which are IaaS,

Page 17: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

PaaS and SaaS [19]. Various tools were provided on top of existing system to manage and

create virtual machines (VMs).

2.3.2 OPENSTACK ARCHITECTURE

Research paper by Tiago Rosado and Jorge Bernardino (2014), gives an overview of

OpenStack software components functionalities in order to design and implement unique

cloud computing solutions to fit with enterprises purposes [15]. This paper gives an update,

highlight and detailed overview of OpenStack architecture and shows the essential services

that are necessary to install. It also presented about overview of OpenStack architecture,

services available and how they work and adapt into any hardware infrastructure by

providing a sustainable and robust private cloud solution.

The primary components of an OpenStack consist of [14]:

Nova (Compute): OpenStack Compute (Nova) is a cloud computing fabric controller which is

used to deploy and manage large numbers of virtual machines and other instances to handle

computing tasks.

Swift (Object Storage): OpenStack Object Storage (Swift) is a scalable redundant storage

system for objects and files. Objects include files are written to multiple disk drives spread

throughout servers in the datacenter OpenStack software only responsible for ensuring data

replication and integrity across the cluster.

Cinder (Block Storage): OpenStack Block Storage (Cinder) is a block storage component,

which is more analogous to the traditional notion of a computer being able to access specific

locations on a disk drive. It also provides persistent block-level storage devices for use with

Page 18: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

OpenStack compute instances. The block storage system in OpenStack manages the creation,

attaching, detaching of the block devices to servers.

Neutron (Networking): OpenStack Networking (Neutron) provides the networking capability

for OpenStack. It is a system implemented for managing networks and IP addresses easily,

quickly and efficiently.

Horizon (Dashboard): OpenStack Dashboard (Horizon) is the dashboard behind OpenStack

which provides administrators and users a graphical interface to access, provision and

automate cloud-based resources.

Keystone (Identity Service): OpenStack Identity (Keystone) provides identity services for

OpenStack. It also known as a central directory of users mapped to the OpenStack services

that they can access. It provides multiple means of access and acts as a common

authentication system across the cloud operating system and can integrate with existing

backend directory services like LDAP (Lightweight Directory Access Protocol).

Glance (Image Service): OpenStack Image Service (Glance) provides image services to

OpenStack application. It provides discovery, registration and delivery services for disk and

server images. It also allows these images to be used as templates when deploying new

virtual machine instances.

Ceilometer (Telemetry Service): OpenStack Telemetry Service (Ceilometer) provides

telemetry services which allow the cloud to provide billing services to individual users of the

cloud. It keeps a verifiable count of each user’s system usage of each of the various

components of an OpenStack cloud.

Page 19: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

Heat (Orchestration): OpenStack Orchestration (Heat) is a service which allows developers

to store the requirements of a cloud application in a file that defines what resources are

necessary for that application.

Trove (Database): OpenStack (Trove) is a database as a service which provides relational and

non-relational database engines.

2.4 RESEARCH ON RELATED TOPIC

2.4.1 AUTOMATED CLOUD COMPUTING SCENARIOS

Dr. H. Guruprasad (2014), S. Nivedha and N. Saranya (2015) introduces cloud

computing technology and OpenStack platform concepts [4] [13]. Two types of cloud which

are private and public were described. In this paper, latest and power technique for creating

private cloud using OpenStack open source platform with Ubuntu open source operating

system was introduced. There also information about implementation of private cloud using

open source software where private cloud provides Infrastructure as a Service model and

Platform as a Service model. Implementation of private cloud should be launching different

flavours of images, instances and services. With the help of OpenStack dashboard, user

personally can manage different OpenStack services.

Vladimir Sobeslav and Ales Komarek (2014), research’s covers about the basic open

source automation and configuration management tools that can be used to alleviate common

operations tasks and processes in cloud systems. It shows a quick survey of major cloud

computing solutions and introduces simple abstraction layer for the management of physical

and virtual resources. The last chapter in the paper covers some of the common automation

Page 20: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

scenarios from both cloud computing provider and consumer perspectives and their possible

open-source implementations [5].

The operating enterprise systems without automation are not sustainable in the long

term. We can see rapid development of new cloud computing resources coping up with

increasing demand to support these complex infrastructures. The open-source world

introduces some enterprise level solutions to many common automation scenarios. The ability

of automatic operations and transition processes by continuous integration has become

crucial in the IT environment until now. With the open-source automation, we utilize the

laboratory hardware resources more efficient by turning the system architecture into fully

edged IaaS solution that can support educational as well as research projects. The own cloud

computing platform allows us to automate the provision of new virtual servers and thus adopt

the last missing step to continuously integrate the process. With this infrastructure, we can

continuously test the open-source automation scenarios involving the installation of

OpenStack platform on physical servers and the deployment of virtual servers for education

and various distributed systems.

2.4.2 REPLICATION CONCEPT

Replication is a process where a whole object is replicated some number of times [8]

to provide protection if a copy of an object is lost or unavailable. Replication used to achieve

reliability and availability of data in cloud storage. Availability can be achieved by

designating at least one person who would have access to the files at all time. It can avoid

data lost even if interruption occurs during data transmission over the network. It also

minimizes network delays and the bandwidth usage. We consider both energy efficiency and

bandwidth consumption of the system. This is in addition to the improved Quality of Service

Page 21: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

(QoS) obtained as a result of the reduced communication delays. The results obtained from

the simulator might help to unveil performance and energy efficiency trade-offs as well as

guide the design of future data replication solutions. Data replication across multiple

datacenters is important to prevent data loss. In the event of a disaster in one area, data could

still be accessed from other regions and users would be unaware of any problems. For

example in the occurrence of disaster in the Southeast, data could still be accessed from other

regions. If something bad happened to the Southeast, such as snowstorm that caused by

power supply failure, your data would be served from another datacenter. This can be proven

as replication is a widely used mechanism for ensuring availability in the presence of failures.

Based on research paper released by David Richard Schafer, Kurt Rothermel and

Muhammad Adnan Tariq (2018), it says that the main purpose of replication is to improve

availability [16]. They presented novel replication schemes for ensuring the availability of

workflow executing in heterogeneous environments. The evaluations showed that the

replication schemes significantly increase availability in the presence of failures, while

introducing low overhead and being scalable.

High availability, high fault tolerance and high efficiency access to cloud data centers where

failures are normal rather than exceptional are significant issues, due to the large-scale data

support. Data replication allows reducing user waiting time, speeding up data access and

increasing data availability by providing the user with different replicas of the same service,

all of them with a coherent state. Replication is a frequently used technique in the cloud, such

as GFS (Google file system), HDFS (Hadoop Distributed File System).

Page 22: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

2.4.3 DATA AVAILABILITY

Data availability means that resources must be available to authorized users at all the

time as on when required. They must always be accessible even during network failure or

when a whole datacenter has gone offline. It guarantees that system works accurately and

service is accessible to authorized users only [3]. Bhagyashri Kulkarni and Varsha Bhosale

(2016), OpenStack is advanced open source cloud computing software which comes with

various features and its services that provided for the end users. Replication is important in

cloud service as it can ensure the availability of requested data [20].

2.4.4 OPENSTACK OBJECT STORAGE (SWIFT)

In OpenStack, Swift component provides Object storage capabilities. The Swift uses

replication technique as strategy to achieve its reliability, availability and fault tolerance

properties of storage which it keeps more than one copy of each object (typically three

copies).

Lei Li, Dagang Li, Zhiliang Su, Lianwen Jin and Ganbo Huang, presents the technical

details about OpenStack Swift, such as how the hardware configuration, the components and

different parameter configurations affect the performance [18]. It stated that there are still

many unknown environmental factors of cloud storage to be discovered in the real

deployments and evaluations. Thus, the evaluation of OpenStack Swift as a real commercial

cloud storage deployment is needed in future work.

Previously, most file systems use the RAID (Redundant Array of Independent disk)

structure to achieve data replication. Although the RAID structure can back up the data into

different hard disks in a single server, it costs too much when the cluster is large and cannot

Page 23: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

distribute the backup files into different servers to avoid a single-point from breaking down.

Currently, cloud storage system can provide the replica data store scheduler to make the

backup files on different servers so that it can guarantee the storage system in a consistent

state in case of a temporary error, like hard device failure or power outage.

2.6 SUMMARY

From this chapter, we can conclude that conducting previous research is an important

element in developing a project because it will consume more knowledge about the topic and

it also shows how the previous researchers do their research. It is important as a guide to

prevent same mistake or the same idea and technique that had been used being established.

Hopefully this chapter would provide an overview regarding the concept of the project based

on the explanation provided.

Page 24: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

CHAPTER III

METHODOLOGY

3.1 INTRODUCTION

This chapter reports about the approach, model development and application of a

comprehensive framework taken in the development of system, application or

implementation of study. It contains method, technique or approach that will be used during

the design and implementation of the project. The selection of suitable methodology for the

project development is very important. Choosing false methodology can cause a lot of

problem and the project might not be completed based on right schedule. Moreover, the

proposed project might be completely failed as the developer might lose guidance in order to

complete the project development. All the involved phases during this project will be detailed

for further understanding.

Page 25: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

3.2 PROJECT METHODOLOGY

Figure 3.2: Methodology Phase for Automation Replication in OpenStack

Figure 3.2 shows the methodology phase for Automation Replication in OpenStack. It

consists of 5 phases which are:

PHASE 1: INITIATING

The beginning phase of project methodology defines about the starting point of a new

project or new phase of an existing project. Information regarding the project was gathered

before doing any decision on the scope and topic of the project that suitable to be proposed to

the panels. This phase help developer to increase knowledge based on their interest.

PHASE 2: PLANNING

The project scope background was discussed to make sure the chosen topic satisfy

course needs, level, scope and rules that have been stated during first course briefing. The

INITIATING PLANNING

DEPLOYMENT TESTING

ANALYSIS &

DESIGN

Page 26: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

problem was identified and determined whether it satisfied the course scope or not. Then, we

came out with several objectives and lot of references from previous researchers have been

done to make a list of literature review. It is the stage where we plan the project based on

chosen topic.

PHASE 3: ANALYSIS AND DESIGN

This phase was the stage where we create a framework and data flow diagram based

on the information gathered for this project development. The suitable method, technique and

platform been analysed. Modifying command for automation feature also done at this stage

based on the requirement.

PHASE 4: TESTING

The whole modification must be tested to check its functionality during testing phase.

If there are any errors occur, then its will appear at this point of the process. Therefore, we

still have a chance to correct and improve the command. It is important to test the project

progress to make sure whether the design can be implemented in OpenStack application or

not.

PHASE 5: DEPLOYMENT

After checking the requirement whether it satisfy the course scope or not, the project

can be proceed to deploy.

Page 27: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

3.3 PROJECT FRAMEWORK

Figure 3.3: Framework of Automation Replication in OpenStack

3.2.1 INSTALLATION AND CONFIGURATION OF SOFTWARE

Figure 3.3 shows an overall framework of automation replication in OpenStack. The

first image in the framework shows the software used to carry out this project which includes

VM Ware workstation, CentOS and OpenStack. Firstly, we need to install and configure VM

Ware workstation as the virtual machine. Then, install and configure CentOS in VM Ware

workstation before installing and configuring OpenStack on CentOS.

Installation and configuration of

OpenStack in CentOS Software

Add new program [Data management]

Display

Page 28: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

3.2.2 SWIFT COMPONENT

Swift is one of the components been included in OpenStack to provide object storage

service in cloud system. OpenStack Swift distributed object storage system is a type of

storage that can read and write files using a REST (representational state transfer) API

(application programming interface) and enables large quantities of data to be stored and

shared among applications.

3.2.3 ADD NEW PROGRAM

For this project, we are going to modify and add new program related to automation

replication feature. The modification focused on Swift and Horizon component but the other

components in OpenStack also need to be analysed whether it is related with the project

progress.

3.2.4 HORIZON COMPONENT

Horizon used for Dashboard which is simply used as a dashboard behind OpenStack

application which provides web based services to user interface of OpenStack. The

automation replication that been developed will be displayed on OpenStack dashboard via

Horizon.

Page 29: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

3.4 OPENSTACK ARCHITECTURE

Figure 3.4.1 represents the OpenStack conceptual architecture with all native software

components which developed by companies and individual supporters, showing how they

interact with each other.

Figure 3.4.1: OpenStack Conceptual Architecture

Page 30: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

Figure 3.4.2 shows the OpenStack logical architecture with all native software

components which developed by companies and individual supporters, showing how they

interact with each other.

Figure 3.4.2: OpenStack Logical Architecture

Page 31: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

3.5 OPENSTACK SWIFT OBJECT STORAGE SERVICE PLATFORM

Figure 3.5: OpenStack Object Storage Service

OpenStack Swift is a distributed object open source storage platform with different

structures of platform that can be deployed to satisfy different requirements. It has several

components which provide different functionalities such as file store location scheduler,

lookup service, metadata manager and failure recovery. It provided files availability by

keeping more than one data replication in different storage servers. For the latest version of

OpenStack, Swift can provide different configurations of replication to allow users to select

how many replications of files they want.

Page 32: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

Figure 3.5 shows the components available in Swift in order to operate the system for

object storage service. Proxy server component in Swift perform file location scheduler

service and lookup service which is the access entry of storage system. It used a hash code

named ring file to put the file in the suitable location of storage. Ring file is a static file that

cannot be modified during runtime process. The proxy server can be a load balance server in

the system to schedule the files to be placed in different storage device. The proxy server will

solve the problem and set handoff server in the ring and route the files if failure occurs in any

storage device.

OpenStack Swift simultaneously writes three or more backup data to different storage

devices to guarantee the safety of the data storage. The system defines that the file is

successfully stored when two file backups are successfully written. The replica server can

detect the quality of files. For example, when a file is destroyed because of power outage,

writing error or hard device damage, replica server can copy file from another storage device

to guarantee the number of the file replicated.

When a user that corresponds to user account requests for an object inside a container,

the account server looks for the account first in its account database and finds associated

container. Then, container server checks the container database to find out whether the

requested object exists or not in the specified container and finally the object server looks in

object databases to find retrieval information about the object. The proxy server needs to

know which object server store the object and the path of the object in the local filesystem of

that server to retrieve an object.

Page 33: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

3.6 DATA MODEL OF DATA REPLICATION IN OPENSTACK

Figure 3.6: Flowchart of Data Replication Conduct in Swift

Figure 3.6 represents flowchart of data replication conduct in Swift. When user request

for data replication by input certain command, the request have to go through a few nodes

and servers before either replicas is successfully created or not. From the available

OpenStack service, user need to use GET and PUT command to request and upload data in

object storage database provided to make data replicas. Therefore, through this project we are

Service storage

Proxy server

Authentication

Obtain container

metadata

Get all possible nodes according

to partitions and object ring files

Pick three nodes Object server

Handle writing data to disk

and updating its metadata

Container server

Update object record in

the container

Proxy server

At least 2

replicas written

User

Request

Establish connection with

three container nodes

Establish connection with

three container nodes

Update container

information

Yes

No

Page 34: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

going to provide convenience for data management especially for data replication via

automation on OpenStack dashboard.

3.7 PROOF OF CONCEPT

Figure 3.7.1 shows the command on CentOS terminal used for creating OpenStack Packstack

file while the second command is an instruction to open the Packstack file.

Figure 3.7.1: Creating Packstack File

Figure 3.7.2 represents configuration of OpenStack component after opening the Packstack

file. The components needed for this project was configured by changing the ‘n’ notation to

‘y’.

Figure 3.7.2: Configuration of OpenStack Component

Page 35: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

Figure 3.7.3 shows the successful installation of OpenStack.

Figure 3.7.3: Installation of OpenStack

Page 36: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

REFERENCES

[1] Zhang, R., & Liu, L. (2010, July). Security models and requirements for healthcare

application clouds. In Cloud Computing (CLOUD), 2010 IEEE 3rd International

Conference on (pp. 268-275). IEEE.

[2] Aiftimiei, C., Costantini, A., Bucchi, R., Italiano, A., Michelotto, D., Panella, M. &

Zizzi, G. (2017, October). Cloud Environment Automation: from infrastructure

deployment to application monitoring. In Journal of Physics: Conference Series (Vol.

898, No. 8, p. 082016). IOP Publishing.

[3] Gaikwad, C., Churi, B., Patil, K., & Tatwadarshi, P. N. (2017, March). Providing

storage as a service on cloud using OpenStack. In Innovations in Information,

Embedded and Communication Systems (ICIIECS), 2017 International Conference on

(pp. 1-4). IEEE.

[4] Girish, L. S., & Guruprasad, H. S. (2014). Building private cloud using OpenStack.

International Journal of Emerging Trends & Technology in Computer Science

(IJETTCS), 3(3).

[5] Sobeslav, V., & Komarek, A. (2015). Opensource automation in cloud computing. In

Proceedings of the 4th International Conference on Computer Engineering and

Networks (pp. 805-812). Springer, Cham.

[6] https://www.openstack.org/software, last access on: 03/10/2018.

[7] Sun, D. W., Chang, G. R., Gao, S., Jin, L. Z., & Wang, X. W. (2012). Modeling a

dynamic data replication strategy to increase system availability in cloud computing

environments. Journal of computer science and technology, 27(2), 256-272.

Page 37: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

[8] Yadav, S. (2013). Comparative study on open source software for cloud computing

platform: Eucalyptus, OpenStack and OpenNebula. International Journal of

Engineering And Science, 3(10), 51-54.

[9] Lombardi, F., & Di Pietro, R. (2011). Secure virtualization for cloud computing.

Journal of network and computer applications, 34(4), 1113-1122.

[10] Cook, J. D., Primmer, R., & de Kwant, A. (2014). Compare cost and performance of

replication and erasure coding. hitachi Review, 63, 304.

[11] http://searchcloudstorage.techtarget.com/feature/How-opensource-Swift-OpenStack-

Object-Storage-works last access on: 27/11/2018.

[12] Nunez, M., Nguyen, N. T., Camacho, D., & Trawinski, B. (Eds.). (2015).

Computational Collective Intelligence: 7th International Conference, ICCCI 2015,

Madrid, Spain, September 21-23, 2015, Proceedings (Vol. 9330). Springer.

[13] Saranya, N., & Nivedha, S. (2016, January). Implementing authentication in an

Openstack environment-survey. In Computer Communication and Informatics

(ICCCI), 2016 International Conference on (pp. 1-7). IEEE.

[14] Pol, D. U. (2014). Cloud Computing with Open Source Tool: OpenStack. American

Journal of Engineering Research (AJER), 3(9), 233-240

[15] Rosado, T., & Bernardino, J. (2014, July). An overview of openstack architecture. In

Proceedings of the 18th International Database Engineering & Applications

Symposium (pp. 366-367). ACM.

[16] Schafer, D. R., Rothermel, K., & Tariq, M. A. (2018). Replication Schemes for

Highly Available Workflow Engines. IEEE Transactions on Services Computing, (1),

1-1.

Page 38: AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project developer. The configurations of OpenStack Swift system

[17] Kulkarni, B., & Bhosale, V. (2016, August). Efficient storage utilization using erasure

codes in OpenStack cloud. In Inventive Computation Technologies (ICICT),

International Conference on (Vol. 3, pp. 1-5). IEEE.

[18] Li, L., Li, D., Su, Z., Jin, L., & Huang, G. (2016). Performance analysis and

framework optimization of open source cloud storage system. China

Communications, 13(6), 110-122.

[19] Jain, P., Datt, A., Goel, A., & Gupta, S. C. (2016, September). Cloud service

orchestration based architecture of OpenStack Nova and Swift. In Advances in

Computing, Communications and Informatics (ICACCI), 2016 International

Conference on (pp. 2453-2459). IEEE.

[20] Kulkarni, B., & Bhosale, V. (2016, August). Efficient storage utilization using erasure

codes in OpenStack cloud. In Inventive Computation Technologies (ICICT),

International Conference on (Vol. 3, pp. 1-5). IEEE.