AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line...
Transcript of AUTOMATION REPLICATION IN OPENSTACK · OpenStack dashboard by modify the suitable Command Line...
AUTOMATION REPLICATION IN
OPENSTACK
WAN NUR ATHIQAH SHAFIRAH BINTI WAN MUHAMMAD
SHAFAWI
BACHELOR OF COMPUTER SCIENCE (COMPUTER
NETWORK SECURITY) WITH HONORS
UNIVERSITI SULTAN ZAINAL ABIDIN
2018
AUTOMATION REPLICATION IN
OPENSTACK
WAN NUR ATHIQAH SHAFIRAH BINTI WAN MUHAMMAD
SHAFAWI
BACHELOR OF COMPUTER SCIENCE (COMPUTER
NETWORK SECURITY) WITH HONORS
FACULTY OF INFORMATICS AND COMPUTING
UNIVERSITI SULTAN ZAINAL ABIDIN, TERENGGANU,
MALAYSIA
2018
DECLARATION
I hereby declare that this report is based on my original work except for quotations and
citations, which have been duly acknowledged. I also declare that it has not been previously
or concurrently submitted for any other degree at University Sultan Zainal Abidin or other
institutions.
____________________________
Name: Wan Nur Athiqah Shafirah binti Wan
Muhammad Shafawi
Date:
CONFIRMATION
This is to confirm that:
The research conducted and the writing of this report was under my supervision.
___________________________
Name: Prof. Madya Dr. Zarina binti Mohamad
Date:
DEDICATION
In the Name of Allah, the Most Merciful, the Most Compassionate all praise be to
Allah, the Lord of the worlds; and prayers and peace upon Mohamed His servant and
messenger.
This thesis is dedicated to my beloved supervisor, Prof. Madya Dr. Zarina binti
Mohamad who has always been a constant source of support and encouragement during the
challenges of completing this project. I would like to sincerely thank her for the guidance
throughout this study especially for her confidence in me. Also to the fellow panels Prof.
Madya Dr. Mohamad Afendee bin Mohamed and Dr. Aznida Hayati binti Zakaria @
Mohamad, I would like to warmly thank them for the constructive criticism and ideas which
have provided a good basis for the present thesis.
Besides, I would like to express my deep and sincere gratitude to my beloved parents,
Wan Muhammad Shafawi bin Wan Ibrahim and Rosmini binti Mat Jusoh for their thrust and
non-stop encouragement to complete my study. Without their support and understanding it
would have been impossible for me to finish this work.
Last but not least, I would like to thank all my friends for their kindness to share their
knowledges, moral support and helps in order to complete the thesis. My deepest thanks go to
all people who took part in making this thesis.
ABSTRACT
Nowadays, more and more embedded devices such as mobile phones, tablet PCs and
laptops are used in every field. Therefore, huge files conveniently to be stored or backed up
into cloud storage. Cloud computing is a collection of computing resources provided with the
help of Internet. Cloud has various benefits such as broad area network access, cost reduction,
ease of use, time saving, on-demand delivery of its services and all device support which has
Internet connection. Optimizing the performance of cloud storage is very important for
Internet development and people convenient. This paper presents the automation replication
in OpenStack. OpenStack is an open-source platform that provides cloud as a service. There
are many features that had been provided by OpenStack cloud computing components to help
users manage their own cloud storage from OpenStack dashboard. This paper mainly focuses
on the automation mechanism of cloud storage as well as the replication methods to process
different files purposely for data availability in cloud storage. This work provides
contributions through the challenge in order to implement automation replication feature in
OpenStack dashboard by modify the suitable Command Line Interface (CLI) as the project
developer. The configurations of OpenStack Swift system need to be done and we need to
analyse how the components can be related with each other to achieve the expected
objectives.
CONTENTS
DECLARATION ....................................................................................................................... 3
CONFIRMATION ..................................................................................................................... 4
DEDICATION ........................................................................................................................... 5
ABSTRACT ............................................................................................................................... 6
CONTENTS ............................................................................................................................... 7
LIST OF FIGURES ................................................................................................................... 9
CHAPTER I ............................................................................................................................. 10
INTRODUCTION ................................................................................................................... 10
1.1 BACKGROUND ....................................................................................................... 10
1.2 PROBLEM STATEMENT ....................................................................................... 12
1.3 OBJECTIVE.............................................................................................................. 13
1.4 SCOPE ...................................................................................................................... 13
1.5 LIMITATION OF WORK ........................................................................................ 13
CHAPTER II ............................................................................................................................ 14
LITERATURE REVIEW ........................................................................................................ 14
2.1 INTRODUCTION .......................................................................................................... 14
2.2 CLOUD COMPUTING ................................................................................................. 14
2.2.1 CLOUD COMPUTING CONCEPT ....................................................................... 14
2.2.2 CLOUD ARCHITECTURE .................................................................................... 15
2.3 SOFTWARE ON CLOUD ............................................................................................. 16
2.3.1 OPENSTACK .......................................................................................................... 16
2.3.2 OPENSTACK ARCHITECTURE .......................................................................... 17
2.4 RESEARCH ON RELATED TOPIC ............................................................................ 19
2.4.1 AUTOMATED CLOUD COMPUTING SCENARIOS ......................................... 19
2.4.2 REPLICATION CONCEPT .................................................................................... 20
2.4.3 DATA AVAILABILITY ......................................................................................... 22
2.4.4 OPENSTACK OBJECT STORAGE (SWIFT) ....................................................... 22
2.6 SUMMARY ....................................................................................................................... 23
CHAPTER III .......................................................................................................................... 24
METHODOLOGY .................................................................................................................. 24
3.1 INTRODUCTION .......................................................................................................... 24
3.2 PROJECT METHODOLOGY ....................................................................................... 25
3.3 PROJECT FRAMEWORK ................................................................................................ 27
3.2.1 INSTALLATION AND CONFIGURATION OF SOFTWARE ............................ 27
3.2.2 SWIFT COMPONENT ........................................................................................... 28
3.2.3 ADD NEW PROGRAM .......................................................................................... 28
3.2.4 HORIZON COMPONENT ..................................................................................... 28
3.4 OPENSTACK ARCHITECTURE ................................................................................. 29
3.5 OPENSTACK SWIFT OBJECT STORAGE SERVICE PLATFORM ........................ 31
3.6 DATA MODEL OF DATA REPLICATION IN OPENSTACK .................................. 33
3.7 PROOF OF CONCEPT .................................................................................................. 34
REFERENCES ........................................................................................................................ 36
LIST OF FIGURES
FIGURE TITLE PAGE
3.2 Methodology Phase for Automation Replication in OpenStack 25
3.3 Framework of Automation Replication in OpenStack 27
3.4.1 OpenStack Conceptual Architecture 29
3.4.2 OpenStack Logical Architecture 30
3.5 OpenStack Object Storage Service 31
3.6 Flowchart of Data Replication Conduct in Swift 33
3.7.1 Creating Packstack File 34
3.7.2 Configuration of OpenStack Component 34
3.7.3 Installation of OpenStack 34
CHAPTER I
INTRODUCTION
1.1 BACKGROUND
Cloud computing consists of emerging computing paradigm which makes use of the
contemporary virtual machine technology. Cloud Computing allows the users to access a
pool of shared computing resources like storage, networks, application and services that can
be rapidly provisioned [4]. The collaboration between internet and virtual machine
technologies enable cloud computing to emerge as a paradigm with promising prospects to
facilitate the development of large scale, flexible computing infrastructures and available on-
demand to meet the computational requirement of e-Science applications. Cloud computing
technology attracts ICT service providers that offer tremendous opportunities for online
distribution of services. End users can benefit from the convenience of accessing data and
services globally, from centrally managed backups, high computational capacity and flexible
billing strategies. Cloud computing is a service where data is remotely maintained, managed
and backup for users over a network. There are some open source platforms provided by
expertise in developing a cloud computing service such as OpenStack, Eucalyptus,
CloudStack and OpenNebula.
OpenStack is a set of software tools for building and managing cloud computing
platforms for public and private clouds. OpenStack is open source platform for cloud
computing and easily available to users and user can deploy service models. It is component
based software which includes various projects in different code names to deliver cloud
services. OpenStack is for building private cloud that provides three service delivery models;
Infrastructure as a Service (IaaS), Software as a Service (SaaS) and Platform as a Service
(PaaS) model. It consists of resources like storage, networking and processing throughout a
datacenter. It can be managed through a web-based dashboard or via the OpenStack API. It
allows administrators to control the dashboard while empowering their users to provision
resources through a web interface. OpenStack works with popular enterprise and open source
technologies that make it ideal for heterogeneous infrastructure. In April 2012, RackSpace
announced it would implement OpenStack Compute as the underlying technology for their
Cloud Servers product. The change will come a new control panel as well as add-on cloud
services offering databases, server monitoring, block storage and virtual networking.
Nowadays, OpenStack is the major cloud tool used in the industry. OpenStack is made of
several components with each one of them with a particular function that include nova, swift,
glance, horizon and keystone [6].
Swift is a two-tier storage system consists of a proxy tier, which handles all incoming
requests and an object storage tier where the actual data is stored. Swift uses a data structure
called a ring to outline URL for an object to a particular location in the cluster where the
object is stored. Swift is used for storing objects and files which provides scalability and to
ensure data integrity and data replication. Swift enables users to store, retrieve and delete
objects with their associated metadata in containers via RESTful HTTP API. Metadata is a
set of data that describes and gives information about other data. Swift replicates each object
three times [11]. When storing the copies, it tries to spread them out over different servers
and disks so the failure of a single component would not cause data lost data transfer. Cloud
computing systems need at least twice the number of storage devices it requires to store
information to support failure occurrence during data transmission across the network. A
cloud computing system must make a copy of all clients' information and store it on other
devices. The copies enable the central server to access backup machines to retrieve data that
otherwise would be unreachable. Making copies of data as a backup is called replication.
Data replication allows user to reduce waiting time, speeding up data access and increase data
availability by providing the user with different replicas of the same service, all of them with
a coherent state [7]. Replication is a frequently used technique in the cloud such as GFS
(Google file system), HDFS (Hadoop Distributed File System).
From this project, we will improve the data availability in cloud computing via Swift
component using OpenStack as the cloud platform. OpenStack users can manage the
automation replication service through the web-based dashboard.
In conclusion, cloud computing is a computing manner based on the Internet that can
provide corresponding hardware and software resources and information according to the
needs [1]. OpenStack cloud platform enables consumers to hire and utilize infrastructure
resources like, compute power, storage and network in a need based manner. Based on the
requirements, users can access and utilize the resources as a service instead of purchasing
them. Swift is the open source storage subproject of OpenStack. It is an object storage
services that provisioned in a scalable manner by the storage component. It uses the common
servers to build redundancy, scalable, distributed object storage cluster and the storage
capacity could reach up to PB-level [9]. Swift component enables users to replicate data in
cloud storage.
1.2 PROBLEM STATEMENT
OpenStack is advanced open source cloud computing software which comes with
various features and its services that provided for the end users. One of the cloud computing
features is to availability of data and it is covered in OpenStack. However, automation
replication is not being implemented with other OpenStack features. Replication is important
in cloud service to ensure the availability of requested data. Replication technique is included
in OpenStack via Swift component [17].
1.3 OBJECTIVE
There are three main objectives for this project. These objectives are derived to overcome
problems that stated before. The objectives are:
To propose an automation replication in OpenStack
To design an automation replication framework in OpenStack
To implement automation of replication in OpenStack
1.4 SCOPE
The project focuses on administrators and users scope. Administrator will provide
maintenance to achieve the reliability and availability of data in OpenStack. For the users’
scope, it includes how convenient the users can manage the application on their own.
1.5 LIMITATION OF WORK
During the development of this project, there might be a few challenges occur which
include the limitation of huge storage capacity in order to install the software needed.
CHAPTER II
LITERATURE REVIEW
2.1 INTRODUCTION
This chapter describes about research conducted by other parties of existing computer
that related to the ongoing project. Research is an important aspect where a study is carried
out to obtain the information related to the proposed project. When the study is conducted,
we can examine all the available information that related with the project. This chapter also
describes the techniques, methods, equipment or technology that suitable for the
implementation of projects or studies. There are some references used for this chapter, which
included journal articles, internet and thesis.
2.2 CLOUD COMPUTING
2.2.1 CLOUD COMPUTING CONCEPT
Cloud computing is a new delivery and consumption model for IT services. It
involves provision of dynamically scalable and often virtualized resources typically over the
Internet. Cloud computing can also be called as “computing in an independent or remote
location with shared resources available on demand” [12]. Cloud computing developed by
emerging concept that combine many fields of computing. Cloud computing provide and
enable the use of distributed computing, storage resources and services that have been
developed, thoroughly tested and adopted by industry, science and government [2].
Nowadays more and more data is generated in different forms and there always a concern
about the storage of data. Therefore, Cloud is implemented as one of solution for storing data.
Cloud Computing allows the users to access a pool of shared computing resources
like storage, networks, application and services that can be rapidly supplied. Cloud
computing has primarily three service delivery models which are Infrastructure as a Service
(IaaS), Software as a Service (SaaS) and Platform as a Service (PaaS). There are some
essential features of cloud computing, that includes Broad network access, on-demand self-
service, Resource pooling, Measured service and Rapid elasticity. Four service deployment
models of cloud are private cloud, public cloud, community cloud, hybrid cloud.
2.2.2 CLOUD ARCHITECTURE
Cloud computing architecture refers to the components and subcomponents required
for cloud computing.
The various cloud based services have their own distinct and unique cloud architectures:
• Software as a Service (SaaS) involves software hosted and maintained on internet.
With SaaS, users do not have to install the software locally.
• Development as a Service (DaaS) involves web based development tools shared
across communities.
• Platform as a Service (PaaS) provides users with application platforms and databases.
• Infrastructure as a Service (IaaS) provides for infrastructure and hardware such as
servers, networks, storage devices.
2.3 SOFTWARE ON CLOUD
S. Yadav (2013) tells about cloud computing technology and its basic concepts. It
explained about the demand for private cloud and its deployment [10]. This paper describes
three open source software and tells how to deploy private cloud using open source software
provided such as OpenStack, Eucalyptus and OpenNebula. A comparative study of those
three open source software based on their architecture is analysed and it mentioned about
cloud implementation, programming language, database compatibility and OS compatibility
of these software. This summarization and comparison helps to choose better services
according to requirements.
2.3.1 OPENSTACK
OpenStack is open source platform for cloud computing and easily available to users
and user can deploy service models. OpenStack is for building private cloud that provides
three service delivery models which are Infrastructure as a Service (IaaS), Software as a
Service (SaaS) and Platform as a Service (PaaS) model. It consists of resources like Storage,
Networking and processing throughout a datacenter that users can manage through a web
based dashboard [4]. There are numerous components in an OpenStack that provides
different services accordingly. OpenStack is open source software platform which developers
and researchers can used to setup and run the different types of cloud. The OpenStack is
basically used to implement a private cloud.
S. Nivedha et. al (2015) tells how to build private cloud using open source software
[13]. There also stated with various features of OpenStack software and its services,
capabilities provided to the end users. Dr. U. Pol (2014), in the journal stated that OpenStack
open source software is used for providing three software deployment models which are IaaS,
PaaS and SaaS [19]. Various tools were provided on top of existing system to manage and
create virtual machines (VMs).
2.3.2 OPENSTACK ARCHITECTURE
Research paper by Tiago Rosado and Jorge Bernardino (2014), gives an overview of
OpenStack software components functionalities in order to design and implement unique
cloud computing solutions to fit with enterprises purposes [15]. This paper gives an update,
highlight and detailed overview of OpenStack architecture and shows the essential services
that are necessary to install. It also presented about overview of OpenStack architecture,
services available and how they work and adapt into any hardware infrastructure by
providing a sustainable and robust private cloud solution.
The primary components of an OpenStack consist of [14]:
Nova (Compute): OpenStack Compute (Nova) is a cloud computing fabric controller which is
used to deploy and manage large numbers of virtual machines and other instances to handle
computing tasks.
Swift (Object Storage): OpenStack Object Storage (Swift) is a scalable redundant storage
system for objects and files. Objects include files are written to multiple disk drives spread
throughout servers in the datacenter OpenStack software only responsible for ensuring data
replication and integrity across the cluster.
Cinder (Block Storage): OpenStack Block Storage (Cinder) is a block storage component,
which is more analogous to the traditional notion of a computer being able to access specific
locations on a disk drive. It also provides persistent block-level storage devices for use with
OpenStack compute instances. The block storage system in OpenStack manages the creation,
attaching, detaching of the block devices to servers.
Neutron (Networking): OpenStack Networking (Neutron) provides the networking capability
for OpenStack. It is a system implemented for managing networks and IP addresses easily,
quickly and efficiently.
Horizon (Dashboard): OpenStack Dashboard (Horizon) is the dashboard behind OpenStack
which provides administrators and users a graphical interface to access, provision and
automate cloud-based resources.
Keystone (Identity Service): OpenStack Identity (Keystone) provides identity services for
OpenStack. It also known as a central directory of users mapped to the OpenStack services
that they can access. It provides multiple means of access and acts as a common
authentication system across the cloud operating system and can integrate with existing
backend directory services like LDAP (Lightweight Directory Access Protocol).
Glance (Image Service): OpenStack Image Service (Glance) provides image services to
OpenStack application. It provides discovery, registration and delivery services for disk and
server images. It also allows these images to be used as templates when deploying new
virtual machine instances.
Ceilometer (Telemetry Service): OpenStack Telemetry Service (Ceilometer) provides
telemetry services which allow the cloud to provide billing services to individual users of the
cloud. It keeps a verifiable count of each user’s system usage of each of the various
components of an OpenStack cloud.
Heat (Orchestration): OpenStack Orchestration (Heat) is a service which allows developers
to store the requirements of a cloud application in a file that defines what resources are
necessary for that application.
Trove (Database): OpenStack (Trove) is a database as a service which provides relational and
non-relational database engines.
2.4 RESEARCH ON RELATED TOPIC
2.4.1 AUTOMATED CLOUD COMPUTING SCENARIOS
Dr. H. Guruprasad (2014), S. Nivedha and N. Saranya (2015) introduces cloud
computing technology and OpenStack platform concepts [4] [13]. Two types of cloud which
are private and public were described. In this paper, latest and power technique for creating
private cloud using OpenStack open source platform with Ubuntu open source operating
system was introduced. There also information about implementation of private cloud using
open source software where private cloud provides Infrastructure as a Service model and
Platform as a Service model. Implementation of private cloud should be launching different
flavours of images, instances and services. With the help of OpenStack dashboard, user
personally can manage different OpenStack services.
Vladimir Sobeslav and Ales Komarek (2014), research’s covers about the basic open
source automation and configuration management tools that can be used to alleviate common
operations tasks and processes in cloud systems. It shows a quick survey of major cloud
computing solutions and introduces simple abstraction layer for the management of physical
and virtual resources. The last chapter in the paper covers some of the common automation
scenarios from both cloud computing provider and consumer perspectives and their possible
open-source implementations [5].
The operating enterprise systems without automation are not sustainable in the long
term. We can see rapid development of new cloud computing resources coping up with
increasing demand to support these complex infrastructures. The open-source world
introduces some enterprise level solutions to many common automation scenarios. The ability
of automatic operations and transition processes by continuous integration has become
crucial in the IT environment until now. With the open-source automation, we utilize the
laboratory hardware resources more efficient by turning the system architecture into fully
edged IaaS solution that can support educational as well as research projects. The own cloud
computing platform allows us to automate the provision of new virtual servers and thus adopt
the last missing step to continuously integrate the process. With this infrastructure, we can
continuously test the open-source automation scenarios involving the installation of
OpenStack platform on physical servers and the deployment of virtual servers for education
and various distributed systems.
2.4.2 REPLICATION CONCEPT
Replication is a process where a whole object is replicated some number of times [8]
to provide protection if a copy of an object is lost or unavailable. Replication used to achieve
reliability and availability of data in cloud storage. Availability can be achieved by
designating at least one person who would have access to the files at all time. It can avoid
data lost even if interruption occurs during data transmission over the network. It also
minimizes network delays and the bandwidth usage. We consider both energy efficiency and
bandwidth consumption of the system. This is in addition to the improved Quality of Service
(QoS) obtained as a result of the reduced communication delays. The results obtained from
the simulator might help to unveil performance and energy efficiency trade-offs as well as
guide the design of future data replication solutions. Data replication across multiple
datacenters is important to prevent data loss. In the event of a disaster in one area, data could
still be accessed from other regions and users would be unaware of any problems. For
example in the occurrence of disaster in the Southeast, data could still be accessed from other
regions. If something bad happened to the Southeast, such as snowstorm that caused by
power supply failure, your data would be served from another datacenter. This can be proven
as replication is a widely used mechanism for ensuring availability in the presence of failures.
Based on research paper released by David Richard Schafer, Kurt Rothermel and
Muhammad Adnan Tariq (2018), it says that the main purpose of replication is to improve
availability [16]. They presented novel replication schemes for ensuring the availability of
workflow executing in heterogeneous environments. The evaluations showed that the
replication schemes significantly increase availability in the presence of failures, while
introducing low overhead and being scalable.
High availability, high fault tolerance and high efficiency access to cloud data centers where
failures are normal rather than exceptional are significant issues, due to the large-scale data
support. Data replication allows reducing user waiting time, speeding up data access and
increasing data availability by providing the user with different replicas of the same service,
all of them with a coherent state. Replication is a frequently used technique in the cloud, such
as GFS (Google file system), HDFS (Hadoop Distributed File System).
2.4.3 DATA AVAILABILITY
Data availability means that resources must be available to authorized users at all the
time as on when required. They must always be accessible even during network failure or
when a whole datacenter has gone offline. It guarantees that system works accurately and
service is accessible to authorized users only [3]. Bhagyashri Kulkarni and Varsha Bhosale
(2016), OpenStack is advanced open source cloud computing software which comes with
various features and its services that provided for the end users. Replication is important in
cloud service as it can ensure the availability of requested data [20].
2.4.4 OPENSTACK OBJECT STORAGE (SWIFT)
In OpenStack, Swift component provides Object storage capabilities. The Swift uses
replication technique as strategy to achieve its reliability, availability and fault tolerance
properties of storage which it keeps more than one copy of each object (typically three
copies).
Lei Li, Dagang Li, Zhiliang Su, Lianwen Jin and Ganbo Huang, presents the technical
details about OpenStack Swift, such as how the hardware configuration, the components and
different parameter configurations affect the performance [18]. It stated that there are still
many unknown environmental factors of cloud storage to be discovered in the real
deployments and evaluations. Thus, the evaluation of OpenStack Swift as a real commercial
cloud storage deployment is needed in future work.
Previously, most file systems use the RAID (Redundant Array of Independent disk)
structure to achieve data replication. Although the RAID structure can back up the data into
different hard disks in a single server, it costs too much when the cluster is large and cannot
distribute the backup files into different servers to avoid a single-point from breaking down.
Currently, cloud storage system can provide the replica data store scheduler to make the
backup files on different servers so that it can guarantee the storage system in a consistent
state in case of a temporary error, like hard device failure or power outage.
2.6 SUMMARY
From this chapter, we can conclude that conducting previous research is an important
element in developing a project because it will consume more knowledge about the topic and
it also shows how the previous researchers do their research. It is important as a guide to
prevent same mistake or the same idea and technique that had been used being established.
Hopefully this chapter would provide an overview regarding the concept of the project based
on the explanation provided.
CHAPTER III
METHODOLOGY
3.1 INTRODUCTION
This chapter reports about the approach, model development and application of a
comprehensive framework taken in the development of system, application or
implementation of study. It contains method, technique or approach that will be used during
the design and implementation of the project. The selection of suitable methodology for the
project development is very important. Choosing false methodology can cause a lot of
problem and the project might not be completed based on right schedule. Moreover, the
proposed project might be completely failed as the developer might lose guidance in order to
complete the project development. All the involved phases during this project will be detailed
for further understanding.
3.2 PROJECT METHODOLOGY
Figure 3.2: Methodology Phase for Automation Replication in OpenStack
Figure 3.2 shows the methodology phase for Automation Replication in OpenStack. It
consists of 5 phases which are:
PHASE 1: INITIATING
The beginning phase of project methodology defines about the starting point of a new
project or new phase of an existing project. Information regarding the project was gathered
before doing any decision on the scope and topic of the project that suitable to be proposed to
the panels. This phase help developer to increase knowledge based on their interest.
PHASE 2: PLANNING
The project scope background was discussed to make sure the chosen topic satisfy
course needs, level, scope and rules that have been stated during first course briefing. The
INITIATING PLANNING
DEPLOYMENT TESTING
ANALYSIS &
DESIGN
problem was identified and determined whether it satisfied the course scope or not. Then, we
came out with several objectives and lot of references from previous researchers have been
done to make a list of literature review. It is the stage where we plan the project based on
chosen topic.
PHASE 3: ANALYSIS AND DESIGN
This phase was the stage where we create a framework and data flow diagram based
on the information gathered for this project development. The suitable method, technique and
platform been analysed. Modifying command for automation feature also done at this stage
based on the requirement.
PHASE 4: TESTING
The whole modification must be tested to check its functionality during testing phase.
If there are any errors occur, then its will appear at this point of the process. Therefore, we
still have a chance to correct and improve the command. It is important to test the project
progress to make sure whether the design can be implemented in OpenStack application or
not.
PHASE 5: DEPLOYMENT
After checking the requirement whether it satisfy the course scope or not, the project
can be proceed to deploy.
3.3 PROJECT FRAMEWORK
Figure 3.3: Framework of Automation Replication in OpenStack
3.2.1 INSTALLATION AND CONFIGURATION OF SOFTWARE
Figure 3.3 shows an overall framework of automation replication in OpenStack. The
first image in the framework shows the software used to carry out this project which includes
VM Ware workstation, CentOS and OpenStack. Firstly, we need to install and configure VM
Ware workstation as the virtual machine. Then, install and configure CentOS in VM Ware
workstation before installing and configuring OpenStack on CentOS.
Installation and configuration of
OpenStack in CentOS Software
Add new program [Data management]
Display
3.2.2 SWIFT COMPONENT
Swift is one of the components been included in OpenStack to provide object storage
service in cloud system. OpenStack Swift distributed object storage system is a type of
storage that can read and write files using a REST (representational state transfer) API
(application programming interface) and enables large quantities of data to be stored and
shared among applications.
3.2.3 ADD NEW PROGRAM
For this project, we are going to modify and add new program related to automation
replication feature. The modification focused on Swift and Horizon component but the other
components in OpenStack also need to be analysed whether it is related with the project
progress.
3.2.4 HORIZON COMPONENT
Horizon used for Dashboard which is simply used as a dashboard behind OpenStack
application which provides web based services to user interface of OpenStack. The
automation replication that been developed will be displayed on OpenStack dashboard via
Horizon.
3.4 OPENSTACK ARCHITECTURE
Figure 3.4.1 represents the OpenStack conceptual architecture with all native software
components which developed by companies and individual supporters, showing how they
interact with each other.
Figure 3.4.1: OpenStack Conceptual Architecture
Figure 3.4.2 shows the OpenStack logical architecture with all native software
components which developed by companies and individual supporters, showing how they
interact with each other.
Figure 3.4.2: OpenStack Logical Architecture
3.5 OPENSTACK SWIFT OBJECT STORAGE SERVICE PLATFORM
Figure 3.5: OpenStack Object Storage Service
OpenStack Swift is a distributed object open source storage platform with different
structures of platform that can be deployed to satisfy different requirements. It has several
components which provide different functionalities such as file store location scheduler,
lookup service, metadata manager and failure recovery. It provided files availability by
keeping more than one data replication in different storage servers. For the latest version of
OpenStack, Swift can provide different configurations of replication to allow users to select
how many replications of files they want.
Figure 3.5 shows the components available in Swift in order to operate the system for
object storage service. Proxy server component in Swift perform file location scheduler
service and lookup service which is the access entry of storage system. It used a hash code
named ring file to put the file in the suitable location of storage. Ring file is a static file that
cannot be modified during runtime process. The proxy server can be a load balance server in
the system to schedule the files to be placed in different storage device. The proxy server will
solve the problem and set handoff server in the ring and route the files if failure occurs in any
storage device.
OpenStack Swift simultaneously writes three or more backup data to different storage
devices to guarantee the safety of the data storage. The system defines that the file is
successfully stored when two file backups are successfully written. The replica server can
detect the quality of files. For example, when a file is destroyed because of power outage,
writing error or hard device damage, replica server can copy file from another storage device
to guarantee the number of the file replicated.
When a user that corresponds to user account requests for an object inside a container,
the account server looks for the account first in its account database and finds associated
container. Then, container server checks the container database to find out whether the
requested object exists or not in the specified container and finally the object server looks in
object databases to find retrieval information about the object. The proxy server needs to
know which object server store the object and the path of the object in the local filesystem of
that server to retrieve an object.
3.6 DATA MODEL OF DATA REPLICATION IN OPENSTACK
Figure 3.6: Flowchart of Data Replication Conduct in Swift
Figure 3.6 represents flowchart of data replication conduct in Swift. When user request
for data replication by input certain command, the request have to go through a few nodes
and servers before either replicas is successfully created or not. From the available
OpenStack service, user need to use GET and PUT command to request and upload data in
object storage database provided to make data replicas. Therefore, through this project we are
Service storage
Proxy server
Authentication
Obtain container
metadata
Get all possible nodes according
to partitions and object ring files
Pick three nodes Object server
Handle writing data to disk
and updating its metadata
Container server
Update object record in
the container
Proxy server
At least 2
replicas written
User
Request
Establish connection with
three container nodes
Establish connection with
three container nodes
Update container
information
Yes
No
going to provide convenience for data management especially for data replication via
automation on OpenStack dashboard.
3.7 PROOF OF CONCEPT
Figure 3.7.1 shows the command on CentOS terminal used for creating OpenStack Packstack
file while the second command is an instruction to open the Packstack file.
Figure 3.7.1: Creating Packstack File
Figure 3.7.2 represents configuration of OpenStack component after opening the Packstack
file. The components needed for this project was configured by changing the ‘n’ notation to
‘y’.
Figure 3.7.2: Configuration of OpenStack Component
Figure 3.7.3 shows the successful installation of OpenStack.
Figure 3.7.3: Installation of OpenStack
REFERENCES
[1] Zhang, R., & Liu, L. (2010, July). Security models and requirements for healthcare
application clouds. In Cloud Computing (CLOUD), 2010 IEEE 3rd International
Conference on (pp. 268-275). IEEE.
[2] Aiftimiei, C., Costantini, A., Bucchi, R., Italiano, A., Michelotto, D., Panella, M. &
Zizzi, G. (2017, October). Cloud Environment Automation: from infrastructure
deployment to application monitoring. In Journal of Physics: Conference Series (Vol.
898, No. 8, p. 082016). IOP Publishing.
[3] Gaikwad, C., Churi, B., Patil, K., & Tatwadarshi, P. N. (2017, March). Providing
storage as a service on cloud using OpenStack. In Innovations in Information,
Embedded and Communication Systems (ICIIECS), 2017 International Conference on
(pp. 1-4). IEEE.
[4] Girish, L. S., & Guruprasad, H. S. (2014). Building private cloud using OpenStack.
International Journal of Emerging Trends & Technology in Computer Science
(IJETTCS), 3(3).
[5] Sobeslav, V., & Komarek, A. (2015). Opensource automation in cloud computing. In
Proceedings of the 4th International Conference on Computer Engineering and
Networks (pp. 805-812). Springer, Cham.
[6] https://www.openstack.org/software, last access on: 03/10/2018.
[7] Sun, D. W., Chang, G. R., Gao, S., Jin, L. Z., & Wang, X. W. (2012). Modeling a
dynamic data replication strategy to increase system availability in cloud computing
environments. Journal of computer science and technology, 27(2), 256-272.
[8] Yadav, S. (2013). Comparative study on open source software for cloud computing
platform: Eucalyptus, OpenStack and OpenNebula. International Journal of
Engineering And Science, 3(10), 51-54.
[9] Lombardi, F., & Di Pietro, R. (2011). Secure virtualization for cloud computing.
Journal of network and computer applications, 34(4), 1113-1122.
[10] Cook, J. D., Primmer, R., & de Kwant, A. (2014). Compare cost and performance of
replication and erasure coding. hitachi Review, 63, 304.
[11] http://searchcloudstorage.techtarget.com/feature/How-opensource-Swift-OpenStack-
Object-Storage-works last access on: 27/11/2018.
[12] Nunez, M., Nguyen, N. T., Camacho, D., & Trawinski, B. (Eds.). (2015).
Computational Collective Intelligence: 7th International Conference, ICCCI 2015,
Madrid, Spain, September 21-23, 2015, Proceedings (Vol. 9330). Springer.
[13] Saranya, N., & Nivedha, S. (2016, January). Implementing authentication in an
Openstack environment-survey. In Computer Communication and Informatics
(ICCCI), 2016 International Conference on (pp. 1-7). IEEE.
[14] Pol, D. U. (2014). Cloud Computing with Open Source Tool: OpenStack. American
Journal of Engineering Research (AJER), 3(9), 233-240
[15] Rosado, T., & Bernardino, J. (2014, July). An overview of openstack architecture. In
Proceedings of the 18th International Database Engineering & Applications
Symposium (pp. 366-367). ACM.
[16] Schafer, D. R., Rothermel, K., & Tariq, M. A. (2018). Replication Schemes for
Highly Available Workflow Engines. IEEE Transactions on Services Computing, (1),
1-1.
[17] Kulkarni, B., & Bhosale, V. (2016, August). Efficient storage utilization using erasure
codes in OpenStack cloud. In Inventive Computation Technologies (ICICT),
International Conference on (Vol. 3, pp. 1-5). IEEE.
[18] Li, L., Li, D., Su, Z., Jin, L., & Huang, G. (2016). Performance analysis and
framework optimization of open source cloud storage system. China
Communications, 13(6), 110-122.
[19] Jain, P., Datt, A., Goel, A., & Gupta, S. C. (2016, September). Cloud service
orchestration based architecture of OpenStack Nova and Swift. In Advances in
Computing, Communications and Informatics (ICACCI), 2016 International
Conference on (pp. 2453-2459). IEEE.
[20] Kulkarni, B., & Bhosale, V. (2016, August). Efficient storage utilization using erasure
codes in OpenStack cloud. In Inventive Computation Technologies (ICICT),
International Conference on (Vol. 3, pp. 1-5). IEEE.