CLOUD COMPUTING - JBIET
Transcript of CLOUD COMPUTING - JBIET
2
CLOUD COMPUTING
UNIT-I
Topic name page no
1) Principles of Parallel and Distributed Computing, 2-21
2) Introduction to cloud computing, 22-23
3) Cloud computing Architecture, 24-25
4) cloud concepts and technologies, 26-32
5) cloud services and platforms, 32-37
6) Cloud models, 37-40
7) cloud as a service, 40-45
8) cloud solutions, 46-53
9) cloud offerings, 53-58
10) Introduction to Hadoop and Mapreduce. 59-63
3
1. Principles of Parallel and Distributed Computing:
The fundamental principles of parallel and distributed computing and discusses
models and conceptual frameworks that serve as foundations for building cloud
computing systems and applications.
1.1 Eras of computing
The two fundamental and dominant models of computing are sequential and parallel.
The sequential computing era began in the1940s; the parallel (and distributed)
computing era followed it within a decade(see Figure 2.1). The four key elements of
computing developed during these eras are architectures, compilers, applications, and
problem-solving environments.
The computing era started with a development in hardware architectures, which
actually enabled the creation of system software—particularly in the area of
compilers and operating systems—which support the management of such systems
and the development of applications. Every aspect of this era under went a three-
4
phase process: research and development (R&D), commercialization, and
commoditization.
1.2 Principles of Parallel and Distributed Computing
The terms parallel computing and distributed computing are often used
interchangeably, even though they means lightly different things. The term parallel
implies a tightly coupled system, whereas distributed refers to a wider class of
system, including those that are tightly coupled.
The term parallel computing refers to a model in which the computation is divided
among several processors sharing the same memory. The architecture of a parallel
computing system is often characterized by the homogeneity of components: each
processor is of the same type and It has the same capability as the others. The shared
memory has a single address space, which is accessible to all the processors. Parallel
programs are then broken down into several units of execution that can be allocated
to different processors and can communicate with each other by means of the shared
memory. For example, a cluster of which the nodes are connected through an
InfiniBand network and configured with a distributed shared memory system can be
considered a parallel system.
The term distributed computing encompasses any architecture or system that allows
the computation to be broken down into units and executed concurrently on different
computing elements, whether these are processors on different nodes, processors on
the same computer, or cores within the same processor. Therefore, distributed
computing includes a wider range of systems and applications than parallel
computing and is often considered a more general term. Classic examples of
distributed computing systems are computing grids or Internet computing systems,
which combine together the biggest variety of architectures, systems, and
applications in the world.
1.3 Elements of parallel computing
5
1.3.1 Parallel Processing:
Processing of multiple tasks simultaneously on multiple processors is called
parallel processing. The parallel program consists of multiple active processes (tasks)
simultaneously solving a given problem. A given task is divided into multiple
subtasks using a divide-and-conquer technique, and each subtask is processed on a
different central processing unit(CPU). Programming on a multiprocessor system
using the divide-and-conquer technique is called parallel programming.
The development of parallel processing is being influenced by many factors.
The prominent among them include the following:
• Computational requirements are ever increasing in the areas of both scientific and
business computing. The technical computing problems, which require high-speed
computational power, are related to life sciences, aerospace, geographical information
systems, mechanical design and analysis, and the like.
• Sequential architectures are reaching physical limitations as they are constrained by
the speed of light and thermodynamics laws. The speed at which sequential CPUs can
operate is reaching saturation point (no more vertical growth), and hence an
alternative way to get high computational speed is to connect multiple CPUs
(opportunity for horizontal growth).
• Hardware improvements in pipelining, superscalar and the like are non-scalable and
require sophisticated compiler technology. Developing such compiler technology is a
difficult task.
• Vector processing works well for certain kinds of problems. It is suitable mostly for
scientific problems (involving lots of matrix operations) and graphical processing. It
is not useful for other areas, such as databases.
• The technology of parallel processing is mature and can be exploited commercially;
there is already significant R&D work on development tools and environments.
6
• Significant development in networking technology is paving the way for
heterogeneous computing.
1.3.2 Hardware Architecture for parallel processing:
The core elements of parallel processing are CPUs. Based on the number of
instruction and data streams that can be processed simultaneously, computing systems
are classified into the following four categories:
• Single-instruction, single-data (SISD) systems:
An SISDcomputingsystemisauniprocessormachinecapableofexecutingasingleinstruction,
which operates on a single data stream. In SISD, machine instructions are processed
sequentially; hence computers adopting this model are popularly called sequential
computers. All the instructions and data to be processed have to be stored in primary
memory. The speed of the processing element in the SISD model is limited by the rate at
which the computer can transfer information internally. Dominant representative SISD
systems are IBM PC, Macintosh, and workstations.
7
Single-instruction, multiple-data (SIMD) systems:
An SIMD computing system is a multiprocessor machine capable of executing the same
instruction on all the CPUs but operating on different data streams. Machines based on an
SIMD model are well suited to scientific computing since they involve lots of vector and
matrix operations. For instance, statements such as Ci=Ai * Bi
can be passed to all the processing elements(PEs); organized data elements of vectors A
and B can be divided into multiple sets(N-sets for N PE systems);and each PE can
process one dataset. Dominant representative SIMD systems are Cray’s vector processing
machine and Thinking Machines’ cm*.
• Multiple-instruction, single-data (MISD) systems:
An MISD computing system is a multiprocessor machine capable of executing different
instructions on different PEs but all of them operating on the same dataset (see Figure2.4). For
instance, statements such as y=sin(x)+cos(x)+tan(x) perform different operations on the same
data set. Machines built using the MISD model are not useful in most of the applications; a few
machines are built, but none of them are available commercially. They became more of an
intellectual exercise than a practical configuration.
8
• Multiple-instruction, multiple-data (MIMD) systems:
An MIMD computing system is a multiprocessor machine capable of executing multiple
instructions on multiple data sets (see Figure 2.5). Each PE in the MIMD model has separate
instruction and data streams; MIMD machines are broadly categorized into shared-memory
MIMD and distributed-memory MIMD based on the way PEs are coupled to the main memory.
Shared memory MIMD machines:
9
In the shared memory MIMD model, all the PEs are connected to a single global memory and
they all have access to it(see Figure 2.6). Systems based on this model are also called tightly
coupled multiprocessor systems. The communication between PEs in this model takes place
through the shared memory. Dominant representative shared memory MIMD systems are Silicon
Graphics machines and Sun/IBM’s SMP (Symmetric Multi-Processing).
Distributed memory MIMD machines:
In the distributed memory MIMD model, all PEs have a local memory. Systems based on this
model are also called loosely coupled multi processor systems. The communication between PEs
in this model takes place through the interconnection network (the inter process communication
channel, or IPC). The network connecting PEs can be configured to tree, mesh, cube, and soon.
Each PE operates asynchronously, and if communication/synchronization among tasks is
necessary, they can do so by exchanging messages between them.
1.4 Approaches to parallel Programming:
A sequential program is one that runs on a single processor and has a single line of
control. To make many processors collectively work on a single program, the
program must be divided into smaller independent chunks so that each processor can
work on separate chunks of the problem. The program decomposed in this way is a
parallel program. A wide variety of parallel programming approaches are available.
The most prominent among them are the following:
10
• Data parallelism: In the case of data parallelism, the divide-and-conquer technique
is used to split data into multiple sets, and each data set is processed on different PEs
using the same instruction. This approach is highly suitable to processing on
machines based on the SIMD model.
• Process parallelism: In the case of process parallelism, a given operation has mul-
tiple (but distinct) activities that can be processed on multiple processors
• Farmer-and-worker model: In the case of the farmer- and-worker model, a job
distribution approach is used: one processor is configured as master and all other
remaining PEs are designated as slaves; the master assigns jobs to slave PEs and, on
com- pletion, they inform the master, which in turn collects results
1.4.1 Levels of parallelism:
Levels of parallelism are decided based on the lumps of code (grain size) that can be a
potential candidate for parallelism. The common goal is to boost processor efficiency
by hiding latency. The idea is to execute concurrently two or more single-threaded
applications, such as compiling, text format- ting, database searching, and device
simulation.
parallelism within an application can be detected at several levels:
• Large grain (or task level)
• Medium grain (or control level)
• Fine grain (data level)
• Very fine grain (multiple-instruction issue)
11
1.5 Elements of distributed computing:
1.5.1 General concepts and definitions
A general definition of the term distributed system, we use the one proposed by
Tanenbaum:
A distributed system is a collection of independent computers that appears to its users
as a single coherent system.
Communication is another fundamental aspect of distributed computing. Since
distributed systems are composed of more than one computer that collaborate together,
12
it is necessary to provide some sort of data and information exchange between them,
which generally occurs through the network (proposed by Coulouris):
A distributed system is one in which components located at networked computers
communicate and coordinate their actions only by passing messages.
1.5.2 Components of a distributed system
A distributed system is the result of the interaction of several components that traverse
the entire computing stack from hardware to software. It emerges from the
collaboration of several elements that—by working together—give users the illusion of
a single coherent system. Figure 2.10 provides an overview of the different layers that
are involved in providing the services of a distributed system.
At the very bottom layer, computer and network hardware constitute the physical
infrastructure; these components are directly managed by the operating system, which
provides the basic services for interprocess communication (IPC), process scheduling
and management, and resource manage- ment in terms of file system and local
devices.
13
The use of well-known standards at the operating system level and even more at
the hardware and network levels allows easy harnessing of heterogeneous
components and their organization into a coherent and uniform system.
The middleware layer leverages such services to build a uniform environment for the
development and deployment of distributed applications. This layer supports the
programming paradigms for distributed systems. By relying on the ser- vices offered
by the operating system, the middleware develops its own protocols, data formats,
and programming language or frameworks for the development of distributed
applications. All of them constitute a uniform interface to distributed application
developers that is completely independent from the underlying operating system and
hides all the heterogeneities of the bottom layers.
The top of the distributed system stack is represented by the applications and services
designed and developed to use the middleware. These can serve several purposes and
often expose their features in the form of graphical user interfaces (GUIs) accessible
locally or through the Internet via a Web browser.
Figure 2.11 shows an example of how the general reference architecture of a
distributed system is contextualized in the case of a cloud computing system.
14
Note that hardware and operating system layers make up the bare-bone
infrastructure of one or more datacenters, where racks of servers are deployed and
connected together through high-speed connectivity. This infrastructure is managed by
the operating system, which provides the basic capability of machine and network
management. The core logic is then implemented in the middle- ware that manages the
virtualization layer, which is deployed on the physical infrastructure in order to maximize
its utilization and provide a customizable runtime environment for applications. The
middleware provides different facilities to application developers according to the type of
services sold to customers. These facilities, offered through Web 2.0-compliant
interfaces, range from virtual infrastructure building and deployment to application
development and runtime environments.
1.5.3 Architectural styles for distributed computing
Although a distributed system comprises the interaction of several layers, the
middleware layer is the one that enables distributed computing, because it provides a
coherent and uniform runtime environment for applications. There are many different
ways to organize the components that, taken together, constitute such an environment.
15
The interactions among these components and their responsibilities give structure to the
middleware and characterize its type or, in other words, define its architecture.
Architectural styles are mainly used to determine the vocabulary of components and
connectors that are used as instances of the style together with a set of constraints on
how they can be combined.
We organize the architectural styles into two major classes:
• Software architectural styles
• System architectural styles
The first class relates to the logical organization of the software; the second class
includes all those styles that describe the physical organization of distributed software
systems in terms of their major components.
1.5.3.1 Component and connectors:
We intend for components and connectors, since these are the basic building
blocks with which architectural styles are defined. A component represents a unit
of software that encapsulates a function or a feature of the system. Examples of
components can be programs, objects, processes, pipes and filters. A connector is
a communication mechanism that allows cooperation and coordination among
components. Differently from components, connectors are not encapsulated in a
single entity, but they are implemented in a distributed manner over many system
components.
1.5.3.2 Software architectural styles:
Software architectural styles are based on the logical arrangement of software
components. They are helpful because they provide an intuitive view of the
whole system, despite its physical deploy- ment. They also identify the main
abstractions that are used to shape the components of the system and the
expected interaction patterns between them.
These models constitute the foundations on top of which distributed systems are
designed from a logical point of view, and they are discussed in the following
sections.
16
Data centered architectures:
These architectures identify the data as the fundamental element of the software
system, and access to shared data is the core characteristic of the data-centered
architectures.
The repository architectural style is the most relevant reference model in this
category. It is characterized by two main components: the central data structure,
which represents the current state of the system, and a collection of independent
components, which operate on the central data. In particular, repository-based
architectures differentiate and specialize further into subcategories according to
the choice of control discipline to apply for the shared data structure. Of
particular interest are databases and blackboard systems.
The blackboard architectural style is characterized by three main components:
• Knowledge sources. These are the entities that update the knowledge base that
is maintained in the blackboard.
• Blackboard. This represents the data structure that is shared among the
knowledge sources and stores the knowledge base of the application.
• Control. The control is the collection of triggers and procedures that govern the
interaction with the black board and update the status of the knowledgebase.
17
Data-flow architectures:
In the case of data-flow architectures, it is the availability of data that controls the
computation. With respect to the data-centered styles, in which the access to data is the
core feature, data-flow styles explicitly incorporate the pattern of data flow.
Batch Sequential Style.
The batch sequential style is characterized by an ordered sequence of separate programs
executing one after the other. These programs are chained together by providing as input for the
next program the output generated by the last program after its completion, which is most likely
in the form of a file.
Pipe-and-Filter Style:
The pipe-and-filter style is a variation of the previous style for expressing the activity of a
software system as sequence of data transformations. Each component of the processing chain is
called a filter, and the connection between one filter and the next is represented by a data stream.
Filters generally do not have state, know the identity of neither the previous nor the next filter,
and they are connected with in-memory data structures such as first-in/first-out (FIFO) buffers or
other structures. This particular sequencing is called pipelining and introduces concurrency in the
execution of the filters. Data-flow architectures are optimal when the system to be designed
embodies a multistage process, which can be clearly identified into a collection of separate
components that need to be orchestrated together.
18
Virtual machine architectures:
The virtual machine class of architectural styles is characterized by the presence of an abstract
execution environment (generally referred as a virtual machine) that simulates features that are
not available in the hardware or software. Applications and systems are implemented on top of
this layer and become portable over different hardware and software environments as long as
there is an implementation of the virtual machine they interface with.
Rule-Based Style:
This architecture is characterized by representing the abstract execution environment as an
inference engine. Programs are expressed in the form of rules or predicates that hold true. The
input data for applications is generally represented by a set of assertions or facts that the
inference engine uses to activate rules or to apply predicates, thus transforming data. The output
can either be the product of the rule activation or a set of assertions that holds true for the given
input data. The set of rules or predicates identifies the knowledge base that can be queried to
infer properties about the system. The use of rule-based systems can be found in the networking
domain: network intrusion detection systems (NIDS) often rely on a set of rules to identify
abnormal behaviors connected to possible intrusions in computing systems.
Interpreter Style.
The core feature of the interpreter style is the presence of an engine that is used to interpret a
pseudo-program expressed in a format acceptable for the interpreter. The interpretation of the
pseudo-program constitutes the execution of the program itself. Systems modeled according to
this style exhibit four main components: the interpretation engine that executes the core activity
of this style, an internal memory that contains the pseudo-code to be interpreted, a representation
of the current state of the engine, and a representation of the current state of the program being
executed.
Virtual machine architectural styles are characterized by an indirection layer between
applications and the hosting environment. This design has the major advantage of decoupling
applications from the underlying hardware and software environment, but at the same time it
introduces some disadvantages, such as a slowdown in performance.
19
Call & return architectures
This category identifies all systems that are organised into components mostly connected
together by method calls. The activity of systems modeled in this way is characterized by a chain
of method calls whose overall execution and composition identify the execution of one or more
operations.
Top-Down Style.
This architectural style is quite representative of systems developed with imperative
programming, which leads to a divide-and-conquer approach to problem resolution. Systems
developed according to this style are composed of one large main program that accomplishes its
tasks by invoking subprograms or procedures. The components in this style are procedures and
subprograms, and connections are method calls or invocation. The calling program
passes information with parameters and receives data from return values or parameters. Method
calls can also extend beyond the boundary of a single process by leveraging techniques for
remote method invocation, such as remote procedure call (RPC) and all its descendants.
Object-Oriented Style.
This architectural style encompasses a wide range of systems that have been designed and
implemented by leveraging the abstractions of object-oriented programming (OOP). Systems are
specified in terms of classes and implemented in terms of objects. Classes define the type of
components by specifying the data that represent their state and the operations that can be done
over these data. One of the main advantages over the top-down style is that there is a coupling
between data and operations used to manipulate them.
Layered Style.
The layered system style allows the design and implementation of software systems in terms of
layers, which provide a different level of abstraction of the system. Each layer generally operates
with at most two layers: the one that provides a lower abstraction level and the one that provides
a higher abstraction layer. Specific protocols and interfaces define how adjacent layers interact.
It is possible to model such systems as a stack of layers, one for each level of abstraction.
20
Architectural styles based on independent components
This class of architectural style models systems in terms of independent components that have
their own life cycles, which interact with each other to perform their activities. There are two
major categories within this class—communicating processes and event systems—which
differentiate in the way the interaction among components is managed.
Communicating Processes.
In this architectural style, components are represented by independent processes that leverage
IPC facilities for coordination management. This is an abstraction that is quite suitable to
modeling distributed systems that, being distributed over a network of computing nodes, are
necessarily composed of several concurrent processes. Each of the processes provides other
processes with services and can leverage the services exposed by the other processes.
Event Systems.
In this architectural style, the components of the system are loosely coupled and connected. In
addition to exposing operations for data and state manipulation, each component also publishes
(or announces) a collection of events with which other components can register. In general, other
components provide a callback that will be executed when the event is activated. During the
activity of a component, a specific runtime condition can activate one of the exposed events, thus
triggering the execution of the callbacks registered with it.
The advantage of such an architectural style is that it fosters the development of open
systems: new modules can be added and easily integrated into the system as long as they have
compliant inter-
faces for registering to the events.
1.5.3.3 System architectural styles
System architectural styles cover the physical organization of components and processes over a
distributed infrastructure. They provide a set of reference models for the deployment of such
systems and help engineers not only have a common vocabulary in describing the physical layout
of systems but also quickly identify the major advantages and drawbacks of a given deployment
and whether it is applicable for a specific class of applications.
21
Client/server
This architecture is very popular in distributed computing and is suitable for a wide variety of
applications. As depicted in Figure2.12, the client/server model features two major components:
a server and a client. These two components interact with each other through a network
connection using a given protocol. The communication is unidirectional: The client issues are
quest to the server, and after processing the request the server returns a response. There could be
multiple client components issuing requests to a server that is passively waiting for them. Hence,
the important operations in the client-server paradigm are request, accept (clientside), and listen
and response (server side).
• Thin-client model. In this model, the load of data processing and transformation is put on the
server side, and the client has a light implementation that is mostly concerned with retrieving and
returning the data it is being asked for, with no considerable further processing.
• Fat-client model. In this model, the client component is also responsible for processing and
transforming the data before returning it to the user, where as the server features a relatively light
implementation that is mostly concerned with the management of access to the data.
Client/server model
Peer-to-peer
The peer-to-peer model, depicted in Figure2.13, introduces a symmetric architecture in
which all the components, called peers, play the same role and incorporate both client
and server capabilities of the client/server model. More precisely, each peer acts as a
22
server when it processes requests from other peers and as a client when it issues
requests to other peers. With respect to the client/ server model that partitions the
responsibilities of the IPC between server and clients, the peer-to- peer model attributes
the same responsibilities to each component.
Interesting example of peer-to-peer architecture is represented by the Skype network.
The system architectural styles presented in this section constitute a reference model
that is further enhanced or diversified according to the specific needs of the application
to be designed and implemented. The server and client abstraction can be used in some
cases to model the macro scale or the micro scale of the systems.
nteresting example of peer-to-peer architecture is represented by the Skype network.
1.5.4 Models for inter process communication
Distributed systems are composed of a collection of concurrent processes interacting with each
other by means of a network connection. Therefore, IPC is a fundamental aspect of distributed
systems design and implementation. IPC is used to either exchange data and information or
coordinate the activity of processes. IPC is what ties together the different components of a
distributed system, thus making them act as a single system.
23
2.Introduction to Cloud Computing:
Cloud Computing provides us a means by which we can access the applications as utilities, over
the Internet. It allows us to create, configure, and customize applications online.
What is Cloud?
The term Cloud refers to a Network or Internet. In other words, we can say that Cloud is
something, which is present at remote location. Cloud can provide services over network, i.e.,
on public networks or on private networks, i.e., WAN, LAN or VPN. Applications such as e-
mail, web conferencing, customer relationship management (CRM), all run in cloud.
What is Cloud Computing?
Cloud Computing refers to manipulating, configuring, and accessing the applications online. It
offers online data storage, infrastructure and application.
We need not to install a piece of software on our local PC and this is how the cloud computing
overcomes platform dependency issues. Hence, the Cloud Computing is making our business
application mobile and collaborative.
.
24
CLOUD COMPUTING IN A NUTSHELL:
Cloud computing has been coined as an umbrella term to describe a category of sophisticated on-
demand computing services initially offered by commercial providers, such as Amazon, Google,
and Microsoft. It denotes a model on which a computing infrastructure is viewed as a “cloud,”
from which businesses and individuals access applications from anywhere in the world on
demand. The main principle behind this model is offering computing, storage, and software “as a
service.”
ROOTS OF CLOUD COMPUTING:
We can track the roots of clouds computing by observing the advancement of several
technologies, especially in hardware (virtualization, multi-core chips), Internet technologies
(Web services, service-oriented architectures, Web 2.0), distributed computing (clusters, grids),
and systems management (autonomic computing, data center automation).
Computing delivered as a utility can be defined as “on demand delivery of infrastructure,
applications, and business processes in a security-rich, shared, scalable, and based computer
environment over the Internet for a fee”.
This model brings benefits to both consumers and providers of IT services. Consumers can attain
reduction on IT-related costs by choosing to obtain cheaper services from external providers as
opposed to heavily investing on IT infrastructure and personnel hiring. The “on-demand”
component of this model allows consumers to adapt their IT usage to rapidly increasing or
unpredictable computing needs.
25
3.Cloud Computing Architecture:
Cloud Computing is trending in today’s technology driven world. With the advantages of
flexibility, storage, sharing and easy accessibility, Cloud is being used by major players in IT.
Apart from companies, individuals also use Cloud technologies for various daily activities. From
using Google drive to store, to Skype to chat and Picasa web albums, we use Cloud Computing
platforms extensively. Cloud Computing is a service provided via virtual networks, especially
the world wide web.
Cloud Computing architecture refers to the various components and sub-components of cloud
that constitute the structure of the system. Broadly, this architecture can be classified into two
sections:
-Front-end
-Back-end
Each of the ends are connected through a network, usually via Internet. The following diagram
shows the graphical view of cloud computing architecture:
26
The front-end and back-end are connected to each other via a virtual network or the internet.
There are other components like Middleware, Cloud Resources, etc, that are part of the Cloud
Computing architecture.
Front-end is the side that is visible to the client, customer or the user. It includes the client’s
computer system or network that is used for accessing the cloud system. Different Cloud
Computing systems have different user interfaces. For email programs, support is driven from
web browsers like Firefox, Chrome, and Internet Explorer.
Back-end is the side used by the service provider. It includes various servers, computers, data
storage systems, and virtual machines that together constitute the cloud of computing services.
This system can include different types of computer programs. Each application in this system is
managed by its own dedicated server. The back-end side has some responsibilities to fulfill for
the client.
-To provide security mechanisms, traffic control and protocols
-To employ protocols that connects networked computers for communication
Protocols
A single central server is used to manage the entire Cloud Computing system. This server is
responsible for monitoring traffic and making each end run smoothly without any disruption.
This process is followed with a fixed set of rules called Protocols. Also, a special set of software
called Middleware is used to perform the processes. Middleware connects networked computers
to each other.
Depending on the client’s demand, adequate storage space is provided by the Cloud Computing
service provider. While some companies require huge number of digital storage devices, others
do not require as much space. The Cloud Computing service provider usually has capacity for
twice the amount of storage space that is required by the client. This is to keep a copy of client’s
data secured during system breakdown. Saving copies of data for backup is known as
Redundancy.
27
4.Cloud Concepts and Technologies:
This Chapter Covers
Concepts and enabling technologies of cloud computing including:
Virtualization
Load balancing
Scalability & Elasticity
Deployment
Replication
Monitoring 4.1 Virtualization Virtualization refers to the partitioning the resources of a physical system (such as computing,
storage, network and memory) into multiple virtual resources. Virtualization is the key
enabling technology of cloud computing and allows pooling of resources. In cloud computing,
resources are pooled to serve multiple users using multi-tenancy. Multi-tenant aspects of the
cloud allow multiple users to be served by the same physical hardware. Users are assigned
virtual resources that run on top of the physical resources. Figure 2.1 shows the architecture
of a virtualization technology in cloud computing. The physical resources such as computing,
storage memory and network resources are virtualized. The virtualization layer partitions the
physical resources into multiple virtual machines. The virtualization layer allows multiple
operating system instances to run currently as virtual machines on the same underlying
physical resources.
Hypervisor
The virtualization layer consists of a hypervisor or a virtual machine monitor (VMM). The
hypervisor presents a virtual operating platform to a guest operating system (OS). There are
two types of hypervisors as shown in Figures 2.2 and 2.3 . Type-1 hypervisors or the native
hypervisors run directly on the host hardware and control the hardware and monitor the guest
operating systems. Type 2 hypervisors or hosted hypervisors run on top of a conventional
(main/host) operating system and monitor the guest operating systems.
Guest OS
A guest OS is an operating system that is installed in a virtual machine in addition to the host or
main OS. In visualization, the guest OS can be different from the host OS.
Various forms of virtualization approaches exist:
28
Full Virtuallzation
In full virtualization, the virtualization layer completely decouples the guest OS from the
underlying hardware. The guest OS requires no modification and is not aware that it is being
virtualized. Full virtualization is enabled by direct execution of user requests and binary
translation of OS requests. Figure 2.4 shows the full virtualization approach.
Para-Virtualization
In para-virtualization, the guest OS is modified to enable communication with the hypervisor to
improve performance and efficiency. The guest OS kernel is modified to replace nonvirtualizable
instructions with hypercalls that communicate directly with the virtualization layer hypervisor.
Figure 2.5 shows the para-virtualization approach.
Hardware Virtualization
Hardware assisted virtualization is enabled by hardware features such as Intel's Virtualization
Technology (VT-x) and AMD's AMD-V. In hardware assisted virtualization, privileged and
sensitive calls are set to automatically trap the hypervisor. Thus, there is no need for either binary
translation or para-virtualization.
4.2 Load Balancing
One of the important features of cloud computing is scalability. Cloud computing resources can
be scaled up on demand to meet the performance requirements of applications. Load balancing
distributes workloads across multiple servers to meet the application workloads. The goals of
load balancing techniques are to achieve maximum utilization of resources, minimizing the
response times, maximizing throughput. Load balancing distributes the incoming user requests
across multiple resources. With load balancing, cloud-based applications can achieve high
availability and reliability. Since multiple resources under a load balancer are used to serve the
user requests, in the event of failure of one or more of the resources, the load balancer can
automatically reroute the user traffic to the healthy resources. To the end user accessing a cloud-
based application, a load balancer makes the pool of servers under the load balancer appear as a
single server with high computing capacity.
Round Robin
In round robin load balancing, the servers are selected one by one to serve the incoming requests
in a non-hierarchical circular fashion with no priority assigned to a specific server.
Weighted Round Robin
29
In weighted round robin load balancing, severs are assigned some weights. The incoming
requests are proportionally routed using a static or dynamic ratio of respective weights.
Low Latency
In low latency load balancing the load balancer monitors the latency of each server. Each
incoming request is routed to the server which has the lowest latency.
Least Connections
In least connections load balancing, the incoming requests are routed to the server with the least
number of connections.
Priority
In priority load balancing, each server is assigned a priority. The incoming traffic is routed to the
highest priority server as long as the server is available. When the highest priority server fails,
the incoming traffic is routed to a server with a lower priority.
Overflow
Overflow load balancing is similar to priority load balancing. When the incoming requests to
highest priority server overflow, the requests are routed to a lower priority server.
Figure 2.6 depicts these various load balancing approaches. For session based applications, an
important issue to handle during load balancing is the persistence of multiple requests from a
particular user session. Since load balancing can route successive requests from a user session to
different servers, maintaining the state or the information of the session is important. Three
commonly used persistence approaches are described below:
Sticky sessions
In this approach all the requests belonging to a user session are routed to the same server. These
sessions are called sticky sessions. The benefit of this approach is that it makes session
management simple. However, a drawback of this approach is that if a server fails all the
sessions belonging to that server are lost, since there is no automatic failover possible.
Session Database
In this approach, all the session information is stored externally in a separate session database,
which is often replicated to avoid a single point of failure. Though, this approach involves
additional overhead of storing the session information, however, unlike the sticky session
approach, this approach allows automatic failover.
30
Browser cookies
In this approach, the session information is stored on the client side in the form of browser
cookies. The benefit of this approach is that it makes the session management easy and has the
least amount of overhead for the load balancer.
URL re-writing
In this approach, a URL re-write engine stores the session information by modifying the URLs
on the client side. Though this approach avoids overhead on the load balancer, a drawback is that
the amount of session information that can be stored is limited. For applications
4.3 Scalability & Elasticity
Multi-tier applications such as e-Commerce, social networking, business-to-business, etc. can
experience rapid changes in their traffic. Each website has a different traffic pattern which is
determined by a number of factors that are generally hard to predict beforehand. Modern web
applications have multiple tiers of deployment with varying number of servers in each tier.
Capacity planning is an important task for such applications. Capacity planning involves
determining the right sizing of each tier of the deployment of an application in terms of the
number of resources and the capacity of each resource. Capacity planning may be for computing,
storage, memory or network resources. Figure 2.7 shows the cost versus capacity curves for
traditional and cloud approaches.
Traditional approaches for capacity planning are based on predicted demands for applications
and account for worst case peak loads of applications. When the workloads of applications
increase, the traditional approaches have been either to scale up or scale out.
4.4 Deployment
Deployment prototyping can help in making deployment architecture design choices. By
comparing performance of alternative deployment architectures, deployment prototyping can
help in choosing the best and most cost effective deployment architecture that can meet the
application performance requirements. Deployment design is an iterative process that involves
the following steps:
Deployment Design
In this step the application deployment is created with various tiers as specified in the
deployment configuration. The variables in this step include the number of servers in each tier,
computing, memory and storage capacities of severs, server interconnection, load balancing and
replication strategies. Deployment is created by provisioning the cloud.
Performance Evaluation
31
Once the application is deployed in the cloud, the next step in the deployment lifecycle is to
verify whether the application meets the performance requirements with the deployment. This
step involves monitoring the workload on the application and measuring various workload
parameters such as response time and throughput. In addition to this, the utilization of servers
(CPU, memory, disk, I/O, etc.) in each tier is also monitored.
Deployment Refinement
After evaluating the performance of the application, deployments are refined so that the
application can meet the performance requirements. Various alternatives can exist in this step
such as vertical scaling (or scaling up), horizontal scaling (or scaling out), alternative server
interconnections. alternative load balancing and replication strategics, for instance.
4.5 Replication
Replication is used to create and maintain multiple copies of the data in the cloud. Replication of
data is important for practical reasons such as business continuity and disaster recovery.
In the event of data loss at the primary location, organizations can continue to operate their
applications from secondary data sources. With real-time replication of data, organizations can
achieve faster recovery from failures. Traditional business continuity and disaster recovery
approaches don't provide efficient, cost effective and automated recovery of data. Cloud based
data replication approaches provide replication of data in multiple locations, automated recovery,
low recovery point objective (RPO) and low recovery time objective (RTO). Cloud enables rapid
implementation of replication solutions for disaster recovery for small and medium enterprises
and large organizations. With cloud-based data replication organizations can plan for disaster
recovery without making any capital expenditures on purchasing, configuring and managing
secondary site locations. Cloud provides affordable replication solutions with pay-per-use/pay-
as-you-go pricing models. There are three types of replication approaches as shown in Figure 2.9
and described as follows:
Array based replication
Host based replication
Network based replication
4.6 Monitoring
Cloud resources can be monitored by monitoring services provided by the cloud service
providers. Monitoring services allow cloud users to collect and analyze the data on various
monitoring metrics. Figure 2.10 shows a generic architecture for a cloud monitoring service. A
monitoring service collects data on various system and application metrics from the cloud
computing instances. Monitoring services provide various pre-defined metrics. Users can also
32
define their custom metrics for monitoring the cloud resources. Users can define various actions
based on the monitoring data, for example, auto-scaling a cloud deployment when the CPU
usage of monitored resources becomes high. Monitoring services also provide various statistics
based on the monitoring data collected. Table 2.4 lists the commonly
4.7 Software Defined Networking
Software-Defined Networking (SDN) is a networking architecture that separates the control
plane from the data plane and centralizes the network controller. Figure 2.11 shows the
conventional network architecture built with specialized hardware (switches, routers, etc.).
Network devices in conventional network architectures are getting exceedingly complex with the
increasing number of distributed protocols being implemented and the use of proprietary
hardware and interfaces. In the conventional network architecture the control plane and data
plane are coupled. Control plane is the part of the network that carries the signaling and routing
message traffic while the data plane is the part of the network that carries the payload data
traffic.
The limitations of the conventional network architectures are as follows:
• Complex Network Devices: Conventional networks are getting increasingly complex with
more and more protocols being implemented to improve link speeds and reliability.
Interoperability is limited due to the lack of standard and open interfaces. Network devices use
proprietary hardware and software and have slow product lifecycles limiting innovation. The
conventional networks were well suited for static traffic patterns and had a large number of
protocols designed for specific applications. With the emergence of cloud computing and
proliferation of internet access devices, the traffic patterns are becoming more and more
dynamic. Due to the complexity of conventional network
4.8 Network Function Virtualization
Network Function Virtualization (NFV) is a technology that leverages virtualization to
consolidate the heterogeneous network devices onto industry standard high volume servers,
switches and storage. NFV is complementary to SDN as NFV can provide the infrastructure on
which SDN can run. NFV and SDN are mutually beneficial to each other but not dependent.
Figure 2.16 shows the NFV architecture, as being standardized by the European Telecom-
munications Standards Institute (ETSI) [III. Key elements of the NFV architecture are as
follows:
Virtualized Network Function (VNF): VNF is a software implementation of a
network function which is capable of running over the NFV Infrastructure (NFVI).
33
NFV Infrastructure (NFVI): NFVI includes compute, network and storage
resources that are virtualized.
NFV Management and Orchestration: MN Management and Orchestration
focuses on all virtualization-specific management tasks and covers the orchestration and
lifecycle management of physical and/or software resources that support the
infrastructure virtualization, and the lifecycle management of VNFs.
NFV comprises of network functions implemented in software that run on virtualized resources
in the cloud. NFV enables a separation the network functions which are implemented.
5.Cloud Services and Platforms:
There are various types of cloud services and for each category of cloud services, examples
of services are provided by various cloud service providers including Amazon, Google and
Microsoft.
5.1 Compute Services
Compute services provide dynamically scalable compute capacity in the cloud. Compute
resources can be provisioned on-demand in the form of virtual machines. Virtual machines can
be created from standard images provided by the cloud service provider (e.g. Ubuntu image,
Windows server image, etc.) or custom images created by the users. A machine image is a
template that contains a software configuration (operating system. application server, and
applications). Compute services can be accessed from the web consoles of these services that
provide graphical user interfaces for provisioning, managing and monitoring these services.
Cloud service providers also provide APIs for various programming languages (such as Java,
Python. etc. ) that allow developers to access and manage these services programmatically.
Features
• Scalable: Compute services allow rapidly provisioning as many virtual machine instances
as required. The provisioned capacity can be scaled-up or down based on the workload levels.
Auto-scaling policies can be defined for compute services that are triggered when the
monitored metrics (such as CPU usage. memory usage. etc.) go above pre-defined thresholds.
34
Flexible: Compute services give a wide range of options for virtual machines with multiple
instance types, operating systems. zones/regions, etc.
Secure: Compute services provide various security features that control the access to the
virtual machine instances such as security groups, access control lists, network fire-walls,
ctc. Users can securely connect to the instances with SSH using authentication mechanisms
such as (Muth or security certificates and keypairs.
Cost effective: Cloud service providers offer various billing options such as on-demand instances
which arc billed per-hour, reserved instances which arc reserved after one-time initial
payment, spot instances for which users can place bids. etc.
5.2 Storage Services
Cloud storage allow storage and retrieval of any amount of data, at anytime from anywhere on
the web. Most cloud storage services organize data into buckets or containers. Buckets or
containers store objects which arc individual pieces of data
Features
Scalahility: Cloud storage services provide high capacity and scalahility. Objects upto
several tera-bytes in size can be uploaded and multiple buckets/containers can be created on
cloud storages.
Replication: When an object is uploaded it is replicated at multiple facilities and/or on
multiple devices within each facility.
Access Policies: Cloud storage services provide several security features such as Access
Control Lists (ACLs). bucket/container level policies. etc. ACLs can be used to selectively
grant access permissions on individual objects. Bucket/container levelpolicies can also be
defined to allow or deny Nrtnissions across some or all of the objects within a single
bucket/container.
Encryption: Cloud storage services provide Server Side Encryption (SSE) options to encrypt
all data stored in the cloud storage.
Consistency: Strong data consistency is provided for all upload and delete operations. Therefore,
any object that is uploaded can be immediately downloaded after the upload is complete.
35
5.3 Database Services
Cloud database services allow you to set-up and operate relational or non-relational databases in
the cloud. The benefit of using cloud database services is that it relieves the application
developers from the time consuming database administration tasks. Popular relational
databases provided by various cloud service providers include MySQL, Oracle, SQL Server,
etc. The non-relational (No-SQL) databases provided by cloud service providers are mostly
proprietary solutions. No-SQL databases are usually fully-managed and deliver seamless
throughput and scalability. The characteristics of relational and non-relational databases are
described in Chapter 5.
Features
Scalability: Cloud database services allow provisioning as much compute and storage resources
as required to meet the application workload levels. Provisioned capacity can be scaled-up or
down. For read-heavy workloads, read-replicas can be created.
Reliability: Cloud database services arc reliable and provide automated backup and snapshot
options.
Performance: Cloud database services provide guaranteed performance with options such as
guaranteed input/output operations per second (TOPS) which can be provisioned upfront.
Security: Cloud database services provide several security features to restrict the access to the
database instances and stored data, such as network firewalls and authentication mechanisms.
5 .4 Applicat ion Services
In this section you will learn about various cloud application services such as application
runtimes and frameworks, queuing services, email services, notification services and media
services.
5.5 Content Delivery Services
Cloud-based content delivery service include Content Delivery Networks (CDNs). A CDN is a
distributed system of servers located across multiple geographic locations to serve content to
36
end-users with high availability and high performance. CDNs are useful for serving static content
such as text, images, scripts, etc., and streaming media. CDNs have a number of edge locations
deployed in multiple locations, often over multiple backbones. Requests for static or streaming
media content that is served by a CDN are directed to the nearest edge location. CDNs cache the
popular content on the edge servers which helps in reducing bandwidth costs and improving
response times.
5.6 Analytics Services
Cloud-based analytics services allow analyzing massive data sets stored in the cloud either in
cloud storages or in cloud databases using programming models such as MapReduce. Using
cloud analytics services applications can perform data-intensive tasks such as such as data
mining, log file analysis, machine learning, web indexing, etc.
5.7 Deployment & Management Services
Cloud-based deployment & management services allow you to easily deploy and manage
applications in the cloud. These services automatically handle deployment tasks such as capacity
provisioning. load balancing. auto-scaling, and application health monitoring.
5.8 Identity & Access Management Services
Identity & Access Management (IDAM) services allow managing the authentication and
authorization of users to provide secure access to cloud resources. IDAM services are useful for
organizations which have multiple users who access the cloud resources. Using IDAM services you
can manage user identifiers, user permissions, security credentials and access keys.
5.9 Open Source Private Cloud Software
In the previous sections you learned about popular public cloud platforms. This section covers
open source cloud software that can he used to build private clouds.
5.9.1 CloudStack
Apache CloudStack is an open source cloud software that can be used for creating private
cloud offerings. CloudStack manages the network, storage, and compute nodes that make
37
up a cloud infrastructure. A CloudStack installation consists of a Management Server and
the cloud infrastructure that it manages. The cloud infrastructure can be as simple as one host
running the hypervisor or a large cluster of hundreds of hosts. Thc Management Server
allows you to configure and manage the cloud resources. Figure 3.21 shows the architecture
of CloudStack which is basically the Management Server. The Management Server
manages one or more zones where each zone is typically a single datacenter. Each zone
has one or more pods. A pod is a rack of hardware comprising of a switch and one or more
clusters. A cluster consists of one or more hosts and a
Primary storage. A host is a compute node that runs guest virtual machines. The primary
storage of a cluster stores the disk volumes for all the virtual machines running on the hosts
in that cluster.
5.9.2 Eucalyptus
Eucalyptus is an open source private cloud software for building private and hybrid clouds that
are compatible with Amazon Web Services (AWS) APIs. The architecture of Eucalyptus has The
Node Controller (NC) hosts the virtual machine instances and manages the virtual network
endpoints. The cluster-level (availability-zone) consists of three components - Cluster Controller
(CC), Storage Controller (SC) and VMWare Broker. The CC manages the virtual machines and
is the front-end for a cluster. The SC manages the Eucalyptus block volumes and snapshots to
the instances within its specific cluster. SC is equivalent to AWS Elastic Block Store (EBS). The
VMWarc Broker is an optional component that provides an AWS-compatible interface for
VMware environments. At the cloud-level there are two components - Cloud Controller (CLC)
and Walrus. CLC provides an administrative interface for cloud management and performs high-
level resource scheduling. system accounting. authentication and quota management.
5.9.3 OpenStack
OpenStack is a cloud operating system comprising of a collection of interacting services that
control computing, storage. and networking resources The OpenStack compute service (called
nova-compute) manages networks of virtual machines running on nodes, providing virtual
servers on demand. The network service (called nova-networking) provides connectivity
between the interfaces of other OpenStack services. The volume service (Cinder) manages
38
storage volumes for virtual machines. The object storage service (swift) allows users to store and
retrieve tiles. The identity service (keystone) provides authentication and authorization for other
services. The image registry (glance) acts as a catalog and repository for virtual machine images.
The OpenStack scheduler (nova-scheduler) maps the nova-API calls to the appropriate
OpenStack components. The scheduler takes the virtual machine requests from the queue and
determines where they should run. The messaging service (rabbit-mq) acts as a central node for
message passing between daemons. Orchestration activities such as running an instance are
performed by the nova-api which accepts and responds to end user compute API calls. The
Open-Stack dashboard (called horizon) provides web-based interface for managing OpenStack
services.
6.Cloud Models:
Cloud Service Models
Cloud computing services are offered to users in different forms. NIST defines atleast three
service models as follows:
Infrastructure-as-a-Service (laaS)
laaS provides the users the capability to provision computing and storage resources. These
resources are provided to the users as virtual machine instances and virtual storage. Users can
start, stop, configure and manage the virtual machine instances and virtual storage. Users can
deploy operating systems and applications of their choice on the virtual resources provisioned in
the cloud. The cloud service provider manages the underlying infrastructure. Virtual resources
provisioned by the users are billed based on a pay-per-use paradigm. Common metering metrics
used arc the number of virtual machine hours used and/or the amount of storage space
provisioned.
Platform-as-a-Service (PaaS)
PaaS provides the users the capability to develop and deploy application in the cloud using the
development tools, application programming interfaces (APIs), software libraries and services
39
provided by the cloud service provider. The cloud service provider manages the underlying cloud
infrastructure including servers, network, operating systems and storage. The users, themselves,
arc responsible for developing, deploying, configuring and managing applications on the cloud
infrastructure.
Software-as-a-Service (SaaS)
SaaS provides the users a complete software application or the user interface to the the
application itself. The cloud service provider manages the underlying cloud infrastructure
including servers, network, operating systems, storage and application software, and the user is
unaware of the underlying architecture of the cloud. Applications are provided to the user
through a thin client interface (e.g., a browser). SaaS applications arc platform independent and
can be accessed from various client devices such as workstations, laptop, tablets and
smartphones, running different operating systems. Since the cloud service provider manages both
the application and data, the users are able to access the applications from anywhere.
40
Cloud Deployment Models:
NIST also defines four cloud deployment models as follows:
Public cloud
In the public cloud deployment model, cloud services arc available to the general public or a
large group of companies. The cloud resources are shared among different users (individuals,
large organizations, small and medium enterprises and governments). The cloud services are
provided by a third-party cloud provider. Public clouds are best suited for users who want to use
cloud infrastructure for development and testing of applications and host applications in the
cloud to serve large workloads, without upfront investments in IT infrastructure.
Private cloud
In the private cloud deployment model, cloud infrastructure is operated for exclusive use of a
single organization. Private cloud services are dedicated for a single organization. Cloud
infrastructure can be setup on premise or off-premise and may be managed internally or by a
third-party. Private clouds are best suited for applications where security is very important and
organizations that want to have very tight control over their data.
Hybrid cloud
The hybrid cloud deployment model combines the services of multiple clouds (private or public).
The individual clouds retain their unique identities but are bound by standardized or proprietary
technology that enables data and application portability. Hybrid clouds are best suited for
organizations that want to take advantage of secured application and data hosting on a private
cloud, and at the same time benefit from cost savings by hosting shared applications and data in
public clouds.
Community cloud
In the community cloud deployment model, the cloud services are shared by several orga-
nizations that have the same policy and compliance considerations. Community clouds are best
suited for organizations that want to access to the same applications and data, and want the cloud
costs to be shared with larger group.
41
7.CLOUD AS A SERVICE:
In today's economy many
businesses are faced with
challenges like:
● "taking cost" out of their infrastructure
● deliver new, innovative business services
● "do more with less"
● fast change their IT infrastructure
● IT resource optimization and lowering cost
● Add rental-style capability to IT resource u
47
7.6 cloud service demand.
8.Cloud Solutions:
8.1 Introduction
Cloud environment presents
A new opportunity to enhance the user experience by providing a broader communication
path for reaching out to the user.
Providing a series of business services to the user via the application features.
Deploying the application to the cloud is somewhat different since the deployment process will
not be done locally within the enterprise and the existence of the provisioned image and a series
of deployment steps needed to deploy the application and validate the deployment.Development
and testing environments are readily available within the cloud environment.The advantages of
these environments, especially from a costing perspective, are numerous as there is no need to
purchase any servers within the normal enterprise environment.If a POC is being developed and
project is cancelled, no software, hardware and even development tools would have to be
purchased, only to be thrown away later as the cloud supports development and testing of
application.
8.1 1. Cloud Application Planning
The design and development of cloud application requires
many unique considerations:
Business Functions
Application Architecture
Security for cloud computing Cloud delivery model User experience Development, testing, and
48
runtime environment Application architecture is selected through some sort of criteria
evaluation.The key thing is talk from a security aspect is the enhancements to the existing
security model where data protection and the isolation of the data from the other areas of the
cloud environment.Encryption is one possibility to further enhance the security model whereas
the enterprise would not necessary to invoke that option.
8.1 2. Cloud Business and Operational
Support Services (BSS & OSS)
Business Support services (B_SS) are the components that cloud operators use to run
their business operations. Such operations include - taking customer orders, managing
customer data etc
Operational support Services ( OSS) are computer systems used by cloud service
providers - network services, provisioning services
BSS and OSS need to be externalized so that they can be moved to cloud environments.
8.2 Cloud Ecosystem
Bringing any cloud service to market requires corresponding pre-investment along with
respective metering and charging models in support of the corresponding business.
8.3 Cloud business process management
49
Business process management (BPM) governs an organization's cross-functional, customer-
focussed, end-to-end core business processes.
Its objective is to direct and deploy resources from across the organization into efficient
processes that create customer value-
. It focuses on driving overall bottom line success by integrating verticals and optimizing core
work. examples
order-to-cash
integrated product development
integrated supply chain
. This what differentiates BPM from traditional functional management disciplines
. In addition, intrinsic to BPM is the principle of continuous improvement, perpetually increasing
value-generation and sustaining market competitiveness (or dominance) of the organization.
. BPM clearly defines and aligns operations, organizations and information technology.
The cloud environment could help BPM
Integration of core process
holistic
cross organizational functions and boundaries ( height and breadth)
Includes business and Technology
. Continuous
This is based on longer periods of intervals pertaining to cloud business
Continual improvement
50
Cultural
Cultural considerations of the organization and geographical area kept in mind at the time of due
diligence of the requirement.
8.3.1 Identifying BPM Opportunities
The following exploratory set of question might uncover opportunities using Cloud for
Identifying BPM opportunities.
What are the strategic value proposition and capabilities defined by the enterprise?
How do you manage core business processes?
How does your customers measure and assess the performance
Cloud application development offering provide
Cloud application reference architecture
Unmatched experience developing high-performing, secure applications across a wide
range of technologies of the cloud vendor
Unmatched application security expertise
Leadership in the cloud related technologies- multi-tenancy, virtualization, pervasive
computing
Significant expertise with cloud business models
Ability to integrate a portfolio of related cloud services ex. gmail
8.3.2 Cloud Technical Strategy
Cloud services enable user to build middleware clouds in their data centers and utilize public
clouds. The following cloud-enabled services are provided
51
Infrastructure services
Platform services
ApplicatIon services
Cloud Strategy enables organizations to do the following :
. Build middleware cloud in their data center
. Utilize public clouds, where it makes sense.
. It does so by providing support in the following areas
Cloud Strategy enables organizations to do the following
Cloud enabled middleware Services
Infrastructure services
Platform services
Application services
Serving the on premise and public clouds
8.3.3 Cloud Service Management
A service Management system provides the visibility, control and automation needed for
efficient cloud delivery in both public and private implementations:
. Enable policies to lower cost with provisioning
. Automated provisioning and deprovisioning speeds service delivery
. Provisioning policies allow release and reuse of assets
. Increase system administrator productivity
Cloud services are managed either by in-house teams or cloud brokers.
Every service-oriented approach needs a mechanism to enable discovery and end-point
52
resolution
Registry/repository technology provides this where service delivery is inside the firewall
Cloud services delivered across firewalls need something similar— a third party that
serves as a " service broker"
8.4 Cloud Service Management- Cloud Brokers
. These cloud intermediaries will help companies choose the right platform, deploy app across
multiple clouds and
perhaps even provide cloud arbitrage services that allow end user to shift between platform to
capture the best
pricing.
8.5 Cloud Service Management- Cloud Brokers - categories of opportunities
. Cloud service intermediaries
Building services atop an existing cloud platform- such as additional security or management
capabilities
. Cloud Aggregation
Deploying customer services over multiple cloud platform
. Cloud service arbitrage
. supplying flexibility and "opportunities & choices" and fostering competition between clouds
8.6 Cloud Stack
CIoudStack is the bundled offering that includes hardware,software, and services needed
to get started with cloud computing.
53
It includes all the elements in a service ecosystem. It has a self-service portal, it includes
automation, and it tracks and controls all the resources.
It is completely integrated and includes a service and on top of that , users can add
additional services to do integration or other types of cloud work.
CIoudStack is a pre-packaged private cloud offering that brings together the hardware,
software and services needed to establish a private cloud to accelerate your selling efforts
and effectiveness.
. CIoudStack solution is designed from client cloud implementation experience and integrates the
service
management software system with servers, storage, and services to enable a private cloud in IT
environment.
. CIoudStack is " Built for performance" and is based on architectures and configurations
required by specific
Workloads.
. It enables the data center to accelerate the creation of services for a variety of workloads with
high degree of
flexibility, reliability and resource optimization.
8.7 On-premise cloud orchestration & provisioning engine
On-premise cloud orchestration and provisioning engine can be a bundled offer that
includes hardware, software and the services one needs to get started with cloud
computing
Orchestration describes the automated arrangement, coordination, and management of
complex computer systems, middleware, and services.
It should includes all the elements in a services ecosystem.
54
It must have a service-portal and include automation, and track and control all resources
8.8 Computing on Demand ( CoD)
On-demand computing is a necessity in today’s enterprises. Virtualization helps us in
implementing on-demand
computing. Cloud helps enterprises to use resources without buying them. This enables
enterprise to transfer workload to outside when their resource can not support it and let other to
use them when idle.
8.9 Cloud Sourcing
Cloud Sourcing the end to end solution using cloud technology using public cloud,
infrastructure and platform.
Cloud source is a planned approach and comprises or the whole service cycle of
outsourcing business with cloud principles with the help strategized connected cloud
platform that will match the overall enterprise requirements.
9.Cloud Offerings:
9.1Introduction:
Information is pouring in faster than we make sense of it. It is being authored by billions of
people flowing from a trillion intelligent devices, sensors and instrumented objects. With 80% of
new data growth existing as unstructured content from music files, to 3D images, to email
keystrokes and more, the challenge is trying to pull it all together and make it useful.
55
Until now, organizations could not fully or quickly synthesize and interpret all the information
out there- they had make decisions based largely on instinct.But now, there is software that can
capture organize and process all the data scattered throughout an organization, and turn into
actual intelligence. This enables organizations to make better business decisions.
9.2 ILM Objectives
Cost reductions Controlling demand for storage
Better system performance and personal productivity Doing the storage activities "right"
Increased effectiveness Doing the -right storage activities
Ways to generate, enhance and sustain higher savings
.Activities for gaining initial saving Reduce the amount of used storage as a result of initial clean
up
.Activities for maximizing saving Reconfigure the current storage environment effectively
improving the available to raw utilization
.Activities for sustaining savings Develop storage architecture governance model
9.2 ILM Objectives
How the pike Of a rall tlekot breaks down 26,
Cost Components 25%
Operating cost categories 22%
personal, facilities, storage hardware maintenance, storage software maintenance,
outagesInvestment cost categories new hardware required, new software required, hardware
refresh, transition services.
56
9.3 Information Management Points
Data
Information
ILM
Information Taxonomy
Information Classes
Value Driven data Placement
Storage Process
Storage service
Enterprise Class of Service ( CoS)
9.4 Information Management points
Storage service
Enterprise Class of Service ( CoS)
Storage Tier
Tiered Storage Infrastructure
Utility based Service Delisery
9.5 Cloud Analytics
Cloud analytics is the new offering in the new era of cloud computing.
This will help consulting domain and will ensure the better results.
It provides user with better forecasting techniques to analyze and optimize the
service lines and provides a higher level of accuracy.
57
9.6 Cloud Analytics
It also helps to apply analytics principles and best practices to analyse different
business consequences and achieve new levels of optimization.
This can combine complex analytics with the newer software platforms and will
lead towards the predictable business out of every business insight.
5.3.1 Cloud Business Analytics Competencies:
Cloud analytics is supported by different types of competencies.
1.Cloud Business Analytics Strategy that helps client Analytics and optimization – provides
different type or modeling techniques, deep computing and simulation techniques to check
different types of “what if” analysis to increase performance.
2. Business management and performance
management helps increase performance by providing accurate and on-time data
reporting.
9.6.1 Cloud Business Analytics Competencies Cloud analytics is supported by different
types of competencies Enterprise information management that lets the user to apply
different architecture related to data extraction, archival, retrieval, movement and
integration. Content management that includes different service architecture,
technology architecture and process related to capturing, storing, preserving, delivering
and managing the data. It also provides access in the global environment and makes it
easy to share data with stakeholders across the globe.
9.6.2 How it Works: Analytics
Analytics works with the combination of hardware, services and middleware.
This expertise makes it best suited to help clients extract new value from their business
information.
Delivering business analytics and information software requires a seamless flow
of all forms of data regardless of format, platform and location.
58
It focuses on open industry standards is the key to this effort, and gives us
significant advantages.
9.6.3 How it Works: Ana lytics
The system features include the platform that provides data reporting, analytics based on
text, mining activities, business intelligence, dashboard and
perceptive analytics techniques.
This also takes care of the storage optimization and different high-performance data-
warehouse management techniques.
9.6.4 How it Works: Analytics
Analytics Business Outcomes
Analytics systems help to get the right information as and when required, identify to get
it and point out right sources to get it.
Therefore, analytics also helps in designing the policies faster based on the information
available in the organization as decision-makers work with the exploration services
available within the organization.
This also helps in gauging the business results by measuring
the different metrics generated with the help of analytics.
59
This gives the option through which the organization can increase the
profitability, reduces cycle times and reduce defects
9.7 esting under Cloud
Testing under cloud provides a good return on investment on moving typical
testing environment to cloud.
It allows flexibility to play with the surrogate of the real system without the actual
risk
9.7.1 Benefits
Cut capital and operational costs and not affect mission critical applications.
Offer new and innovative services to clients, and present an opportunity to speed
cycle of innovation and improve solution quality
Facilitate a test environment based on request and provide request-based service
for storage, network and OS.
9.7.1 Benefits
Cut capital and operational costs and not affect mission critical applications.
Offer new and innovative services to clients, and present an opportunityto speed
cycle of innovation and improve solution quality
60
Facilitate a test environment based on request and provide request-based service
for storage, network and OS
9.7.2 Value proposition
Business test cloud delivers an integrated, flexible and extensible approach to test
resource services and management with rapid time to value.
This is an end-to-end set of services to strategize, design and build request-driven
delivery of test resources in a cost-effective, efficient manner.
9.7.3 Biggest Benefitters
With ability to deploy virtual environments quickly and automatically and redirect
capacity as needed, cloud computing offers an ideal solution for testing and development.
10.Introduction to hadoop and Mapreduce:
Hadoop is an Apache open source framework written in java that allows distributed processing
of large datasets across clusters of computers using simple programming models. A Hadoop
frame-worked application works in an environment that provides distributed storage and
computation across clusters of computers. Hadoop is designed to scale up from single server to
thousands of machines, each offering local computation and storage. Hadoop is an Apache open
source framework written in java that allows distributed processing of large datasets across
clusters of computers using simple programming models. A Hadoop frame-worked application
works in an environment that provides distributed storage and computation across clusters of
computers. Hadoop is designed to scale up from single server to thousands of machines, each
offering local computation and storage.
61
Hadoop Architecture
Hadoop framework includes following four modules:
Hadoop Common: These are Java libraries and utilities required by other Hadoop
modules. These libraries provides filesystem and OS level abstractions and contains the
necessary Java files and scripts required to start Hadoop.
Hadoop YARN: This is a framework for job scheduling and cluster resource
management.
Hadoop Distributed File System (HDFS™): A distributed file system that provides
high-throughput access to application data.
Hadoop MapReduce: This is YARN-based system for parallel processing of large data
sets.
We can use following diagram to depict these four components available in Hadoop framework.
Since 2012, the term "Hadoop" often refers not just to the base modules mentioned above but
also to the collection of additional software packages that can be installed on top of or alongside
Hadoop, such as Apache Pig, Apache Hive, Apache HBase, Apache Spark etc.
MapReduce
62
Hadoop MapReduce is a software framework for easily writing applications which process big
amounts of data in-parallel on large clusters (thousands of nodes) of commodity hardware in a
reliable, fault-tolerant manner.
The term MapReduce actually refers to the following two different tasks that Hadoop programs
perform:
The Map Task: This is the first task, which takes input data and converts it into a set of
data, where individual elements are broken down into tuples (key/value pairs).
The Reduce Task: This task takes the output from a map task as input and combines
those data tuples into a smaller set of tuples. The reduce task is always performed after
the map task.
Typically both the input and the output are stored in a file-system. The framework takes care of
scheduling tasks, monitoring them and re-executes the failed tasks.
The MapReduce framework consists of a single master JobTracker and one
slave TaskTracker per cluster-node. The master is responsible for resource management,
tracking resource consumption/availability and scheduling the jobs component tasks on the
slaves, monitoring them and re-executing the failed tasks. The slaves TaskTracker execute the
tasks as directed by the master and provide task-status information to the master periodically.
The JobTracker is a single point of failure for the Hadoop MapReduce service which means if
JobTracker goes down, all running jobs are halted.
Hadoop Distributed File System
Hadoop can work directly with any mountable distributed file system such as Local FS, HFTP
FS, S3 FS, and others, but the most common file system used by Hadoop is the Hadoop
Distributed File System (HDFS).
The Hadoop Distributed File System (HDFS) is based on the Google File System (GFS) and
provides a distributed file system that is designed to run on large clusters (thousands of
computers) of small computer machines in a reliable, fault-tolerant manner.
HDFS uses a master/slave architecture where master consists of a single NameNode that
manages the file system metadata and one or more slave DataNodes that store the actual data.
63
A file in an HDFS namespace is split into several blocks and those blocks are stored in a set of
DataNodes. The NameNode determines the mapping of blocks to the DataNodes. The
DataNodes takes care of read and write operation with the file system. They also take care of
block creation, deletion and replication based on instruction given by NameNode.
HDFS provides a shell like any other file system and a list of commands are available to interact
with the file system. These shell commands will be covered in a separate chapter along with
appropriate examples.
How Does Hadoop Work?
Stage 1
A user/application can submit a job to the Hadoop (a hadoop job client) for required process by
specifying the following items:
1. The location of the input and output files in the distributed file system.
2. The java classes in the form of jar file containing the implementation of map and reduce
functions.
3. The job configuration by setting different parameters specific to the job.
Stage 2
The Hadoop job client then submits the job (jar/executable etc) and configuration to the
JobTracker which then assumes the responsibility of distributing the software/configuration to
the slaves, scheduling tasks and monitoring them, providing status and diagnostic information
to the job-client.
Stage 3
The TaskTrackers on different nodes execute the task as per MapReduce implementation and
output of the reduce function is stored into the output files on the file system.
Advantages of Hadoop
Hadoop framework allows the user to quickly write and test distributed systems. It is
efficient, and it automatic distributes the data and work across the machines and in turn,
utilizes the underlying parallelism of the CPU cores.
64
Hadoop does not rely on hardware to provide fault-tolerance and high availability
(FTHA), rather Hadoop library itself has been designed to detect and handle failures at
the application layer.
Servers can be added or removed from the cluster dynamically and Hadoop continues to
operate without interruption.
Another big advantage of Hadoop is that apart from being open source, it is compatible
on all the platforms since it is Java based.
Cloud Computing UNIT I
Short Answer Questions
1.Define parallel; computing.
2.Types of Distributed Computing on basis of architectural style.
3.Define cloud computing.
4.Define map reduce technique.
Descriptive Questions
1. Write short notes on compute services, Storage services and database
services. CO
2. What is Hadoop? How does it work? Explain the architecture of Hadoop
with neat diagram.
3. Write down the broad approaches of migrating into cloud?
4. What is virtualization? Explain the taxonomy of virtualization techniques.
Assignment Questions 1.Give the differences between parallel computing and distributed computing.
2.Give the deployment models of cloud .
3Give the architecture of cloud.
4.Write about Amazon cloud.
Objective questions
1. _________ model consists of the particular types of services that you can
access on a cloud computing platform. a) Service b) Deployment c) Application
d) None of the mentioned
65
2. Point out the correct statement : a) The use of the word “cloud” makes
reference to the two essential concepts b) Cloud computing abstracts systems
by pooling and sharing resources c) cloud computing is nothing more than the
Internet d) All of the mentioned
3. ________ refers to the location and management of the cloud’s
infrastructure. a) Service b) Deployment c) Application d) None of the
mentioned
4. Which of the following is deployment model ? a) public b) private c) hybrid
d) all of the mentioned
5. ________ as a utility is a dream that dates from the beginning of the
computing industry itself. a) Model b) Computing c) Software d) All of the
mentioned
6. Point out the wrong statement : a) All applications benefit from deployment
in the cloud b) With cloud computing, you can start very small and become big
very fast c) Cloud computing is revolutionary, even if the technology it is built
on is evolutionary d) None of the mentioned
7. ________ has many of the characteristics of what is now being called cloud
computing. a) Internet b) Software’s c) Web Service d) All of the mentioned
8. Which of the following is related to service provided by Cloud ? a) Sourcing
b) Ownership c) Reliability d) AaaS
9. The ________ cloud infrastructure is operated for the exclusive use of an
organization. a) Public b) Private c) Community d) All of the mentioned
10. A ____________ cloud combines multiple clouds where those clouds retain
their unique identities, but are bound together as a unit. a) Public b) Private c)
Community d) Hybrid
Answer the following questions either True or False:
11. Scalability in the cloud allows users to expand or contract when they need
to.
12. Cloud load balancers typically have built-in redundancy.
13. Cloud computing refers to applications and services that run on a
distributed network using virtualized resources.
14. Productivity is essential concept related to Cloud?
15. Virtualization cloud concept is related to pooling and sharing of resources.
16. Intranet can be identified as cloud.
17. Cloud computing is an abstraction based on the notion of pooling physical
resources and presenting them as a Virtual resource.
18. AWS is Cloud Platform by Amazon.
19. Deployment refers to the location and management of the cloud’s
infrastructure.
20. Amazon has built a worldwide network of data centers to service its search
engine.
66
UNIT TEST QUESTIONS
1.Give the different types of parallel computing.
2.Explain IAAS feature of cloud.
3.Describe about hadoop architecture.
Web Link:
https://www.tutorialspoint.com/cloud_computing/cloud_computing_tutorial.pdf
https://www.guru99.com/cloud-computing-for-beginners.html
PPTS
NA
Videos
NA