CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 ·...

39
CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The student should be made to: Understand how Grid computing helps in solving large scale scientific problems. Gain knowledge on the concept of virtualization that is fundamental to cloud computing. Learn how to program the grid and the cloud. Understand the security issues in the grid and the cloud environment. UNIT I - INTRODUCTION

Transcript of CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 ·...

Page 1: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

CS6703 - GRID AND CLOUD COMPUTING

CS6703 - GRID AND CLOUD COMPUTING

OBJECTIVES

The student should be made to:

Understand how Grid computing helps in solving large scale scientific problems.

Gain knowledge on the concept of virtualization that is fundamental to cloud computing.

Learn how to program the grid and the cloud.

Understand the security issues in the grid and the cloud environment.

UNIT I - INTRODUCTION

Page 2: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

Evolution of Distributed computing: Scalable computing over the Internet – Technologies for

network based systems – clusters of cooperative computers - Grid computing Infrastructures –

cloud computing - service oriented architecture – Introduction to Grid Architecture and standards

– Elements of Grid – Overview of Grid Architecture.

UNIT II - GRID SERVICES

Introduction to Open Grid Services Architecture (OGSA) – Motivation – Functionality

Requirements –Practical & Detailed view of OGSA/OGSI – Data intensive grid service models –

OGSA services.

UNIT III - VIRTUALIZATION

Cloud deployment models: public, private, hybrid, community – Categories of cloud computing:

Everything as a service: Infrastructure, platform, software - Pros and Cons of cloud computing –

Implementation levels of virtualization – virtualization structure – virtualization of CPU,

Memory and I/O devices – virtual clusters and Resource Management – Virtualization for data

center automation.

UNIT IV - PROGRAMMING MODEL

Open source grid middleware packages – Globus Toolkit (GT4) Architecture , Configuration –

Usage of Globus – Main components and Programming model - Introduction to Hadoop

Framework - Mapreduce, Input splitting, map and reduce functions, specifying input and output

parameters, configuring and running a job – Design of Hadoop file system, HDFS concepts,

command line and java interface, dataflow of File read & File write.

UNIT V - SECURITY

Trust models for Grid security environment – Authentication and Authorization methods – Grid

security infrastructure – Cloud Infrastructure security: network, host and application level –

aspects of data security, provider data and its security, Identity and access management

architecture, IAM practices in the cloud, SaaS, PaaS, IaaS availability in the cloud, Key privacy

issues in the cloud.

Page 3: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

TOTAL: 45 PERIODS

OUTCOMES:

At the end of the course, the student should be able to:

Apply grid computing techniques to solve large scale scientific problems.

Apply the concept of virtualization.

Use the grid and cloud tool kits.

Apply the security models in the grid and the cloud environment.

TEXT BOOK:

1. Kai Hwang, Geoffery C. Fox and Jack J. Dongarra, “Distributed and Cloud Computing:

Clusters,Grids, Clouds and the Future of Internet”, First Edition, Morgan Kaufman Publisher, an

Imprint of Elsevier, 2012.

REFERENCES:

1. Jason Venner, “Pro Hadoop- Build Scalable, Distributed Applications in the Cloud”, A Press,

2009.

2. Tom White, “Hadoop The Definitive Guide”, First Edition. O‟Reilly, 2009.

3. Bart Jacob (Editor), “Introduction to Grid Computing”, IBM Red Books, Vervante, 2005

4. Ian Foster, Carl Kesselman, “The Grid: Blueprint for a New Computing Infrastructure”, 2nd

Edition, Morgan Kaufmann.

5. Frederic Magoules and Jie Pan, “Introduction to Grid Computing” CRC Press, 2009.

6. Daniel Minoli, “A Networking Approach to Grid Computing”, John Wiley Publication, 2005.

7. Barry Wilkinson, “Grid Computing: Techniques and Applications”, Chapman and Hall, CRC,

Taylor and Francis Group, 2010.

COURSE OUTCOMES:

Upon completion of the course, the students will have

CO1 An ability to apply grid computing techniques to solve large scale scientific problems.

Page 4: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

CO2 An ability to understand grid services and their functionalities.

CO3 An ability to understand and apply the concept of virtualization.

CO4 An ability to design and develop application using grid and cloud tool kits.

CO5 An ability to analysis the security models in the grid and the cloud environment.

UNIT I INTRODUCTION

Page 5: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

Evolution of Distributed computing: Scalable computing over the Internet – Technologies for

network based systems – clusters of cooperative computers - Grid computing Infrastructures –

cloud computing - service oriented architecture – Introduction to Grid Architecture and standards

– Elements of Grid – Overview of Grid Architecture.

PART A

1. List out the three computing Paradigms.R

The growth of Internet clouds as a new computing paradigm.

The maturity of radio-frequency identification (RFID),

Global Positioning System (GPS), and sensor technologies has triggered the development of

the Internet of Things (IoT).

2. Define Distributed computing?R

Distributed computing this is a field of computer science/engineering that studies

distributed Systems. A distributed system consists of multiple autonomous computers, each

having its own private memory, communicating through a computer network. Information

exchange in a distributed system is accomplished through message passing. A computer program

that runs in a distributed system is known as a distributed program. The process of writing

distributed programs is referred to as distributed programming

3. What is meant by cloud computing?U

An Internet cloud of resources can be either a centralized or a distributed computing

system. The cloud applies parallel or distributed computing, or both. Clouds can be built with

physical or virtualized resources over large data centers that are centralized or distributed. Some

authors consider cloud computing to be a form of utility computing or service computing.

4. What is SOA?R(Nov/Dec 2018)

A service-oriented architecture (SOA) is an architectural pattern in computer software

design in which application components provide services to other components via a

communications protocol, typically over a network. The principles of service-orientation are

independent of any vendor, product or technology

Page 6: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

5. What is meant virtual machine?U

Virtual machines (VMs) offer novel solutions to underutilized resources, application

inflexibility, software manageability, and security concerns in existing physical machines.

6. Can you define cluster?R

A computing cluster consists of interconnected stand-alone computers which work

cooperatively as a single integrated computing resource. In the past, clustered computer systems

have demonstrated impressive results in handling heavy workloads with large data sets.

7. What is meant by SSI?U

Ideal cluster should merge multiple system images into a single-system image (SSI).

Cluster designers desire a cluster operating system or some middleware to support SSI at various

levels, including the sharing of CPUs, memory, and I/O across all cluster nodes. An SSI is an

illusion created by software or hardware that presents a collection of resources as one integrated,

powerful resource.

8. Define Peer-to-Peer Network.R

Well-established distributed system is the client-server architecture. In this scenario,

client machines (PCs and workstations) are connected to a central server for compute, e-mail, file

access, and database applications. The P2P architecture offers a distributed model of networked

systems. First, a P2P network is client-oriented instead of server-oriented.

9. Define Overlay Networks.R

Unstructured and structured. An unstructured overlay network is characterized by a

random graph. There is no fixed route to send messages or files among the nodes. Often,

flooding is applied to send a query to all nodes in an unstructured overlay, thus resulting in heavy

network traffic and nondeterministic search results. Structured overlay networks follow certain

connectivity topology and rules for inserting and removing nodes (peer IDs) from the overlay

graph.

10. List out three clouds services.R

Infrastructure as a Service (IaaS)

Software as a Service (SaaS

Platform as a Service (PaaS)

11. Define Message-Passing Interface (MPI).R

Page 7: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

This is the primary programming standard used to develop parallel and concurrent

programs to run on a distributed system. MPI is essentially a library of subprograms that can be

called from C or FORTRAN to write parallel programs running on a distributed system. The idea

is to embody clusters, grid systems, and P2P systems with upgraded web services and utility

computing applications

12. List out the types of resource used in grid computing?R

Computation

Storage

Communications

Software and licenses

13. List the Grid software components.R

Management components

Distributed grid management

Donor software

Submission software

Schedulers

14. What are all the Grid Topologies? R

Intragrid

Local grid within an organization

Trust based on personal contracts

Extragrid

Resources of a consortium of organizations connected through a (Virtual) Private

Network

Trust based on Business to Business contracts

Intergrid

Global sharing of resources through the internet

Trust based on certification

15. Analyse all the elements involved in Grid?AN

Resource sharing

– Computers, data, storage, sensors, networks, …

Sharing always conditional: issues of trust, policy, negotiation, payment etc.,

Page 8: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

Coordinated problem solving

Beyond client-server: distributed data analysis, computation, collaboration etc.,

Dynamic, multi-institutional virtual organizations

– Community overlays on classic org structures

– Large or small, static or dynamic

16. Bring out the differences between private cloud and public cloud. AN (Nov/Dec 2016)

Characteristics Public Cloud Private cloud

Scalability Very high Limited

Security Good, but depends on the security

measures of the service provider

Most secure

Performance Low to medium Very good

Reliability Medium- depends on the Internet

Connectivity and service provider

availability

High, as all equipment is

on-premises

Cost Very Good; pay-as-you-go model and

no need for on-premises storage

infrastructure

Good, but requires on-

premises resources such as

data center space

17. Highlight the importance of the term "cloud computing". AN(Nov/Dec 2016)

Cloud computing is a computing paradigm, where a large pool of systems are connected

in private or public networks, to provide dynamically scalable infrastructure for application,

data and file storage. With the advent of this technology, the cost of computation, application

hosting, content storage and delivery is reduced significantly.

18. Tabulate the difference between high performance computing and high throughput

computing. AN (April/May 2017)

S. No. High Performance High Throughput

1. Granularity largely defined by the algorithm Granularity can be selected to fit the

environment

2. Load balancing difficult Load balancing easy

3. Hard to schedule different workloads Mixing workloads is easy

4. Reliability is all important Sustained throughput is the key goal

Page 9: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

19. Give the operations of a VM. R (April/May 2017).

Virtualization is the enabling technology and creates virtual machines that allow a single

machine to act as if it were many machines.

Runs several applications at the same time on a single physical server by hosting each of

them inside their own virtual machine.

By running multiple virtual machines simultaneously, a physical server can be utilized

efficiently

20. “Grid inherits features of P2P and Cluster Computing Systems”. Is the statement true?

Validate your answer. AN(Nov/Dec 2017)

Yes. Grid inherits features of P2P and Cluster Computing Systems. Cluster computing is

the base of all distributed computing paradigm, it aggregates the resources locally and shares the

load. Grid computing is the extended version of cluster, in which resources are provisioned

through internet. Cloud is on top of all, it provides more or less same functionalities as the above

two, but provides in the form of services and bills the same.

21. Differentiate between grid and cloud computing. AN (Nov/Dec 2017)

Description Grid Cloud

Underlying Concept Utility Computing Utility Computing

Main benefit Solve computationally

complex problems

Provide a scalable standard environment

for network-centric application

development, testing and development

Resource

Distribution/

Allocation

Negotiate and manage

resource sharing;

Schedulers

Simple user-provider model;

Pay per use

Domains Multiple Domains Single Domain

22. "Networks are backbones of grid computing". Justify the statement. AN (April/May

2018).

A grid computing interconnects various pieces of network, providing a path for the

exchange of information between different nodes.

23. Differentiate GRIS with GIIS with an illustration. AN (April/May 2018).

Grid Resource Information Service (GRIS)

• Associated with each resource.

• Answers queries from client/user about the particular resource.

Page 10: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

• Accesses an “information provider” deployed on that resource for

requested information.

Grid Index Information Service (GIIS)

• A directory service that collects („pulls”) information for GRIS‟s.

• A “caching” service.

• Provides indexing and searching functions

24. Outline any two advantages of distributed computing. R (Nov/Dec 2018)

Advantages

Shareability

Expandability

Local autonomy

Improved performance

Improved reliability and availability

Potential cost reductions

PART-B

1. Explain Evolution of Distributed computing? U

2. Explain Technologies for network based system? U

3. Design neat diagram for Grid computing Infrastructures and explain? C

4. Explain Grid computing elements? U

5. Explain SOA? U

6. Illustrate the architecture of virtual machine and brief about the operations. (AP) (Nov/Dec

2016)

7. Write short notes on: (i) cluster of cooperative computers. (8) (ii) service oriented

architecture. (8) (R )(Nov/Dec 2016)

8. (i) Demonstrate in detail about internet of things and cyber physical systems.(8) AP

(ii) Examine the memory, storage and wide area networking technology in network based

system. (8)

9. Analyze in detail about the GPU programming model.(16) (AN)

10. i)Explain the layered architecture of SOA for web services.(8) (AN)

ii) Compare the features of grid versus cloud. (8) (AN)

11. Brief the interaction between the GPU and CPU in performing parallel execution of

Page 11: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

operations. (U )(April/May 2017)

12. Illustrate with a neat sketch, the grid computing Infrastructures. (R ) (April/May 2017)

13. (i) Describe the infrastructure requirements for grid computing. (U )(Nov/Dec 2017)

(ii) What are the issues in cluster design? How can they be resolved?

14.(i) Describe the layered grid architecture. How does it map onto internet protocol

architecture? U (Nov/Dec 2017)

(ii) Describe the architecture of a cluster with suitable illustrations. U

15. Explain in detail the layered architecture of a grid environment and the

functionalities of a grid server. (U) (April/May 2018)

16. Discuss the evolution path of cloud computing. Also, express the difference between

grid and distributed computing. (U) (April/May 2018)

17.(i)Outline the architecture of a cluster of cooperative computers with a diagram.(7)

(R ) (Nov/Dec 2018)

(ii) Outline the similarities and differences between distributed computing grid

computing and cloud computing.(6)( AN) (Nov/Dec 2018)

18. What is grid computing? Draw a typical view of a grid environment and outline the key

elements of grid. (13) (U) (Nov/Dec 2018)

UNIT II GRID SERVICES

Introduction to Open Grid Services Architecture (OGSA) – Motivation – Functionality

Requirements – Practical & Detailed view of OGSA/OGSI – Data intensive grid service models

– OGSA services.

PART-A

1. Define OGSA? R

• OGSA defines what Grid services are, what they should be capable of, what type of

technologies they should be based on. OGSA does not give a technical and detailed

specification. It uses WSDL

• It is a formal and technical specification of the concepts described in OGSA.

Page 12: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

• The Globus Toolkit 3 is an implementation of OGSI.

2. List the OGSA Design Goals. R

Operations are grouped to form interfaces, and interfaces are combined to specify a

service. Encourages code-reuse „ Simplifies application design Ease of composition of services

Service Virtualization: isolate users from details of service implementation and location.

3. Why IDL and Service Virtualization? AN

Service discovery Allows clients to query and find suitable services in an unfamiliar

environment.

Service composition

Code-reusage, dynamic construction of complex systems from simple components.

Specialization Use of different implementation of a service interface on different

platforms.

Interface extension

Allows extensions to specialized service interfaces

4. List the OGSA Components. R

Open Grid Services Infrastructure (OGSI)

OGSA services

OGSA schemas

Built on Web services

5. What is the role of OGSA? U

OGSI does not define everything:

How to establish identity and authenticate?

How is policy expressed/negotiated?

„How do I discover services? …

OGSA needs to pick up the slack:

Define additional services

Define standard schema to achieve interoperability

6. What is the use of WSDL? U

Used by OGSA to describe software components independent of any programming

language/implementation WSDL service definition is encoded using XML Service description

Page 13: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

defines the service interface Implementation details Describes how the interface maps to

protocol messages and concrete endpoint addresses

7. What are the Features of OGSI? R

Grid service description and instances

Distinguish between definition and instances

Service state, metadata, introspection

Allows clients to receive states of a particular service

Naming conventions/ resolution Service life cycle management

Fault type Standard base type for fault messages Service groups

8. List out Functional requirements for OGSA platform. R

Discovery and brokering

Data sharing

Virtual organizations Monitoring:

Policy

9. List out Resource Management Functions. R

Advanced Reservation: Scheduling: Load balancing: Notification/Messaging: Logging:

Fault tolerance: Disaster Recovery

10. What are all the services of OGSA? R

Web services with improved characteristics and services.

Provides a set of well-defined interfaces and conventions.

Interface Addresses:

a) Discovery, dynamic Service Creation.

b) Lifetime Management, notification.

Conventions include:

a) Naming Services

b) Upgrade ability.

11. Analyse the functional requirement of OGSA? AN

Resource sharing requirements include:

Global name space.

Page 14: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

Metadata services.

Site autonomy.

Resource usage data

12. List out the services of Execution Management (OGSA-EMS)? R

Execution Management Services (OGSA-EMS) are concerned with the problems of

instantiating and managing, to completion, units of work. Examples of units of work may include

either OGSA applications or legacy (non-OGSA) applications (a database server, a servlet

running in a Java application server container, etc).

13. Explain the EMS Services classes? U

Resources that model processing, storage, executables, resource management, and

provisioning;

Job management; and

Resource selection services that collectively decide where to execute a unit of work.

14. Analyse the requirements of Data services. AN

• Data storage

• Data access.

• Data transfer

• Data location management & Data federation

16. What are the security concerns associated with the grid? R (Nov/Dec 2016)

Multiple security infrastructures

Perimeter security solutions

Authentication, Authorization, and Accounting

Encryption

Application and Network-Level Firewalls

Certification

17. What do you understand by the term data intensive? U (April/May 2017)

• Data intensive grid service models need to handle large volume of data.

• So the grid systems designed must be able to discover, transfer, and manipulate these

massive data sets

• Desirable properties

– Less time-consuming

Page 15: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

– Low storage costs

– High-speed data movement

18. Define “OGSA”. R (April/May 2017)

Open Grid Services Architecture (OGSA) is a set of standards defining the way in which

information is shared among diverse components of large, heterogeneous grid systems. In this

context, a grid system is a scalable wide area network that supports resource sharing and

distribution. OGSA is a trademark of the Open Grid Forum

19. Compare GSH with GSR. AN (Nov/Dec 2017)

GSH:

A GSH is a globally unique name that distinguishes a specific grid service instance from

all others. The OGSA employs a “handle-resolution” mechanism for mapping from a GSH to a

GSR. The GSH must be globally defined for a particular Instance.

GSR:

Describes how a client can communicate with a Grid service instance. The HandleMap

interface allows a client to map from a GSH to a GSR. While the GSH represents name only, the

GSR includes binding information for transport protocol and data encoding format.

20. What is the purpose of grid service description? U (Nov/Dec 2017)

The purpose of the OGSI document is to specify the (standardized) interfaces and

behaviours that define a grid service. In brief, a grid service is a WSDL-defined service that

conforms to a set of conventions relating to its interface definitions and behaviours. Grid services

provide for the controlled management of the distributed and often long-lived state that is

commonly required in sophisticated distributed applications. OGSI also introduces standard

factory and registration interfaces for creating and discovering grid services.

21. Justify that Web and Web architecture are SOA based. AN(April/May 2018)

The technology of Web Services is the most likely connection technology of service-

oriented architectures. The service provider returns a response message to the service consumer.

The request and subsequent response connections are defined in some way that is understandable

to both the service

Page 16: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

consumer and service provider. A service provider can also be a service consumer.

22. List the services provided by a grid infrastructure. R (April/May 2018)

OGSA SERVICES:

• Common Management Model (CMM)

• Service domains

• Distributed data access and replication.

• Policy, security

• Provisioning and resource management

23. Define the term Web service. R (Nov/Dec 2018)

A Web service is a software service used to communicate between two devices on a

network. More specifically, a Web service is a software application with a standardized way of

providing interoperability between disparate applications. It does so over HTTP using

technologies such as XML, SOAP, WSDL, and UDDI.

24. What is a data grid? U (Nov/Dec 2018)

A data grid is an architecture or set of services that gives individuals or groups of users

the ability to access, modify and transfer extremely large amounts of geographically distributed

data for research purposes. Data grids make this possible through a host of middleware

applications and services that pull together data and resources from multiple administrative

domains and then present it to users upon request.

PART-B

1. Explain functionality requirements of OGSA? U

2. Explain in detail about OGSA services? U

3. Explain Open Grid Services Architecture (OGSA) with neat diagram. U

4. Explain detailed view of OGSA/OGSI? U

5. With a neat sketch, discuss the OGSA framework. U (Nov/Dec 2016)

6. Explain the data intensive grid service models with suitable diagrams. U(Nov/Dec 2016)

7. i)Analyze the set of services for the building blocks of OGSA based grid. (8) AN

ii) Explain the services provided by OGSA architecture. (8)

8. i) Explain the OGSA grid service interfaces developed by the OGSA working group. (8) AN

Page 17: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

ii) Analyze the difference between service oriented architecture and OGSA. (8)

9. i) Tabulate the web service resource frame work and its related specifications. (8) R

ii) Examine the reasons involved in adopting OGSA as a grid architecture by number of

projects. R (8)

10. Write a detailed note on OGSA security models. U(April/May 2017)

11. Explain how migrations of grid services are handled. U(April/May 2017)

12. “Data produced by a large Hardon Collider may exceed several petabytes”. What type of grid

service model(s) will you suggest for such an application? Illustrate with diagrams.AN

(Nov/Dec 2017)

13. What is OGSA? Explain open grid services architecture in detail with the functionalities of

the components. U(Nov/Dec 2017)

14. Explain in detail the OGSA security architecture and its security services.

(U) (April/May 2018)

15. What is the purpose of OGSI? Describe the ports and interfaces defined in OGSI along with

its inheritance hierarchy. U (April/May 2018)

16. What is open grid services architecture? Present a detailed view of open grid services

architecture. (13) U (Nov/Dec 2018)

17. What is open grid services infrastructure? Outline the open grid services infrastructure with a

diagram. U (Nov/Dec 2018)

UNIT III VIRTUALIZATION

Cloud deployment models: public, private, hybrid, community – Categories of cloud computing:

Everything as a service: Infrastructure, platform, software - Pros and Cons of cloud computing

– Implementation levels of virtualization – virtualization structure – virtualization of CPU,

Memory and I/O devices – virtual clusters and Resource Management – Virtualization for data

center automation

PART-A

Page 18: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

1. What are the cloud deployment models? R

• Private Cloud

• Public Cloud

• Hybrid Cloud

• Community Cloud

2. Define private cloud. R

• Applies to private clouds implemented at a customer‟s premises.

Outsourced Private Cloud

• Applies to private clouds where the server side is outsourced to a hosting company.

3. Define public cloud. R

The most ubiquitous, and almost a synonym for, cloud computing. The cloud

infrastructure is made available to the general public or a large industry group and is owned by

an organization selling cloud services.

Examples of Public Cloud: Google App Engine

4. Define Hybrid cloud. R

The cloud infrastructure is a composition of two or more clouds (private, community, or

public) that remain unique entities but are bound together by standardized or proprietary

technology that enables data and application portability (e.g., cloud bursting for load-balancing

between clouds).

5. Write a short note on community cloud. R (Nov/Dec 2018)

The cloud infrastructure is shared by several organizations and supports a specific

community that has shared concerns (e.g., mission, security requirements, policy, and

compliance considerations). Government departments, universities, central banks etc. often find

this type of cloud useful.

6. What are the cloud service models? R

1. SaaS(Software as a service)

2. PaaS(platform as a service)

3. IaaS(Infrastructure as a service)

7. What is meant by Saas? U

Page 19: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

Software as a Service (SaaS) is a software distribution model in which applications are

hosted by a vendor or service provider and made available to customers over a network, typically

the Internet.

8. What is meant by Paas? U

Platform as a service (PaaS) is a cloud computing model that delivers applications over

the Internet. In a PaaS model, a cloud provider delivers hardware and software tools -- usually

those needed for application development -- to its users as a service. A PaaS provider hosts the

hardware and software on its own infrastructure. As a result, PaaS frees users from having to

install in-house hardware and software to develop or run a new application.

9. What is meant by Iaas? U

Infrastructure as a Service (IaaS) is a service model that delivers computer infrastructure

on an outsourced basis to support enterprise operations. Typically, IaaS provides hardware,

storage, servers and data center space or network components; it may also include software

10. Analyse the Pros and Cons of cloud computing. AN

Advantage:-

Easy implementation

Accessibility

Flexibility for growth

Efficient recovery

Disadvantages:-

Longer in control

May not get all the features

Doesn't mean you should do away with servers.

No Redundancy

Bandwidth issues

11. Define Virtualization in cloud. R

Page 20: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

In computing, virtualization means to create a virtual version of a device or resource,

such as a server, storage device, network or even an operating system where the framework

divides the resource into one or more execution environments.

12. List the types of Virtualization in cloud. R

Server Virtualization

Network Virtualization

Storage Virtualization

Desktop Virtualization

Application Virtualization

13. How CPU is virtualized?AN

Privileged instruction runs only in privileged mode.

Control sensitive instruction that tend to change memory mapping, communicating with

other device

Behavior sensitive instruction that tend to change resource configuration

14. Discuss the benefits of Clustering.AN

Scientific applications

Large ISPS and e-Commerce

Graphics rendering

Fail over clusters

High availability

15. List out the three resource managers. R

Instance Manager: controls the execution, inspection, and terminating of VM instances on the

host where it runs.

Group Manager: gathers information about and schedules VM execution on specific instance

managers, as well as manages virtual instance network.

Cloud Manager: is the entry-point into the cloud for users and administrators. It queries node

managers for information about resources, makes scheduling decisions, and implements them by

making requests to group managers.

Page 21: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

16. Why OS-Level Virtualization is required?AN

Operating system virtualization inserts a virtualization layer inside an operating system to

partition a machine‟s physical resources. It enables multiple isolated VMs within a single

operating system kernel. This kind of VM is often called a virtual execution environment (VE),

Virtual Private System (VPS), or simply container. From the user‟s point of view, VEs look like

real servers. This means a VE has its own set of processes, file system, user accounts, network

interfaces with IP addresses, routing tables, firewall rules, and other personal settings.

17. Give the role of a VM. (Nov/Dec 2016) R

VMs have several qualities that make them an appealing technology in Grid systems:

Security and isolation

Customization of execution environment

Resource control

Site independence

18. Why do we need a hybrid cloud? AN (Nov/Dec 2016)

The hybrid cloud enables the enterprise to allocate its data, applications, and other

computing resources to either its own dedicated private cloud or to third-party public cloud

infrastructures. This flexibility helps organizations achieve a wide range of business goals,

including efficiency, availability, reliability, security, and cost efficiency.

19. Mention the characteristic features of the cloud. R (April/May 2017)

On-demand self-service: consumers can acquire the necessary computational resources

without having to interact with human service providers.

Ubiquitous network access: cloud features don‟t require special devices – laptops, mobile

phones, etc. are generally supported.

Resource pooling: cloud resources are pooled to serve many customers “… using a

multi-tenant model, with different physical and virtual resources…”

Rapid elasticity: resources can be allocated and de-allocated quickly as needed.

Measured service: resource use is measured and monitored; charges are made based on

usage and service type (e.g., storage, CPU cycles, etc.)

Page 22: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

20. Summarize the differences between PaaS and SaaS. AN (April/May 2017)

PaaS provides a framework for quickly developing and deploying applications by

automating infrastructure provisioning and management. PaaS products include APIs and tools

that enable developers to hook in features such as traffic splitting, monitoring, and version

control systems.

SaaS providers host an application and make it available to users through the internet,

usually a browser-based interface. Users most commonly interact with SaaS applications such as

Gmail, Dropbox, Salesforce, or Netflix. Eliminating the need to install and run programs on

individual devices, SaaS makes applications available through the internet.

21. Distinguish between physical and virtual clusters. AN (Nov/Dec 2017)

A physical cluster is a collection of servers (physical machines) interconnected by a

physical network such as LAN.

Virtual clusters are built with VMs installed at distributed servers from one or more

physical clusters.

The VMs in a virtual cluster are interconnected logically by a virtual network across

several physical networks.

22. List the requirements of VMM. R (Nov/Dec 2017)

VM Monitor

Presents software interface to guest software.

Isolates guests‟ states from each other.

Guest software should behave exactly as if running on native HW.

Guest Software should not be able to change allocation of real system resources directly.

VMM must control everything even though.

23.How does performance enhances by virtualizing the data center? AN (April/May

2018)

Page 23: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

Virtualization promises to radically transform computing for the better utilization of

resources available in the data center reducing overall costs and increasing agility. It

reduces operational complexity, maintains flexibility in selecting software and

hardware platforms and product vendors. It also increases agility in managing

heterogeneous virtual environments.

24.“Although virtualization is widely accepted today, it does have its limits”. Comment

on the statement. AN (April/May 2018)

It can have a high cost of implementation.

It creates a security risk.

It creates an availability issue.

It creates a scalability issue.

It requires several links in a chain that must work together cohesively.

25.Define the term virtual cluster. R (Nov/Dec 2018)

In a virtual cluster, virtual machines are grouped. Virtual clusters are built with VMs

installed at distributed servers from one or more physical clusters.

The VMs in a virtual cluster are interconnected logically by a virtual network across

several physical networks.

PART-B

1. Explain Cloud deployment models with neat diagrams? U

2. Describe Categories of cloud computing everything as a service: Infrastructure, platform,

software. U

3. Explain Implementation Levels Of Virtualization. U

4. Give detailed information about Virtualization Structures. R

5. Explain virtualization of CPU, Memory and I/O devices. U

6. Explain physical versus virtual clusters. U

7. Explain Virtual storage management. U

8. List the cloud deployment models-and give a detailed note about them. R (Nov/Dec 2016)

9. Give the importance of cloud computing and elaborate the different: types of services offered

by it. AN (Nov/Dec 2016)

10. Analyze the uses of AN

Page 24: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

i) Infrastructure as a service (8)

ii) Platform as a service (8)

iii) Software as a service (8)

11. Discuss how virtualization is implemented in different layers. AN (April/May 2017)

12. What do you mean by data centre automation using virtualization? U (April/May 2017)

13. Describe service and deployment models of a cloud computing environment with

illustrations. How do they fit in NST cloud architecture? U (Nov/Dec 2017)

14. What is virtualization? Describe para and full virtualization architectures. Compare and

contrast them. U (Nov/Dec 2017)

15. With architecture, elaborate the various deployment models and reference models of

cloud computing. U (April/May 2018)

16. ''Virtualization is the wave of the future". Justify. Explicate the process of CPU, memory

And I/O device virtualization in data center.AN (April/May 2018)

17. (i) What are the pros and cons for public, private and hybrid cloud? (7) U (Nov/Dec 2018)

(ii) Explain virtualization of l/O devices with an example. (6)

18. What is a data center? Outline the issues to be addressed with respect to virtualization for

data center automation. U (Nov/Dec 2018)

UNIT IV - PROGRAMMING MODEL

Open source grid middleware packages –Globus Toolkit (GT4) Architecture, Configuration –

Usage of Globus –Main components and Programming model -Introduction to Hadoop

Framework - Mapreduce, Input splitting, map and reduce functions, specifying input and output

parameters, configuring and running a job – Design of Hadoop file system, HDFS concepts,

command line and java interface, dataflow of File read & File write.

PART A

1. What is open source? R

Open source refers to any program whose source code is made available for use or

modification as users or other developers. Open source software is usually developed as a

public collaboration and made freely available.

Page 25: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

2. What is Middleware? R

Middleware is a general term for software that serves to "glue together" separate, often

complex and already existing, programs. Some software components that are frequently

connected with middleware include enterprise applications and Web services.

3. Define Globus Toolkit. R

The Globus Toolkit is a collection of grid middleware that allows users to run jobs,

transfer files, track file replicas, publish information about a grid, and more. All of these

facilities share a common security infrastructure called GSI that enables single sign-on.

4. What are the key terms in Globus? R

Three key terms:

Endpoint - a logical address for a GridFTP server, similar to a domain name for a web

server. Data is transferred between Globus endpoints.

Globus Connect Personal – for individual users – a client for communicating with other

GridFTP servers, via your local computer using Globus. Installing Globus Connect

Personal on your computer creates your own endpoint that you can use to transfer data to

and from your computer.

Globus Connect Server – for multiuser environments – a Linux package that sets up a

GridFTP server for use with Globus. Once installed on a server, those with access to the

server can move data to and from this location.

5. What is the use Globus toolkit? U

The Globus Toolkit, therefore, is a result of the Grid community's attempts to solve real

problems that are encountered by real application projects. It contains components that have

proven useful in addressing the challenging problems that come up when implementing Grid

applications and systems. The components have been generalized so that they make sense

within a wide variety of applications.

6. What are the standards supported by Globus Toolkit? R

Page 26: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

Globus Toolkit adheres to or provides implementations of the following standards:

Open Grid Services Architecture (OGSA).

Open Grid Services Infrastructure (OGSI), originally intended to form the basic

“plumbing” layer for OGSA, but has been superseded by WSRF and WS-Management.

Web Services Resource Framework (WSRF).

Job Submission Description Language (JSDL).

Distributed Resource Management Application API (DRMAA).

WS-Management.

WS-Base Notification.

SOAP.

Web Services Description Language.

Grid Security Infrastructure (GSI).

7. What is Programming model? R

"A model is an abstract machine providing certain operations to the programming

level above and requiring implementations on all of the architectures”

8. Define Apache Hadoop . U

Apache Hadoop is an open-source software framework for distributed storage and

distributed processing of very large data sets on computer clusters built from commodity

hardware. All the modules in Hadoop are designed with a fundamental assumption that

hardware failures are common and should be automatically handled by the framework.

9. Define MapReduce. U

MapReduce is a programming model and an associated implementation for processing

and generating large data sets with a parallel, distributed algorithm on a cluster.

MapReduce data processing is driven by this concept of input splits. The number of

input splits that are calculated for a specific application determines the number of mapper

tasks. Each of these mapper tasks is assigned, where possible, to a slave node where the input

split is stored.

10. Illustrating relationship between data blocks and input splits.AP

Page 27: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

11. What are Advantages of HDFS Block? U

The benefits with HDFS block are:

The blocks are of fixed size, so it is very easy to calculate the number of blocks that can

be stored on a disk.

HDFS block concept simplifies the storage of the datanodes. The datanodes doesn‟t need

to concern about the blocks metadata data like file permissions etc. The namenode maintains

the metadata of all the blocks.

If the size of the file is less than the HDFS block size, then the file does not occupy the

complete block storage.

As the file is chunked into blocks, it is easy to store a file that is larger than the disk size

as the data blocks are distributed and stored on multiple nodes in a hadoop cluster.

Blocks are easy to replicate between the datanodes and thus provide fault tolerance and

high availability. Hadoop framework replicates each block across multiple nodes (default

replication factor is 3). In case of any node failure or block corruption, the same block can

be read from another node.

12. What is HDFS File system? R

Filesystem Blocks: A block is the smallest unit of data that can be stored or retrieved

from the disk. Filesystems deal with the data stored in blocks. Filesystem blocks are normally

in few kilobytes of size. Blocks are transparent to the user who is performing filesystem

operations like read and write.

Page 28: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

13. Why using the Command Line Interface ? AN

It‟s faster to create prototype with. At least when the UI is not the prototype you‟re trying

to show.

It‟s easier to debug

More powerful than UI who don‟t implement all the options in order to keep the UI

simple and usable. It‟s just a matter of UX.

Sometimes you have to. It‟s the case if you‟re running a linux server.

14. Why HDFS Blocks are Large in Size?AN

The main reason for having the HDFS blocks in large size is to reduce the cost of seek

time. In general, the seek time is 10ms and disk transfer rate is 100MB/s. To make the seek

time 1% of the disk transfer rate, the block size should be 100MB. The default size HDFS

block is 64MB.

15. Write the use of Hadoop file system shell commands. R

Hadoop file system shell commands are used to perform various file operations like

copying file, changing permissions, viewing the contents of the file, changing ownership of

files, creating directories .The syntax of HDFS shell command is : hadoop fs <args>

16. Name any four services offered in GT4. (Nov/Dec 2016) R

Resource management: Grid Resource Allocation & Management Protocol (GRAM)

Information Services: Monitoring and Discovery Service (MDS)

Security Services: Grid Security Infrastructure (GSI)

Data Movement and Management: Global Access to Secondary Storage (GASS) and

GridFTP

17. What are the advantages of using Hadoop? U(Nov/Dec 2016)

Hadoop is a highly scalable storage platform

Hadoop also offers a cost effective storage solution for businesses' exploding data sets.

Flexible & Fast

Page 29: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

A key advantage of using Hadoop is its fault tolerance.

18. Write the significant use of GRAM. R(April/May 2017)

The Globus Toolkit includes a set of service components collectively referred to as the

Globus Resource Allocation Manager (GRAM). GRAM simplifies the use of remote systems

by providing a single standard interface for requesting and using remote system resources for

the execution of "jobs". The most common use (and the best supported use) of GRAM is

remote job submission and control. This is typically used to support distributed computing

applications.

19. Name the different modules in Hadoop framework. U(April/May 2017)

Hadoop Common: the common utilities that support the other Hadoop modules

Hadoop Distributed File System (HDFS): a distributed file system that provides

high-throughput access to application data

Hadoop YARN: a framework for job scheduling and cluster resource management

Hadoop MapReduce: a YARN-based system for parallel processing of large data

sets

20. “HDFS is fault tolerant. Is it true? Justify your answer.AN (Nov/Dec 2017)

HDFS is highly fault tolerant. It handles faults by the process of replica creation.

The replica of users data is created on different machines in the HDFS cluster. So

whenever if any machine in the cluster goes down, then data can be accessed from other

machine in which same copy of data was created. HDFS also maintains the replication

factor by creating replica of data on other available machines in the cluster if suddenly

one machine fails.

21. What is the purpose of heartbeat in Hadoop. U(Nov/Dec 2017)

In Hadoop Name node and data node do communicate using Heartbeat. Therefore

Heartbeat is the signal that is sent by the data node to the name node after the regular

interval to time to indicate its presence, i.e. to indicate that it is alive.

Page 30: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

If after a certain time of heartbeat Name Node does not receive any response from

Data Node, then that particular Data Node used to be declared as dead. The default

heartbeat interval is 3 seconds.

22. How does divide-and-conquer strategy relates to MapReduce paradigm?AN

(April/May 2018)

In MapReduce, you divide the work up serially, execute work packets in parallel, and tag

the results to indicate which results go with which other results. The merging is then serial for all

the results with the same tag, but can be executed in parallel for results that have different tags. In

more previous systems, the merge step became a bottleneck for all but the most truly trivial tasks.

With MapReduce it can still be if the nature of the tasks requires that all merging be done serially.

If, however, the task allows some degree of parallel merging of results, then MapReduce

gives a simple way to take advantage of that possibility.

23. Brief out the main components of Globus toolkit. U (April/May 2018)

Common runtime components

• Security

• Data management

• Information services

• Execution management

24. What is distributed file system? (Nov/Dec 2018)

Distributed File System (DFS) is a set of client and server services that allow an

organization using Microsoft Windows servers to organize many distributed SMB file

shares into a distributed file system. DFS has two components to its service: Location

transparency (via the namespace component) and Redundancy.

25. How MapReduce framework executes user jobs? (Nov/Dec 2018)

The user runs a MapReduce program on the client node which instantiates a Job client

object. Next, the Job client submits the job to the Job Tracker. Then the job tracker creates a

set of map and reduce tasks which get sent to the appropriate task trackers.

PART B

1. Explain detail Map Reduce. U

2. Explain detail components and Programming model. U

Page 31: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

3. What is Hadoop Framework and Explain in detail? U

4. Write Short notes on HDFS. R

5. What is file system and explain in detail dataflow of File read & File write. U

6. Draw and explain the global toolkit architecture. U (Nov/Dec 2016)

7. Give a detailed note on Hadoop framework. R (Nov/Dec 2016)

8. Discuss MAPREDUCE with suitable diagrams. U (April/May 2017)

9. Elaborate HDFS concepts with suitable illustrations. AN(April/May 2017)

10. Illustrate dataflow in HDFS during file read/write operation with suitable diagrams.

U (Nov/Dec 2017)

11. What is GT4? Describe in detail the components of GT4 with a suitable diagram.

U (Nov/Dec 2017)

12. List the characteristics of Globus tool kit. With a neat sketch describe t he

architecture of Globus GT4 and the services offered. R April/May 2018)

13. With an illustration, emphasize the significance of MapReduce paradigm in Hadoop

framework. List out the assumptions and goals set in HDFS architecture for processing

the data based on divide-and-conquer strategy. AN (April/May 2018)

14. Explain the main components and programming model of Globus Toolkit. (13)

U (Nov/Dec

2018)

15. Explain the Hadoop distributed file system architecture with a diagram. (13) U (Nov/Dec

2018)

UNIT V - SECURITY

Trust models for Grid security environment – Authentication and Authorization methods – Grid

security infrastructure – Cloud Infrastructure security: network, host and application level –

aspects of data security, provider data and its security, Identity and access management

architecture, IAM practices in the cloud, SaaS, PaaS, IaaS availability in the cloud, Key privacy

issues in the cloud.

Part A

1. Illustrate Grid security infrastructure. U

Page 32: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

2. What are the security challenges faced in a Grid environment? R

Security challenges faced in a Grid environment can be grouped into three categories:

(a) integration solutions where existing services needs to be used, and interfaces should be

abstracted to provide an extensible architecture.

(b) interoperability solutions so that services hosted in different virtual organizations that

have different security mechanisms and policies will be able to invoke each other.

(c) solutions to define, manage and enforce trust policies within a dynamic Grid

Environment

3. Different between Authentication and Authorization.AN

Authentication:

Authentication means validating whether app is accepting right user and rejecting invalid /

anonymous users or not.

Ex: 1) Login into internet banking with valid login credentials (application has to accept the

user).

2) Login into internet banking with invalid login credentials (application has to reject

the user)

Authorization:

Authorization means validating whether app providing right permissions to right users or

not.

Ex: 1) Agent: After login into IRCTC website, agent has permissions like book tickets more

Page 33: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

than 5, cancel, edit information.

2) User: Ater login into IRCTC website, user has permissions like booking not go

beyond 5 tickets, only book ticket or cancel ticket. But not editing information like

Agent.

4. What is identity and access management in a cloud environment? U(Nov/Dec 2018)

An identity access management (IAM) system is a framework for business processes that

facilitates the management of electronic identities. The framework includes the technology

needed to support identity management.

5. List out the Cloud security issues.R

Cloud security issues fall primarily into three areas:

Data Residency: Many companies face legislation by their country of origin or the local

country that the business entity is operating in, requiring certain types of data to be kept

within defined geographic borders. There are specific regulations that must be followed,

centered around data access, management and control.

Data Privacy: Business data often needs to be guarded and protected more stringently

than non-sensitive data. The enterprise is responsible for any breaches to data and must be

able ensure strict cloud security in order to protect sensitive information.

Industry & Regulation Compliance: Organizations often have access to and are

responsible for data that is highly regulated and restricted. Many industry-specific regulations

such as GLBA, CJIS, ITAR and PCI DSS, require an enterprise to follow defined standards

to safeguard private and business data and to comply with applicable laws.

6. What is Software as a Service (SaaS) ? U

Software as a Service (SaaS) is a software distribution model in which

applications are hosted by a vendor or service provider and made available to

customers over a network, typically the Internet.

7. What is Platform as a Service (PaaS)? R

Platform as a service (PaaS) is a category of cloud computing services that provides a

platform allowing customers to develop, run, and manage applications without the

Page 34: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

complexity of building and maintaining the infrastructure typically associated with

developing and launching an app.

8. What is Infrastructure as a Service (IaaS)? R

Infrastructure as a Service (IaaS) is a form of cloud computing that provides virtualized

computing resources over the Internet. IaaS is one of three main categories of cloud

computing services, alongside Software as a Service (SaaS) and Platform as a Service

(PaaS).

9. Define Cloud computing security. U

Cloud computing security or, more simply, cloud security is an evolving sub-domain

of computer security, network security, and, more broadly, information security. It refers to

a broad set of policies, technologies, and controls deployed to protect data, applications, and

the associated infrastructure of cloud computing.

10. What are Cloud Security Controls? R

1. Deterrent controls

These controls are intended to reduce attacks on a cloud system. Much like a

warning sign on a fence or a property, deterrent controls typically reduce the threat level

by informing potential attackers that there will be adverse consequences for them if they

proceed. (Some consider them a subset of preventive controls.)

2. Preventive controls

Preventive controls strengthen the system against incidents, generally by reducing

if not actually eliminating vulnerabilities. Strong authentication of cloud users, for

instance, makes it less likely that unauthorized users can access cloud systems, and more

likely that cloud users are positively identified.

3. Detective controls

Detective controls are intended to detect and react appropriately to any incidents

that occur. In the event of an attack, a detective control will signal the preventative or

corrective controls to address the issue. System and network security monitoring,

Page 35: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

including intrusion detection and prevention arrangements, are typically employed to

detect attacks on cloud systems and the supporting communications infrastructure.

4.Corrective controls

Corrective controls reduce the consequences of an incident, normally by limiting

the damage. They come into effect during or after an incident. Restoring system backups

in order to rebuild a compromised system is an example of a corrective control.

11. What are aspects of data security? R

1. Data Confidentiality

Data confidentiality is the property that data contents are not made available or

disclosed to illegal users. Outsourced data is stored in a cloud and out of the owners'

direct control. Only authorized users can access the sensitive data while others, including

CSPs, should not gain any information of the data.

2. Data Access Controllability

Access controllability means that a data owner can perform the selective restriction of

access to his data outsourced to cloud. Legal users can be authorized by the owner to access

the data, while others can not access it without permissions.

3. Data Integrity

Data integrity demands maintaining and assuring the accuracy and completeness of data.

A data owner always expects that his data in a cloud can be stored correctly and

trustworthily.

12. What are Advantages and disadvantages PaaS? R

The advantages to PaaS are primarily that it allows for higher-level programming

with dramatically reduced complexity; the overall development of the application can be

more effective, as it has built-in infrastructure; and maintenance and enhancement of the

application is easier.It can also be useful in situations where multiple developers are

working on a single project involving parties who are not located nearby.

One disadvantage of PaaS offerings is that developers may not be able to use a full

range of conventional tools (e.g. relational databases, with unrestricted joins). Another

Page 36: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

possible disadvantage of being locked in to a certain platform. However, most PaaSes are

relatively lock-in free.

13. What is Grid Security Infrastructure (GSI)? U

The Grid Security Infrastructure (GSI), formerly called the Globus Security

Infrastructure, is a specification for secret, tamper-proof, delegable communication

between software in a grid computing environment. Secure, authenticable communication

is enabled using asymmetric encryption.

14. What are Security Mechanisms in Grid Security Infrastructure (GSI)? R

Transport Layer Security (TLS) can be used to protect the communication channel

from eavesdropping or man-in-the-middle attacks.

Message-Level Security can be used (although currently it is much slower than TLS).

15. What are the derivatives of grid computing? R

There are 8 derivatives of grid computing.

a)Compute grid

b)Data grid

c)Science grid

d)Access grid

e)Knowledge grid

f)Cluster grid

g)Terra grid

h)Commodity grid

16. Mention the importance of Transport Level Security. AN (Nov/Dec 2016)

Transport level security is based on Secure Sockets Layer (SSL) or Transport Layer

Security (TLS) that runs beneath HTTP. SSL and TLS provide security features including

authentication, data protection, and cryptographic token support for secure HTTP

connections.

Page 37: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

17. Discuss on the application and use of identity and access management. AN (Nov/Dec

2016)

An identity access management (IAM) system is a framework for business processes that

facilitates the management of electronic identities. The framework includes the technology

needed to support identity management. The system is compatible with Microsoft SQL and

Oracle database systems.

18. What are the various challenges in building the trust environment? U (April/May

2017)

The first challenge is integration with existing systems and technologies. The second

challenge is interoperability with different “hosting environments”. The third challenge is to

construct trust relationships among interacting hosting environments.

19. Write a brief note on the security requirements of grid. U(April/May 2017)

Authentication

Authorization

Assurance/Accreditation

Accounting

Audit

20. List any four host security threats in public IaaS. (Nov/Dec 2017)

Host security at IaaS Level

a. Virtualization software security

i. Hypervisor security

ii. Threats: Blue Pill attack on the hypervisor

b. Customer guest OS or virtual server security

i. Attacks to the guest OS: e.g., stealing keys used to access and manage the

hosts

21. Identify the trust model based on a site’s trust worthiness. U(Nov/Dec 2017)

In a reputation-based model, jobs are sent to a resource site only when the site is

trustworthy to meet users‟ demands. The site trustworthiness is usually calculated from the

Page 38: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

following information such as the defense capability, direct reputation, and recommendation

trust. The defense capability refers to the site‟s ability to protect itself from danger.

22. On what basis trust models are set for grid environment? AN (April/May 2018)

A trust relationship must be established before the entities in the grid interoperate

with one another. The entities have to choose other entities that can meet the

requirements of trust to coordinate with. The entities that submit requests should believe

the resource providers will try to process their requests and return the results with a

specified QoS.

To create the proper trust relationship between grid entities, two kinds of trust

models are often used. One is the PKI-based model, which mainly exploits the PKI to

authenticate and authorize entities. The other is the reputation-based model.

23. State how CIA Triad plays a vital role in managing cloud security. AN

(April/May 2018)

There are three crucial components that make up the elements of the CIA triad, the

widely-used model designed to guide IT security. Those components are confidentiality,

integrity, and availability.

24. Write a short note on Kerberos. R (Nov/Dec 2018)

Kerberos (protocol) is a computer network authentication protocol that works on

the basis of tickets to allow nodes communicating over a non-secure network to prove

their identity to one another in a secure manner.

PART B

1. What are Security Mechanisms in Grid Security Infrastructure (GSI)? Explain. U

2. Explain in detail Cloud computing security.U

3. Write short notes on.R

SaaS

PaaS

IaaS

4. Explain in detail Authentication and Authorization. U

5. Explain trust models for grid security environment. U (Nov/Dec 2016)

6. Write in detail about cloud security infrastructure. R (Nov/Dec 2016)

Page 39: CS6703 - GRID AND CLOUD COMPUTINGpit.ac.in/pitnotes/uploads/CS6703_QB_CSE.pdf · 2019-07-16 · CS6703 - GRID AND CLOUD COMPUTING CS6703 - GRID AND CLOUD COMPUTING OBJECTIVES The

7. (i) Analyze in detail about the IAM Standards, Protocols, and Specifications for Consumers.(8)

AN

(ii) Compare the Enterprise and Consumer Authentication Standards and Protocols.(8) AN

8.i) Analyze the infrastructure security of cloud at host level.(8)AN

ii)Explain in detail about virtual server security of cloud.(8) U

9.i)Tabulate in detail about the Comparison of SPI maturity models in the context of IAM.(8)

AN

ii) Tabulate the Comparison of maturity levels for IAM components in detail.(8) AN

10. Write detailed note on identity and access management architecture. U(April/May 2017)

11. Explain grid security infrastructure. U (April/May 2017)

12. What is the purpose of GSI? Describe the functionality of various layers in GSI.

U (Nov/Dec

2017)

13. What is the purpose of IAM? Describe its functional architecture with an illustration.

U(Nov/Dec 2017)

14. "In today's world, infrastructure security and data security is highly challenging at

network at network, host and application levels". Justify and explain the several ways of

protecting the data at -transit and at rest. AN (April/May 2018)

15. Explain the baseline Identity and Access Management (IAM) factors to be practiced by

the stakeholders of cloud services and the common key privacy issues likely to happen in

the environment.AN (April/May 2018)

16. Define authentication and authorization. Outline authentication and authorization in grids

with relevant examples.(13) U(Nov/Dec 2018)

17. Describe Infrastructure-as-a Service (IaaS), Platform-as-a Service (PaaS) and Software-

as-a Service (SaaS) with an example. (13) U(Nov/Dec 2018)

.