1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June...

33
1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011

Transcript of 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June...

Page 1: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.1

Introduction to Grid Computing

© 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011

Page 2: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.2

“The grid virtualizes heterogeneous geographically disperse resources” from "Introduction to Grid Computing with Globus," IBM

Redbooks

• Using geographically distributed and interconnected computers together for computing and for resource sharing.

Grid Computing

Page 3: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.3

Need to harness computers

Original driving force behind Grid computing same as behind the early development of networks that became the Internet:

– Connecting computers at distributed sites for high performance computing.

Page 4: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1a-1.4

However, Grid computing is about collaborating and resource sharing as much as it is about high performance computing.

Page 5: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1a-1.5

Virtual Organizations

Grid computing offerspotential of virtual organizations:

– groups of people, both geographically and organizationally distributed, working together on a problem, sharing computers AND other resources such as databases and experimental equipment.

Page 6: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Different organizations can supply resources and personnel.

Concept has many benefits, including:

•Problems that could not be solved previously for humanity because of limited computing resources can now be tackled.

Examples

• Understanding the human genome • Searching for new drugs … .

Continued.

1-1.6

Page 7: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

• Users can have access to far greater computing resources and expertise than available locally.

• Inter-disciplinary teams can be formed across different institutions and organizations to tackle problems that require expertise of multiple disciplines.

• Specialized localized experimental equipment can be accessed remotely and collectively.

• Large collective databases can be created to hold vast amounts of data.

• Unused compute cycles can be harnessed at remote sites, achieving more efficient use of computers.

1-1.7

Page 8: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Crosses multiple administrative domains.

• Another hallmark of larger Grid computing projects.

• Resources being shared owned either by members of virtual organization or donated by others.

• Introduces challenging technical and social-political challenges.

• Requires true collaboration.

1-1.8

Page 9: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

• Some key features we regard as indicative of Grid computing:

– Shared multi-owner computing resources

– Uses Grid computing software, with security and cross-management mechanisms in place

– Tools to bring together geographically distributed computers owned by others.

1-1.9

Page 10: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.10

Shared Resources

Can share much more than just computers:

• Storage

• Sensors for experiments at particular sites

• Application Software

• Databases

• Network capacity, …

Page 11: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

History of distributed computingCertainly one can go back a long way to trace the history of distributed computing.

Types of distributed computing existed in 1960s.

Many people interested in connecting computers together for high performance computing.

From connecting processors/computers together locally that began in earnest in the 1960s and 1970s, distributed computing now extends to connecting computers that are geographically distant - Grid computing.

1-1.11

Page 12: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Distributed computing technologies that underpin Grid computing developed concurrently and rely upon each other.

Three concurrent interrelated paths:

• Networks• Computing platforms• Software techniques

1-1.12

Page 13: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.13

Networks1960s - Development of packet switched networks.1969 - ARPNET network became operational.

4 nodes, Univ. of California at Los Angeles, Stanford Research Institute, Univ. of California at Santa Barbara, and Univ. of Utah. Design speed of 50 Kbits/sec.1974 - TCP (Transmission Control Protocol) 1978 - TCP/IP (Transmission Control Protocol/Internet Protocol).

TCP a protocol for reliable communicationIP for network routing. IP addresses identify hosts on the

Internet Ports identify end points (processes) for communication purposes.Early 1970s - Ethernet for interconnecting computers on local networks. Early 1980s - Internet. Uses the TCP/IP protocol.1990s - Internet developed into World-Wide Web.

Browser and HTML markup language introduced.

Page 14: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.14

Computing Platforms

1960 onwards - Recognized that increased speed could potentially be obtained by having more than one processor inside a single computer system

Parallel computer coined to describe such systems.

1970s and 1980s - many parallel computer projects especially with advent of low cost microprocessors.

1990s - cluster computing, a group of computers inter connected through a network switch to form a computing platform

Commodity computers (PCs) provided cost-effective solution.

Page 15: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.15

Typical cluster

computing configuration

Programming clusters: Message passing programming -- Messages between processes specified by programmer using message-passing routines:

Late 1980s - early 1990s - PVM (Parallel Virtual Machine)

Late 1990s - MPI (Message Passing Interface)

Page 16: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Late 1980’s onwards – Condor

To harness “unused” cycles of networked computers for high performance computing.

A collection of computers could be given over to remote access automatically when not being used locally.

Widely used as a job scheduler for clusters in addition to its original purpose of using laboratory computers collectively.

We will consider Condor in the light of Grid computing.

1-1.16

Page 17: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Software TechniquesMid 1980s - Remote procedure call (RPC) for invoking a procedure on a remote computer.

Service registry - introduced with RPC to locate remote services.

1990s - Object-oriented versions of RPC:

CORBA (Common Request Broker Architecture)

Java Method Invocation (RMI).

2000 - Web service

Provide remote actions as RPC but invoked through standard protocols and Internet addressing.

Use XML (eXtensible Markup Language), also introduced in 2000.

Web services and XML adopted into Grid computing soon after their introduction

1-1.17

Page 18: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.18

Grid Computing History

• Began in mid 1990s with experiments using computers at geographically dispersed sites.

• Seminal experiment – “I-way” experiment at 1995 Supercomputing conference (SC’95), using 17 sites across US running:

– 60+ applications.– Existing networks (10 networks).

Page 19: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

http://globus.org/

Middleware software Grid computing toolkit.

Led by Ian Foster, a co-developer of I-Way demo and founder of Grid computing concept.

Evolved through several versions although basic structural components remained essentially same.

Globus Toolkit

We will describe Globus in detail later.

Page 20: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Other grid computing middleware software

Although Globus widely adopted and basis of our course, there are other software infrastructure projects. Europe:

1990s UNICORE (UNiform Interface to COmputing REsources)European grid computing project.Initially funded by German Ministry for Education and Research. Continued with other European funding.Basis of several European efforts and elsewhere. Many similarities to Globus. Still active. http://www.unicore.eu/index.php

2002-2010 EGEE, Enabling Grids for E-Science, continued as EGI, European Grid Infrastructure

1-1.20

Page 21: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

gLite

Lightweight Middleware forGrid Computing

A framework for building grid applicationspart of EGEE project, continued in EMI.

http://glite.cern.ch/

Page 22: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

http://www.ige-project.eu/

Page 23: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.23

Key concepts in the history of Grid computing

Fig. 1.2

Page 24: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.24

Applications• Originally e-Science applications

– Computational intensive• Traditional high performance computing

addressing large problems• Not necessarily one big problem but a

problem that has to be solved repeatedly with different parameters.

– Data intensive• Computational but emphasis on large

amounts of data to store and process

– Experimental collaborative projects

Page 25: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

1-1.25

• Now also e-Business applications–To improve business models and

practices.

–Sharing corporate computing resources and databases

–On-demand Grid computing …

–Indirectly led to cloud computing.

Page 26: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Grid Computing verse Cluster Computing

• Important not to think of Grid computing simply as large cluster because potential and challenges different.

• Courses on Grid computing and on cluster computing are quite different.

1-1.26

Page 27: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Cluster computing course• One learns about :

– Message passing programming using tools such as MPI, and

– Shared memory programming using threads and OpenMP, given that most computers in a cluster today now multi-core shared memory systems.

– Parallel algorithms (lots)

• Network security is not a big issue. – Usually an ssh connection to front node of cluster

sufficient. – User logging onto a single compute resource.

• Computers connected together locally under one administrative domain

1-1.27

Page 28: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Grid computing course• Learn about running jobs of remote machines,

scheduling jobs and distributed workflow

• Learn in detail underlying Grid infrastructure

• How Internet technologies applied to Grid computing

• Grid computing software and standards

• Security is an issue.

1-1.28

Page 29: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Grid Computing verse Cluster Computing

• Of course, there are things in common

• Both courses hands-on with programming experiences.

• Both use multiple computers

• Both require job scheduler to place jobs.

1-1.29

Page 30: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Cloud computing

• Lot of hype on Cloud computing at the moment.

• Business model in which services provided on servers that can be accessed through Internet.

• Lineage of cloud computing can be traced back to on-demand Grid computing in the early 2000s.

1-1.30

Page 31: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Cloud computing using virtualized resources

1-1.31Fig. 1.3

Page 32: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

• Common thread between Grid computing and cloud computing is use of Internet to access resources.

• Cloud computing driven by widespread access that Internet provides and Internet technologies.

• However cloud computing quite distinct from original purpose of Grid computing.

1-1.32

Page 33: 1-1.1 Introduction to Grid Computing © 2011 B. Wilkinson/Clayton Ferner. Modification date: June 20, 2011.

Grid Computing verse Cloud Computing

• Whereas Grid computing focuses on collaborative and distributed shared resources,

Cloud computing concentrates upon placing services for users to pay to use.

• Technology for cloud computing emphases:– use of software as a service (SaaS)– virtualization (process of separating particular

user’s software environment from underlying hardware).

1-1.33