Clouds, Interoperation and PRAGMA Philip M. Papadopoulos, Ph.D University of California, San Diego...

Clouds, Interoperation and PRAGMA

Philip M. Papadopoulos, Ph.DUniversity of California, San Diego

San Diego Supercomputer CenterCalit2

Remember the Grid Promise?

The Grid is an emerging infrastructure that will fundamentally change the way we think about-and

use-computing. The word Grid is used by analogy with the electric power grid, which provides pervasive

access to electricity and has had a dramatic impact on human capabilities and society

The grid: blueprint for a new computing infrastructure,Foster, Kesselman. From Preface of first edition, Aug 1998

Some Things that Happened on the Way to Cloud Computing

• Web Version 1.0 (1995)• 1 Cluster on Top 500 (June 1998) • Dot Com Bust (2000)• Clusters > 50% of Top 500 (June 2004)• Web Version 2.0 (2004)• Cloud Computing (EC2 Beta - 2006)• Clusters > 80% of Top 500 (Nov. 2008)

Gartner Emerging Tech 2005

Gartner Emerging Tech - 2008

Gartner Emerging Tech 2010

What is fundamentally different about Cloud computing vs. Grid Computing

• Cloud computing – You adapt the infrastructure to your application– Should be less time consuming

• Grid computing – you adapt your application to the infrastructure– Generally is more time consuming

• Cloud computing has a financial model that seems to work – grid never had a financial model– The Grid “Barter” economy only valid for provider-to-

provider trade. Pure consumers had no bargaining power

IaaS – Of Most Interest to PRAGMA

Sun

3TeraIBM

Amazon EC2

Run (virtual) computers to solve your problem, using your software

Rackspace

GoGrid

Cloud Hype

• “Others do all the hard work for you”• “You never have to manage hardware again”• “It’s always more efficient to outsource”• “You can have a cluster in 8 clicks of the

mouse”• “It’s infinitely scalable” • …

Amazon Web Services

• Amazon EC2 – catalytic event in 2006 that really started cloud computing

• Web Services access for – Compute (EC2)– Storage (S3, EBS)– Messaging (SQS)– Monitoring (CloudWatch)– + 20 (!) More services

• “I thought this was supposed be simple”

Basic EC2

Amazon Machine Images (AMIs)

S3 – Simple Storage ServiceEBS – Elastic Block Store

Amazon Cloud StorageElastic Compute Cloud (EC2)

Copy AMI & Boot

• AMIs are copied from S3 and booted in EC2 to create a “running instance”

• When instance is shutdown, all changes are lost

– Can save as a new AMI

http://photos.sun.com/thumbnail/400/400/13814#type=gallery&cb_imgurl=LB.tb_imgurl&cb_gal=LB.tb_gallery&id=13814





Basic EC2

• AMI (Amazon Machine Image) is copied from S3 to EC2 for booting – Can boot multiple copies of an AMI as a “group”– Not a cluster, all running instances are independent – Clusters Instances are about $2/Hr (8 cores)

($17K/year)• If you make changes to your AMI while running

and want them saved– Must repack to make a new AMI

• Or use Elastic Block Store (EBS) on a per-instance basis

Some Challenges in EC2

1. Defining the contents of your Virtual Machine (Software Stack)

2. Preparing, packing, uploading image3. Understanding limitations and execution

model4. Debugging when something goes wrong5. Remembering to turn off your VM

– Smallest 64-bit VM is ~$250/month running 7x24

One Problem: too many choices

All

Paravirt

ualHVM

Instance

Store

(S3)

Elasti

c Block

Store

(EBS)

Other O

S/Uncla

ssified

Ubuntu

Windows

CentOS

Fedora

Debian

Archlin

ux

Redhat/RHEL OEL

Other A

pp/Uncla

ssified

“Cloud”

Oracle

Zeus

Hadoop

NCBI0

1000

2000

3000

4000

5000

6000

7000

8000

9000

5496

4995

501

4971

525

3537

920

508

176

139

119

45 33 19

4457

870

64 48 41 16

7056

6443

613

6090

966

5786

1487

588

219

192

124

45 71 31

5830

1008

81 80 46 11

8233

7485

748

6660

1603

6559

1747

709

265 397

145

46 72 40

6860

1125

84 100

56 8

7856

6791

1065

5165

2691

4047

2115

963

295

184

153

45 41 13

6681

774

79 180

58 84

Public Amazon Instances

7-Sep-1030-Dec-1020-Apr-1128-Nov-11

Reality for Scientific Applications

• The complete software stack is critical to proper operation– Libraries– compiler/interpreter versions– file system location– Kernel

• This is the fundamental reason that the Grid is hard: my cluster is not the same environment as your cluster– Electrons are universal, software packages are not

People and Science are Distributed

• PRAGMA – Pacific Rim Applications and Grid Middleware Assembly– Scientists are from different countries– Data is distributed

• Cyber Infrastructure to enable collaboration• When scientists are using the same software

on the same data– Infrastructure is no longer in the way– It needs to be their software (not my software)

PRAGMA’s Distributed Infrastructure Grid/Clouds

26 institutions in 17 countries/regions, 23 compute sites, 10VM sites

UZHSwitzerland

NECTECKUThailand

UoHydIndia

MIMOSUSMMalaysia

HKUHongKong

ASGCNCHCTaiwan

HCMUTHUTIOIT-HanoiIOIT-HCMVietnam

AISTOsakaUUTsukubaJapan

MUAustralia

KISTIKMUKorea

JLUChina

SDSCUSA

UChileChile

CeNAT-ITCRCosta Rica

BESTGridNew Zealand

CNICChina

LZUChina

UZHSwitzerland

LZUChina

ASTIPhilippines

IndianaUUSA

UValleColombia

Our Goals

• Enable Specialized Applications to run easily on distributed resources

• Investigate Virtualization as a practical mechanism– Multiple VM Infrastructures (Xen, KVM,

OpenNebula, Rocks, WebOS, EC2)• Use Geogrid Applications as a driver of the

process

GeoGrid Applications as Driver

I am not part of GeoGrid, but PRAGMA members are!

Deploy Three Different Software Stacks on the PRAGMA Cloud

• QuiQuake– Simulator of ground motion map when earthquake occurs– Invoked when big earthquake occurs

• HotSpot– Find high temperature area from Satellite– Run daily basis (when ASTER data arrives from NASA)

• WMS server– Provides satellite images via WMS protocol– Run daily basis, but the number of requests is not stable.

Source: Dr. Yoshio Tanaka, AIST, Japan

Example of current configuration

21

WMS Server QuiQuake Hot spot

• Fix nodes to each application• Should be more adaptive and elastic according

to the requirements.


1st step: Adaptive resource allocation in a single system

22

WMS Server QuiQuake Hot Spot

WMS Server QuiQuake Hot Spot

WMS Server QuiQuake Hot spot

Big Earthquake !

Increase WMS requests

Change nodes for each application according to the situation and requirements.


2nd Step: Extend to distributed environments

NASA

(AIST)

ERSDAC

JAXA

TDRS

Terra/ASTER

ALOS/PALSAR

UCSD

OCC

NCHC


What are the Essential Steps

1. AIST/Geogrid creates their VM image2. Image made available in “centralized” storage3. PRAGMA sites copy Geogrid images to local

clouds1. Assign IP addresses2. What happens if image is in KVM and site is Xen?

4. Modified images are booted5. Geogrid infrastructure now ready to use

Basic Operation

• VM image authored locally, uploaded into VM-image repository (Gfarm from U. Tsukuba)

• At local sites:– Image copied from repository– Local copy modified (automatic) to run on specific

infrastructure– Local copy booted

• For running in EC2, adapted methods automated in Rocks to modify, bundle, and upload after local copy to UCSD.

VM hosting server

VM Deployment Phase I - Manualhttp://goc.pragma-grid.net/mediawiki-1.16.2/index.php/Bloss%2BGeoGrid

Geogrid+ Bloss

# rocks add host vm container=…# rocks set host interface subnet …# rocks set host interface ip …# rocks list host interface …# rocks list host vm … showdisks=yes# cd /state/partition1/xen/disks# wget http://www.apgrid.org/frontend...# gunzip geobloss.hda.gz# lomount –diskimage geobloss.hda -partition1 /media# vi /media/boot/grub/grub.conf…# vi /media/etc/sysconfig/networkscripts/ifc……# vi /media/etc/sysconfig/network…# vi /media/etc/resolv.conf…# vi /etc/hosts…# vi /etc/auto.home…# vi /media/root/.ssh/authorized_keys…# umount /media# rocks set host boot action=os …# Rocks start host vm geobloss…

frontend

vm-container-0-0

vm-container-0-2

vm-container-….

Geogrid + Bloss

vm-container-0-1

VM develserver

Website

Geogrid+ Bloss

http://goc.pragma-grid.net/mediawiki-1.16.2/index.php/Bloss+GeoGrid

http://www.apgrid.org/frontend

What we learned in manual approach

AIST, UCSD and NCHC met in Taiwan for 1.5 days to test in Feb 2011

• Much faster than Grid deployment of the same infrastructure

• It is not too difficult to modify a Xen image and run under KVM

• Nearly all of the steps could be automated• Need a better method than “put image on a website”

for sharing

Gfarm file server

Gfarm file serverGfarm file serverGfarm file server

Gfarm file server

Gfarm Client Gfarm meta-serverGfarm file server

Centralized VM Image Repository

QuickQuake

Geogrid + Bloss

Nyouga

Fmotif

Gfarm Client

VM images depository and sharing

vmdb.txt

Gfarm using Native tools

VM hosting server

VM Deployment Phase II - Automatedhttp://goc.pragma-grid.net/mediawiki-1.16.2/index.php/VM_deployment_script

Geogrid + Bloss

Gfarm Cloud

$ vm-deploy quiquake vm-container-0-2

vmdb.txt

GfarmClient

Quiquake

Nyouga

Fmotif

quiquake, xen-kvm,AIST/quiquake.img.gz,…Fmotif,kvm,NCHC/fmotif.hda.gz,…

frontend

vm-container-0-0

vm-container-0-2

vm-container-….

vm-container-0-1

vm-deploy

GfarmClient

VM development server

S

Quiquake

Quiquake

http://goc.pragma-grid.net/mediawiki-1.16.2/index.php/VM_deployment_script

AIST HotSpot + Condor

gFS

gFSgFS

gFSgFS

SDSC (USA)Rocks Xen

NCHC (Taiwan)OpenNebula KVM

LZU (China)Rocks KVM

AIST (Japan)OpenNebula Xen

IU (USA)Rocks Xen

Osaka (Japan)Rocks Xen

gFC

gFC

gFC

gFC

gFC

gFC

gFS

gFS

gFS

gFS

gFS

GFARM Grid FileSystem (Japan)

AIST QuickQuake + Condor

AIST Geogrid + Bloss

AIST Web Map Service + Condor

UCSD Autodock + Condor

NCHC Fmotif

= VM deploy Script

VM Imagecopied from gFarm





Condor Master


S

S

S

S

SS

SgFC

gFS

= Grid Farm Client= Grid Farm Server

slave

slave slave

slave slave

slave

Put all togetherStore VM images in Gfarm systems

Run vm-deploy scripts at PRAGMA SitesCopy VM images on Demand from gFarm

Modify/start VM instances at PRAGMA sitesManage jobs with Condor

Moving more quickly with PRAGMA Cloud

• PRAGMA 21 – Oct 2011– 4 sites: AIST, NCHC, UCSD, and EC2 (North

America)• SC’11 – Nov 2011

– New Sites:• Osaka University• Lanzhou University • Indiana University• CNIC • EC2 – Asia Pacific

Condor Pool + EC2 Web Interface

• 4 different private clusters• 1 EC2 Data Center• Controlled from Condor Manager in AIST, Japan

http://www.cs.wisc.edu/condor

PRAGMA Compute Cloud

UoHydIndia

MIMOSMalaysia

NCHCTaiwan

AISTOsakaUJapan

SDSCUSA

CNICChina

LZUChinaLZUChina

ASTIPhilippines

IndianaUUSA

JLUChina

Cloud Sites Integrated in Geogrid Execution Pool

Roles of Each Site PRAGMA+Geogrid

• AIST – Application driver with natural distributed computing/people setup

• NCHC – Authoring of VMs in a familiar web environment. Significant Diversity of VM infra

• UCSD – Lower-level details of automating VM “fixup” and rebundling for EC2

We are all founding members of PRAGMA

NCHC WebOS/Cloud Authoring Portal

Users start with well-defined Base Image then

add their software

Getting things working in EC2

• Short Background on Rocks Clusters• Mechanisms for using Rocks to create an EC2

compatible image• Adapting methodology to support non-Rocks

defined images

38

• Technology transfer of commodity clustering to application scientists• Rocks is a cluster/System Configuration on a CD

– Clustering software (PBS, SGE, Ganglia, Condor, … )– Highly programmatic software configuration management– Put CDs in Raw Hardware, Drink Coffee, Have Cluster.

• Extensible using “Rolls”• Large user community

– Over 1PFlop of known clusters– Active user / support list of 2000+ users

• Active Development– 2 software releases per year– Code Development at SDSC– Other Developers (UCSD, Univ of Tromso, External Rolls

• Supports Redhat Linux, Scientific Linux, Centos and Solaris• Can build Real, Virtual, and Hybrid Combinations (2 – 1000s)

Rocks – http:// www.rocksclusters.org

Rocks Core Development NSF award #OCI-0721623

http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0721623

http://images.google.com/imgres?imgurl=http://www.voltaire.com/assets/images/hpcwire-readers-choice-2008.jpg&imgrefurl=http://www.voltaire.com/NewsAndEvents/Awards&usg=__h-P__8fYtLmXoI1gLmamvJG20z8=&h=110&w=110&sz=18&hl=en&start=6&um=1&itbs=1&tbnid=bsQqvStxLPlC8M:&tbnh=85&tbnw=85&prev=/images?q=hpcwire+2008+awards&um=1&hl=en&sa=N&tbs=isch:1

Key Rocks Concepts

• Define components of clusters as Logical Appliances (Compute, Web, Mgmt, Login DB, PFS Metadata, PFS Data, … )– Share common configuration among appliances– Graph decomposition of the full cluster SW and Config– Rolls are the building blocks: reusable components (Package +

Config + Subgraph)• Use installer’s (Redhat Anaconda, Solaris Jumpstart) text

format to describe an appliance configuration – Walk the Rocks graph to compile this definition

• Heterogeneous Hardware (Real and Virtual HW) with no additional effort

Triton Resource

Large Memory PSDAF

• 256 GB & 512 GB Nodes (32 core)• 8TB Total• 128 GB/sec• ~ 9TF

x28

Shared ResourceCluster

• 16 GB/Node• 4 - 8TB Total• 256 GB/sec• ~ 20 TFx256

A Mid-Sized Cluster Resource Includes : Computing, Database, Storage, Virtual Clusters, Login, Management Appliances

Campus Research Network

Campus Research Network

UCSD Research Labs

Large Scale Storage(Delivery by Mid May)• 2 PB ( 384 TB Today)• ~60 GB/sec ( 7 GB/s )• ~ 2600 (384 Disks Now)

http://tritonresource.sdsc.edu

What’s in YOUR cluster?

How Rocks Treats Virtual Hardware

• It’s just another piece of HW.– If RedHat supports it, so does

Rocks

• Allows mixture of real and virtual hardware in the same cluster– Because Rocks supports

heterogeneous HW clusters

• Re-use of all of the software configuration mechanics– E.g., a compute appliance is

compute appliance, regardless of “Hardware”

Virtual HW must meet minimum HW Specs– 1GB memory– 36GB Disk space*– Private-network Ethernet– + Public Network on Frontend

* Not strict – EC2 images are 10GB










Extended Condor Pool (Very Similar to AIST GeoGrid)

Cluster Private Network(e.g. 10.1.x.n)

Rocks Frontend

Node 0

Condor CollectorScheduler

Node n

Node 1

JobSubmit

Identical system images

Cloud 1

Condor Pool with both local

and cloud resources

Cloud 0

http://www.cs.wisc.edu/condor

VM Container

Rocks Frontend

Guest VM

Kickstart Guest VMec2_enable=true1

Bundle as S3 Image2

Amazon EC2 Cloud

Register Image as EC2 AMI4

Boot AMI as an Amazon Instance5

Upload Image to Amazon S33

Disk Storage

“Compiled” VM Image

Optional: Test and Rebuild of

Image

Local Hardware

Complete Recipe

At the Command Line: provided by the Rocks EC2 Roll/Xen Rolls

1. rocks set host boot action=install compute-0-02. rocks set host attr compute-0-0 ec2_enable true3. rocks start host vm compute-0-0

– After reboot inspect, then shut down

4. rocks create ec2 bundle compute-0-05. rocks upload ec2 bundle compute-0-0 <s3bucket>6. ec2-register <s3bucket>/image.manifest.xml7. ec2-run instances <ami>

Amazon EC2 Cloud

VM Container

Rocks Frontend

Guest VM

vm-deploy nyouga2 vm-container-0-20

1

Bundle as S3 Image3

Register Image as EC2 AMI5

Boot AMI as an Amazon Instance6

Upload Image to Amazon S34

Disk Storage

“Modified” VM Image

Local Hardware

Gfarm

Makeec2.sh <image file>2

Modify to Support Non-Rocks Imagesfor PRAGMA Experiment

Observations

• This is much faster than our Grid deployments• Integration of private and commercial cloud is

at proof-of-principle state• Haven’t scratched the surface of when one

expands into an external cloud• Networking among instances in different clouds

has pitfalls (firewalls, addressing, etc)• Users can focus on the creation of their

software stack

Heterogenous Clouds

More Information Online

Revisit Cloud Hype

• “Others do all some of the hard work for you”• “You never still have to manage hardware again”• “It’s always sometimes more efficient to outsource”• “You can have a cluster in 8 clicks of the mouse, but it

may not have your software”• “It’s infinitely scalable” • Location of data is important • Interoperability across cloud infrastructures is possible• …

Thank You!

[email protected]

Clouds, Interoperation and PRAGMA Philip M. Papadopoulos, Ph.D University of California, San Diego...

Documents

Transcript of Clouds, Interoperation and PRAGMA Philip M. Papadopoulos, Ph.D University of California, San Diego...