Cloud Computing Development. Shallow Introduction.

Post on 30-Mar-2015

219 views 1 download

Tags:

Transcript of Cloud Computing Development. Shallow Introduction.

Cloud Computing Development

Cloud Computing Development

Shallow Introduction

Introduction

What is the cloud computing

Is it computing while in flight?

Image Courtesy SevensHeaven.nl

What is the cloud computing

Is it computing while in flight?

NOImage Courtesy SevensHeaven.nl

What is the cloud computing

What is it about then?

What is the cloud computing

What is it about then?

Cloud computing is consumption of

computing resources without worrying about

specifics.

What is the cloud computing

What is it about then?

As well as ability to add or remove resources

according to the demand.

What is the cloud computing

What is it about then?

Similar to the power grid and

telephone network.

What is the cloud computing

What is it about then?

Similar to the power grid and

telephone network.

How does it work?‣ Consumer signs up for the service. (Same as if you get a mobile phone plan)

‣ Consumer uses services according to their needs

‣ Provider sends the bill at the end of the cycle

‣ Consumer pays

Provider ModelsSoftware As A Service

SAAS

EmailCRM

Office Apps

Provider ModelsSoftware As A Service

SAAS

EmailCRM

Office Apps

Platform As A Service

PAAS

Application ServersDatabasesMiddleware

Provider ModelsSoftware As A Service

SAAS

EmailCRM

Office Apps

Platform As A Service

PAAS

Application ServersDatabasesMiddleware

Infrastructure As A Service

IAAS

Bare Hardware (Sort of )

ProvidersSoftware As A Service

SAAS

Google (GMail)Salesforce

Microsoft (Office Live)

ProvidersSoftware As A Service

SAAS

Google (GMail)Salesforce

Microsoft (Office Live)

Platform As A Service

PAAS

Google App EngineHeroku / Engine Yard (Rails)

Windows Azure (.NET)

ProvidersSoftware As A Service

SAAS

Google (GMail)Salesforce

Microsoft (Office Live)

Platform As A Service

PAAS

Google App EngineHeroku / Engine Yard

(Rails)Windows Azure (.NET)

Infrastructure As A Service

IAAS

Amazon AWSRackspace

GoGrid

Provider: Windows Azure

‣ Platform as a service

‣ Windows based

‣ Storage provided through blob storage, drives, SQL Azure

‣ State is stored and propagated with Queues and Tables

‣ Integrated with Visual Studio

‣ Eclipse plug-in for PHP

Slide courtesy Vlad Vinogradsky from Microsoft

Provider: Google App Engine

• Platform as a service

• Python or Java based

• Storage provided through BigTable

• Automatically scales web nodes

Provider: Rackspace• Infrastructure as a service

• Very Basic just a few Linux or Windows images

• Provides storage with CloudFiles

• Very Cheap

• Open source API

• Relatively New

Provider: Amazon AWS• Oldest on the market

• Many services / Images / Third party providers

• Provides computation through EC2 / EMR

• Provides state / storage through S3, SQS, RDS, SimpleDB

• Multiple APIs

Sample Prices

Amazon

Compute $0.10+ VM/Hr Storage $0.15+ GB/Month $0.15+ GB/XFer

Rackspace

Compute $0.02+ VM/Hr Storage $0.15+ GB/Month $0.22+ GB/XFer

MicrosoftCompute $0.12 VM/HrStorage $0.15 GB/moBandwidh $0.15 GB/XFer

Development

Practical Considerations

•Cloud Development is slightly different from traditional in house model.

Practical Considerations•Cloud Development is slightly

different from traditional in house model.

•Everything is virtualized (most of the time)

•Everything is distributed

•Per instance reliability is much lower

•Overall reliability is much higher

Cloud Programming Model

Cloud Programming Model‣Compute and

Interface nodes are not reliable, they can crash and disappear at any time.

‣Storage and State are reliable and heavily distributed.

‣At any time we can start more compute or interface nodes and shut them down when demand subsides.

Cloud Programming Model on Azure

‣ Compute : Worker Nodes

‣ State: Tables / Queues / SQL

‣ Storage: SQL / Tables / Blobs / Drives

‣ Client Inteface: Web Nodes

Cloud Programming Model on AWS

‣Compute : EC2 Instances

‣State: S3 / Queues / SimpleDB / RDS

‣Storage: S3 / SimpleDB / RDS

‣Client Inteface: S3 / EC2 / CloudFront

AWS Details: S3

‣S3 = Simple Storage Service

‣Guaranteed to be reliable

‣Simple {Key, Value} storage

‣Keys are stored within buckets

‣Values could be as large as 5GB

‣Default Storage Mechanism for AWS

AWS Details: Simple DB

‣Schema less database

‣Main storage unit is domain ( similar to table )

‣Each record can have many attributes, new attributes could be added at any time

‣Similar to LISP / Scheme attributes

‣Can query domain for records containing particular attribute

‣No Joins / Unions with other domains

‣Eventual Consistency

AWS Details: RDS

‣ RDS = Relational Data Storage

‣ MySQL in a cluster mode

‣ Preferred to simply running DB server within instance (ask me why for details)

AWS Details: SQS

‣SQS = Simple Queue System

‣Massively scalable

‣Allows to put message in the queue and retrieve later on

‣Retrieving the message hides it from the other users

‣When message is processed it is deleted from the queue

‣If message is not deleted before the timeout it is returned back

AWS Details: EC2‣EC2 = Elastic Compute Cloud

‣Allows to run arbitrary virtual machinesProvided they are compatible with Amazon’s modified Xen

‣Kernels and Startup Disks are stored in S3

‣Also have large local storage

‣Machines are not exactly like physical machines

‣Local storage is not persistentWhen machine is shut down all local data disappears.

‣Hardware TCP [No packet layer / No Broadcast ]

‣Can launch many copies of the machine at the same time

‣Lot’s of preconfigured machines

AWS Details: Other Services

‣EMR = Elastic Map ReduceLet’s run Hadoop jobs on EC2

‣CloudFront Content Delivery Network

‣ELB = Elastic Load Balancer

‣EBS = Elastic Block StorageS3 backed persistent storage

‣Public Data Sets - Lots of publicly available data

Census ( 1980 , 1990, 2000 ), Wikipedia logs, Freebase dumps, Genetic and Chemistry data

Starting Up

•Amazon Account

•Credentials KeyID : SecretKey

•X509 Ceriticate

Helpful Tools

•S3 Fox - Firefox extension for browsing S3

•Elastic Fox - Firefox extension for operating EC2

•Transmit - Mac utility for S3 ($)

•Right Scale - Web based platform for managing everything ( Free / $ )

Libraries

•Official Amazon Libraries (Java)

•Unofficial Libraries - .Net / Ruby / Perl

•AWS4C - C/C++/Objective C

•Boto - Very popular Python library (official Hadoop/EC2 library)

Demo

Demo

Running Hadoop on EC2

Questions

????