Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of...

58
Basics of Cloud Computing – Lecture 7 Cloud Computing – Summary and Research at UT Research at UT Satish Srirama

Transcript of Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of...

Page 1: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Basics of Cloud Computing – Lecture 7

Cloud Computing – Summary and

Research at UTResearch at UT

Satish Srirama

Page 2: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Outline

• Cloud computing – Summary

• Research at UT on cloud

– SciCloud

– Mobile Cloud– Mobile Cloud

– D2CM

23.05.2012 2/57Satish Srirama

Page 3: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

What is Cloud Computing?

• Computing as a utility

– Consumers pay based on their usage

• Cloud Computing characteristics

– Illusion of infinite resources

– No up-front cost

– Fine-grained billing (e.g. hourly)

• Gartner: “Cloud computing is a style of computing where massively scalable IT-related capabilities are provided ‘as a service’ across the Internet to multiple external customers”

23.05.2012 3/57Satish Srirama

Page 4: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Cloud Computing – Services - Recap

• Software as a Service – SaaS– A way to access applications

hosted on the web through your web browser

• Platform as a Service – PaaS– A pay-as-you-go model for IT

resources accessed over the

SaaS

Facebook, Flikr, Myspace.com,

Google maps API, Gmail

Level of

Abstraction

resources accessed over the Internet

• Infrastructure as a Service –IaaS– Use of commodity computers,

distributed across Internet, to perform parallel processing, distributed storage, indexing and mining of data

– Virtualization

PaaS

Google App Engine,

Force.com, Hadoop, Azure, Amazon S3, etc

IaaS

Amazon EC2, SciCloud,

Joyent Accelerators, Nirvanix Storage Delivery Network, etc.

23.05.2012 4/57Satish Srirama

Page 5: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Scientific Computing Cloud (SciCloud)

• Distributed Systems Group owned private cloud infrastructure

• Eucalyptus setup

• Goal of the project• Goal of the project

– To establish a private cloud at the university

– To efficiently use the already existing resources of universities

– To address computationally intensive scientific, mathematical, and academic problems

• Hope it was a pleasant experience !!!

23.05.2012 5/57Satish Srirama

Page 6: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Economics of Cloud Providers –

Failures - Recap

• Cloud Computing providers bring a shift from high reliability/availability servers to commodity servers– At least one failure per day in large datacenter

• Caveat: User software has to adapt to failures• Caveat: User software has to adapt to failures

• Solution: Replicate data and computation– MapReduce & Distributed File System

• MapReduce = functional programming meets distributed processing on steroids – Not a new idea… dates back to the 50’s (or even 30’s)

23.05.2012 6/57Satish Srirama

Page 7: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

MapReduce

• Programmers specify two functions:map (k, v) → <k’, v’>*reduce (k’, v’) → <k’, v’>*– All values with the same key are reduced together

• The execution framework handles everything else…• Not quite…usually, programmers also specify:• Not quite…usually, programmers also specify:

partition (k’, number of partitions) → partition for k’– Often a simple hash of the key, e.g., hash(k’) mod n– Divides up key space for parallel reduce operationscombine (k’, v’) → <k’, v’>*– Mini-reducers that run in memory after the map phase– Used as an optimization to reduce network traffic

23.05.2012 7/57Satish Srirama

Page 8: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Hadoop Processing Model

• Create or allocate a cluster

• Put data onto the file system (HDFS)– Data is split into blocks

– Replicated and stored in the cluster

• Run your job• Run your job– Copy Map code to the allocated nodes

• Move computation to data, not data to computation

– Gather output of Map, sort and partition on key

– Run Reduce tasks

• Results are stored in the HDFS

23.05.2012 8/57Satish Srirama

Page 9: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

MapReduce Examples

• Distributed Grep

• Count of URL Access Frequency

• Reverse Web-Link Graph

• Inverted Index • Inverted Index

• Distributed Sort

23.05.2012 9/57Satish Srirama

Page 10: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Synchronization in Hadoop

• Approach 1: turn synchronization into an ordering problem– Sort keys into correct order of computation

– Partition key space so that each reducer gets the appropriate set of partial results

– Hold state in reducer across multiple key-value pairs to perform computationcomputation

– Illustrated by the “pairs” approach in calculating conditional probability of words

• Approach 2: construct data structures that “bring the pieces together”– Each reducer receives all the data it needs to complete the

computation

– Illustrated by the “stripes” approach

23.05.2012 10/57Satish Srirama

Page 11: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Information Retrieval

• We have checked Boolean retrieval

• We also have seen Ranked retrieval in corpus

– TF.IDF

– We have seen how to calculate similarity between – We have seen how to calculate similarity between

documents

• How several MapReduce jobs can be used to

solve particular problem

23.05.2012 11/57Satish Srirama

Page 12: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Cloud Providers

• Seen several services from Amazon

– Amzon EC2, S3, EBS

– Amazon CloudFormation and others

• Google AppEngine• Google AppEngine

• Aneka framework

• Force.com

23.05.2012 12/57Satish Srirama

Page 13: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

RESEARCH AT UT

23.05.2012 13Satish Srirama

Page 14: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

SciCloud [Srirama et al, CCGrid 2010] – Research

topics

• Have customized images supporting

– Scientific Computing

• NumPy

• SciPy

• Enterprise computing• Enterprise computing

– Ongoing research with auto scaling

– Porting enterprise application onto the cloud

– Ongoing research with load balancing

• Clouds promise virtually infinite resources

– Probably good for HPC!!! Are they?

23.05.2012 14/57Satish Srirama

Page 15: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Troubles with HPC on Cloud

• Shift from high reliability/availability servers to

commodity servers

• Communication is huge trouble [Srirama et al, SPJ

2011]2011]

– Virtualization technology is the culprit

– Thoroughly analyzed MPI on cloud

• Tests conducted with NAS PB

23.05.2012 15/57Satish Srirama

Page 16: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Adapting computing problems to cloud

• Reducing the algorithms to cloud computing frameworks like MapReduce [Srirama et al, FGCS 2012]

• Designed a classification on how the algorithms can be adapted to MRalgorithms can be adapted to MR

– Algorithm � single MapReduce job

– Algorithm � n MapReduce jobs

– Each iteration in algorithm � single MapReduce job

– Each iteration in algorithm � n MapReduce jobs

23.05.2012 16/57Satish Srirama

Page 17: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Example algorithms for classification [Srirama et al, FGCS 2012]

• Conjugate Gradient (CG) – Class 4

• K-medoid clustering

– Partitioning Around Medoids (PAM) – Class 3

– Clustering large applications (CLARA) – Class 2– Clustering large applications (CLARA) – Class 2

• Factoring integers – Class 1

23.05.2012 17/57Satish Srirama

Page 18: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Conjugate Gradient

• Iterative method for solving systems of linear equations

• Makes an initial guess of the solution

• Applies different matrix and vector operations at each vector operations at each iteration to improve the guess

• CG is a complex algorithm, it is not possible to reduce the whole algorithm to MapReduce model

• Adapted the matrix and vector operations used by CG instead

23.05.2012 18/57Satish Srirama

Page 19: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

PAM

• Iterative k-medoid clustering algorithm

• Map– Find the closest medoid and assign the

object to the cluster of the medoid

– Input: (cluster id, object)

– Output: (new cluster id, object)

• Reduce• Reduce– Find which object is the most central and

assign it as the new medoid of the cluster

– Input: (cluster id, (list of all objects in the cluster))

– Output: (cluster id, new medoid)

23.05.2012 19/57Satish Srirama

Page 20: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

CLARA in MapReduce

• The algorithm can be reduced to 2

MapReduce Jobs

• MapReduce job I (Finding the Candidate sets)

– Map: (Assign random key to each point)– Map: (Assign random key to each point)

• Input: < key, point >

• Output < random key, point >

– Reduce: (Pick first S points and use PAM on them)

• Output < key, PAM(S points) >

– result of PAM() is a set of k medoids

23.05.2012 20/57Satish Srirama

Page 21: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

CLARA in MapReduce - continued

• MapReduce job II (Measuring the quality)

– Map: (Find each points distance from it's closest

medoid, for each candidate set)

• Input: < key, point >• Input: < key, point >

• Output: < candidate set

key, distance > – C different

sums, one for each

candidate set

– Reduce: (Sum distances)

23.05.2012 21/57Satish Srirama

Page 22: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

PAM vs CLARA

23.05.2012 22/57Satish Srirama

Page 23: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Observations

• CG– Operations used inside each iteration are reduced to MapReduce

model

– Low efficiency and scalability

• PAM– Content of a whole iteration can be reduced to MapReduce model

– Low efficiency and scalability– Low efficiency and scalability

• CLARA– Iterations are removed, everything can be reduced to two different

MapReduce jobs

– Good scalability and efficiency

• Factoring integers– Everything can be reduced to one MapReduce job

– Very good efficiency and scalability

23.05.2012 23/57Satish Srirama

Page 24: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Results of the study

• Adapting scientific computing problems to cloud environments is not a trivial task

• MapReduce is suited better for embarrassingly parallel algorithms

• CG and PAM have serious problems with job • CG and PAM have serious problems with job latency– Most of the time is spent on the background tasks

• Reading data from the HDFS

• Input need to be read again and again, even if much of it does not change after each MapReduce job

• Approximately 18 seconds latency with Hadoop for job initiation

23.05.2012 24/57Satish Srirama

Page 25: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Alternative MapReduce: Twister

• Designed for iterative algorithms, by providing long running Map and Reduce tasks

• Ability to keep input data in memory across multiple MapReduce executions

• Disadvantages:• Disadvantages:– Not perfect fault tolerance

– No distributed file system integrated

– Has stability issues

– Input data must fit into the collective memory of the cluster

– Twister has much shorter startup time. But still too high for real time applications. (~3 sec)

23.05.2012 25/57Satish Srirama

Page 26: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Stratus

• Building our own framework for iterative

problems [Jakovits and Srirama, CCGrid 2012]

• Based on Bulk Synchronous Parallel

– Consists of a sequence of super steps– Consists of a sequence of super steps

– Concurrent tasks execute the same code on local data

partition

– Send messages to other tasks, which are received at

the next super step

– Barrier synchronization at the end of every super step

23.05.2012 26/57Satish Srirama

Page 27: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

MOBILE CLOUD

23.05.2012 27Satish Srirama

Page 28: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Mobile is 7th Mass Media Channel

Tomi T Ahonen, Mobile as 7th of the Mass Media, http://mobile7th.futuretext.com/

23.05.2012 28/57Satish Srirama

Page 29: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Some numbers

• There are lot of mobile phones

– 5.6 billion subscriptions with global population of

6.97 billion

– > 3.6 billion people with at least one mobile– > 3.6 billion people with at least one mobile

• Mobile data services revenue totals $314.7

Billion

http://www.gartner.com/it/page.jsp?id=1759714

23.05.2012 29/57Satish Srirama

Page 30: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Popular consumer mobile applications

• Location-based services (LBSs)

– Deliver services to users based on his location

• Mobile social networking

– Most popular social networking platforms have – Most popular social networking platforms have apps for mobiles

• Mobile commerce

– An extension of e-commerce

• Mobile payment

– Near field communication (NFC) payment

23.05.2012 30/57Satish Srirama

Page 31: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Popular consumer mobile applications

- continued

• Context-aware services

– Context means person's interests, history, environment, connections, preferences etc.

– Proactively serve up the most appropriate content, product or service

–content, product or service

• Mobile instant messaging (MIM)

– Skype for mobiles

• Mobile e-mail

• Mobile video

23.05.2012 31/57Satish Srirama

Page 32: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Mobile Cloud Lab

• Interested in developing mobile applications

– Social networks, m-learning, m-health etc.

• Work with multiple platforms

– Android, iOS, Windows Phone 7, Sensor kits etc.

– Have ~50 latest phones

• Offered courses

– Mobile Application Development - MTAT.03.262 (Fall 2012)

– Mobile Application Development - Projects (MTAT.03.266) (Spring & Fall 2012)

23.05.2012 Satish Srirama 32/57

Page 33: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Mobile Cloud Computing

• One can do interesting things on mobiles

directly

– Today’s mobiles are far more capable

• However, some applications need to offload • However, some applications need to offload

certain activities to servers

– Processing sensor data

• Resource-intensive processing on the cloud

– To enrich the functionality of mobile applications

23.05.2012 33/57Satish Srirama

Page 34: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Mobile Cloud Applications

• Mobile has significant advantage by going

cloud-aware

– Increased data storage capacity

– Availability of unlimited processing power– Availability of unlimited processing power

– Extended battery life [Rudenko et al., SIGMOBILE 1998]

• Mobile Cloud based mash-up applications

– Mobiles need to possess multiple APIs

– Cloud interoperability is a huge trouble

23.05.2012 34/57Satish Srirama

Page 35: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Mobile Cloud Middleware

[Srirama et al, ICIW 2006]

[Flores et al, MoMM 2011]

23.05.2012 35/57Satish Srirama

Page 36: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

MCM – enables

• Interoperability between different Cloud

Services (IaaS, SaaS, PaaS) and Providers

(Amazon, Eucalyptus, etc)

• Bringing the benefits of the Cloud to the • Bringing the benefits of the Cloud to the

Mobile Domain

• Composition of different Cloud Services

• Asynchronous communication between the

device and MCM

23.05.2012 36/57Satish Srirama

Page 37: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Technological Choices – Asynchronous

Notification

• Via third party services

– Apple Push Notification Service (APNS)

– Android Cloud to Device Messaging Framework (C2DM)

– Microsoft Push Notification Service (MPN)– Microsoft Push Notification Service (MPN)

• Mobile Host [Srirama et al, 2006]

– Providing web services from smart phones

– MCM can directly send the response back to the device

– Support for Android, iOS and J2ME

23.05.2012 37/57Satish Srirama

Page 38: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

CroudSTag – Scenario

• CroudSTag takes the pictures/videos from the cloud and tries to recognize people

– Pictures/Videos are actually taken by the phone

– Processes the videos

– Recognizes people using facial recognition technologies– Recognizes people using facial recognition technologies

• Reports the user a list of people recognized in the pictures

• The user decides whether to add them or not to the social group

• The people selected by the user receive a message in Facebook inviting them to join the social group

23.05.2012 38/57Satish Srirama

Page 39: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

CroudSTag [Srirama et al, PCS 2011]

• Cloud services used

– Media storage on

Amazon S3

– Processing videos on Facial Recognition

Process

Taking picture/video

using the camera

Selecting CloudMain Menu

Send Asynchronous

Notification and Results

Storage Services

1.

2.

3.

Facebook

Login

– Processing videos on

Elastic MapReduce

– face.com to recognize

people on facebook

– Starting social group

on facebookSend invitation to the

social group

Selecting people

Selecting Cloud

Facebook

Authentication

Start Process

4.

5.

6.

7.

8.

9.

Send next

invitation

Back to Menu

23.05.2012 39/57Satish Srirama

Page 40: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

D2CM

23.05.2012 40Satish Srirama

Page 41: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Desktop to Cloud Migration [Srirama et al,

FGCS 2013/X]

• D2CM is a tool developed in the University of Tartu as part of

an FP7 project called REMICS

– “Reuse and Migration of Legacy Applications to Interoperable Cloud

Services”

• Enables scientists to deploy scientific experiments on Cloud • Enables scientists to deploy scientific experiments on Cloud

infrastructure (IaaS)

– By migrating local desktop virtual machines to the cloud

– Scenario

• Main assumptions:

– Non computer-scientists do not have significant knowledge of

computer science, clouds and migration procedures

– They are only interested in submitting an experiment and collecting

the results after some time

23.05.2012 41/57Satish Srirama

Page 42: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

D2CM Goals

• Simplify the use of on-demand cloud resources

• Support full start-to-end migration of scientific

experiments to the cloud

• Retain total control over the environment and • Retain total control over the environment and

installed libraries

– Choosing correct optimization libraries is vital for HPC

• Easy to use

23.05.2012 42/57Satish Srirama

Page 43: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

D2CM Goals - continued

• Enable to scale up scientific experiments

–Vertical scalability

–Horizontal scalability

• Concurrent execution of parallel subtasks

– MPI, OpenMP, ...

• Batch processing

– Multiple experiments at once

23.05.2012 43/57Satish Srirama

Page 44: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

D2CM execution flow

• Enter cloud credentials

• Add a desktop image

– Where everything needed is installed

– Currently we are using VirtualBox– Currently we are using VirtualBox

• Migrate the image to a specific cloud

23.05.2012 44/57Satish Srirama

Page 45: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Migrate image

• Started with migration to Amazon and Eucalyptus

– Both based on XEN

– Proof-of-Concept implementation

• Transform it into a compatible cloud image• Transform it into a compatible cloud image

– Extract the file system

– Package kernels

– Install additional software

• Move it to the target cloud

23.05.2012 45/57Satish Srirama

Page 46: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Describe the deployment

• Create deployment template to describe the configuration of the IaaS resources

– Define roles

• Instance type, what resources allocated

• Number of instances• Number of instances

– Define actions for each role

• Uploads

• Initialization commands

• Run commands

• Deployment ending conditions

• Downloads

23.05.2012 46/57Satish Srirama

Page 47: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

23.05.2012 47/57Satish Srirama

Page 48: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Deploy experiments

• Start the required instances

– Certain number for each role

– Certan type of cloud instance

• Execute the role scpecific commands• Execute the role scpecific commands

• Track the ending conditions

• Shutdown the instances

23.05.2012 48/57Satish Srirama

Page 49: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Monitor experiments

• Uses CollectD to keep track of Linux Virtual machines

– Memory usage

– Processor load

– Network

• Combined and separate graphical views for roles

• Scientist can see:

– how well the resources are utilized

– is the deployment working as intended

– was the deployment chosen correctly

23.05.2012 49/57Satish Srirama

Page 50: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Monitoring

23.05.2012 50/57Satish Srirama

Page 51: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Cost prediction

• Monetary cost is important when running

experiments in public cloud

– Price per hour cost

– Estimating the total runtime based on previous – Estimating the total runtime based on previous

experiments

23.05.2012 51/57Satish Srirama

Page 52: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

23.05.2012 52/57Satish Srirama

Page 53: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

FURTHER RESEARCH ON CLOUD

23.05.2012 53Satish Srirama

Page 54: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Data Mining on the Cloud

• Data storage and retrieval

– Also interested in processing, analysis, mining,

warehousing and presentation

• Research interests• Research interests

– NoSQL and cloud scale data storage solutions

– Migrating relational databases to non-relational

systems

– Implementing graph algorithms on graph

databases

23.05.2012 54/57Satish Srirama

Page 55: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Further research interests

• Cloud economics– Predicting costs based on previous executions

– Providing best estimates on ideal deployment configuration, given problem size and resource costs • Have the setup to experiment with load balancers• Have the setup to experiment with load balancers

• Distributed troubleshooting [Shor & Srirama, CLOSER 2011; Shor et al, DOA-SVI 2011]

– Finding memory leaks in distributed applications

– Adapting the tools to cloud solutions

• Bioinformatics on the cloud

• Refactoring enterprise applications for cloud

23.05.2012 55/57Satish Srirama

Page 56: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

How you can contribute?

• Several open Bachelor/Master theses

http://ds.cs.ut.ee/theses

• Your course projects can be extended to thesestheses

• You can also do internship in summer

• All new interesting ideas are welcome

• Good students always have paid positions

– Contact srirama AT ut.ee

23.05.2012 56/57Satish Srirama

Page 57: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

Any Questions?

• All the best with exam next week

23.05.2012 57/57Satish Srirama

Page 58: Basics of Cloud Computing –Lecture 7 Cloud Computing ...€¦ · partition (k’, number of partitions) → partition for k’ – Often a simple hash of the key, e.g., hash(k’)

References

• S. N. Srirama, P. Jakovits, E. Vainikko: Adapting Scientific Computing Problems to Clouds using MapReduce, Future Generation Computer Systems Journal, 28(1):184-192, 2012. Elsevier press. DOI 10.1016/j.future.2011.05.025

• H. Flores, S. N. Srirama, C. Paniagua: A Generic Middleware Framework for Handling Process Intensive Hybrid Cloud Services from Mobiles, The 9th International Conference on Advances in Mobile Computing & Multimedia (MoMM-2011), December 5-7, 2011, pp. 87-95. ACM.(MoMM-2011), December 5-7, 2011, pp. 87-95. ACM.

• S. N. Srirama, C. Paniagua, H. Flores: CroudSTag: Social Group Formation with Facial Recognition and Mobile Cloud Services, The 8th International Conference on Mobile Web Information Systems (MobiWIS 2011), September 19-21, 2011, v. 5 of Procedia Computer Science, pp. 633-640. Elsevier. doi: 10.1016/j.procs.2011.07.082.

• More details at http://math.ut.ee/~srirama/publications.html

23.05.2012 58/57Satish Srirama