cloud computing alcances e implementacion

88
Implementacion de Cloud Computing: Alcances y Tecnologia Lic. Jorge Guerra Guerra Universidad Nacional Mayor de San Marcos XVII Congreso Nacional de Estudiantes de Ingeniería de Sistemas y Computación 6 Agosto 2010 / http://sites.google.com/site/jguerra91/home/

description

charla de cloud computing agosto 2010

Transcript of cloud computing alcances e implementacion

Page 1: cloud computing alcances e implementacion

Implementacion de Cloud Computing: Alcances y Tecnologia

Lic. Jorge Guerra GuerraLic. Jorge Guerra GuerraUniversidad Nacional Mayor de San Marcos

XVII Congreso Nacional de Estudiantes de Ingeniería de

Sistemas y Computación 6 Agosto 2010

/

http://sites.google.com/site/jguerra91/home/

Page 2: cloud computing alcances e implementacion

Agenda

• Definiciones

• Taxonomía

• Costos

• Implementaciones• Implementaciones

2Lic. Jorge Guerra

Page 3: cloud computing alcances e implementacion

Que es cloud computing?

“No es nada nuevo”“... hemos redefinido la computación en nube para incluir todo lo que ya hacemos ... No entiendo que podriamosde otra manera ... que no sea

“Es una trampa”“Es la peor estupidez: es una bola del marketing. Alguien está diciendo que es inevitable-y cada vez que oigo eso, es muy Que es cloud computing?

No hay una respuesta consistente…

de otra manera ... que no sea cambiar la redacción de algunos de nuestros anuncios.”

Larry Ellison, CEO, Oracle (Wall Street Journal, Sept. 26, 2008)

vez que oigo eso, es muy probable que sea un campaña de negocios para hacerlo realidad.”

Richard Stallman, Founder, Free Software Foundation (The Guardian, Sept. 29, 2008)

Page 4: cloud computing alcances e implementacion

Todo el mundo tiene un montón de

datos para procesar!

• Wayback Machine tiene 2 PB + 20 TB/mes (2006)

• Google procesa 20 PB por dia (2008)

• “Todas las palabras que han hablado alguna vez

los seres humanos” ~ 5 EB

• NOAA tiene ~1 PB datos del clima (2007)

• CERN’s LHC genera 15 PB al año(2008)

Maximilien Brice, © CERN

Some material adapted from slides by Jimmy Lin, Christophe

Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google

Distributed Computing Seminar, 2007 (licensed under

Creation Commons Attribution 3.0 License)4Lic. Jorge Guerra

Page 5: cloud computing alcances e implementacion

Evolucion hacia el Cloud

Source: http://news.cnet.com5Lic. Jorge Guerra

Page 6: cloud computing alcances e implementacion

Que es Cloud Computing?

• Viejas ideas: – Grids, supercomputadoras vectoriales

– Software como Servicio (SaaS)• Def: desarrollando aplicaciones sobre la Internet

• Recientemente: “[Hardware, Infraestructura, Plataforma] como un servicio”Plataforma] como un servicio”– Pobremente definido por lo que hay que evitar “X es un

servicio”

• Utility Computing: computacion paga-como-tu-vas– Ilusion de infinitos recursos

– No hay costo por adelantado

– Facturacion de grano fino(ejm. por hora)

6Lic. Jorge Guerra

Page 7: cloud computing alcances e implementacion

Definiciones formales

• Un estilo de computación donde capacidades

basadas en TI masivamente escalables en

forma masiva se proporcionan "como un

servicio" en la red (IBM)servicio" en la red (IBM)

7Lic. Jorge Guerra

Page 8: cloud computing alcances e implementacion

Características

• Virtual – Ubicación física y detalles sobre los

infraestructura son transparentes para los usuarios

• Escalable – Capaz de dividir en partes cargas de

trabajo complejas para ser atendidos, a través de una

infraestructura ampliable de forma incremental

Lic. Jorge Guerra 8

infraestructura ampliable de forma incremental

• Eficiente – Arquitectura Orientada a Servicios para la

provisión dinámica de compartir los recursos

informáticos

• Flexible – Puede servir una variedad de tipos de carga

de trabajo - tanto de cliente o de empresa

Page 9: cloud computing alcances e implementacion

Percepción del usuario

9Lic. Jorge Guerra

Page 10: cloud computing alcances e implementacion

Como lo ven al Cloud Computing

• “Sólo me interesa resultados, no

cómo se implementan las

capacidades de TI”

• " Quiero pagar por lo que yo uso,

como una utilidad mas“como una utilidad mas“

• " Puedo acceder a los servicios

desde cualquier lugar, desde

cualquier dispositivo”

• “Puedo escalar hacia arriba o

abajo de la capacidad, según sea

necesario""

10Lic. Jorge Guerra

Page 11: cloud computing alcances e implementacion

Mapa Cloud/Saas de Laird

11Lic. Jorge Guerra

Page 12: cloud computing alcances e implementacion

Curva de evolución Cloud de Gartner

12Lic. Jorge Guerra

Page 13: cloud computing alcances e implementacion

Implementaciones Cloud

13Lic. Jorge Guerra

Page 14: cloud computing alcances e implementacion

Tipos de implementacion

14Lic. Jorge Guerra

Page 15: cloud computing alcances e implementacion

SAAS

Lic. Jorge Guerra 15

Page 16: cloud computing alcances e implementacion

Mapa Saas de Wolosky 2008

16Lic. Jorge Guerra

Page 17: cloud computing alcances e implementacion

Tipos de Cloud Computing

17Lic. Jorge Guerra

Page 18: cloud computing alcances e implementacion

Tipos

Lic. Jorge Guerra 18

Page 19: cloud computing alcances e implementacion

Enabling Technology:

Virtualization

App App App OS

App App App

OS OS

Hardware

Operating System

Traditional Stack

Hardware

Hypervisor

Virtualized Stack

Some material adapted from slides by Jimmy Lin, Christophe

Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google

Distributed Computing Seminar, 2007 (licensed under

Creation Commons Attribution 3.0 License)19Lic. Jorge Guerra

Page 20: cloud computing alcances e implementacion

Muchos Tipos de Virtualizacion

• Full virtualization

– Instrucciones sensibles (descubrimiento estático o dinámico en tiempo de ejecución) se

sustituyen por la traducción binaria o ejecucion por pasos enhardware en VMM para la

simulacion de SW

– Cualquier SO puede correr en el VM

– Ejemplos: IBM’s CP/CMS, Oracle (Sun) VirtualBox, VMware Workstation

• Virtualizacion asistido por Hardware(IBM S/370, Intel VT, o AMD-V)

– Instrucciones sensibles a traps de CPU– ejecuta sin modificar sistema operativo invitado– Instrucciones sensibles a traps de CPU– ejecuta sin modificar sistema operativo invitado

– Ejemplos: VMware Workstation, Linux Xen, Linux KVM, Microsoft Hyper-V

• Para-virtualizacion

– Presenta interfaz de SW para las máquinas virtuales similar pero no idéntica a la del HW

subyacente, requiriendo los sistemas operativos invitados que adaptarse

– Examples: early versions of Xen

• Virtualizacion del Sistema Operativo

– kernel del sistema operativo permite instancias de espacio de usuario aislados, en lugar de

un solo espacio

– Instancia look and feel como un servidor real

– Ejemplos: Solaris Zones, QEMU, BSD Jails, OpenVZ20Lic. Jorge Guerra

Page 21: cloud computing alcances e implementacion

Que hay del Grid?

Hitachi SR8000 – Leibnitz Rechenzentrum

2 TFlop/s (2*1012) 21Lic. Jorge Guerra

Page 22: cloud computing alcances e implementacion

Grid Computing

• Grid Computing Criteria (Ian Foster 2004)– Coordination: A grid must coordinate resources that are not subject to

centralized control

– Open APIs: A grid must use standard, open, general-purpose protocols

and interfaces

– QoS: A grid must deliver nontrivial qualities of service (e.g., relating to – QoS: A grid must deliver nontrivial qualities of service (e.g., relating to

response time, throughput, availability, and security) for co-allocating

multiple resource types to meet complex user demands

• Promise of ubiquitous grid computing (utility)

– Reality is specialized grids

• TeraGrid, Open Science Grid, LHC Grid

– Grid provides “library level” service customized to HW

• Ensuring consistent libraries across HW is hard!

22Lic. Jorge Guerra

Page 23: cloud computing alcances e implementacion

Cloud Computing vs.

Grid Computing

23Lic. Jorge Guerra

Page 24: cloud computing alcances e implementacion

Datacenter es el nuevo“servidor”• “Programa” = Web search, email, map/GIS, …• “Computadora” = 1000’s computadoras, almacenamiento,

redes• Facilidades y carga de trabajo del tamaño de la

instalacion• Nuevas ideas de datacenter (2007-2008): camion

container (Sun), flotantes (Google), datacenter-en-tienda

24

• Nuevas ideas de datacenter (2007-2008): camion container (Sun), flotantes (Google), datacenter-en-tienda(Microsoft)

• Cómo habilitar la innovación en nuevos servicios sin tener que construir primero y capitalizar una gran empresa?

photos: Sun Microsystems & datacenterknowledge.com 24Lic. Jorge Guerra

Page 25: cloud computing alcances e implementacion

Datacenter Architectures

• Major engineering design challenges in building

datacenters

– One of Google’s biggest secrets and challenges

– Read: https://groups.google.com/group/google-– Read: https://groups.google.com/group/google-

appengine/browse_thread/thread/a7640a2743922dcf

– Very hard to get everything correct!

• Some issues – Network access, physical security,

power

– And there’s all the software…

25Lic. Jorge Guerra

Page 26: cloud computing alcances e implementacion

Algunos con accesso de fibra muy

seguro …

Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking26Lic. Jorge Guerra

Page 27: cloud computing alcances e implementacion

Algunos con menos que eso

Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking27Lic. Jorge Guerra

Page 28: cloud computing alcances e implementacion

Infraestructura de seguridad

• 24x7 Manned

• Acceso: Biometrics,

card keyscard keys

• Video Surveillance

Sliding Glass

Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking28Lic. Jorge Guerra

Page 29: cloud computing alcances e implementacion

Algunos muy seguros…

http://www.thebunker.net

Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking29Lic. Jorge Guerra

Page 30: cloud computing alcances e implementacion

Otros como si hubiera pasado un

huracan…

Source: Build vs. Buy: Internet Datacenter, W. B. Norton and M. Lucking30Lic. Jorge Guerra

Page 31: cloud computing alcances e implementacion

Datacenter Architectures

• Let’s look at an example from telco

professionals

• Example: AT&T Miami, Florida Tier 1 datacenter

– Redundant dual uplinks to AT&T global backbone– Redundant dual uplinks to AT&T global backbone

– Minimum N+1 redundancy factor on all critical

infrastructure systems

31Lic. Jorge Guerra

Page 32: cloud computing alcances e implementacion

AT&T Internet Data Center

Security

• Hardened facilities protected by multiple

security measures:

– 24x7x365 on-premise support

– Continuous CCTV surveillance, security breach – Continuous CCTV surveillance, security breach

alarms, electronic card key access, biometric palm

scan and individual personal access code

– Secured cage and cabinet environment

AT&T Enterprise Hosting Services briefing 10/29/200832Lic. Jorge Guerra

Page 33: cloud computing alcances e implementacion

Batteries UPS Systems

Paralleling

Switch Gear /

Manual Switch

Commercial

Power SupplyTransformer

AT&T Internet Data Center

Power

Power

Distribution Units

Remote Power

Panels

Manual Switch

Diesel Fuel Tanks Generators

AT&T Enterprise Hosting Services briefing 10/29/200833Lic. Jorge Guerra

Page 34: cloud computing alcances e implementacion

AT&T Internet Data Center

Power

2 Commercial Feed Each At 13,800V

Located Near Substation supplied from 2 different grids

All Cable Routed Underground for Protection

Commercial Power Feeds

AT&T Enterprise Hosting Services briefing 10/29/200834Lic. Jorge Guerra

Page 35: cloud computing alcances e implementacion

AT&T Internet Data Center Power

• Paralleling Switch Gear

• Automatically Powers Up All

Generators When

Commercial Power is

Interrupted for More Than 7

Seconds

Emergency Power Switch

Seconds

– Generators are Shed to Cover

Load as Needed

– Typical Transition Takes Less

Than 60 Seconds

• Manual Override Available to

Ensure Continuity if

Automatic Start-Up Should

FailAT&T Enterprise Hosting Services briefing 10/29/2008

35Lic. Jorge Guerra

Page 36: cloud computing alcances e implementacion

• Four (4) Battery Strings To

Support The UPS Systems

• Battery Strings Contain

Flooded Cell Batteries

• A minimum of Fifteen (15)

AT&T Internet Data Center Power

• A minimum of Fifteen (15)

Minutes of Battery Backup

Available At Full Load

• Hydrogen Sensors

Monitoring

• Remote Status Monitoring

of Battery Strings

UPS Batteries

AT&T Enterprise Hosting Services briefing 10/29/200836Lic. Jorge Guerra

Page 37: cloud computing alcances e implementacion

AT&T Internet Data Center Power

• Four UPS Modules connected in a Ring Bus configuration• Each Module rated at 1000kVA• Rotary Type UPS by Piller

Eliminate Spikes, Sags, Surges, Transients, And All Other Over/Under Voltage And Frequency Conditions, Providing Clean Power To Connected Critical Loads

Uninterruptible Power Supply (UPS)

AT&T Enterprise Hosting Services briefing 10/29/200837Lic. Jorge Guerra

Page 38: cloud computing alcances e implementacion

AT&T Internet Data Center Power

Back-up Power – Generators and Diesel Fuel

• Four (4) 2,500 kw Diesel Generators Providing Standby Power, capable of producing 10 MW of power

• Two (2) 33,000 Gallon Aboveground Diesel Fuel Storage Tanks

AT&T Enterprise Hosting Services briefing 10/29/200838Lic. Jorge Guerra

Page 39: cloud computing alcances e implementacion

Typical Tier-2 One Megawatt Datacenter

Transformer

Main Supply

ATSSwitchBoard

UPS UPS

Generator

1000 kW

• Reliable Power: Mains + Generator,

Dual UPSSTS

PDU

STSPDU

Panel

Panel

200 kW

50 kW

Rack

Circuit

2.5 kW

X. Fan, W-D Weber, L. Barroso, “Power Provisioning for a

Warehouse-sized Computer,” ISCA’07, San Diego, (June 2007).

Dual UPS

• Units of Aggregation

– Rack (10-80 nodes) → PDU (20-60

racks) → Facility/Datacenter

39Lic. Jorge Guerra

Page 40: cloud computing alcances e implementacion

Systems & Power Density

• Estimating DC power density hard

– Power is 40% of DC costs• Power + Mechanical: 55% of cost

– Shell is roughly 15% of DC cost

– Cheaper to waste floor than power• Typically 100 to 200 W/sq ft• Typically 100 to 200 W/sq ft

• Rarely as high as 350 to 600 W/sq ft

• Over 20% of entire DC costs is in power

redundancy

– Batteries able to supply 13 megawatt for

12 min

– N+2 generation (11 x 2.5 megawatt)

James Hamilton talk, 1/17/200740Lic. Jorge Guerra

Page 41: cloud computing alcances e implementacion

Porque ahora(y no antes)?

• Commoditization of HW & SW

– x86 as universal ISA, plus fast virtualization

– Standard software stack, largely open source (LAMP)

– Bet: Can statistically multiplex multiple instances onto a single box without interference between instances

• Novel economic model: fine grain billing

– Earlier examples: Sun, Intel Computing Services—longer commitment, more $$$/hour

• Infrastructure software: eg Google FileSystem

• Operational expertise: failover, DDoS, firewalls...

• More pervasive broadband Internet

41Lic. Jorge Guerra

Page 42: cloud computing alcances e implementacion

Classifying Clouds

App Model for Utility Computing

Something

New

???

???

Amazon EC2

Close to Physical

Hardware

User Controls

Most of Stack

Windows Azure

.NET and CLR…

ASP.NET Support

More Constraints

on User Stack

Google AppEngine

App Specific Traditional

Web App Model

Constrained

Stateless/Stateful Tiers

Lower-level,

Less managed

“flexibility/portability”

Higher-level,

More managed

“more built-in functionality”???

Hard to Auto

Scale and Failover

Auto Provisioning

of Stateless App

Auto Scaling and

Auto High-Availability

Constraints on App Model Offer Tradeoffs… Lots of Ongoing Innovation…

“flexibility/portability” “more built-in functionality”

• Instruction Set VM (Amazon EC2, 3Tera)• Managed runtime VM (Microsoft Azure)• Framework VM (Google AppEngine, Force.com)

42Lic. Jorge Guerra

Page 43: cloud computing alcances e implementacion

Aplicaciones web asesinas

• Mobile and web applications

• Extensiones de software de escritorio

– Matlab, Mathematica

• Batch processing / MapReduce• Batch processing / MapReduce

– Oracle at Harvard, Hadoop at NY Times

43Lic. Jorge Guerra

Page 44: cloud computing alcances e implementacion

Demanda de Aplicacion Cloud

• Muchas aplicaciones de nubes tienen curvas

cíclicas de demanda

– Daily, weekly, monthly, …DemandaR

ecur

sos

• Picos de carga de trabajo más frecuentes y significativos

– Muerte de Michael Jackson:

• 22% de tweets, 20% de trafico Wikipedia , Google penso que

encontraba bajo ataque

– Day de toma de posesion de Obama : 5x incremento en

tweets

Tiempo

44Lic. Jorge Guerra

Page 45: cloud computing alcances e implementacion

Economia de usuarios Cloud

• Pago por usar en lugar de aprovisionamiento

para el pico

• Recuerde: los costos de CD > $ 150M y toma

24 + meses para diseñar y construir

Cómo elegir un

nivel de

capacidad?

Recursos sin usar

24 + meses para diseñar y construir

Data center estatico Data center en el cloud

Demanda

Capacidad

Tiempo

Rec

urso

s

Demanda

Capacidad

Tiempo

Rec

urso

s

45Lic. Jorge Guerra

Page 46: cloud computing alcances e implementacion

Recursos sin usar

Economia de usuarios Cloud

• Riesgo de sobre-provision: baja utilizacion

• enorme costo perdido en infraestructura

Capacidad

Static data center

Demanda

Tiempo

Rec

rsos

46Lic. Jorge Guerra

Page 47: cloud computing alcances e implementacion

Economia de usuarios Cloud

• Dura penalidad por baja-provision

Res

ourc

es

Demand

Capacity

1 2 3

Res

ourc

es

Capacity

Riesgo de bajo uso siRiesgo de bajo uso si

predicciones de pico

son demasiadoAplicacion

Perdida de ingresos

Perdida de usuarios

Res

ourc

esDemand

Capacity

Time (days)1 2 3

Time (days)1 2 3

Res

ourc

es

Demand

Capacity

Time (days)1 2 3

Muy difícil provisión para

cargas de trabajo de punta

despericiado

son demasiado

optimistas – CapEx

despericiado

Aplicacion

47Lic. Jorge Guerra

Page 48: cloud computing alcances e implementacion

Utility Computing Arrives• Amazon Elastic Compute Cloud (EC2)• “Compute unit” rental: $0.10-0.80 0.085-0.68/hour

– 1 CU ≈ 1.0-1.2 GHz 2007 AMD Opteron/Intel Xeon corePlatform Units Memory Disk

Small - $0.10 $.085/hour 32-bit 1 1.7GB 160GB

Large - $0.40 $0.35/hour 64-bit 4 7.5GB 850GB – 2 spindles

X Large - $0.80 $0.68/hour 64-bit 8 15GB 1690GB – 4 spindles

• No up-front cost, no contract, no minimum• Billing rounded to nearest hour (also regional,spot pricing)• New paradigm(!) for deploying services?, HPC?

X Large - $0.80 $0.68/hour 64-bit 8 15GB 1690GB – 4 spindles

High CPU Med - $0.20 $0.17 64-bit 5 1.7GB 350GB

High CPU Large - $0.80 $0.68 64-bit 20 7GB 1690GB

High Mem X Large - $0.50 64-bit 6.5 17.1GB 1690GB

High Mem XXL - $1.20 64-bit 13 34.2GB 1690GB

High Mem XXXL - $2.40 64-bit 26 68.4GB 1690GB

Northern VA cluster

48Lic. Jorge Guerra

Page 49: cloud computing alcances e implementacion

Economics of Cloud Providers

• Microsoft and Google race to build next-gen DCs

(Jan’07)

– Microsoft announces a $550 million DC in Texas

– Google confirm plans for a $600 million site in North

CarolinaCarolina

– Google two more DCs in South Carolina; may cost another

$950 million – about 150,000 computers each

• Power availability drives deployment decisions

49Lic. Jorge Guerra

Page 50: cloud computing alcances e implementacion

Costos ocultos del cloud

50Lic. Jorge Guerra

Page 51: cloud computing alcances e implementacion

Google Oregon Datacenter

Source: Harper’s (Feb, 2008)

51Lic. Jorge Guerra

Page 52: cloud computing alcances e implementacion

Containerized Datacenters

Nortel Steel Enclosure

Containerized telecom equipment Sun Black Box (242 systems in 20’)Sun Black Box (242 systems in 20’)

Rackable Systems (1,152 Systems in 40’)Rackable Systems Container Cooling Model

James Hamilton talk, 1/7/200752Lic. Jorge Guerra

Page 53: cloud computing alcances e implementacion

Unit of Data Center Growth

• One at a time: – 1 system

– Racking & networking: 14 hrs ($1,330)

• Rack at a time:– ~40 systems

– Install & networking: .75 hrs ($60)

• Container at a time:• Container at a time:– ~1,000 systems

– No packaging to remove

– No floor space required

– Power, network, & cooling only

• Weatherproof & easy to transport

• Data center construction takes 24+ months– Both new build & DC expansion require

regulatory approval

53Lic. Jorge Guerra

Page 54: cloud computing alcances e implementacion

Sun Modular Datacenter

“BlackBox” (GreenBox)• Delivered June 9th, operational in September

– Significant challenges with cooling reliability

• 7.5 40U racks

– Power and cooling equivalent to all Soda machine rooms

54Lic. Jorge Guerra

Page 55: cloud computing alcances e implementacion

Economics of Cloud Providers

Economies of Scale for Humongous Datacenters

(1,000’s to 10,000’s of commodity computers)

Electricity

Put Datacenters

at Cheap Power

Network

Put Datacenters

on Main Trunks

Operations

Standardize and

Automate Ops

Hardware

Containerized

Low-Cost Servers

• Economy of scale vs. provisioning a medium-sized (100’s machines) facility– Public (utility) vs. private clouds issue

• Build-out driven by demand growth (more users)

55

5 to 7 Times Reduction in the Cost of Computing…

Lic. Jorge Guerra

Page 56: cloud computing alcances e implementacion

Alimentación y refrigeración es cara!

La infraestructura de energía y

enfriamiento cuestan MUCHO

Infrastructure PLUS Energy

> Server Cost Since 2001

Infrastructure Alone

> Server Cost Since 2004

Belady, C., “In the Data Center, Power and

Cooling Costs More than IT Equipment it

Supports”, Electronics Cooling Magazine

(Feb 2007)

Energy Alone

> Server Cost Since 2008

Cost Effective to Discard Inefficient Servers

Ahorro de energía � Ahorro en Infraestructura!

Like Airlines Retiring Fuel-Guzzling Airplanes

Dispuesto a pagar más $ / servidor para

servidores eficientes mas potentes

56Lic. Jorge Guerra

Page 57: cloud computing alcances e implementacion

Public vs. Private Clouds

• Building a Very Large-Scale Datacenter Very Is Expensive

– $100+ Million (Minimum)

• Large Internet Companies Already Building Huge DCs

– Google, Amazon, Microsoft…

• Large Internet Companies Already Building Software

– MapReduce, GoogleFS, BigTable, Dynamo– MapReduce, GoogleFS, BigTable, Dynamo

Technology Cost in Medium-Sized DC Cost in Very Large DC Ratio

Network $95 per Mbit/sec/month $13 per Mbit/sec/Month 7.1

Storage $2.20 per GByte/month $0.40 per Gbyte/month 5.7

Administration ≈ 140 Servers /

Administrator

> 1000 Servers /

Administrator

7.1

James Hamilton, Internet Scale Service

Efficiency, Large-Scale Distributed Systems

and Middleware (LADIS) Workshop Sept‘08

Huge DCs 5-7X as Cost Effective

as Medium-Scale DCs 57Lic. Jorge Guerra

Page 58: cloud computing alcances e implementacion

Extra Benefits para Cloud Providers

• Amazon: utiliza capacidad ociosa

• Microsoft: vende herramientas .NET

• Google: reutiliza infraestructura existente

58Lic. Jorge Guerra

Page 59: cloud computing alcances e implementacion

Platform - Amazon Web

Services

� Elastic Compute Cloud (EC2)� Rent computing resources by the hour� Basic unit of accounting = instance-hour� Additional costs for bandwidth

� Simple Storage Service (S3)� Simple Storage Service (S3)� Persistent storage� Charge by the GB/month� Additional costs for bandwidth

Page 60: cloud computing alcances e implementacion

Platform - Amazon Web Services(EC2)

• • Infrastructure as a Service provider, and current market

leader.

• • Data centers in USA and Europe

• • Different regions and availability zones• • Different regions and availability zones

• • Uses Xen hypervisor

• • Users provision instances in classes, with different CPU,

memory and I/O performance.

Page 61: cloud computing alcances e implementacion

Platform - Amazon Web Services(EC2)

• Users provision instances with an Amazon Machine Image (AMI),

packaged virtual machines.

– Instances ready in 10-20 seconds.

– Amazon provides a range of AMIs

• Users can upload and share custom AMIs,

– preconfigured for different roles.

– • Supports Windows, OpenSolaris and Linux

• Control interface

– HTTP REST/SOAP API

– Command line tools

• Able to implement external monitoring and scaling using interface.

Page 62: cloud computing alcances e implementacion

Platform - Amazon Web Services(EC2)

• Flexible, but low-level (roll-your-own)

• No built-in load balancing or scaling (yet)

• Integrated with services:

– Simple Storage Service (S3)

– Scalable Queue Service (SQS)

– SimpleDB – SimpleDB

• Pricing based on instance hours

– + bandwidth charges

– + service charges (S3, SQS etc.)

Page 63: cloud computing alcances e implementacion
Page 64: cloud computing alcances e implementacion

Platform – Windows Azure• Platform as a Service (in pre-release)

– “Cloud OS”

– .NET libraries for managed code like C#

– Web and worker roles (w/queues)

• Topology described in metadata

• Live upgrades (w/upgrade zones)• Live upgrades (w/upgrade zones)

Page 65: cloud computing alcances e implementacion
Page 66: cloud computing alcances e implementacion

Platform – Google App Engine

• Platform as a Service

• Target: Web applications

• Provides custom Python runtime environment, with a

specialized version of the Django framework.

• Integrated with Google data store (Bigtable), and other • Integrated with Google data store (Bigtable), and other

“Internet-scale” infrastucture.

• Actually support Java Technology.

Page 67: cloud computing alcances e implementacion
Page 68: cloud computing alcances e implementacion

Cloud Computing Infrastructure

• Computation model: MapReduce*

• Storage model: HDFS*

• Other computation models: HPC/Grid

ComputingComputing

• Network structure

*Some material adapted from slides by Jimmy Lin, Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet,

Google Distributed Computing Seminar, 2007 (licensed under Creation Commons Attribution 3.0 License)

68Lic. Jorge Guerra

Page 69: cloud computing alcances e implementacion

Cloud Computing Computation

Models

• Finding the right level of abstraction

– von Neumann architecture vs cloud environment

• Hide system-level details from the developers

– No more race conditions, lock contention, etc.– No more race conditions, lock contention, etc.

• Separating the what from how

– Developer specifies the computation that needs to be performed

– Execution framework (“runtime”) handles actual execution

69Lic. Jorge Guerra

Page 70: cloud computing alcances e implementacion

“Big Ideas”

• Scale “out”, not “up”– Limits of SMP and large shared-memory machines

• Idempotent operations– Simplifies redo in the presence of failures

• Move processing to the data• Move processing to the data– Cluster has limited bandwidth

• Process data sequentially, avoid random access– Seeks are expensive, disk throughput is reasonable

• Seamless scalability for ordinary programmers– From the mythical man-month to the tradable

machine-hour

70Lic. Jorge Guerra

Page 71: cloud computing alcances e implementacion

Typical Large-Data Problem

• Iterate over a large number of records

• Extract something of interest from each

• Shuffle and sort intermediate results

• Aggregate intermediate results• Aggregate intermediate results

• Generate final outputKey idea: provide a functional abstraction for

these two operations – MapReduce

(Dean and Ghemawat, OSDI 2004)

71Lic. Jorge Guerra

Page 72: cloud computing alcances e implementacion

• http://labs.google.com/papers/mapreduce.html• This is a dataflow model between services where services can do useful

document oriented data parallel applications including reductions• The decomposition of services onto cluster engines (clouds) is automated• The large I/O requirements of datasets changes efficiency analysis in favor

of dataflow• Services (count words in example) can obviously be extended to general

parallel applications• There are many alternatives to language expressing either dataflow and/or

Google MapReduce

Simplified Data Processing on Clusters/Clouds

• There are many alternatives to language expressing either dataflow and/or parallel operations and/or workflow

72Lic. Jorge Guerra

Page 73: cloud computing alcances e implementacion

f f f f fMap

Roots in Functional Programming

g g g g gFold

73Lic. Jorge Guerra

Page 74: cloud computing alcances e implementacion

Putting everything together…

namenode

namenode daemon

job submission node

jobtracker

datanode daemon

Linux file system

tasktracker

slave node

datanode daemon

Linux file system

tasktracker

slave node

datanode daemon

Linux file system

tasktracker

slave node

74Lic. Jorge Guerra

Page 75: cloud computing alcances e implementacion

MapReduce/GFS Summary

• Simple, pero poderoso modelo de programación

• Escala a manejar cargas de trabajo de petabyte+

– Google: six hours and two minutes to sort 1PB (10 trillion 100-byte records) on 4,000 computers

– Yahoo!: 16.25 hours to sort 1PB on 3,800 computers– Yahoo!: 16.25 hours to sort 1PB on 3,800 computers

• Incrementa la mejora del rendimiento con más nodos

• Maneja a la perfección los fallos, pero posiblemente con penalizaciones en el rendimiento

75Lic. Jorge Guerra

Page 76: cloud computing alcances e implementacion

Implementacion

Lic. Jorge Guerra 76

Page 77: cloud computing alcances e implementacion

Estrategias comerciales

• Microsoft: Software plus Services

– Uso de .NET y Windows

• IBM: Transformation through Customer

ImplementationsImplementations

– Implementacion construida con participacion del

cliente

• Cisco: Evolving Interoperability

– Provee herramientas basadas en Web 2.0

Lic. Jorge Guerra 77

Page 78: cloud computing alcances e implementacion

Metodología de implementación

Lic. Jorge Guerra 78

Page 79: cloud computing alcances e implementacion

Definir Casos de Uso

Lic. Jorge Guerra 79

Page 80: cloud computing alcances e implementacion

Evaluar Infraestructura

Lic. Jorge Guerra 80

Page 81: cloud computing alcances e implementacion

Implementar

Lic. Jorge Guerra 81

Page 82: cloud computing alcances e implementacion

Problemas a considerar

Lic. Jorge Guerra 82

Page 83: cloud computing alcances e implementacion

Problemas a considerar

Lic. Jorge Guerra 83

Page 84: cloud computing alcances e implementacion

Buenas practicas

Lic. Jorge Guerra 84

Page 85: cloud computing alcances e implementacion

Criterios a considerar

Lic. Jorge Guerra 85

Page 86: cloud computing alcances e implementacion

Sumario

• Muchos beneficios de Cloud Computing :

– Desplazar de CapEx aOpEx , escalar OpEx a la demanda

– Startups and prototyping, One-off tasks (Wash. Post)

– Costo asociativo

– Investigacion a escala– Investigacion a escala

• Many Cloud Computing Challenges:

– Disponibilidad

– Datos en la nube pueden ser “pesados” ($$$ para mover)

86Lic. Jorge Guerra

Page 87: cloud computing alcances e implementacion

Referencias

• http://en.wikipedia.org/wiki/Cloud_computing– Includes references to Amazon, Apple, Dell, Enomalism, Globus,

Google, IBM, KnowledgeTreeLive, Nature, New York Times, Zimdesk

– Others like Microsoft Windows Live Skydrive important

• http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud

• http://uc.princeton.edu/main/index.php?option=com_content&task=view&id=2589&Itemid=1 Policy Issuest&task=view&id=2589&Itemid=1 Policy Issues

• http://www.cra.org/ccc/home.article.bigdata.html– Hadoop (MapReduce) and “Data Intensive Computing”

– See Data intensive computing minitrack at HICSS-42 January 2009

• http://ianfoster.typepad.com/blog/2008/01/theres-grid-

in.html

– OGF Thought Leadership blog

• OGF22 talks by Charlie Catlett and Irving Wladawsky-Berger87Lic. Jorge Guerra