1 Want Sustainable and Green? Seed Computing Clouds Dr Lee Gillam [email protected].

49
1 Want Sustainable and Green? Seed Computing Clouds Dr Lee Gillam [email protected]

Transcript of 1 Want Sustainable and Green? Seed Computing Clouds Dr Lee Gillam [email protected].

1

Want Sustainable and Green? Seed Computing

Clouds

Dr Lee Gillam

[email protected]

What is our scope?• Green IT and Sustainability often used interchangeably:

• Typically, with focus on natural resources – more compute power consumes more materials and energy;

• Waste products throughout the product lifecycle, and familiar (Greenpeace) images including those of children burning plastic wires to obtain valuable copper.

• Highly toxic elements in laptops and mobile phones.• But we probably also need to consider the

sustainability of IT, particularly in the UK.• Fewer “IT” graduates, but can technology reduce

numbers needed?• Graduates need to know the business impacts of such

changes • Cloud Computing demonstrates a lot of what we

can be thinking about. • So, just by seeding a few Clouds ….

Sustainability and Green IT?

3

A pile of electronic waste on a roadside in Guiyu, China

Electronic waste in Guangdong, China As much as 4,000 tonnes of toxic e-waste are discarded every hour. Vast amounts are routinely and often illegally shipped as waste from Europe, USA and Japan to places where unprotected workers recover parts and materials.

Sustainability of UK IT?• Number of HE students and graduates at a record high

but HE Computing student numbers and graduations are falling.• Students: 137,650 (2003-4); 106,910 (2006-7)• Graduations: 37,445 (2004-5) down to 31,270 (2006-7)

• IT labour market seen growing.• Continuing demand for staff in the IT labour market

leading to a net growth in the number employed.• Single largest area of growth: Software Professionals, a

role for which a high degree of technical knowledge, capability and training is required.

• Decline in computing graduates results in fewer “new entrants” from HE with necessary deep-based technical skills.

Source: CPHC report: A study on the IT labour market in the UK.(2008)

Structure of TalkIntroducing Cloud ComputingSoftware as a Service (SaaS)Platform as a Service (PaaS)Infrastructure as a Service (IaaS)Concluding Remarks

So, what are these Clouds?

Already an environmental concern:

“Make IT Green: Cloud Computing and its Contribution to Climate Change”

http://www.greenpeace.org/raw/content/usa/press-center/reports4/make-it-green-cloud-computing.pdf

- 12 pages, 7MB.

Spot the Cloud

What is the Cloud?

What is the Cloud?

What is the Cloud?Microsoft’s $500 million Chicago data

center, one of the largest data centers ever built, spanning more than 700,000 square feet (Man Utd pitch about 80,000 sq.ft).

How might Clouds be built?See Microsoft’s video at:

http://www.microsoft.com/showcase/en/us/details/36db4da6-8777-431e-aefb-316ccbb63e4e

Environmentally-friendly Clouds

12

In June 2007 Google completed a 1.6MW solar installation - the largest U.S. corporate installation at that time. 9,212 solar panels cover the rooftops of eight buildings and two solar carports at the Googleplex.

Produces enough electricity to power 30% of Google's peak electricity demand

http://www.google.com/corporate/solarpanels/home

Power usage effectiveness (PUE), which gives a ratio of total energy in to energy used for compute, does not appear to incorporate how “green” the energy is.

Environmentally-friendly Clouds

13

Activity Google searches

CO2 emissions of an average daily newspaper (100% recycled paper)

850

A glass of orange juice 1,050

One load of dishes in an EnergyStar dishwasher

5,100

A five mile trip in the average U.S. automobile

10,000

A cheeseburger 15,000

Electricity consumed by the average US household in one month

3,100,000

http://www.google.com/corporate/green/datacenters/

What is the Cloud?It’s not just data centres, but “reselling” capacity in

data centres is key:Advantage of economies of scale to reduce the cost-per-unit

service provision and maximize the opportunities for efficient energy use in power and cooling systems – machine room vs. broom cupboard?

Remove redundant server components during production (e.g. sound cards, USB ports).

Usage can be maximized – “salesforce.com serves more than 1.5M users (and 55,000 enterprise customers) every day with less than a 1,000 servers” (500 used to mirror data). “As of January 31, 2009, … a single, third-party Web hosting facility

located on the west coast of the United States, leased from Equinix, Inc. …. replicated in near real-time in a separate [Equinix] back-up facility located on the east”.

What infrastructure duplication would occur if each enterprise customer had their own in-house system?

http://www.slideshare.net/AmazonWebServices/london-aws-exec-wv, slide 16.

Traditional IT – unanticipated success can be disastrous: can’t scale to meet demand.

http://www.slideshare.net/kvjacksn/aws-2008-cloud-computing-ncoic-presentation, Slide 27

“Cloud” PUEs

Source: Make IT Green: Cloud Computing and its Contribution to Climate Change

Where is the Cloud?Amazon offers use of data centres in four

availability zones (US East/West, Ireland, Singapore).

Amazon.com also uses other data centres. In 2003, Amazon began to use Equinix as a data

centre provider. Equinix has 50 data centres in 19 locations worldwide, with 5 data centres, and the company’s 50th, in and around London.

The Salesforce CRM runs in Equinix facilities; further Equinix customers include (or have included) Myspace.com, NASA, Fox Sports Interactive Media and Sandisk.

Where is the Cloud?Facebook is building its first data centre, scheduled for

completion in 2011, in Prineville, Oregon with an estimated cost around $200mFacebook has used Digital Realty Trust, who also provide a data

centre facility to Terremark and report some 50 Fortune 500 customers, and DFT data centres, and has been alongside Twitter in Fortune Data Centers.

Terremark own and operate NAP of the Americas – Miami, a facility that houses, amongst others, NewServers.

ElasticHosts use Peer 1 and BlueSquare Data. Peer 1 has data centres in 16 locations, 14 in America/Canada, a London data centre, and a mainland European data centre in Amsterdam.

NewServers, Terremark and ElasticHosts are all “Cloud” providers (more about them later also).

What is the Cloud?“Reselling” capacity in data centres.

Buyer beware: For Cloud Computing users, care needs to be taken in selecting multiple Cloud providers in the hope of constructing redundant systems. It is possible that the physical infrastructures that underlie some of these systems are highly co-located. Cloud Storage solutions, JungleDisk and ElephantDrive, are

both reselling Amazon storage (S3) with bespoke functionality at slightly higher prices, perhaps with some “free” space.

Need to know data centre location and service provider to ensure adherence e.g. to the Data Protection Act 1998, additional regulatory requirements, AND because I may think I’m building redundancy in when I’m not.

Business sustainability?Some organizations need never purchase a

server, own software licenses, or worry (so much) aboutwhether current on server software

licenses?when to schedule next software upgrade? what if the hardware fails at inconvenient

times?how do I manage technology assets? what to do with old hardware? how to treat depreciation of IT assets? when can I afford to add capacity?

21

Traditional vs Cloud

Traditional Systems Cloud Systems

File server Google Docs

MS Outlook, Apple Mail Gmail, Yahoo!, MSN

SAP CRM/Oracle CRM/Siebel

SalesForce.com

Quicken/Oracle Financials

Intacct/NetSuite

Microsoft Office/Lotus Notes

Google Apps

Off-site backup Amazon S3

Server, racks, and firewall

Amazon EC2, GoGrid, Mosso22

NB: Just a sample! – from Reese, Table 1-1

SaaS

SaaSSaaS probably the most mature *aaS of Cloud

ComputingFor some, SaaS began with mainframes!

Software as a package, with license, distributed on media, installed by individuals / corporates – costs all round.

Software becomes a download installed by individuals / corporates – cheaper for producer.

Software becomes large, difficult to install and configure, and needs new hardware – more expensive for consumer.Company costs increased by software “bloat”; updates

become expensive to install, test, deploy, remove (on failure) etc.

Software usable over the internet.

24

SaaSAdvantages?

Bespoke infrastructure not requiredReduced system administrator loadingNo upfront software costsUsers supported by the organisation that knows the

product and how to make it work wellBUT, some setup costs necessary. Training required, but

system consistency preferred, so incremental changes may reduce training overhead.

Disadvantages?Costs may be “per seat” rather than “per use”.Always-on connectivity / “thin pipes” problem.Vendor lock-in?Trust in third party?Limited support in Service Level Agreements? Emotional attachment to physical systems?25

SaaS: Google Mail / AppsTrinity College Dublin

Legacy email system that lacked the features that students wanted; Achieved cost savings associated with labour and operational efficiencies

University of WestminsterReported £1 million savings on IT costs

Universities of Portsmouth, Sunderland, Loughborough...

http://www.google.com/a/help/intl/en/edu/customers.html Los Angeles City Council (2nd largest US city)

outsources e-mail to Google; $7.25-million contract (5 years) to move all 30,000 city employees.

26

Other SaaSApplication Examples

Calendar Google, Yahoo, Windows Live, CalendarHub, Hunt Calendars, Calendar Net

Schedules Diarised, Windows Live Events, Schedulebook, AppointmentQuest

Planning / Task Management

Bla-bla List, Hiveminder, Remember the Milk, Tudu List, HiTask, Zoho Planner

Event Management Conference.com, RegOnline, Event Wax

Project Management BaseCamp, Project Drive, Zoho Projects, onProject

Web Databases Zoho Creator / Zoho DB & Reports, MyWebDB, Cebase, QuickBase, Lazybase

Bookmarking BlinkList, Clipmarks, del.icio.us, Tagseasy

Photo Editing FotoFlexer, Preloadr, Snipshot

Photo Sharing dotPhoto, Flickr, Photobucket, Picasa Web Albums

Desktops ajaxWindows, eyeOS, g.ho.st, YouOS

Web Conferencing Genesys Meeting Center, IBM Lotus Sametime, Microsoft Office Live Meeting, WebEx, Zoho Meeting

Groupware Contact Office, Google Sites, Project Spaces, teamspace

Blogs and Wikis Blogger, TypePad, WordPress, Pbwiki, wikihost.org, Wikispaces, Zoho Wiki

27

PaaS

Google App EngineGoogle has lots of data centres ....

From: Using Google App Engine

Where is my data??

Google App Engine Google keeps buying computers. Add or adjust network and computing resources to meet the

demand. Even Google may not be able to track what is running

and where in real time! Systems must continue running, as far as possible, so

independence of the physical system vital. Google has a software framework—an abstraction layer—that

hides detail about where data is located or which software is running on which server in which data centre.

Abstraction layer provides flexibility in terms of dynamic reallocation of resources to meet changing needs and demands. Data, software, and computation can “follow the sun” Where people are sleeping, data centres can undertake batch

work such as Web crawling, building search indexes, backups, maintenance updates, or supporting load balancing for systems where the sun is.

Google’s competitive advantage here is cost-efficient scaling

Adapted from: Using Google App Engine

GAppE: What is it?So, if you have a good software framework

running in your data centres, why not let others use it?Applications run in a sandbox. Storage is Google’s Datastore (abstracted away

from physical storage)Can retrieve URLs, but not open direct network

connectionsApplications are monitored – performance can

be catered for; bad applications can be stopped.At specific levels of use (application popularity),

charges apply

Adapted from: Using Google App Engine

GAppE: How does it work?Google cloud like a mobile phone network:

Programs and data “roam” – your software could be anywhere in the world at any given time.

Web requests (calls) find your software, regardless of where your software happens to be running.

IP address of your Google application may be different, depending on where you hook in. Google’s system determines which data center(s) can run

your application or perhaps are already running your application.

Proximity, loading, or storage associations (where the data is “at rest” at the moment) may be factors.

If your application experiences a sudden spike of traffic in the United Kingdom, Google will likely copy your program and some of your data to one of its data centers there, start your application in that data center, and partition requests.

Scaling just happens!

Adapted from: Using Google App Engine

GAppE: How does it work?Popular applications may be running in

parallel in a number of different data centers.

Unpopular applications may not be running until requested.

You don’t know where your application is, or if it is running. Google hides all these details from you. Users don’t really notice.

“Somehow, it should just work”

Adapted from: Using Google App Engine

GAppE: What else does it do?When you run on your own web servers (“heavy lifting”):

Which O/S, version, patchlevel? (How many?) How patched? (if Windows, reboot??)

Antivirus? Firewall? (D)DoS – have to contact Google; may cost you.Which DB, version, patchlevel?

DB on same machine as web server? DB across multiple machines? DB as a bottleneck?

Support for peak demand? Unexpected demand? (When?)When is an upgrade to any kind of capacity needed?

Under GAppE, Google’s problems! Of course, costs start to apply (but economics need to be

considered anyhow). Can you do it cheaper than Google can? Can another

provider …?

IaaS

IaaSThe cloud competes against two approaches

to IT:Internal IT infrastructure and support

ownership, no matter where it is locatedpay staffbuy replacements (or cannibalise existing machines)requires Capital Expenditure (CapEx)

Outsourcing to managed servicesrental (fees – Operational Expenditure, OpEx) - someone

else owns your servers and keeps them runningrental company pays staff managing infrastructure is their problem - replacement

depends on the service-level agreement (SLA).

Adapted from Reese, Ch1.

Amazon Web Services (AWS)EC2: APIs for provisioning, managing, and

deprovisioning virtual servers inside the Amazon cloud. Any application anywhere on the Internet can launch

a virtual server in the Amazon cloud with a single web services call.

Several “data centres” (availability zones) Amazon’s EC2 U.S. footprint spans three (or more)

data centers on the East Coast of the U.S. and two (or more) in Western Europe.

You cannot mix and match U.S. and European environments, though you can run traffic between them.

Adapted from Reese, Ch1.

Amazon Web Services (AWS)Servers run a highly customized version of the Open Source

Xen hypervisor using paravirtualization. provides isolated computing environment for guest servers.

Guest servers are set up as Amazon machine images (AMI) with an operating system and a set of software; you will need a defined software stack for the functionality that you require.

Three kinds of storage directly relevant to EC2:Ephemeral storageBlock storage (SAN-like) and persistent across timeSimple Storage Service (S3) – can be used for cloud-based persistent

storage; used for AMI storage – AMI staged to disk from S3 (KVM)Competitors may also provide persistent internal storage

for nodes to make them operate more like a traditional data center.

Amazon Web Services (AWS)EBS and research data – move the analysis

to the data....

Amazon Web Services (AWS)EBS and research data – move the analysis

to the data....

IaaSAmazon Web Services (AWS) – including EC2,

S3, ….RackspaceFlexiscaleGoGridJoyentTerremark vCloud ExpressElasticHostsNewServers

Why buy another machine??

Concluding Remarks

Buyer BewareThe market is full of supposed “Cloud” providers –

but it might not be “Cloud” as you expect it or if you subscribe to NIST/Gartner (etc.) definitions or Open Cloud Manifesto.

A variety of pricing structures exists.Everybody has a faster Cloud than

everybody else.Exercise for the reader – try and figure out what

the Cloud offering is from these providers.Carpathia Hosting Inc: http://www.carpathiahosting.com/ Layered Technologies: http://www.layeredtech.com/ 3Tera: http://www.3tera.com/ Skytap: http://www.skytap.com/

New Clouds env. + ?Google Patent Application: “A system includes

a floating platform-mounted computer data center comprising a plurality of computing units, a sea-based electrical generator in electrical connection with the plurality of computing units, and one or more sea-water cooling units for providing cooling to the plurality of computing units.” .An “offshore” data centre – but how far off shore?!

(Tax avoidance vs legislative difficulties?)

New Clouds env. + ?Some combination of:

on the site of a Nuclear Reactor?at the poles to provide free cooling – or out in

space to provide “really” cool computers?on oil rigs, using “flare gas” for energy?with less-predictable provisioning out at sea

to make use of hydroelectric and wind power?

Environmentalists and legislators will be looking at this for years to come!

Cloud implies various issues Green SaaS, PaaS, IaaS V12n Multi-tenant Clusters, HPC, HTC, P2P,

Grid, Web services Load balancing, metering

and monitoring, utilization Exotic architectures Distributed data – ACID?

Federation, Replication Economics – CapEx and

OpEx, billing and TCO

International Laws – Copyright, DP/Privacy, Licences

Ethics? … “heavy lifting” – gone Availability,

segmentation, redundancy and backup (and security)

Different problems emergedata-centre centricity

(where and which?)

Additional Reading

Reese, G. (2009) “Cloud Application Architectures: Building Applications and Infrastructure in the Cloud”. O'Reilly Media, Inc. ISBN(13): 978-0596156367, 204 pages.

Above the Clouds:http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf

Amazon AWS: http://docs.amazonwebservices.com/AWSEC2/latest/DeveloperGuide/

Sun’s Introduction: http://www.sun.com/featured-articles/CloudComputing.pdf

Cloud Computing and the Law:http://www.law.ed.ac.uk/ahrc/gikii/docs3/mowbray.pdf

47

Shameless PlugNikolaos Antonopoulos and Lee Gillam (Eds.)

(2010): Cloud Computing: Principles, Systems and Applications. Springer. ISBN: 978-1-84996-240-7. Due: August 2, 2010

Thank You

Dr Lee [email protected]