Cloudy Operations - OSCON 2010

Post on 08-May-2015

14.270 views 3 download

description

OSCON 2010 Cloud Summit presentation. How to to operate in a cloudy world.

Transcript of Cloudy Operations - OSCON 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 1

John WillisVP of Servicesjohn@opscode.comtwitter.com/botchagalupe

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 2

IT Management PodcastDevopsCafeCloudCafe

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 2

IT Management PodcastDevopsCafeCloudCafe

Thursday, July 29, 2010

Managedhosting

Virtualization

Private Public

SaaS

PaaS PaaS

IaaS IaaS

Slide courtesy Alistair Croll - alistair@rednod.comThursday, July 29, 2010

Managedhosting

Virtualization

Private Public

SaaS

PaaS PaaS

IaaS IaaS

If you want to

talk clouds,

pick one first.

Slide courtesy Alistair Croll - alistair@rednod.comThursday, July 29, 2010

Infrastructure as a Service(IaaS)

Amazon EC2, Rackspace Cloud, Terremark, Gogrid, Joyent (and nearly every private cloud built on Zenserver or VMWare.)

Slide courtesy Alistair Croll - alistair@rednod.comThursday, July 29, 2010

Cloudy Operations

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Pixie Dust!

6Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 7

Did They Lie?

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 7

Did They Lie?

I did not have “cloudy” relations with that provider

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Infrastructure is Hard!

8Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 9

FullyAutomated Infrastructure

ReleaseControl

Orchestration

Dispatcher

Provisioning

Deploy

Config management

OS boot/install

Artifact repository

Build

CI Server Issue tracker

SCM Repository

Model

Asset inventory

Host naming

Identity

CMDBMonitoring

Events

Trending Reporting

Trending

Workflows

Resources

Topology

Configuration

Code

Sources

Scheduler

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 9

FullyAutomated Infrastructure

ReleaseControl

Orchestration

Dispatcher

Provisioning

Deploy

Config management

OS boot/install

Artifact repository

Build

CI Server Issue tracker

SCM Repository

Model

Asset inventory

Host naming

Identity

CMDBMonitoring

Events

Trending Reporting

Trending

Workflows

Resources

Topology

Configuration

Code

Sources

Scheduler

Thursday, July 29, 2010

Network OperationsSystems Administrators

Software DevelopersDatabase Administrators

Storage ManagementProject ManagementChange Management

Continuity PlanningRisk Management

Web DesignPerformance

ComplianceArchitecture

ToolingTestingSecurityReporting

Facilities

* in no particular

order( )

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

SNAFU11

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

Alerts

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

Correlation

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacity

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

Nagios

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

NagiosCollectd

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

NagiosCollectdjcollectd

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

NagiosCollectdjcollectdGanglia

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

NagiosCollectdjcollectdGangliaZenoss

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

NagiosCollectdjcollectdGangliaZenoss

JMX

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

NagiosCollectdjcollectdGangliaZenoss

JMXOpenNMS

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Monitoring

12

Performance Log

AlertsEvent

CorrelationCapacityAnalytics

NagiosCollectdjcollectdGangliaZenoss

JMXOpenNMS

Munin

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Provisioning

13Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Provisioning

13

Provisioning

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Provisioning

13

Provisioning

Configuration

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Cloudy Provisioning

13

Provisioning

Configuration

Systems Integration

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 14

Nodes

opslb01

opsws01opsws02

opsdm01opsds01opsds02

Provisioning

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 15

Roles

loadbalancerwebserverdbmasterdbslave

Configuration Management

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 16

Load Balancer

Web Server Web Server

DB Master

DiskDisk

DB Slave DB Slave

Disk

Recipes

haproxyapache2myssql

Systems Integration

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 17

name "webserver"description "Systems that serve HTTP traffic"

run_list( "role[base]", "recipe[apache2]", "recipe[apache2::mod_ssl]")

default_attributes( "apache" => { "listen_ports" => [ "80", "443" ] })

override_attributes( "apache" => { "max_children" => "50" })

Role Based Configuration

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Load Balancer Example

18Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 19

Load Balancer

Web Server Web Server

DB Master

DiskDisk

DB Slave DB Slave

Disk

Systems Integration

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 20

Devops

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 20

Devops

•Culture

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 20

Devops

•Culture

•Automation

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 20

Devops

•Culture

•Automation

•Measurement

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 20

Devops

•Culture

•Automation

•Measurement

•Sharing

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 21

What Do Developers

Need?

Thursday, July 29, 2010

For Developers...

Thursday, July 29, 2010

For Developers...

• Self Service Operations

Thursday, July 29, 2010

For Developers...

• Self Service Operations

• The infrastructure is the application (and vice versa)

Thursday, July 29, 2010

For Developers...

• Self Service Operations

• The infrastructure is the application (and vice versa)

• Minimize Bottlenecks

Thursday, July 29, 2010

For Developers...

• Self Service Operations

• The infrastructure is the application (and vice versa)

• Minimize Bottlenecks

• The “Right” Tools

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 23

What Does Operations Need?

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 24Thursday, July 29, 2010

Operations

http://covers.oreilly.com/images/9780596007836/lrg.jpg

Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft

Thursday, July 29, 2010

Operations• Say “Yes”.

http://covers.oreilly.com/images/9780596007836/lrg.jpg

Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft

Thursday, July 29, 2010

Operations• Say “Yes”.

• You never liked rack and stack that much anyway.

http://covers.oreilly.com/images/9780596007836/lrg.jpg

Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft

Thursday, July 29, 2010

Operations• Say “Yes”.

• You never liked rack and stack that much anyway.

• You have never been more critical.

http://covers.oreilly.com/images/9780596007836/lrg.jpg

Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft

Thursday, July 29, 2010

Operations• Say “Yes”.

• You never liked rack and stack that much anyway.

• You have never been more critical.

• Just get out of the way.

http://covers.oreilly.com/images/9780596007836/lrg.jpg

Lean into it appears courtesy of Cliff Moon, of Dynomite fame: http://twitter.com/moonpolysoft

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 26

Agile Infrastructure

Development Team focusIDE/WorkbenchAgile methodologySource Control

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 27

Operations

Individual focusScript VI basedSource control?Waterfall

Agile Infrastructure

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Infrastructure as Code

28

http://www.flickr.com/photos/wonderlane/2306082998/

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Infrastructure as Code is...

29http://www.flickr.com/photos/kwerfeldein/2634561264/sizes/o/

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Infrastructure as Code is...

29

A technical domain revolving around building and managing infrastructure programmatically

http://www.flickr.com/photos/kwerfeldein/2634561264/sizes/o/

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Enable the reconstruction of the business from nothing

but a source code repository, an application

data backup, and bare metal resources.

30Thursday, July 29, 2010

A Tornado Hits Your Data

http://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

http://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

http://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

• Upload your offsite backups

http://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

• Upload your offsite backups

• Provision, config and integrate the new servers

http://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

• Upload your offsite backups

• Provision, config and integrate the new servers

• Change DNS to point to “Hit by Tornado” page

http://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

• Upload your offsite backups

• Provision, config and integrate the new servers

• Change DNS to point to “Hit by Tornado” page

• Restore the customer and application data

http://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

• Upload your offsite backups

• Provision, config and integrate the new servers

• Change DNS to point to “Hit by Tornado” page

• Restore the customer and application data

• Remove the “Hit by Tornado” page

http://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

• Upload your offsite backups

• Provision, config and integrate the new servers

• Change DNS to point to “Hit by Tornado” page

• Restore the customer and application data

• Remove the “Hit by Tornado” page

• Unpause moviehttp://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

• Upload your offsite backups

• Provision, config and integrate the new servers

• Change DNS to point to “Hit by Tornado” page

• Restore the customer and application data

• Remove the “Hit by Tornado” page

• Unpause moviehttp://www.flickr.com/photos/gi/518613153/sizes/o/

Thursday, July 29, 2010

A Tornado Hits Your Data

• Pause your movie

• Sign into your cloud provider

• Upload your offsite backups

• Provision, config and integrate the new servers

• Change DNS to point to “Hit by Tornado” page

• Restore the customer and application data

• Remove the “Hit by Tornado” page

• Unpause moviehttp://www.flickr.com/photos/gi/518613153/sizes/o/

Chapter 5 Infrastructure

as CodeAdam Jacob

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

In a Cloudy WorldYour Prime Constraint Should Be

32

http://www.flickr.com/photos/visualage/2126833132/sizes/o/

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

In a Cloudy WorldYour Prime Constraint Should Be

32

The time it takes to

restore your application

data

http://www.flickr.com/photos/visualage/2126833132/sizes/o/

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 33

Infrastructure as Code

Thursday, July 29, 2010

RecipiesApplies resources in the order they are specified

http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/

Thursday, July 29, 2010

RecipiesApplies resources in the order they are specified

• Can include other recipes.

http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/

Thursday, July 29, 2010

RecipiesApplies resources in the order they are specified

• Can include other recipes.

• A DSL like Ruby.

http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 35

(http://radar.oreilly.com/archives/2007/10/operations-advantage.html)

10

20

30

40

50

“Traditional” Operations

# o

f H

our

s

05

101520

1 2 3 4 5 6 7 9 10 11 12

Ser

vers

Week #

10

20

30

40

50

Operations - The “Secret Sauce”

UpkeepConfigOS InstallHardware

05

101520

1 2 3 4 5 6 7 9 10 11 12

Week #

ExistingNew

Tale of Two Startups

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved 35

(http://radar.oreilly.com/archives/2007/10/operations-advantage.html)

10

20

30

40

50

“Traditional” Operations

# o

f H

our

s

05

101520

1 2 3 4 5 6 7 9 10 11 12

Ser

vers

Week #

10

20

30

40

50

Operations - The “Secret Sauce”

UpkeepConfigOS InstallHardware

05

101520

1 2 3 4 5 6 7 9 10 11 12

Week #

ExistingNew

This is the secret of Cloud Computing.

Every other virtue stems from here.

Tale of Two Startups

Thursday, July 29, 2010

Thursday, July 29, 2010

A Period of Combinatorial Innovation

Thursday, July 29, 2010

A Period of Combinatorial Innovation

• Abstract and fault tolerant components

Thursday, July 29, 2010

A Period of Combinatorial Innovation

• Abstract and fault tolerant components

• Integrated network accessible services

Thursday, July 29, 2010

A Period of Combinatorial Innovation

• Abstract and fault tolerant components

• Integrated network accessible services

• Unlimited infrastructure

Thursday, July 29, 2010

Copyright © 2010 Opscode, Inc - All Rights Reserved

Industry Shifts

37

Be bold-and mighty forces will come to your aidBasil King

Thursday, July 29, 2010