Making your Business Unstoppable Angela Osorio HPS Solution Manager Angela Osorio HPS Solution...
-
Upload
gwen-mills -
Category
Documents
-
view
219 -
download
1
Transcript of Making your Business Unstoppable Angela Osorio HPS Solution Manager Angela Osorio HPS Solution...
Making your Business Making your Business UnstoppableUnstoppable
Making your Business Making your Business UnstoppableUnstoppable
Angela Osorio
HPS Solution Manager
Angela Osorio
HPS Solution Manager
‘‘00s00s‘‘90s90s‘‘80s80s
DecisionDecision OptionalOptional MandatoryMandatory
RecoveryRecoveryExpectationExpectation
HardwareHardware
Days/HoursDays/Hours
Hardware, DataHardware, Data
Minutes/SecondsMinutes/Seconds
Hardware, Data,Hardware, Data,Applications Applications
Minutes/SecondsMinutes/Seconds
Magnified byMagnified by DisasterDisaster Absence ofAbsence of“Bricks & Mortar”“Bricks & Mortar”
Dependence onDependence onComputersComputers
Driven byDriven by RegulationRegulation e-Commercee-Commerce CompetitionCompetition
RequirementsRequirements Restore, RecoverRestore, Recover High AvailabilityHigh Availability 24 x 7, Scalable24 x 7, Scalable
Business FocusBusiness Focus TraditionalTraditional Dot.comDot.com e-Businesse-Business
Evolution of Business ContinuityEvolution of Business ContinuityEvolution of Business ContinuityEvolution of Business Continuity
Changing Concept of Business Changing Concept of Business ContinuityContinuityChanging Concept of Business Changing Concept of Business ContinuityContinuity
AvailabilityAvailability AccessibilityAccessibility QualityQuality
Drivers of Data and Information FlowDrivers of Data and Information Flow Drivers of Data and Information FlowDrivers of Data and Information FlowYesterdayYesterday TodayToday
ASPASP
MFGMFG
DISTDIST ISPISP
SSPSSP
SCSCCreditCredit
CompanyCompany
CustomerCustomer
CompanyCompany
CustomerCustomer
Risks to information availability
Component
AdministrativeIntervention
Building LevelIncident
Metropolitan Area Event
Regional Event
The Failure Event Spectrum
Global Event
Source: Gartner GroupSource: Gartner Group
Causes ofCauses of DOWNTIMEDOWNTIMECauses ofCauses of DOWNTIMEDOWNTIME
11 Planned maintenance
Application failure
Operator error
Operating system failure
Hardware failure
Power outage
Natural disaster
Planned maintenance
Application failure
Operator error
Operating system failure
Hardware failure
Power outage
Natural disaster
2233
4455
6677
Source: Contingency Planning Research, 2000
Financial cost of downtime is relative to Financial cost of downtime is relative to who feels the painwho feels the painFinancial cost of downtime is relative to Financial cost of downtime is relative to who feels the painwho feels the pain
Industry
Financial
Financial
Media
Retail
Retail
Transportation
Entertainment
Shipping
Financial
Application
Brokerage operations
Credit card sales
Pay-per-view
Home shopping (TV)
Catalog sales
Airline reservations
Tele-ticket sales
Package shipping
ATM fees
Average cost per hour of downtime (US$)
$ 7,840,000
$ 3,160,000
$ 183,000
$ 137,000
$ 109,000
$ 108,000
$ 83,000
$ 34,000
$ 18,000
Disasters are defined by youDisasters are defined by youDisasters are defined by youDisasters are defined by you
Which systems are critical to your business?– Those which are customer facing are usually more
important What happens if data becomes unavailable?
– Is it merely inconvenient or aggravating?– Is it life or death?
One person’s inconvenience may be another’s disaster
More disastrous resultsMore disastrous resultsMore disastrous resultsMore disastrous results
Loss of customer service satisfaction
Cost and time of rebuilding lost data
Possible fines and penalties imposed by regulatory agencies
Idle time of employees
Fines and penalties imposed for not meeting contracted delivery times or SLAs
Movement of your customers to your competitor
High Availability and Disaster ToleranceHigh Availability and Disaster ToleranceHigh Availability and Disaster ToleranceHigh Availability and Disaster Tolerance
Disaster Tolerance tends to be:
– Data-centric
– Data integrity-focused
– Geographical
– Recovery point focused
– Longer time horizon
High Availability tends to be:
– Transaction-centric
– Transaction integrity-focused
– Local
– Recovery time focused
– Very short time horizon
Protect your business…Protect your business…Protect your informationProtect your informationThe stakes are high!The stakes are high!
Protect your business…Protect your business…Protect your informationProtect your informationThe stakes are high!The stakes are high!
“Nearly half the companies that lose their data through disaster, never re-open, and 90% are out of business within two years.”
Source: University of Texas Center for Research on Information Systems
Site goes downSite goes down
Shares down 30 pts.
$4B in stock value lost
What types of problems does/will your plan anticipate?
Network failure
Hardware component failure
Natural disasters
Operating system fault/failure
Software viruses
Application failure
Malicious physical and computer security breaches (external)
Malicious physical and computer security breaches (internal)
Acts of man (war, terrorism, etc.)
Service provider failure
Accidental employee-initiated outages
Attack on company Web site
86.9%
84.8
84.4
77.6
75.5
70.9
68.4
59.1
57.8
56.1
55.3
53.6
89.5
78.9
77.9
77.9
83.2
71.6
67.4
56.8
56.8
60
47.4
56.8
87.6
89.3
90.9
76
69.4
69.4
68.6
59.5
60.3
53.7
61.2
52.9
Under $20M in Revenue Over $20M in Revenue
CIO Insight study on Disaster Recovery – November 2001
Anticipated problems driving need for High Anticipated problems driving need for High Availability and Disaster ToleranceAvailability and Disaster Tolerance
Events that actually forced companies to Events that actually forced companies to declare a disasterdeclare a disasterEvents that actually forced companies to Events that actually forced companies to declare a disasterdeclare a disasterPower OutageHardware FailureFireFloodEarthquakeHurricaneSoftware ErrorBombingSnow/Wind StormNetwork FailureContaminationBurst PipeForced EvacuationHVAC FailureDelayed RelocationRiotDR Testing went wrong
Source: Disaster Recovery Journal
High Availability & Disaster Tolerance High Availability & Disaster Tolerance It’s about data and keeping it availableIt’s about data and keeping it availableWhat Is Your Specific Situation?What Is Your Specific Situation?
Questions to ask yourself– What is your business?– What is your application?– What is your environment (flood zone, earthquake)? – What risks are you willing to take?– What’s happened in the past?– What if your critical systems were lost?
High Availability & Disaster Tolerance High Availability & Disaster Tolerance It’s about data and keeping it availableIt’s about data and keeping it availableEvaluating RPO and RTOEvaluating RPO and RTO
Recovery point objective– How fresh is your data?
Not all data needs to be recovered to the same point
The quicker your required recovery time and the more thorough and accurate your recovery point, the more
robust a solution is required
Recovery time objective– How soon after an event do you need to be running?
Not all applications need to come up at the same time
Rules Of ThumbRules Of ThumbRules Of ThumbRules Of Thumb
More Forgiving Less Forgiving
Environment
Disaster Tolerance Methodology
Backup and drivetape across town
Campus-Wide Clusters
Emergency 911
Telecommunications
Defense
Financial Transactions
HealthcareeCommerce
Accounting
Data Warehousing
Payroll
Discrete Mfg
Tech Pubs
High Availability & Disaster Tolerant High Availability & Disaster Tolerant responses are a balance of three aspectsresponses are a balance of three aspectsHigh Availability & Disaster Tolerant High Availability & Disaster Tolerant responses are a balance of three aspectsresponses are a balance of three aspects
TechnologyServicesProcedures and discipline
Find the balance of three aspectsFind the balance of three aspects Find the balance of three aspectsFind the balance of three aspects
TechnologyTechnologyServicesServices
Procedures & Procedures & DisciplineDiscipline
20
Techniques to eliminate system downtimeTechniques to eliminate system downtimeTechniques to eliminate system downtimeTechniques to eliminate system downtime
Data protection Remote log shipping Data Replication
Manager Campus-Wide Clusters Reliable Transaction
Router
Technology Services Procedures & Discipline InsuranceInsurance Assets RecoveryAssets Recovery Cold-site, Mobile recoveryCold-site, Mobile recovery Stand-Alone systemsStand-Alone systems Business Protection ServiceBusiness Protection Service Distributed & Networked Distributed & Networked
systemssystems Disaster recovery hot-siteDisaster recovery hot-site Redundancy, Hot Swap Redundancy, Hot Swap
components, RAIDcomponents, RAID Availability clustersAvailability clusters Data mirroring, SMARTData mirroring, SMART Dual host/redundancyDual host/redundancy Shared Data clustersShared Data clusters FDDI, ATM switchingFDDI, ATM switching
Plan Question Exercise Document procedures Eliminate single points
of failure Rolling Upgrades Provide shared, direct
access to storage Minimize
environmental risks Practice!
Se
rvice
sCustom Systems
RemoteLog Shipping
Data Protection
DataReplication
Manager
Reliable Transaction
Router
Campus WideClusters
Time to recover
COST
LOSS
Maximum costof plan
Acceptabledowntime
Mon
eyNominal Justifiable Cost of PlanNominal Justifiable Cost of PlanNominal Justifiable Cost of PlanNominal Justifiable Cost of Plan
Does cost of recovery exceed the losses?
Plan IV
Loss reduction (savings)
Plan III
Cos
t Plan II
Maximum costof plan
Acceptabledowntime
Evaluate AlternativesEvaluate AlternativesEvaluate AlternativesEvaluate Alternatives
Does your plan make financial sense?
Plan I
Dependency on TechnologyDependency on TechnologyDependency on TechnologyDependency on Technology
Risk LevelRisk LevelRisk LevelRisk Level
E-business…E-business…putting all of your “eggs-in-a-basket”putting all of your “eggs-in-a-basket”E-business…E-business…putting all of your “eggs-in-a-basket”putting all of your “eggs-in-a-basket”
Tools to Make Your Tools to Make Your Business UnstoppableBusiness Unstoppable
Tools to Make Your Tools to Make Your Business UnstoppableBusiness Unstoppable
High Availability & Disaster Tolerance High Availability & Disaster Tolerance It’s about data and keeping it availableIt’s about data and keeping it availableEvaluating RPO and RTOEvaluating RPO and RTO
Recovery point objective– How fresh is your data?
Not all data needs to be recovered to the same point
The quicker your required recovery time and the more thorough and accurate your recovery point, the more
robust a solution is required
Recovery time objective– How soon after an event do you need to be running?
Not all applications need to come up at the same time
Rules Of ThumbRules Of ThumbRules Of ThumbRules Of Thumb
More Forgiving Less Forgiving
Environment
Disaster Tolerance Methodology
Backup and drivetape across town
Campus-Wide Clusters
Emergency 911
Telecommunications
Defense
Financial Transactions
HealthcareeCommerce
Accounting
Data Warehousing
Payroll
Discrete Mfg
Tech Pubs
High Availability & Disaster Tolerant High Availability & Disaster Tolerant responses are a balance of three aspectsresponses are a balance of three aspectsHigh Availability & Disaster Tolerant High Availability & Disaster Tolerant responses are a balance of three aspectsresponses are a balance of three aspects
TechnologyServicesProcedures and discipline
Find the balance of three aspectsFind the balance of three aspects Find the balance of three aspectsFind the balance of three aspects
TechnologyTechnologyServicesServices
Procedures & Procedures & DisciplineDiscipline
29
Techniques to eliminate system downtimeTechniques to eliminate system downtimeTechniques to eliminate system downtimeTechniques to eliminate system downtime
Data protection Remote log shipping Data Replication
Manager Campus-Wide Clusters Reliable Transaction
Router
Technology Services Procedures & Discipline InsuranceInsurance Assets RecoveryAssets Recovery Cold-site, Mobile recoveryCold-site, Mobile recovery Stand-Alone systemsStand-Alone systems Business Protection ServiceBusiness Protection Service Distributed & Networked Distributed & Networked
systemssystems Disaster recovery hot-siteDisaster recovery hot-site Redundancy, Hot Swap Redundancy, Hot Swap
components, RAIDcomponents, RAID Availability clustersAvailability clusters Data mirroring, SMARTData mirroring, SMART Dual host/redundancyDual host/redundancy Shared Data clustersShared Data clusters FDDI, ATM switchingFDDI, ATM switching
Plan Question Exercise Document procedures Eliminate single points
of failure Rolling Upgrades Provide shared, direct
access to storage Minimize
environmental risks Practice!
Se
rvice
sCustom Systems
RemoteLog Shipping
Data Protection
DataReplication
Manager
Reliable Transaction
Router
Campus WideClusters
Preventing a DisasterPreventing a DisasterPreventing a DisasterPreventing a Disaster
You Need:– copy of applications– copy of application data
current: no, or predictable degree of, data loss consistent: write ordering across related replicas
– systems to restart and run applications– reestablished client communications
Spectrum of recovery techniques– trade off cost, recovery time, data currency
Making online healthy and
beautiful
Making online healthy and
beautiful“High availability is as High availability is as
important to eCommerce as important to eCommerce as breathing is to humans.breathing is to humans.Our Compaq servers stayOur Compaq servers stayhighly available to customers, highly available to customers, giving us an advantage for giving us an advantage for eCommerce.eCommerce.
Kal RamanKal RamanChief Information OfficerChief Information OfficerDrugstore.com, Inc.Drugstore.com, Inc.
”
AVAILABILITY…AVAILABILITY…open all night longopen all night longAVAILABILITY…AVAILABILITY…open all night longopen all night long
“At the Vatican... security was At the Vatican... security was our first criterion in choosing a our first criterion in choosing a partner; our second critical factor partner; our second critical factor was availability; another waswas availability; another washigh performance.high performance.
Stefano PasquiniStefano PasquiniIT PlannerIT PlannerInternet Office of the Holy SeeInternet Office of the Holy See
”God knows what else you need…
Professional Services
God knows what else you need…
Professional Services
SECURITY… SECURITY… solving a devilish problemsolving a devilish problemSECURITY… SECURITY… solving a devilish problemsolving a devilish problem
Business Continuity MethodologiesBusiness Continuity MethodologiesBusiness Continuity MethodologiesBusiness Continuity Methodologies
Asynchronous Synchronous
Application
Technology
Simple Backup &Remote Storage Site
Campus-Wide Clusters
Remote Log ShippingSANworks Data
Replication Manager
Emergency 911
Telecommunications
Defense
Financial Transactions
HealthcareeCommerce
Accounting
Data Warehousing
Payroll
Discrete Mfg
Tech Pubs
Data ProtectionTechnologies
Reliable Transaction Router