Introduction to Business Continuity

26
© 2009 EMC Corporation. All rights reserved. EMC Proven Professional The #1 Certification Program in the information storage and management industry Introduction to Business Continuity Chapter 11 Section 3 : Business Continuity

description

Section 3 : Business Continuity. Introduction to Business Continuity. Chapter 11. Chapter Objective. After completing this chapter, you will be able to: Define Business Continuity and Information Availability Detail impact of information unavailability - PowerPoint PPT Presentation

Transcript of Introduction to Business Continuity

Page 1: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

EMC Proven Professional

The #1 Certification Program in the information storage and management industry

Introduction to Business ContinuityIntroduction to Business Continuity

Chapter 11

Section 3 : Business Continuity

Page 2: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Chapter Objective

After completing this chapter, you will be able to:

o Define Business Continuity and Information Availability

o Detail impact of information unavailability

o Define BC measurement and terminologies

o Describe BC planning process

o Detail BC technology solutions

Page 3: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

What is Business Continuity (BC)

o Business Continuity is preparing for, responding to, and recovering from an application outage that adversely affects business operations

o Business Continuity solutions address unavailability and degraded application performance

o Business Continuity is an integrated and enterprise wide process and set of activities to ensure “information availability”

Page 4: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

What is Information Availability (IA)

o IA refers to the ability of an infrastructure to function according to business expectations during its specified time of operation

o IA can be defined in terms of three parameters:o Reliability

o The components delivering the information should be able to function without failure, under stated conditions, for a specified amount of time

o Accessibilityo Information should be accessible at the right place and to the right user

o Timelinesso Information must be available whenever required

Page 5: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Causes of Information Unavailability

Disaster (<1% of Occurrences)

Natural or man made Flood, fire, earthquakeContaminated building

Unplanned Outages (20%)

FailureDatabase corruptionComponent failureHuman error

Planned Outages (80%)

Competing workloads Backup, reportingData warehouse extractsApplication and data restore

Page 6: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Impact of Downtime

Lost RevenueKnow the downtime costs (per hour, day, two days...)• Number of employees

impacted (x hours out * hourly rate)

Damaged Reputation

• Customers• Suppliers• Financial markets• Banks• Business partners

Financial Performance

• Revenue recognition• Cash flow• Lost discounts (A/P)• Payment guarantees• Credit rating• Stock price

Other ExpensesTemporary employees, equipment rental, overtime costs, extra shipping costs, travel expenses...

• Direct loss• Compensatory payments• Lost future revenue• Billing losses• Investment losses

Lost Productivity

Page 7: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Impact of Downtime

o Average cost of downtime per hour = average productivity loss per hour + average revenue loss per hour

o Where:

o Productivity loss per hour = (total salaries and benefits of all employees per week) / (average number of working hours per week)

o Average revenue loss per hour = (total revenue of an organization per week) / (average number of hours per week that an organizations is open for business)

Page 8: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Measuring Information Availability

o MTBF: Average time available for a system or component to perform its normal operations between failures

o MTTR: Average time required to repair a failed component

IA = MTBF / (MTBF + MTTR) or IA = uptime / (uptime + downtime)

Detection

IncidentTime

Detection elapsed

time

Diagnosis

Response Time

Repair

Recovery

Repair time

Restoration

Recovery Time

MTTR – Time to repair or ‘downtime’

Incident

MTBF – Time between failures or ‘uptime’

Page 9: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Availability Measurement – Levels of ‘9s’ Availability

% Uptime % Downtime Downtime per Year Downtime per Week

98% 2% 7.3 days 3hrs 22 min

99% 1% 3.65 days 1 hr 41 min

99.8% 0.2% 17 hrs 31 min 20 min 10 sec

99.9% 0.1% 8 hrs 45 min 10 min 5 sec

99.99% 0.01% 52.5 min 1 min

99.999% 0.001% 5.25 min 6 sec

99.9999% 0.0001% 31.5 sec 0.6 sec

Page 10: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

BC Terminologies

o Disaster recoveryo Coordinated process of restoring systems, data, and infrastructure

required to support ongoing business operations in the event of a disastero Restoring previous copy of data and applying logs to that copy to bring it

to a known point of consistencyo Generally implies use of backup technology

o Disaster restarto Process of restarting from disaster using mirrored consistent copies of data

and applicationso Generally implies use of replication technologies

Page 11: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

BC Terminologies (Cont.)

Recovery Point Objective (RPO)

o Point in time to which systems and data must be recovered after an outage

o Amount of data loss that a business can endure

Recovery Time Objective (RTO)

o Time within which systems, applications, or functions must be recovered after an outage

o Amount of downtime that a business can endure and survive

Recovery-point objective Recovery-time objective

Seconds

Minutes

Hours

Days

Weeks

Seconds

Minutes

Hours

Days

Weeks Tape Backup

Periodic Replication

Asynchronous Replication

Synchronous Replication

Tape Restore

Disk Restore

Manual Migration

Global Cluster

Page 12: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Business Continuity Planning (BCP) Process

o Identifying the critical business functions

o Collecting data on various business processes within those functions

o Business Impact Analysis (BIA) o Risk Analysis

o Assessing, prioritizing, mitigating, and managing risk

o Designing and developing contingency plans and disaster recovery plan (DR Plan)

o Testing, training and maintenance

Page 13: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Business Continuity (BC) Planning LifecycleBC planning must follow a disciplined approach like any other planning process. Organizations today dedicate specialized resources to develop and maintain BC plans. From the conceptualization to the realization of the BC plan, a lifecycle of activities can be defined for the BC process. The BC planning lifecycle includes five stages:

1. Establishing objectives2. Analyzing3. Designing and developing4. Implementing5. Training, testing, assessing, and maintaining

Page 14: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Business Continuity (BC) Planning Lifecycle

Figure. BC planning lifecycle

Page 15: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Establishing objectives

o Determine BC requirements.

o Estimate the scope and budget to achieve requirements.

o Select a BC team by considering subject matter experts from all areas of the business, whether internal or external.

o Create BC policies.

Page 16: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Analyzing

o Collect information on data profiles, business processes, infrastructure support, dependencies, and frequency of using business infrastructure.

o Identify critical business needs and assign recovery priorities.

o Create a risk analysis for critical areas and mitigation strategies.

o Conduct a Business Impact Analysis (BIA).

o Create a cost and benefit analysis based on the consequences of data unavailability.

o Evaluate options.

Page 17: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Designing and developing

o Define the team structure and assign individual roles and responsibilities. For example, different teams are formed for activities such as emergency response, damage assessment, and infrastructure and application recovery.

o Design data protection strategies and develop infrastructure.

o Develop contingency scenarios.

o Develop emergency response procedures.

o Detail recovery and restart procedures.

Page 18: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Implementing

o Implement risk management and mitigation procedures that include backup, replication, and management of resources.

o Prepare the disaster recovery sites that can be utilized if a disaster affects the primary data center.

o Implement redundancy for every resource in a data center to avoid single points of failure.

Page 19: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Training, testing, assessing, and maintainingo Train the employees who are responsible for backup and replication of

business-critical data on a regular basis or whenever there is a modification in the BC plan.

o Train employees on emergency response procedures when disasters are declared.

o Train the recovery team on recovery procedures based on contingency scenarios.

o Perform damage assessment processes and review recovery plans.

o Test the BC plan regularly to evaluate its performance and identify its limitations.

o Assess the performance reports and identify limitations.

o Update the BC plans and recovery/restart procedures to reflect regular changes within the data center.

Page 20: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

BC Technology Solutions

o The following are the solutions and supporting technologies that enable business continuity and uninterrupted data availability:o Fault tolerant configuration

o To avoid single-point of failureo Multi-pathing softwareo Backup and replication

o Backup recoveryo Local replicationo Remote replication

Page 21: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Implementation of Fault Tolerance

FC Switches

Storage Array

Redundant Network

Clustered ServersRedundant Arrays

Remote Site

Redundant Ports

Redundant FC Switches

Redundant Paths

Heartbeat Connection

IP

Storage Array

Client

Page 22: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Multi-pathing Software

o Configuration of multiple paths increases data availability

o Even with multiple paths, if a path fails I/O will not reroute unless system recognizes that it has an alternate path

o Multi-pathing software helps to recognize and utilizes alternate I/O path to data

o Multi-pathing software also provide the load balancing

o Load balancing improves I/O performance and data path utilization

Page 23: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Backup and Replication o Local Replication

o Data from the production devices is copied to replica devices within the same array

o The replicas can then be used for restore operations in the event of data corruption or other events

o Remote Replicationo Data from the production devices is copied to replica devices on a remote

array o In the event of a failure, applications can continue to run from the target

device

o Backup/Restoreo Backup to tape has been a predominant method to ensure business

continuityo Frequency of backup is depend on RPO/RTO requirements

Page 24: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Chapter Summary

Key points covered in this chapter:

o Importance of Business Continuity

o Types of outages and their impact to businesses

o Information availability measurements

o Definitions of disaster recovery and restart, RPO and RTO

o Business Continuity technology solutions overview

Page 25: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

Check Your Knowledge o Which concerns do business continuity solutions address?

o “Availability is expressed in terms of 9s.” Explain the relevance of the use of 9s for availability, using examples.

o What is the difference between RPO and RTO?

o What is the difference between Disaster Recovery and Disaster Restart?

o Provide examples of planned and unplanned downtime in the context of storage infrastructure operations.

o What are some of the Single Points of Failure in a typical data center environment?

Page 26: Introduction to Business Continuity

© 2009 EMC Corporation. All rights reserved.

#1 ITcompany

For more information visit http://education.EMC.com