Business Continuity and Disaster Recovery Strategies October 2, 2002 Paul DiGiacomo, Product...
-
Upload
rosalind-gilbert -
Category
Documents
-
view
214 -
download
1
Transcript of Business Continuity and Disaster Recovery Strategies October 2, 2002 Paul DiGiacomo, Product...
Business Continuity and Disaster Recovery Strategies
October 2, 2002
Paul DiGiacomo, Product Management DirectorAT&T Business - Managed ServicesAT&T Ultravailable® And Recovery [email protected], 973-644-6340
2
Topics
Key Drivers
AT&T BC/DR Best Practices Governance and Execution Process
AT&T BC/DR Best Practices Architecture
AT&T BC/DR Internal and Client Experiences
Summary
3
FinancialTrends
Customer /
Market
Trends
CxO Drivers Today
- eBusiness- 24x7, Always On- Globalization
- Improved Network Intelligence- Exponential Data Growth- Emerging Protocols- Application Management
- Cost Reduction- TCO / ROI Focus- CapEx Reduction- Managed Services
Organ
izatio
nal
Trends
- Productivity Focus- Scarce Qualified Resources- Internet Time- Consolidation
- Broader Threats- Greater Risk- Greater Exposure
- Volatile Markets- Officer Liability- HIPAA- OHS / SEC / Comptroller
Stake
holder
Trends
RiskTrends
Technology
Trends
4
Industry Trends Shaping Customer Needs
Customer /MarketTrends
TechnologyTrends
Structural /Organizational
Trends
Availability
Scalability
Reliability
Accuracy
Integrity
Quality
Accessibility
Security
Continuity
Performance
Recoverability
Predictability
Business and technology leaders today are facing
highly demandingperformancerequirements
Regulatory /Shareholder
Trends
RiskTrends
5
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
6
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
7
BC/DR: Key Drivers, Triggers, Enablers
Culture and Heritage of Quality and Reliability Planning
Audit or Risk Committee Findings / Concerns
Reliability Differentiator in the Marketplace
Legal and Regulatory Requirements
Due Diligence / Insurability
Competitive Benchmarks
Recent Outage(s) or Disaster(s)
Stakeholder Expectations
Data Protection Requirements
Process Availability Requirements
Data Center Migration / Consolidation
8
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
9
Site Incident Mgt Teams
Site 1 Site nSite 2 …
An Exemplary Approach to Governance Chairman /
CEO Board ofDirectors Audit
CommitteeCorp.Officer
Corp. Officer(& BC Champion)
SeniorOfficer
SeniorOfficer
SeniorOfficer
PublicRelations Legal
BC/DR Steering Committee
BC/DR Council
RealEstate
ITOperat’ns
NetworkOperat’ns
BusinessUnits
HR
Corporate Support Team
Corp.Officer
ITOperations
BusinessUnit
SecurityFinance
NetworkOperations
ApplicationDevelopm’t
BusinessUnit
BusinessUnit
BC/DROfficer
10
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
11
BC/DR Standards and PoliciesCertification and Assurance Standards
Incident Management Process Standards
Response Planning Standards
Risk Management Standards
Disaster Recovery Planning Standards
Integrated Planning Process StandardClassifications StandardFunding StandardComponent DR Plan Content StandardData Backup / Offsite Storage StandardDisaster Recovery Planning Tool StandardSecurity StandardLevels of Service StandardPlan Exercise (Test) StandardApproval StandardPlan Maintenance and Change Control StandardTraining and Awareness StandardRisk Acceptance for Non-Compliance Standard
Plan Distribution StandardExceptions StandardWork Center DR Plan StandardApplication DR Plan StandardData Retrieval DR Plan StandardNetwork DR Plan StandardAT&T Core Network DR Plan StandardAT&T Internal Data Network DR Plan StandardPlatform DR Plan StandardRecovery Management DR Plan StandardDR Integrated Planning Process FlowDR Teams Roles and ResponsibilitiesComponent DR Plan Content
12
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
13
Potential Key Processes / Functions
Leadership / StrategyLeadership / Strategy
CustomerCare /
Support
CustomerCare /
SupportProduction/Operations
Production/Operations
Research &New ProductDevelopment
Research &New ProductDevelopment
SupplierManagement
SupplierManagement
Marketing &Sales
Marketing &Sales
Billing /Collections
Billing /Collections
Support Functions: HR / Legal / Finance / IR / PR Support Functions: HR / Legal / Finance / IR / PR
Communications & IT Communications & IT
14
Example Impact Assessment QuestionsProcess Description Primary functions, responsibilities, and accountabilities
Regulatory Reporting Types of reports and frequency
Operational Impacts Impact (Service Level Agreements) and relative importance
Financial Impacts Lost revenue and other financial impacts
Technology Resources Communications and applications
Work Inflows / Outflows Internal and external process inputs / outputs
Outage Tolerance How long could your Process be completely idled?
Impact Profiles by Time Impact based on: monthly, weekly, daily and hourly
Work Backlogs Backlog, normal and seasonal
Special Requirements Any one-of-a-kind items required to conduct business
Backups Frequency of and access to backups
Work Around Procedures Are their work around procedures? How good are they?
Workload Shifting What percentage of workload can be shifted to vendors for how long?
Disruption Experience History and type of process disruptions
Process Vulnerability Vulnerability of your Process to a prolonged disruption or outage
Restoration Complexity How difficult to recover to an acceptable level after a disruption?
Recovery Time Objective What is the optimal Recovery Time Objective (RTO) for your Process?
15
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
Reduce the Threat,Vulnerability, Risk orExposure
16
Plan A Plan B Information Technology: Server and client platforms, LAN,
MAN, WAN, data security, data management, applications, voice, storage, Disaster Recovery plans…
Facility Security: Perimeter security, entrances / exits, loading dock, security cameras, alarm methods, remote monitoring, guard staffing, interior security systems…
Information Security: Servers, routers, firewalls, applications, intrusion detection systems, security policies and standards, organization structure, training and policy deployment…
Infrastructure / Environmentals: Physical building structure, location, HVAC, water / plumbing, environment inspection…
Facilities Safety: Fire exits, emergency procedures, fire suppression equipment (extinguishers, Halon, FM2000, dry & charged sprinklers), emergency lighting, test schedules…
Power: Grounding, distribution, switching, dual grids, UPS installation, maintenance, capacity, load testing, generator installation, maintenance, DC battery plant…
Other: Organizational structure, training & education, customer / supplier contracts…
Best Practices Risk Assessment Approach Current
RiskMitigation Investments
17
Typical Client Scenario
Centralized Data Center / Work Center / Call Center
Single Location for Mission Critical Data
Single Location for Mission Critical Computing
Single Location for Mission Critical Applications
Single Points of Failure for Network Access
Unacceptable Concentration of Risk
The Enterprise MAY NOT Survive a Disaster
18
Costs of Outages & Disasters
QuantitativeDirect
QuantitativeDirect
- Revenue Impact- Opportunity Cost- Penalty Clauses- Fines- …
QuantitativeIndirect
QuantitativeIndirect
- Market Share Loss- Customer Share Loss- Litigation- ...
+
StrategicStrategic
- Potential for Total Business Failure- Brand Equity Loss- Market Cap Loss- ...
+
Physical LossPhysical Loss
- IT/Network Assets- Lost Data- Buildings, Vehicles, Furniture- …
+
Recovery &Restoration
Recovery &Restoration
- Relief & Recovery Operations- Interim Operations- Replacement- ...
+=
Total Business
Impact
19
Costs of Outages & Disasters areTime and Severity Dependent
Do
llar
Co
st
of
Ou
tag
e
0$
1K$
$1M
$1B
seconds minutes hours days weeks months years
TimeDependent
Costs
SeverityDependent
Costs
Larger
Smaller
LongerShorter
20
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
21
Disaster Recovery Approach
Typical Approach Elements:– Off-site data vaulting
– Shared IT Resources
– Permanent Primary Site, Shared Subscription to Temporary Recovery Site
Some Down Time
Some Data Loss
Lower Cost
Best Where Investment in Duplication Would Exceed Importance of Process / Service / Asset
Not Network-Centric
Secondary Recovery Site(Deferrable Workload)
Production Sites
Vendor Location
Vendor Location
- AT&T Production Site 3 is Primary Recovery Site- Test / Development / Deferrable Workload Moved to Vendor Site- Guaranteed Recovery to Second Vendor Location- Full IT Recovery Environment with Extended Stay Potential
CriticalBusiness
Apps
Test / Dev /Deferrable
Apps
Primary Recovery Site(s)
ProductionSite 1
ProductionSite 2
System Disaster Recovery:Deferrable Workload Strategy
/* App 1 Listing */main() {int I, j, k;char *s; { zxy lqr zcnutjkd; for xykj = xzemi;{ xz += fllskj + fjeio; fkjldkfokw;}
/* App 1 Listing */main() {int I, j, k;char *s; { zxy lqr zcnutjkd; for xykj = xzemi;{ xz += fllskj + fjeio; fkjldkfokw;}
/* App 1 Listing */main() {int I, j, k;char *s; { zxy lqr zcnutjkd; for xykj = xzemi;{ xz += fllskj + fjeio; fkjldkfokw;}
/* App 1 Listing */main() {int I, j, k;char *s; { zxy lqr zcnutjkd; for xykj = xzemi;{ xz += fllskj + fjeio; fkjldkfokw;}
VaultedData Data
ProductionSite 3
23
Response
Disaster Recover Timeline
Data in Transit
Data Vaulted
Notification, Damage Assessment & Declaration
Recovery
Restoration
Relief
RecoveryPoint
Objective(RPO)
Recovery TimeObjective (RTO)
Data Backup
Disaster
Incident Management
24
Fundamental Continuity Strategy: Network-Based Geographic Dispersion
Distant Enough for Safety
Close Enough for Cost-Effective Performance
The needs of today’s customersfor Business Continuity
mandate a network-centricgeographically-dispersed infrastructure
25
Disaster Recovery vs. Business Continuity
Typical Approach Elements:– Off-site data vaulting
– Shared IT Resources
– Permanent Primary Site, Shared Subscription to Temporary Recovery Site
Some Down Time
Loses Data
Lower Cost
Best Where Investment in Duplication Would Exceed Importance of Process / Service / Asset
Not Network-Centric
Typical Approach Elements:– Data Mirroring
– Computing Fail-over
– Multiple Permanent Sites
No Down Time
No Data Loss
Higher Investment
Best for Mission Critical Processes / Services / Assets
Highly Network-Centric
Disaster Recovery Business Continuity
26
Four Major Availability Levels / StrategiesStandard
Availability
Computing
Data
Network
SingleServer
DisasterRecovery
HighAvailability
Ultravailable(99.999%)
Server w/Hot-Site
Subscription
LocalCluster
DispersedCluster w/Failover
SingleStorageDevice
StorageDevice w/Off-SiteVaulting
LocalRAID Striping/
Mirroring
SynchronousRemote
Mirroring
Legacy LAN/MAN/WAN
Connectivity
TrailerizedNDR
Resources
UnprotectedDWDM
Services
ProtectedMetro RingServices
27
AT&T Ultravailable® Network Services Diverse Routing with Automatic Protection Switching /
Optical Path Failover for 99.999% Availability
Gigabit EthernetFibre ChannelFICON, ESCON
OCxD1 Digital Video
64 Unprotected or 32 ProtectedWavelengths per Fiber for High Bandwidth,
Rapid Provisioning, & Low Cost
Client- or AT&T-Facility Based
Secure, ConditionedNetwork Nodes
Data Rates of 2.5Gb/s ->10Gb/s Evolving to 40 Gb/s and up
Multiprotocol forFlexibility and
Future-Proofing
24x7 CentralizedMonitoring & Management
for Service Assurance
Non-Switched All-Optical for Low
Latency
Dual Laterals,Dual RisersTo Eliminate
SPOFs
28
AT&T Ultravailable® Suite
MAN-Area Server Clustering / Fail-Over
Synchronous Mode Data Mirroring based onLeading Vendor Disk Arrays for Zero Data Loss
Remote Server-Based or Serverless Backup / Restoreto Automated Tape Libraries for Data Protection
24x7 Centralized Monitoring & Management for Service Assurance
Network Agnostic:- Fibre Channel / FICON/ ESCON over Ultravailable Dedicated or Wavelength DWDM- IP / GigE over Metro Ethernet Services- Channel Extension over T1/T3/OCx
99.999%Data
AvailabilityPrivate orHostedStorage
SANs
PrimaryStorage
29
NDR Trailers
COCO CO
AT&TCO
AT&TCO
AT&T HostingLocation
SANs UltravailableData NAS
UltravailableTape
UltravailableData
UltravailableManaged
Hosted Data
ClientLocation
ManagedPrimaryStorage
ManagedServices
Client Portal
GCSC
GNOC
UltravailableComputing
ManagedHosting
Managed TokenAuthentication Managed
VPN
Managed Intrusion Detection& Scanning Services
Managed InternetServices
ManagedFirewallServices
AT&T Continuity, Recovery, Hosting, & Security Services
Ultravailable Network and Wavelength Services
30
LAN Bridging
Remote Disk Mirroring Remote Back-up
SAN Extension
Server Failover /Remote Clustering
Interlocation Trunking
Multimedia ConferencingContent Distribution
Database Replication
EMPSALORGDoe 37 C5Ng 27 C5rd 88 F9
EMPSALORGDoe 37 C5Ng 27 C5rd 88 F9
EMPSALORGDoe 37 C5Ng 27 C5rd 88 F9
EMPSALORGDoe 37 C5Ng 27 C5rd 88 F9
Example Applications
31
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
32
Example DR Plan Timeline
Task NameDeploy the SDR ProcessRun DR MergeRun Audits & Correct ErrorsEmergency ODA RetrofitForward Prioritization ListApply Joker Equipment OptionsSwing / Build A LinksRelocate ASTN NI BackupBuild ASTN NI F LinksAssign ISDN D Link NodesRe-Engineer Switched T1sDeploy the NDR TrailersPrep the NDR Trailers for RecoveryConnect Fiber and T3 FacilitiesTest and Turn Up TechnologiesTrunk Recent ChangesTrunk Status EvaluationProvide Status ReportsOverall Service Evaluation
ID123456789
10111213141516171819
33
Process Architecture Performance:Four Parallel Example Activities
Bring up OS10 Hours
Bring up OS10 Hours
SwingWAN
10 Hours
SwingWAN
10 Hours
RetrieveVaulted Tapes
10 Hours
RetrieveVaulted Tapes
10 Hours
Test LAN10 Hours
Test LAN10 Hours
How long will this take?!?
34
Importance of Exercising Process
RecoveryTime
Objective
MinimumRecovery
Time
Lik
elih
oo
d o
f M
ee
tin
g o
rB
ea
tin
g R
eco
ve
ry T
ime
90%ConfidenceRecovery
Time
0%
50%
90%
100%
0hours
72hoursMode
35
Process Architecture Performance:Four Serial Normally Distributed Activities
Sorted Results of 100 Trials
Each normally distributed activity has a meanof 10 and a standard deviation of 5.
Results:Minimum: 23.9027Maximum: 67.7011Mean: 41.0530Median: 40.150695% Confidence: 57.8925
Assurance Level BSimulation Exercise Conducted - All Critical Deficiencies / Critical MRs Corrected:- Component simulation exercise completed - Copy of Critical data must be off-site- Critical MRs / deficiencies corrected and closed by Post Review.- The exercise was completed in no more than double the time
specified by the RTO.
Unit and C&AJoint Assurance Assessments
Assurance Level E
Plan Documented: - Local Format - Data Identified and Backed up- Plan Updated within the Last 12 months
Unit Self Assessments
Assurance Level F
No Plan
Assurance Level D
Exercise Ready:- Required Content- LDRPS- Plan Maintenance Validated- Paper Walk Thru Completed
Assurance Level A
Certification:- Process Owner requirements for RTO/RPO ** and service level met/ensured• No critical deficiencies/MRs.
Assurance Level C
Simulation Exercise Conducted:- Component Simulation Exercise Conducted-Copy of Critical Data should be Off-site- Critical MRs / deficiencies occurred in Exercise
5-8%
9-34%
35-50%
90-95%
% Reflects Estimated Likelihood of Recovery.
Certification Achieved
51-65%
66-89%
** RTO - Recovery Time Objective RPO - Recovery Point Objective
Certification and Assurance Metrics
37
AT&T Experience
38
AT&T Switched Network
4ESS
4ESS
4ESS) (
DR Access TrailerDR Intertoll Trailers
AT&TDISASTER RECOVERY
AT&TDISASTER RECOVERY
AT&TDISASTER RECOVERY
AT&TDISASTER RECOVERY
Network Disaster Recovery
EndOffices
Disaster Site
Joker4ESS
39
Network Disaster Recovery
40
NDR Mobile Recovery Assets
Access Trailers
Digital Access and Cross-Connect Systems Trailers
DTMS/FASTAR® Trailers
Lightwave Trailers
5ESS Switch Recovery Platform Trailers
DMS500 Switch Recovery Platform Trailers
Lightguide Regeneration Trailers
Digital Radio Trailers
Power Generation
Digital Radio Recovery Trailers
Portable Radio Towers
Emergency Communications Vehicles
41
NDR Exercises Training and Field Exercises Conducted Quarterly
Test, Exercise, and Develop Capabilities– Declaration / Deployment / Transportation / Set-Up
– Technology
– Teams
– Processes
Sample of Exercises Conducted Since 1997
’01 Tampa, FL
’01 Denver, CO
’00 St. Louis, MO
’00 White Plains, NY
’00 Phoenix, AZ
’99 San Antonio, TX
’99 Lodi, CA
’99 Atlanta, GA
’98 Salt Lake City, UT
’98 Kansas City, MO ’98 Arlington, VA
’97 Oakbrook, IL
42
Recent NDR Deployments9/2001 WTC Disaster NYC, NY &
Northern NJTechnology Trailers, Satellite Units
Communications NYPD Support Humanitarian Relief
6/2001 Flooding (Tropical Storm Allison)
Houston, TX Satellite Unit Technology Trailers
Communications Humanitarian Relief
2/2000 Tornado Camilla, GA Satellite Unit Humanitarian Relief
9/1999 Flooding (Hurricane Floyd)
Tarboro, NC Satellite Unit Humanitarian Relief
9/1999 Flooding (Hurricane Floyd)
Rochelle Park, NJ
Satellite Unit Technology Trailers
Communications Humanitarian Relief
5/1999 Tornadoes Oklahoma City, OK
Satellite Unit Humanitarian Relief
7/1998 Forest Fires Brevard Co., FL Satellite Unit Humanitarian Relief
2/1998 Tornado Lake Mary & Kissimmee, FL
Satellite Unit Humanitarian Relief
9/1997 Gas Line Break Scranton, PA Satellite Unit Humanitarian Relief
9/1997 Train Derailment Dunkirk, NY Regenerator Trailer Communications
4/1997 Flood Grand Fork, ND Satellite Unit Humanitarian Relief
3/1997 Floods Ohio, Kentucky, West Virginia
Satellite Units Lightguide Trailer
Communications Humanitarian Relief
2/1997 Flood Lodi, CA Regenerator Trailer Communications
Date Situation Location Assets Purpose
43
NDR Recovery Site Work
44
AT&T BC/DR Best Practices
AcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
45
WTC: AT&T Perspective
9/11 8:48AM AA 11 Hits WTC North Tower
9/11 8:53AM Stories Begin to Air on GNOC- Monitored Broadcast Networks
9/11 8:53AM Network Duty Officer, GNOC Aware
9/11 8:53AM Automatic Network Controls, RTNR Automatically Reroutes Traffic
9/11 8:55AM Targeted GNOC monitoring of NYC
9/11 8:58AM GNOC Detects Unusual Call Volume
9/11 8:58AM Manual Network Controls Instituted – Limits on NYC-Inbound Calls
9/11 9:00AM NDR NE and SE Region Pre-Activation
9/11 9:21AM Management Control Bridge Activated
46
AT&T BC/DR Best PracticesAcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
47
AT&T Ultravailable® Network Services
CO
AT&TCO
AT&TCO
Ultravailable Network and Wavelength Services
GCSC
9/11 9:05AM UA 175 Hits South Tower
9/11 9:59AM South Tower Collapses
9/11 9:59:35AM South Tower Transport Node Crushed
9/11 9:59:35AM Ultravailable Client Traffic Fails Over Successfully
WTCSouthTower
48
Network Disaster Recovery
9/11 10:20AM All NYC AT&T Offices Ordered to Evacuate All
Non-Essential, Non-Network Personnel
9/11 10:45AM MCB Orders Satellite Phones Readied
Nationally
9/11 11:00AM Mid-Atlantic Emergency Operations Center Activated
9/11 11:30AM Southeast Emergency Operations Center Activated
9/11 11:50AM NDR Equipment Deployment Initiated
49
9/12 4:00AM NDR Team Assembles at Staging Area
9/12 12:00PM Emergency Communications Vehicle To One Police
Plaza
9/12 7:00PM Recovery Location Selected
9/12 10:00PM Location Secured, Trailers DepartStaging Area
9/12 10:30PM Positioning and Leveling Begins
9/13 2:50AM Fiber Spliced to Recovery Location
9/13 8:45AM Grounding Complete
9/13 12:00PM Power Cabling Complete - 500KW
9/13 1:55PM Transport, Digital Cross Connect Up
9/21 ECV Moved From NYPD To Support Relief Workers
AT&T 9/11 Network Disaster Recovery
50
AT&T BC/DR Best PracticesAcknowledgeImportance of
BC/DR
AcknowledgeImportance of
BC/DR
Create aBC/DR
GovernanceStructure
Create aBC/DR
GovernanceStructure
Develop& Deploy
Standards &Policies
Develop& Deploy
Standards &Policies
MonitorProgressOf BC/DRProgram
MonitorProgressOf BC/DRProgram
Assess Threats,Vulnerabilities,
Risks &Exposures
Assess Threats,Vulnerabilities,
Risks &Exposures
Simulate /Test /
Exercise
Simulate /Test /
Exercise
Respond,Repair , &Recover
Respond,Repair , &Recover
Develop & DeployBC/DR
Plans & Assets
Develop & DeployBC/DR
Plans & Assets
Assess &Prioritize KeyAreas of the
Business
Assess &Prioritize KeyAreas of the
Business??
Transfer RiskTransfer Risk
Mitigate RiskBased on
Business Case
Mitigate RiskBased on
Business Case
Accept Risk Accept Risk
Monitor &Manage Events
Monitor &Manage Events
51
Can you identify who is in charge of Business Continuity? What are the strategic continuity objectives and organizational structure to guarantee their achievement?
Do you know which processes, services, and/or assets are most critical?
Do you know the threats, vulnerabilities, risks and financial impact to those processes, services, and assets?
Have you developed a sound business and financial analysis of alternatives?
How confident are you that your plan will meet objectives?
Have you balanced classic strategies such as DR to state-of-the-art high availability architectures? What SLA’s are required?
Questions to Consider
52
Summary
Business Continuity is a critical imperative in today’s world
A successful corporate Business Continuity program needs:– A comprehensive, closed-loop governance, planning and execution process –
across multiple lines of business and functional areas
– A geographically-dispersed, hardened infrastructure, integrated and synchronized by the network, for physical threat protection
– A robust information security strategy and architecture for logical threat protection
The Network is central to physical and logical protection– Network access and transport
– Network security
– Hardened facilities
53
The needs oftoday’s customers forBusiness Continuity
require a network-centricapproach to protectingcritical infrastructure
components.