Examining End-User Standardisation Needs for Disaster Resilience
Business Resilience End-to-End - Plastics Technology PowerP… · Business Resilience End-to-End...
Transcript of Business Resilience End-to-End - Plastics Technology PowerP… · Business Resilience End-to-End...
1
Business Resilience End-to-End
GMAC-RFCCase Study
Chuck Wachter, CDRP
BRM Program Manager
952-857-6384
Traditional BCP/ DR Approach
Recovery Time Objective
Recovery Point Objective
Lost DataResumeBusiness
ReturnHomeNotifications
Restore Communications
Restore Technology Capability
Restore Business Functions
Move toAlternateSiteVital Records
Data Synchronization
Systems Applications Data
Relocate Office Equipment /Supplies
Work Flow
Arriving at BRM
• Significant growth created a complex and interdependent business and IT systems environment.
• Analysis concluded that unacceptable loss would occur from a significant outage lasting more than 24 hours.
• Proposed external vendor solutions could not mitigate the problem and meet recovery requirements.
• Required an integrated, sustaining, all-encompassing approach.
• Determined an internal recovery solution would meet requirements, create a lower CODB, and provid additional benefits:
– Addressed day-to-day impacts up to catastrophic versus catastrophic only– Minimize vendor contractual limitations– Process enhancements: change management, testing, service delivery,
Incident Response etc.
2
Industry Best Practices
• Over 60% of G2000 organizations are implementing a dual data center strategy to support continuous availability
• Business and IT availability classification requirements aligned with associated cost, application and system architecture requirements
• Companies that manage resilience internally achieve a greater degree of maturity and business process alignment with their supporting IT systems
• Financial services industry is under increased scrutiny by regulatory agencies due to the critical nature and impact of its services to the economy. Their response includes:
– Investment in failover capabilities to significantly reduce the time for recovery and to respond to widespread/regional disruptions
– A realignment from the traditional approach of recovering technology and facilities toward a full business resumption model
– Increased frequency of exercising business and IT resilience capabilities
As real-time business requirements become more pervasive, business continuity must be integrated throughout the corporate culture and
business processes, with clear accountability and measurement defined to align with acceptable corporate risk.
Business Resilience Defined
Resilience is the ability and capacity to withstand and adapt to new risk environments. A resilient organization effectively aligns its strategy, operations, business systems, governance structure, and decision-support capabilities so that it can uncover and adjust to continually changing risks, endure disruptions to its primary earnings drivers, and create advantages over less adaptive competitors.
Program MissionThe Business Resiliency program manages the organizations capabilities to continue to provide services at anytime, regardless of the event and impact. Prioritization of investments in people, processes, technology and facilities are based on business risk and criticality. Comprehensive testing continuously validates the recovery capabilities and an integrated governance model assures transparent coordination and reporting.
Best Practice: Resiliency Model
BusinessProcesses
Suppliers
Infrastructure
IT Services
BCDRProgram
BCDR Framework
Operational Management
Internal Sponsorship& Governance
Capabilities
3
Business/IT Alignment Model
� Business define
requirements & strategy
� Focus is on making “today’s
and future business better”
� Accomplished in
conjunction with
Governance, business
process improvements and
capabilities
� BCDR requirements are
understood and integrated
in business process,
valuable and cost effective
� IT aligns and offers BCDR
strategic capabilities to
enable new growth,
products/services, channels
� Creates BCDR “portfolio
visibility” for business
leverage
� Provides responsive, flexible
technology environment
� IT BCDR Services are cost
effective, responsive and
measured
Business-Driven Strategy IT-Enabled Strategy
Leadership needs to ensure that business and IT BCDR
strategies are continuously aligned to create value
Approach
ORGANIZATION
• Roles
• Responsibilities
• Skills
• Cross Organizational Cooperation
PROCESS
• BUSINESS PROCESS
– BCP Mgmt
– Risk Mgmt
– Product Delivery & Mgmt
– Information Mgmt
• IT PROCESS
– Application Development
– Application Operations Mgmt
– Operations Service Delivery
– Operations Service Mgmt IT
• CROSS-FUNCTIONAL PROCESS
– Business Process Integration
– Partner Controls & Integration
– Overall Life-cycle Integration
STRATEGY
• Governance
• Continuity Strategy
• Availability Strategy
• Recovery Strategy
• Communications
• Risk Management
APPLICATIONS and DATA
• Application Architecture / Design
• Application Availability / Recovery
• Application Integration
• Data Backup/Recovery
• Data Security
ARCHITECTURE & TECHNOLOGY
• Platforms & Networks
• Systems Software
• Storage
• Middleware
• Standards
FACILITIES
• Data Center Infrastructure
• Workspace Infrastructure
• Physical Security
• Environmental (Power, HVAC)
Business resiliency is incorporated through-out the corporate culture and business processes. Every level of the organization has been evaluated, with new models and methodologies developed and integrated into daily processes, ensuring resiliency and compliance with regulatory requirements at every layer.
Program Vision
• The vision for Business Resilience Management is:
– Establish BRM processes and services that are well-integrated with business and IT planning, development and operational processes such that enterprise-wide BRM implementation, testing and compliance are ensured, and support business objectives.
• Our Strategy for accomplishing this is:
– Adopt an internal strategy to deploy BRM solutions and develop industry partnerships to support business expansion and growth.
– Deploy a dual data center infrastructure, storage and data architecture to support business continuous availability needs and flexible, scalable and agile BRM solutions.
– Eliminate BRM gaps through investment into development of internal capabilities.
4
BRM Building Blocks
BRM Governance & Coordination
Sy
ste
m R
em
ed
iati
on
BRM Building Blocks
Details behind the Building Blocks
BRM Gov ernance & Coordina tion
Sys
tem
Re
me
dia
tio
n
BRM Program Framework
5
RReessoouurrcceess
20xx200520042003
Vendor Solution
• Does not address BRM requirements
$MM
Industry Recommended BRM Spend Based on IT Budget
BRM Strategic Investments
BRM Program• Driven by Business Requirements• Aligning Business and IT
• Value based BRM Investments
BRM Strategic Investment
Projected BRM spend
TIER 0 TIER 1 TIER 2 TIER 3Vital
Availability
Critical
Availability
Important
Availability
Deferred
Availability
8 Hrs 24 Hrs 48 Hrs 72 Hrs
Vital Data
(A)4 Hrs TIER 0A TIER 1A TIER 2A TIER 3A
Dual DC Remote
Data Replication
Critical
Data
(B)
8 Hrs TIER 0B TIER 1B TIER 2B TIER 3BDual DC Remote
Data Replication
Important
Data
( C )
24 Hrs TIER 0C TIER 1C TIER 2C TIER 3C
Virtual Vault
Storage
Backup/Restore
Deferred
Data
( D )
48 Hrs TIER 0D TIER 1D TIER 2D TIER 3DOffsite Tape
Backup/Restore
Dual DC
Automated
Failover
Dual DC
Manual
Failover
Dual DC
Standby Cold
Restore
Dual DC
Drop Ship
Cold Restore
Recovery Time Objective (RTO)
Reco
very
Po
int O
bje
ctive (R
PO
)
RP
O M
inim
um
Resilie
ncy
RTO Minimum Resiliency
Resiliency Tier Framework (RTF)
Provides a common dialogue for Business & IT recoverability
(3) Remediation and Resiliency Solutions
Tier 0-A Applications:High Availability/Redundant Platforms,
Data Replication/Recovery RPO,DDC automated recovery,
Full Application Recovery Plan & Test
BRM Tier Application Recovery
Categorization FrameworkTIER 0-A
ENTERPRISE SYSTEMS CLASSIFICATIONSFACILITIES ENVIRONMENT
Application & Data Deployment Across DDCs
Full Application & Data
DDC Deploy
Office Recovery Site Fixed Secondary Office
Site, or Commercial
Hotsite
XSP Data Center Site Resiliency Resilient SLAs
NETWORK ENVIRONMENT
Site/Campus & Edge Data Network Resiliency HA Redundancy
XSP WAN/VAN/MAN/ISP/Voice Network Resiliency HA Redundancy
Voice & TCOMM Network HA Redundancy
PLATFORM ENVIRONMENT
App & DB Server Platform Resiliency HA Redundancy
Workstation Recovery Prebuilt Spares
STORAGE ENVIRONMENT
Online SAN Replication/Restore Local & Remote DDC
Offline Tape Backup/Restore In Failover Mode
DATA MGMT ENVIRONMENT
Database Resiliency Local & Remote DDC
File Storage Resiliency Local & Remote DDC
APPLICATION ENVIRONMENT
Application Architecture Resiliency Local & Remote DDC
APPLICATION INTEGRATION ENVIRONMENT
Middleware Architecture Resiliency Local & Remote DDC
SECURITY MGMT ENVIRONMENT
Security Controls & Process Highest
(2) Business Process, Application andSystem Resiliency Requirements
TIER 0 TIER 1 TIER 2 TIER 3Vital
Availability
Critical
Availability
Important
Availability
Deferred
Availability
8 Hrs 24 Hrs 48 Hrs 72 Hrs
Vital Data
(A)4 Hrs TIER 0A TIER 1A TIER 2A TIER 3A
Dual DC Remote
Data Replication
Critical
Data
(B)
8 Hrs TIER 0B TIER 1B TIER 2B TIER 3BDual DC Remote
Data Replication
Important
Data
( C )
24 Hrs TIER 0C TIER 1C TIER 2C TIER 3C
Virtual Vault
Storage
Backup/Restore
Deferred
Data
( D )
48 Hrs TIER 0D TIER 1D TIER 2D TIER 3DOffsite Tape
Backup/Restore
Dual DC
Automated
Failover
Dual DC
Manual
Failover
Dual DC
Standby Cold
Restore
Dual DC
Drop Ship
Cold Restore
Recovery Time Objective (RTO)
Re
co
very
Po
int O
bje
ctiv
e (R
PO
)
RP
O M
inim
um
Res
ilien
cy
RTO Minimum Resiliency
RTF Certification Standard: Tier Alignment
(1) BusinessRTO/RPO
Requirements
6
Business Exposure and Impact Analytics
GMAC-RFC consistently assesses and determines required recovery capabilities for processes and new initiatives. The assessment is based on an analytical model consisting of quantitative and qualitative measures. The assessment and analysis process is structured in four phases, designed to conduct a comprehensive analysis for people, process, technology, facilities and interdependencies.
Business Application Resilience (BAR) Planning Methodology
TIER 0 TIER 1 TIER 2 TIER 3Vital
Availability
Critical
Availability
Important
Availability
Deferred
Availability
8 Hrs 24 Hrs 48 Hrs 72 Hrs
Vital Data
(A)4 Hrs TIER 0A TIER 1A TIER 2A TIER 3A
Dual DC Remote
Data Replication
Critical
Data
(B)
8 Hrs TIER 0B TIER 1B TIER 2B TIER 3BDual DC Remote
Data Replication
Important
Data
( C )
24 Hrs TIER 0C TIER 1C TIER 2C TIER 3C
Virtual Vault
Storage
Backup/Restore
Deferred
Data
( D )
48 Hrs TIER 0D TIER 1D TIER 2D TIER 3DOffsite Tape
Backup/Restore
Dual DC
Automated
Failover
Dual DC
Manual
Failover
Dual DC
Standby Cold
Restore
Dual DC
Drop Ship
Cold Restore
Recovery Time Objective (RTO)
Re
co
ve
ry P
oin
t O
bje
cti
ve
(RP
O)
RP
O M
inim
um
Re
silie
nc
y
RTO Minimum Resiliency
1 2 3 4 5
6 7 810
9
1
2
3
4
5
6
7
8
9
10
Application Suite 1
Application Suite 1
Application Suite 1
Application Suite 1
Application Suite 1
Application Suite 1
Application Suite 1
Application Suite 1
Application Suite 1
Application Suite 1
Business Resiliency Planning
1
2
3
4
5 6
7
8 9 10
Target State
Planned
Best Effort
1 2 3 4
5 6 7 8
9 10 11 12
Systems Availability & Recovery Gaps
Application 1 BRM Gap Scorecard
0%
20%
40%
60%
80%
100%
High Availabilty
Capability
Data and Application
Recovery Capability
RTF C
om
pliance
Application 3 BRM Gap Scorecard
0%
20%
40%
60%
80%
100%
High Availabilty
Capability
Data and Application
Recovery Capability
RT
F C
om
plian
ce
Application 2 BRM Gap Scorecard
0%
20%
40%
60%
80%
100%
High Availabilty
Capability
Data and Application
Recovery Capability
RTF C
om
plia
nce
Application 4 BRM Gap Scorecard
0%
20%
40%
60%
80%
100%
High Availabilty
Capability
Data and Application
Recovery Capability
RT
F C
om
plia
nc
e
7
Critical Service Portfolio
The Vital and Critical portfolio management process has been established to prioritize resiliency requirements and enhancements for business processes based on criticality. It is also designed to eliminate a functional, silo-view, by combining process components into one integrated profile known as a ‘Recovery Domain’. Recovery Domains enable a structured process to continuously enhance resiliency capabilities and provide a strong foundation to provide highest availability for our business processes.
Why Use Recovery Domains?
The volume and complexity of business systems require that recovery parameters are understood to ensure recoverability.
What is a Recovery Domain?
A method for aligning business functions and supporting applications and infrastructure into logical groups that enable resumption of target business or systems functions.
Recovery Domain Definition Standard
Servicer
Compass
DMS
Workflow
DF Letters
Newly funded loansand changes
for selected fields
HomecomingsDefault data(San Diego)
Daily HCF Interface feed
PC Seller/ServicerSystem(2ndmtg)
Daily Home Equitytransactions from GMAC,
HCF and MFI
IMS
ServicerTransmission
HIP
Month-endInitialization files
Scrubbed servicer cutoffdata fi les
Servicersand Service
Bureaus
Monthly servicercutoff data files
HIP
LoanAccounting
PenaltyTracker
CSS
LoanAccountingDistribution
ListLoan Accountant
assignments (manual)
HIPWorkingTables
EOM initialization forHE Loans
EOM initialization forHE Loans
Initialization fornew HE Loans
7th B.D. comparison"Upload Proof"
Reconciledloan data
ACQ
IDR Feed
Bank
$$$
Servicers
Manual checks
DRTDistribution data(for MULSCR
reported deals)
GL infofor reconciliation Peoplesoft
MULSOR(agg, calc,
credinst files)
DDS
MULSOR
INTEX
Liquidations andexpenses
Foreclosure, REO,SSCRA, Bankruptcy
Active loans,payoffs, repurchase s
Loan level info,pool level aggregation
Distribution calculationresults
Loan level info,pool level aggregation
VISION
Loan leveldistribution results
HEStructured
HE WholeLoan
SharedExecution
HIP HSS
All Loans,Payoff files,SOD tables
Potential expenses,REO Amortization
Distribution calculationresults
MULREO
BOS
EAGLE
Servicer advances(manually entered
from report)
DMS
Nightly uploads ofDLQ, FCL, LM,BK, REO info
REO, 3PS,SPO, WO
COLOAN
Snapshot
REO funds
Close ofEscrow
Recovery Domains
Remittances and payoffs
Midnet
FHLMC reporting data(manual process)
Monet
FNMA reporting data
Distribution data
REO liquidation activity(expense s, proceeds)
CurtailmentsREO loans
Liquidations(manually entered
from report)Newly funded loans
Daily changes, payoffs,index values, se rvicer tran sfers,
monthly loan updates
Distribution calculationresults
GL entries
Other
default
reporting
IDR Feed
Manual checks
REO Amortization
Legend
Interface / Data feed
Data dependency
$$$
Trustee(Investors)
Bond payment
InvestorPortal
Auto
PoolingPRNfiles
Original loan info
Recovery Domain Boundary
Excluded from Recovery Boundary
1
2
3
3
4
5
5
1
PROD
“Ca sh In” data
Process Integration and Improvement
• Integrate BRM Resiliency oversight, standards and best practicesinto RCG People, Process & Technology areas:
– New Application Development (SDLC)
– Business Impact Assessment and Planning
– Existing Business and IT System remediation
– Annual Operating Planning
– IT Frameworks
– Delivery Assurance Processes and procedures
– IT Operations Service Management
– Education and cross-training
– Improve resiliency maturity and metrics scorecards
– Etc.
8
Deploy
P roject Contro ls
E stim ates Issue & R isk T rackin g W ork P lan ning &Tracking
Sta tus & Cost Repor t ing Ch ange Con trol R esource P lann ing
TestConstru ctDef in eP lan
Project Ch arter
Project Pla n / SOW
Busin ess R equirem ents
Syste m Re quirem ents
Requ irem entsTraceability
Architectu re D esignDocum ent
R equired R C G-IT Deliverables
Define Te st Stra tegy/Approach
S ystem DesignD ocum ent
U AT Te st Plan
S ystem Deployme ntP lan
S ource C ode/Unit Te st
S ystem Test P lan
O perat ions & Supp ortP lan
S ystem Test C ases
S ystem Test R eadinessR eview
S ystem T est Su mm aryR epor t
U AT Sum m ary R epor t
S ystem T est De fect Log
R elease No tes
In stallat ion G uide
O perat ions Gu ide
D elivery A ss ura nce T ollgates
Project SOW,Review Checklist
Rqmts DefinitionChecklist
Failover Test Plan,Application Recovery Plan (ARP)Infrastructure
Recovery Plan(IRP)
Deployment Review,Production Certification Review,EPT Doc Docs
BusinessRTO/RPO
Rqmts
Design Review, Data Arch review,Sys Arch Review,SysArch Spec,Devlp Review
Checklists
Includes resiliencyscope & costs
Includesresiliencyrequirements
IncludesBusiness availabilityRecoveryrequirements
IncludesRTF Tier requirements,OLAs/SLAs
IncludesLocal HA/Failover& ARP, IRPtest plans
IncludesLocal & DDC Configuration Deployment EPT
Recovery test approach
Process Integration and Improvement:BRM Alignment with DA Framework
Service Management Framework
BRM Alignment with IT Service Mgmt
TechnologyServices Group
IT Service Management
Policy
Incident Management
Problem Management
Change Management
Release Management
Configuration Management
Service Level Management
Availability Management
Capacity Management
IT Financial Management
IT Service Continuity Management
Pro
ce
ss
Sta
nd
ard
s &
Gu
ide
lin
es
(O
ve
rvie
w &
Glo
ss
ary
)
BRM Resiliency Stds BRM Resiliency ( People, process, Tools)
WHY:• Geographically redundant DC reduces risks• Internal self-sufficiency and capabilities enable
business resiliency• Standard availability and recovery solutions reduce
complexity and costs• Standard support and SLAs meet business recovery
objectives• Shared infrastructure enables long-term economies
of scale and reuse
Dallas DDC2
Production/Recovery
Production/Recovery
Dual Data Center (DDC)
Adopted a geographically disperse Dual Data Center (DDC) resiliency strategy. Vital and Critical applications are required to have full fail-over capability within the DDC architecture.
9
Deploy Tiered Storage Architecture Standard for improved RPO Resiliency and Recovery
• Provide tiered Storage options to support business RPOs
• Provide Local DC and Remote DC Data Replication and Recovery
• Provide Resilient Storage Architecture
• Integrate backup and recovery architecture
• Employ Information Mgmt practices to enable data recovery
Data Resilience
BRM Governance & Coordination
BRM Governance: An established set of methods by which Business Areas address their business resilience needs.
The governance model is designed to provide centralized oversight and to enable business ownership. Consistent program tools allow for prioritized assessment, analysis, evaluation and decision-making processes depending on criticality across the enterprise. Defined roles and responsibilities assure consistent business resiliency planning and execution.
BRM Program- Builds BRM Framework,
-Transitions ownership to the BRM Operations Team
--Enables Business & IT to achieve risk goals
Risk Committee
Business ResilienceManagement Committee (BRMC)
BRM Operations Team-BRM Program Manager
-BRM Architect
-BRM Specialist-BCP Site Coordinator
Business Risk
Business Units- Stakeholders
- Projects
Standards?
• type
10
Recovery Plans
The plan structure is designed so that all plans are integrated in an efficient manner. Data content flows are documented between plans ensuring that all required data is captured and non-essential data is minimized. Each plan is an assigned an owner.
BRM Program Objectives
Implement BRM practices into way of doing business
BRM Program
Projects
Build Transition SustainingModel
Transition – Consultants
Sustaining Model - Employees
-Operational program supported by RFC associates and integrated into business processes.
- BRM Projectgenerates Artifacts
- RFC staff provides input & approves project work
-RFC Owner & Stakeholders are identified
-Ownership of Artifacts is transferred to the RFC Owner
- BRM Team consults and helps nurture the Artifacts
BRM Artifacts
Production MSP
Burbank
San Diego
Minneapolis
Prod Dal las
Dallas
Redun dant MSP
Redun dant Dal las
Redun dant San Diego
Redun dant Burbank
Productio n Burbank
Productio n San Diego
TIER 0 TIER 1 TIER 2 TIER 3Vital
Availability
Critical
Av ailability
Important
Availability
Deferre d
Availability
8 Hrs 24 Hrs 48 Hrs 72 Hrs
Vital Data
(A)4 Hrs TIER 0A TIER 1A TIER 2A TIER 3A
Dual DC Re mote
Data Replication
Critical
Data
(B)
8 Hrs TIER 0B TIER 1B TIER 2B TIER 3BDual DC Re mote
Data Replication
Important
Data
( C )
24 Hrs TIER 0C TIER 1C TIER 2C TIER 3CVirtual Vault
Storage
Backup/Restore
Deferred
Data
( D )
48 Hrs TIER 0D TIER 1D TIER 2D TIER 3D Offsite Tape
Backup/Restore
Dual DC
Automated
Failov er
Dual DC
Manual
Failover
Dual DC
Standby Cold
Re store
Dual DC
Drop Ship
Cold Restore
Recovery Time Objective (RTO)
Recovery P
oint Objective (R
PO)
RPO M
inim
um R
esilie
ncy
RTO Minimum Resiliency
RTF
BAR Methodology
RecoveryDomains
DDC
SDLC Checklists
BRM Artifacts
Production MSP
Burbank
San Diego
Minneapolis
Prod Dal las
Dallas
Redun dant MSP
Redun dant Dal las
Redun dant San Diego
Redun dant Burbank
Productio n Burbank
Productio n San Diego
TIER 0 TIER 1 TIER 2 TIER 3Vital
Availability
Critical
Av ailability
Important
Availability
Deferre d
Availability
8 Hrs 24 Hrs 48 Hrs 72 Hrs
Vital Data
(A)4 Hrs TIER 0A TIER 1A TIER 2A TIER 3A
Dual DC Re mote
Data Replication
Critical
Data
(B)
8 Hrs TIER 0B TIER 1B TIER 2B TIER 3BDual DC Re mote
Data Replication
Important
Data
( C )
24 Hrs TIER 0C TIER 1C TIER 2C TIER 3CVirtual Vault
Storage
Backup/Restore
Deferred
Data
( D )
48 Hrs TIER 0D TIER 1D TIER 2D TIER 3D Offsite Tape
Backup/Restore
Dual DC
Automated
Failov er
Dual DC
Manual
Failover
Dual DC
Standby Cold
Re store
Dual DC
Drop Ship
Cold Restore
Recovery Time Objective (RTO)
Recovery P
oint Objective (R
PO)
RPO M
inim
um R
esilie
ncy
RTO Minimum Resiliency
RTF
BAR Methodology
RecoveryDomains
DDC
SDLC Checklists
BRM Planning Overview
• Operational Readiness– Establish Governance– Train Staff– Assess and Adopt BRM Framework– Establish BRM Resiliency Baseline
• Execution– Maintain Recovery Plans– Exercise Recovery Plans– Execute Resiliency Risk
Reduction Projects
• Oversight and Compliance
• Annual Operating Plan– Assess– Prioritize– Plan
11
Review
• Develop a continuity framework that addresses all levels of the organization; facilities, technology, applications, data, processes, governance, strategy.
• Integrate all elements of the framework.
• Establish a governance committee, placing responsibility within the business.
• BRM Operations maintains the framework, tools, methodologies, artifacts.
• Incorporate BRM processes and capabilities into day-to-day processes.
• Invest internally to improve processes versus externally leavingprocesses as-is.
Value Proposition
• Reduce business risks
• Enable business growth
• Invest in gap reduction
• Take complexity out
• Create choice and flexible BRM options
BRM Program
Business strategy alignment and cost optimization through
implementation of the BRM strategies provides company with a
range of options to improve business value