CIS 3500 1rowdysites.msudenver.edu/~fustos/cis3500/pdf/chapter23.pdf · peripherals and software...
Transcript of CIS 3500 1rowdysites.msudenver.edu/~fustos/cis3500/pdf/chapter23.pdf · peripherals and software...
.
CIS 3500 1
Incident Response, Disaster Recovery, and Continuity of Operations
Chapter #23:
Risk Management
Chapter Objectives
n Understand the incident response process
n Learn the incident response procedures
n Explore disaster recovery preparation
n Examine the process of continuity of operations
Incident Response, Disaster Recovery, and Continuity of Operations2
Incident Response, Disaster Recovery, and Continuity of Operations
n Normal operations in an IT enterprise include preparing for
when things go wrong
n things are not operating correctly
n incident response process
n disaster
n Requires preparation and readiness
Incident Response, Disaster Recovery, and Continuity of Operations3
Incident Response Plan
n An incident response plan describes the steps
n low-impact incident – may not result in significant exposure
n moderate-risk – requires greater scrutiny and response
n high-level risk – requires the greatest scrutiny and response
n Two mayor elements to determine the level:
n information criticality - comes from the data classification and
the quantity of data involved
n how the incident potentially affects operations
Incident Response, Disaster Recovery, and Continuity of Operations4
.
CIS 3500 2
Documented Incident Types/Category Definitions
n Documented incident types/category definitions - set of
preplanned scripts that can be applied quickly
n interruption of service, malicious communication, data
exfiltration, malware delivery, phishing attack - customizable
meet the IT needs of each organization.
n The amount of work scales with the size of IT and services
Incident Response, Disaster Recovery, and Continuity of Operations5
Roles and Responsibilities
n A critical step to define the roles and responsibilities
n May vary slightly based on the incident
n permissions to cut connections, change servers, and start/
stop services defined in advance to prevent time-consuming
approvals
n Team leader, the team communicator, IR team members
Incident Response, Disaster Recovery, and Continuity of Operations6
Reporting Requirements/Escalation
n Planning the desired reporting requirements including
escalation steps is an important part of the operational plan
n Who talks for the incident and to whom, and what do they say?
n How does the information flow?
n Who needs to be involved?
n When does the issue escalate to higher levels of management?
n They can refer to industry, regulatory, and statutory
requirements in addition to internal communications
Incident Response, Disaster Recovery, and Continuity of Operations7
Cyber-Incident Response Teams
n The cyber-incident response team - designated to respond
to an incident
n IRP should identify the membership and backup members
and their duties
n The team leader is typically a member of management –
understands the enterprise IT environment and IR process
n Subject matter experts on the various systems
n Team is responsible for all phases of the incident response
Incident Response, Disaster Recovery, and Continuity of Operations8
.
CIS 3500 3
Exercise
n Plan needs to be tested
n Exercises come in many forms and functions
n Tabletop exercise where planning and preparation steps are
tested
n Team has to practice the process on the systems of the
enterprise
Incident Response, Disaster Recovery, and Continuity of Operations9
Incident Response Process
n The incident response process - actions security personnel
perform in response to triggering events
n Incident response activities at times are closely related to
other IT activities involving IT operations
n They can be similar to disaster recovery and business
continuity operations
n Incident response activities are connected to many
operational procedures, and this is key to system efficiency
Incident Response, Disaster Recovery, and Continuity of Operations10
Preparation
n Preparation – occurs before a specific incident
n Tasks needed to be organized and ready to respond
n Incident response should be manageable task
n Success is a direct result of proper preparation
n ensuring the correct data events are being logged
n reporting of potential incidents is happening
n people are trained with respect to IR process and their
personal responsibilities
Incident Response, Disaster Recovery, and Continuity of Operations11
Identification
n Identification is the process where a team member
suspects that a problem is bigger than an isolated incident
and notifies the incident response team
n An incident is defined as a situation that departs from
normal, routine operations
n Some training is required to prevent false alarms
n The team will process the information and determine of
whether or not to invoke incident response processes
Incident Response, Disaster Recovery, and Continuity of Operations12
.
CIS 3500 4
Containment
n Once the IR team has determined that an incident has in
fact occurred and requires a response, their first step is to
contain the incident and prevent its spread
n Containment is the set of actions taken to constrain the
incident to the minimal number of machines
n This preserves as much of production as possible
n Can be complex, and requires fully understanding the
problem, its root cause, and the vulnerabilities involved
Incident Response, Disaster Recovery, and Continuity of Operations13
Eradication
n Eradication involves removing the problem – this may
mean rebuilding a clean machine
n Key part is the prevention of reinfection
n One of the strongest value propositions for virtual machines
is the ability to rebuild quickly, making the eradication step
relatively easy
Incident Response, Disaster Recovery, and Continuity of Operations14
Recovery
n Recovery is the process of returning the asset into the
business function and normal business operations
n Eradication removes the problem, but the system will be
isolated
n The recovery process includes steps to return the systems
and applications to operational status
n After recovery, the team moves to document the lessons
learned from the incident
Incident Response, Disaster Recovery, and Continuity of Operations15
Lessons Learned
n A post-mortem session should collect lessons learned
n Might assign action items to correct weaknesses and to
suggest ways to improve
n Two distinct purposes
n document what went wrong and allowed the incident to occur
n failure to correct this means a sure repeat
n examine the incident response process itself
n continuous improvement of the actual incident response
process is an important taskIncident Response, Disaster Recovery, and Continuity of Operations16
.
CIS 3500 5
Disaster Recovery
n Disaster recovery is the process that the organization uses
to recover from events that disrupt normal operations.
Incident Response, Disaster Recovery, and Continuity of Operations17
fire flood tornado hurricanepolitical unrest/riot earthquake electric storm blizzardgas leak/explosion
Recovery Sites
n Restoration services will be located close to the location of
backup storage
n If the organization has suffered physical damage computing
facilities similar to those used in normal operations are
required
n These sites are referred to as recovery sites
Incident Response, Disaster Recovery, and Continuity of Operations18
Hot Sites
n A hot site is a fully configured environment
n It can be operational immediately or within a few hours
Incident Response, Disaster Recovery, and Continuity of Operations19
Warm Sites
n A warm site is partially configured, usually having the
peripherals and software but perhaps not the more
expensive main processing computer
n It is designed to be operational within a few days
Incident Response, Disaster Recovery, and Continuity of Operations20
.
CIS 3500 6
Cold Sites
n A cold site will have the basic environmental controls
necessary to operate
n Few of the computing components necessary for processing
n Getting a cold site operational may take weeks
Incident Response, Disaster Recovery, and Continuity of Operations21
Order of Restoration
n Part of the planning for a disaster is to decide the order of
restoration
n There are a couple of distinct factors to consider
n dependencies
n criticality to the enterprise
Incident Response, Disaster Recovery, and Continuity of Operations22
Backup Concepts
n B a c k u p s a r e k e y e le m e n t s in b u s in e s s c o n t in u i t y / d is a s t e r r e c o v e r y
( B C / D R )
n B a c k u p s a r e c r i t i c a l w h e n s e c u r i t y m e a s u r e s h a v e f a i le d
n D a t a b a c k u p s t r a t e g y :
n H o w fre q u e n t ly sh o u ld b a ck u p s b e co n d u c te d ?
n H o w e x te n s iv e d o th e b a ck u p s n e e d to b e ?
n W h a t is th e p ro ce ss fo r co n d u c t in g b a ck u p s?
n W h o is re sp o n s ib le fo r e n su r in g b a ck u p s a re c re a te d ?
n W h e re w ill th e b a ck u p s b e s to re d ?
n H o w lo n g w ill b a ck u p s b e k e p t?
n H o w m a n y co p ie s w ill b e m a in ta in e d ?Incident Response, Disaster Recovery, and Continuity of Operations23
Differential
n In a differential backup, only the files that have changed
since the last full backup was completed are backed up
n The frequency of the full backup versus the differential
backups depends on the organization and needs
n Restoration requires two steps:
n first the last full backup needs to be loaded, and then
n the last differential backup performed can be applied
n The system has to have a method to determine which files
have changed Incident Response, Disaster Recovery, and Continuity of Operations24
.
CIS 3500 7
Incremental
n The incremental backup backs up only files that have
changed since the last full or incremental backup
n Less information will be stored in each backup
n Occasional full backup
n To restore a system requires quite a bit more work:
n first need to go back to the last full backup
n then you have to update the system with every incremental
backup that has occurred since the full backup
Incident Response, Disaster Recovery, and Continuity of Operations25
Snapshots
n A snapshot is a copy of a virtual machines at a specific
point in time
n Copying the files that store the virtual machine
n To revert to an earlier snapshot is as easy
Incident Response, Disaster Recovery, and Continuity of Operations26
Full
n In a full backup, all files and software are copied
n Restoration — copy all the files back onto the system
n Copying this amount of data takes time
Incident Response, Disaster Recovery, and Continuity of Operations27
Geographic Considerations
n An important element is the cost of the backup strategy
n A simple strategy might be to store all backups together –
not a good idea
n Keep copies of backups in separate locations
n The most recent copy can be stored locally
n Online backup services alleviates some issues with physical
movement of more traditional storage media
Incident Response, Disaster Recovery, and Continuity of Operations28
.
CIS 3500 8
Off-Site Backups
n Off-site backups are stored in a separate location
n A building fire, a hurricane, a tornado …
n Having backups off-site alleviates the risk of losing backups
Incident Response, Disaster Recovery, and Continuity of Operations29
Distance
n The distance associated with an off-site backup is a logistics
problem – physical movement increases recovery time
n Distance is also critical when examining the reach of a
disaster
n Far enough away that it is not affected by the same
incident – this includes the physical location of a cloud
storage provider
Incident Response, Disaster Recovery, and Continuity of Operations30
Location Selection
n Physical safety of the backup media
n HVAC, potential flooding, theft, protecting the backup
media, ability to move the backups in and out of storage
n Cloud storage: high-speed networks, reasonably priced
storage, ability to store backups in a redundant array
across multiple sites, protecting the information via
encryption
Incident Response, Disaster Recovery, and Continuity of Operations31
Legal Implications
n Legal implications of where the data would actually be
stored
n Different jurisdictions have different laws, rules, and
regulations concerning core tools such as encryption
n Some countries require storage of data concerning their
citizens to be done within their borders – jurisdiction
n Other countries may have different regulations concerning
privacy that would impact the security of the data
Incident Response, Disaster Recovery, and Continuity of Operations32
.
CIS 3500 9
Data Sovereignty
n Data sovereignty – countries mandate that data stored
within their borders is subject to their laws, and data
originating within their borders must be stored there
n With the global Internet this has become a problem
n Firms have changed their business strategies and offerings,
and abandoned markets
Incident Response, Disaster Recovery, and Continuity of Operations33
Continuity of Operation Planning
n C o n t in u it y o f o p e r a t io n s is a b u s in e s s im p e ra t iv e
n T h e o v e r a l l g o a l o f c o n t in u it y o f o p e r a t io n p la n n in g is t o d e te rm in e w h ic h
o p e r a t io n s n e e d s to b e c o n t in u e d d u r in g d is r u p t io n
n C o m p re h e n s iv e p la n
n I d e n t i fy in g c r it ic a l a s s e t s ( in c lu d in g k e y p e r s o n n e l) , c r it ic a l s y s te m s , a n d
in te r d e p e n d e n c ie s , a n d e n s u r in g th e ir a v a i la b i l i t y d u r in g a d is r u p t io n
n J o in t e f fo r t b e tw e e n th e b u s in e s s a n d th e I T t e a m
n B u s in e s s -> w h ic h fu n c t io n s a r e c r it ic a l fo r c o n t in u it y
n I T t e a m -> e q u ip m e n t , d a ta , s e r v ic e s a n d I T fu n c t io n s
n M a jo r d e c is io n s r e g a rd in g r is k b a la n c e v e r s u s c o s t v e r s u s c r it ic a l i t y w h e n
e x a m in in g s t r a te g ie s
Incident Response, Disaster Recovery, and Continuity of Operations34
Exercises/Tabletop
n Once a continuity of operations plan is in place, a tabletop
exercise should be performed to walk through all of the
steps
n It is a critical final step to validate the planning
n This exercise is not a onetime thing, it should be repeated
after major changes to systems
n Major corporations regularly exercise on a calendar-based
schedule, rotating through day and night shifts and systems
Incident Response, Disaster Recovery, and Continuity of Operations35
After-Action Reports
n Identifying and documenting lessons learned is a key
element in after-action reports
n document the level of operations upon transfer to the backups
n actual change from normal operations to continuity systems
occurred - what was right and what went wrong?
Incident Response, Disaster Recovery, and Continuity of Operations36
.
CIS 3500 10
Failover
n Failover is the process for moving from a normal operational
capability to the continuity-of-operations
n The required speed and flexibility of the failover depends on
the business type
n It can be technology driven where if one system fails, a
redundant system takes its place without notice
n The return to normal operations is a more complicated, but the
good news is that it can be performed at a time of the
organization’s choosingIncident Response, Disaster Recovery, and Continuity of Operations37
Alternate Processing Sites
n Solid, cost-effective continuity of operations - plan for an
alternate processing site
n The action that triggered the shift could also have rendered
the original physical location unusable
n Consider the scale and volume of transactions
n Consider the operators and temporarily move them
n With multiple sites a different geographic office can cover the
one that is lost using local people for continuity processes
Incident Response, Disaster Recovery, and Continuity of Operations38
Alternate Business Practices
n Because continuity of operations maintain only key
systems, the business practices will most likely be different
n This leads to alternate business practices
n Need to meet the objectives of the continuity of operations
Incident Response, Disaster Recovery, and Continuity of Operations39
Stay Alert!
There is no 100 percent secure system, and
there is nothing that is foolproof!