Resiliency and Automation Strategiesrowdysites.msudenver.edu/~fustos/cis3500/pdf/chapter16.pdf ·...
Transcript of Resiliency and Automation Strategiesrowdysites.msudenver.edu/~fustos/cis3500/pdf/chapter16.pdf ·...
.
CIS 3500 1
Resiliency and Automation Strategies
Chapter #16:
Architecture and Design
Chapter Objectives
n Learn how resiliency strategies reduce risk
n Discover automation strategies to reduce risk
Resiliency and Automation Strategies2
Resiliency and Automation Strategies
n Resilient systems are those that can return to normal
operating conditions after a disruption
n Goal is to reduce risk associated with failures
n Configuration and setup strategies
n Automation is used to improve efficiency and accuracy
when administering machines using commands
Resiliency and Automation Strategies3
Automation/Scripting
n Automation and scripting: using tools and methods to perform
tasks otherwise performed manually
n Improving efficiency and accuracy and reducing risk
n prewritten and tested scripts remove the chance of user error
n scripts can be chained together to provide a means of automating
complex actions
n automation via scripts can save significant time
n Automation is a major element of an enterprise security program
with protocols, standards, methods, and architectures
Resiliency and Automation Strategies4
.
CIS 3500 2
Automated Courses of Action
n Scripts allow automated courses of action
n Can be tested and approved before use in the production
n They are specified in NIST Special Publication 800-53 which
specifies security and privacy controls for the U.S.
government
n Automated courses reduce errors and can save time
Resiliency and Automation Strategies5
Continuous Monitoring
n Continuous monitoring: system that has monitoring built
into it
n Continuous monitoring is an operational process by which
you can monitor controls and determine if they are
functioning in an effective manner
n Automated dashboards and alerts that show out-of-
standard conditions
n Allow operators to focus on the parts that need attention
Resiliency and Automation Strategies6
Configuration Validation
n Configuration validation: validate configuration against
security standards with no added functionality
n Extra ports, services, accounts are disabled, removed, or
turned off, ACLs are correct and working as designed
n With changes, software patches, and as things are added to
or taken away from the system configuration needs to be
re-validated
n Automated testing can scale and resolve issues
Resiliency and Automation Strategies7
Templates
n Templates are master recipes for the building servers,
programs, or even entire systems
n Templates make Infrastructure as a Service possible
n They enable setting up standard business arrangements
and technology stacks
n They allow rapid, error-free creation of configurations,
connection of services, testing, deployment
Resiliency and Automation Strategies8
.
CIS 3500 3
Master Image
n A master image is a premade, fully patched image of a system
n A VM can be configured and deployed in seconds
n They provide the true clean backup of the operating systems,
applications, everything but the data
n Should an error be found, you have one image to fix and then
deploy
n Master images work well for enterprises with multiple desktops
Resiliency and Automation Strategies9
Non-persistence
n Non-persistence is when a change to a system is not permanent
n Can be useful when you wish to prevent certain types of malware
attacks
n A system that cannot preserve and save changes cannot have
persistent files added into their operations
n A simple reboot wipes out the new files, malware, etc.
n There are utility programs that can freeze a machine from change
Resiliency and Automation Strategies10
Snapshots
n Snapshots: instantaneous savepoints on virtual machines
n They allow to restore the VM to a previous point in time
n Snapshots work because a VM is just a file on a machine: setting
the file back to a previous version reverts the VM
n Snapshots can be used to roll a system back, undo operations, or
provide a quick means of recovery
n They act as a form of backup, very useful in reducing risk
n They can act as a non-persistence mechanism
n Any user data that is stored on the system will be lost
Resiliency and Automation Strategies11
Revert to Known State
n Reverting to a known state is an operating system
capability
n Many OSs now have the capability to create a restore point
- a copy of key files that change upon updates to the OS
n You can roll back the clock on the OS and restore to an
earlier time at which you know the problem did not exist
n This feature only protects the OS and associated files, and
a roll back does not result in loss of a user’s files
Resiliency and Automation Strategies12
.
CIS 3500 4
Rollback to Known Configuration
n Rollback to a known configuration is another way of saying revert
to a known state
n Windows – “The Last Known Good Configuration option” during
boot to roll back the registry to the last value that properly
completed a boot cycle – only in Windows 7 and earlier
n In Windows 8 forward, pressing f8 on bootup is not an option
unless you change to Legacy mode
n The proper method of backing up and restoring registry settings in
Windows 8 through 10, is through a system restore point
Resiliency and Automation Strategies13
Live Boot Media
n A live boot media is an optical disc or USB device that
contains a complete bootable system
n This may be used as a recovery mechanism
n If the internal drive is encrypted, you will need backup keys
to access it
n This is also a convenient method of booting to a task-
specific operating system, forensic tools or incident
response tools
Resiliency and Automation Strategies14
Elasticity
n Elasticity: ability of a system to dynamically increase the
workload capacity using additional, added-on-demand
hardware resources
n This can be set to automatically occur
n One of the strengths of cloud environments, only paying for
the actual resources you use
Resiliency and Automation Strategies15
Scalability
n Scalability is a design element that enables a system to
accommodate larger workloads by adding resources either
making hardware stronger (scale up), or adding additional
nodes (scale out)
n Commonly used in server farms and database clusters
n Both elasticity and scalability have an effect on system
availability and throughput, which can be significant
security- and risk-related issues
Resiliency and Automation Strategies16
.
CIS 3500 5
Distributive Allocation
n Distributive allocation is the transparent allocation of requests
across a range of resources
n Distributive allocation handles the assignment of jobs
n When the jobs are stateful, the process ensures that the
subsequent requests are distributed to the same server to
maintain transactional integrity
n When the system is stateless, other load-balancing routines are
used to spread the work
n Distributive allocation addresses the availability aspect of security
Resiliency and Automation Strategies17
Redundancy
n Redundancy: multiple, independent elements to perform a
critical function, so that if one fails, there is another that can
take over the work
n Redundancy include the use of redundant servers, redundant
connections, and redundant ISPs
n Having critical hardware (or software) spares for critical
functions in the organization can greatly facilitate maintaining
business continuity in the event of software or hardware
failuresResiliency and Automation Strategies18
High Availability
n One of the objectives of security is the availability of data
and processing power when an authorized user desires it
n High availability: to maintain availability of data and
operational processing (services) despite a disrupting event
n This requires redundant systems (power and processing)
n High availability is more than data redundancy; it requires
that both data and services be available
Resiliency and Automation Strategies19
Fault Tolerance
n Fault tolerance basically has the same goal as high
availability — uninterrupted access to data and services
n It can be accomplished by the mirroring of data and
hardware systems
n Should a “fault” occur, the mirrored system provides the
requested data with no apparent interruption
n Certain systems are more critical to business operations
and should be the object of fault-tolerant measures
Resiliency and Automation Strategies20
.
CIS 3500 6
RAID
n To increasing reliability in disk storage is employing a
Redundant Array of Independent Disks (RAID)
n It takes data that is normally stored on a single disk and
spreads it out among several others
n If any single disk is lost, the data can be recovered from
the other disks where the data also resides
n RAID can also increase the speed of data recovery as
multiple drives can be busy retrieving requested data
Resiliency and Automation Strategies21
RAID
n R A ID 0 (s tr ip e d d isk s ) s im p ly sp re a d s th e d a ta a c ro ss se v e ra l d isk s w ith n o
re d u n d a n cy o ffe re d
n R A ID 1 (m irro re d d isk s ) co p ie s th e d a ta fro m o n e d isk o n to tw o o r m o re d isk s
n R A ID 2 (b it- le v e l e rro r-co rre c t in g co d e ) s tr ip e s d a ta a c ro ss th e d r iv e s a t th e b it le v e l
a s o p p o se d to th e b lo ck le v e l
n R A ID 3 (b y te -s tr ip e d w ith e rro r ch e ck ) sp re a d s th e d a ta a c ro ss m u lt ip le d isk s a t th e
b y te le v e l w ith o n e d isk d e d ica te d to p a r ity b its
n R A ID 4 (d e d ica te d p a r ity d r iv e ) s tr ip e s d a ta a c ro ss se v e ra l d isk s b u t in la rg e r s tr ip e s
th a n in R A ID 3 , a n d it u se s a s in g le d r iv e fo r p a r ity -b a se d e rro r ch e ck in g
n R A ID 5 (b lo ck -s tr ip e d w ith e rro r ch e ck ) is a co m m o n ly u se d m e th o d th a t s tr ip e s th e
d a ta a t th e b lo ck le v e l a n d sp re a d s th e p a r ity d a ta a c ro ss th e d r iv e s – re lia b ility a n d
in c re a se d sp e e d p e r fo rm a n ce (m in im u m o f th re e d r iv e s )
Resiliency and Automation Strategies22
Stay Alert!
There is no 100 percent secure system, and
there is nothing that is foolproof!