IT security testing, a practical guide — Part 6: Failure mode testing
-
Upload
bernard-robertson -
Category
Documents
-
view
212 -
download
0
Transcript of IT security testing, a practical guide — Part 6: Failure mode testing
April 1993 Computer Audit Update
software house should be prevented from selling sof tware which in any way discloses the purchaser's confidential information.
If ownership is to remain with the software house the purchaser may be able to recover some of the development costs charged to it by agreeing repayments are made to it should the contractor make subsequent sales of the software to a third party.
To successfully acquire bespoke software there are many points which need to be included in a purchase agreement. The lessons of the London Ambulance Service and the Taurus experience are clear. As the London Ambulance Inquiry Report concluded when the system was fully implemented last Autumn: "The software was not complete, not properly tuned and not fully tested". The purchasers of commissioned software are advised to ensure that the same does not apply to any software bought by them.
IT SECURITY TESTING, A PRACTICAL GUIDE - - PART 6
FAILURE MODE TESTING
Bernard Robertson and David Pullen PA Consulting Group
Failure mode testing is a specialized form of testing which is extensively used with defence and safety critical systems. However within the area of security testing, it is appropriate to the testing of systems which perform critical functions, e.g. financial systems, systems processing confidential data or systems which must provide high levels of availability. The objective of failure mode testing is to determine the effect of failures within system components on the entire system (e.g. the failure of the disk drive onto which the audit log is being written). In all cases the component should 'fail safe', i.e. fai lure should not result in any securi ty exposures. Failures which could be initiated both accidentally and deliberately in the live system
should be created since an attacker could claim the failure was an accident if caught.
SCOPE
Failure mode testing overlaps with other test areas such as hardware and software testing of contingency and disaster recovery plans and procedures. In this article, failure mode testing is defined to exclude failures in design or input data which should be covered by hardware and software testing.
TEST PROCESS
It is essential that failure mode testing is carefully structured to be as effective and efficient as possible. Failure mode testing should be directed to those areas most likely to fail in an insecure way and should adhere to the general structure suggested for security testing in an earlier article in this series, in particular:
• System familiarization.
• Test scripting and logging.
• Problem reporting.
The test process consists of 4 phases described below.
1. Identify points of failure
All the possible points of failure of the system should be identified. This process is best undertaken by starting at the highest level within the system and successively breaking it down into smaller components. For example an office LAN could be broken down into subgroups within the hardware and software areas as follows.
Hardware
Terminals
Screen
Keyboard
Mouse
©1993 Elsevier Science Publishers Ltd 9
Computer Audit Update April 1993
Processing units
Storage devices
Floppy diskettes
Hard disk, etc.
File servers
Screen
Processing unit
Storage devices, etc.
Print servers
Screen
Processing unit, etc.
Printers
PC communications card
In te r face be tween PC and communications card
I_AN cables and connectors
I_AN gateways
Software
Local
Central
System
Communications Software
Once the lowest relevant level of component has been reached then the ways in which the component can fail should be considered. For example the PC communications card could fail in any of the following ways:
• Unrecoverable component failure which is likely to cause a complete communications failure for the PC.
Intermittent component failure which is likely to cause spurious communications failures.
Loose connection which is likely to cause intermittent communications failures.
• Loss of the LAN address which will result in a communications failure.
• Corruption of the firmware which could result in unpredictable behaviour.
2. Identify the effects of failure
The possible effects of all of the identified failures should be defined. It is useful to identify the effects under the three security headings of confidentiality, integrity and availability. For example the following effects may be defined for an office I_AN system which provides:
• Officefacilities (e.g. word processing, spread- sheets, E-mail, etc.).
• A personnel performance recording system.
• Software and design documents for a proprie- tary application.
Confidentiality
Expose sensitive information
Expose personnel details
Disclose password
Integrity
Unauthorized data field change
Modify personnel record
Corruption of data
Proprietary software design corrupted
10 ©1993 Elsevier Science Publishers Ltd
April 1993 Computer Audit Update
Availability
User disabled
Password forgotten
Password changed
Terminal not available
Communications failure
System not available
File server not available
Communications failure
System files corrupted (resulting in loss of the system)
Data files lost
Lose personnel database, etc.
3. Generate Failure/effect matrix
Once the failures and effects have been defined a matrix may be generated with the failures on one axis and the effects on the other. Each failure/effect combination may than be scored in terms of:
• The likelihood of the failure causing the effect.
• The impact of the effect.
The scoring system should be kept as simple as possible-- a scale of 0 to 5 for each parameter is recommended. The overall score for each failure/effect combination may then be derived as the product of the two scores. A section of the matrix for the office LAN described above may look as shown in Figure 1.
The figure shows three types of failure for the PC communications card on the LAN. For each of these failures three effects are defined (one under the each of the headings confidentiality, integrity and availability). The likelihood of the failure causing the effect is scored in the top left hand box for each failure/effect combination. For example an unrecoverable component failure on the PC communications card is very likely to cause the terminal not to be available and so scores 5 out of 5. The impact of the effect is scored in the top right hand box. In the example, the effect of a terminal not being available is likely to be small since there will be other terminals on the LAN which could be used. The effect is therefore given a score of 2. The product of the two scores is placed in the bottom box.
PC Communications card
Conf . Integ. Avail .
Expose Unauth Terminal personnel modif , not info personnel available
details
Unrecoverable component failure Intermittent component failure Loose connection
0 13 0
o 13 0
0 I 3 0
0 I 2 0
2 I 2 4
2 12 4
5 12 10
4 12 8
4 I 2 8
Figure 1: Failure~effects matrix for Office LAN
©1993 Elsevier Science Publishers Ltd 11
Computer Audit Update April 1993
4. Selection of test areas
Once the failure/effect matrix is completed the test areas can be selected by referring to the scores. In practice there are unlikely to be many large final scores in the matrix. The tests may be listed in priority order based on the scores from the matrix. If there is limited resource available to undertake the testing then the tests may be executed in order of importance until the available resources are exhausted.
T E S T A R E A S
Failure mode testing is difficult to perform efficiently and effectively because it is often complex and requires the disruption of the normal operation of the component under test. Following the structured approach outlined above will assist the tester in defining the tests which should be performed. A few of the areas to focus on are outlined below.
1. PC card failures
PC cards may fail in a number of different ways, in particular:
• The contacts may fail causing spurious er- rors.
• Integrated circuits may fail.
• Batteries may fail or lose contact.
• Tamper resistant areas may fail to erase se- cret data under environmental extremes.
Cards which are usually of particular interest are:
• Communications cards (e.g. Ethernet, X.25).
• PC access control cards (e.g. PC Guard).
2. Communication line failures
Communications lines and connectors are susceptible to intermittent and complete failures, for example:
Cables may be accidentally or deliberately cut resulting in loss of availability.
Cables may be accidentally or deliberately damaged resulting in a short circuit or an intermittent connection.
Connectors may become worn resulting in poor electrical contact and intermittent failures.
3. Access token failures
Tokens and their associated readers used for access control purposes may fail or become unreliable in operation general wear and tear or deliberate abuse. Failures of access control tokens or readers could result in:
• Incorrect user identification.
Failure to acknowledge termination of a user session or the commencement of a new session.
4. ,Failure of storage media
Storage media may fail complete ly, intermittently or partially. For example the read/write head on a disk pack may 'crash' rendering the pack unusable, the connectors to the storage media may become worn resulting in intermittent read/write errors or sectors of a disk may become corrupted and unusable. If the storage media is used for a critical security function (e.g. access control lists or audit trail logs) then the failure may result in security exposure (e.g. audit logging may cease or a user may be allowed access to applications to which access is usually denied).
5. Cryptographic process failures
The cryptographic processes in a system may fail completely or intermittently and messages (which may include encryption keys) may be passed in cleartext if the failure is not properly detected by the system. Encryption keys may be compromised if tamper resistant areas
12 ©1993 Elsevier Science Publishers Ltd
April 1993 Computer Audit Update
fail to operate correctly and erase the secret information when attacked.
ISSUES
There are a number of issues which a tester should bear in mind when embarking on failure mode testing, they are listed below.
1. Mlnimlze destructlveneaa
Many failure mode tests are destructive in nature. Wherever possible the tester should seek to simulate the failure without casing any irreparable damage to the component under test. However in some circumstances this approach may not be possible and the tester should consider whether it is appropriate to undertake the test on a real system or attempt a theoretical analysis.
2. Dedicated test system
Many of the tests will cause serious disruption to the system under test. It will be necessary to use a dedicated test system to avoid disrupting any other testing programmes.
3. Gain management support
It is essential to gain management support for failure mode testing because of the two issues described above. The tester is unlikely to be able to get the level of resourcing to successfully undertake the tests until management are thoroughly convinced of the value of the testing. Testers should undertake failure mode testing only on those systems where it is appropriate and for which a convincing case for testing can be made. Tes ters should also reassure management that they will minimize the level of d is rupt ion to other test programmes. Management should be kept informed throughout the test process to ensure that they understand what is happening. Useful methods for keeping management informed include information feedback sessions, and distribution of the test specification and the test report.
4. Realistic tests
The tester should be realistic about the level of failure mode testing required. It is very easy to get carried away with esoteric tests which simulate failures which are unlikely to occur in practice (whether initiated accidentally or deliberately). The tester should continually ask the questions: "How could this failure occur?", and "How likely is the failure in reality?".
This article has described the testing of systems using stress/loading techniques. The next article will examine the specification, development and use of test tools.
Bernard Robertson is a principal consultant in the Security Consult ing Practice of PA Consulting Group. He has extensive experience in performing a range of secur i ty testing programmes for public and financial sector clients. Bernard is a regular speaker on IT security issues and holds degrees in economics and business administration. David Pullen is a senior consultant within the same Security Consulting Practice. Over the last five years he has conducted several security testing projects, including one lasting two years with a team of 15 security testers. David is a physics graduate and a qualified teacher who has produced a wide range of educational material on security testing.
RISK ASSESSMENT FOR EDI IMPLEMENTATION
Dr Brian S Collins
Background
Risk analysis methodologies are well established in the IT security world. Their application to communication systems however is less mature and even less so in the context of value added services such as EDI. The principles are nevertheless much the same and the topics
©1993 Elsevier Science Publishers Ltd 13