Maintaining and testing a disaster recovery plan

4
Vol. 10, No. 6, Page 12 To meet this need, and to offset its own high costs of data processing, the State Bank of New South Wales is offering customers the use of its new computer disaster recovery centre, located in Sydney's central business district, at a cost of between 1% and 2% of their annual data processing budget. The Bank says the centre gives access to almost a complete duplication of service. Frank Rees, Melbourne, Australia MAINTAINING AND A comprehensive disaster recovery plan can only be completed TESTING A DISASTER at considerable expense and effort to the organization concerned. RECOVERY PLAN The continually changing business and computing environment will dictate that, to maintain the viability of the plan, updating procedures must be in place. The consequences of omitted or out-of-date information within the plan are correctly considered to be the failure of part of, or at worst, all of the recovery capability. The overall effectiveness of the recovery plan will be severely impacted by changes in the operational environment which the plan was originally created to protect. Major factors which would affect the plan include the introduction of new equipment, departmental and staff organizational changes, and the introduction of new applications. Procedures must be in place to ensure that all changes which could impact the plan are systematically recorded and passed to a central point where plan amendment is carried out. This can only be achieved by way of established procedures. It is essential that there is regular feedback both from those who are introducing the changes, in the form of improved systems, etc., and from those who can recognize that changes have evolved, e.g. changes in personnel, suppliers, etc. The person responsible for the plan and its implementation when needed, is unlikely, except in the smallest installations, to be aware of all amendments as they occur. Assistance from those with a vested interest in the plan will be required in the majority of cases. The updating procedure could be organized in the following way. The Recovery Team Leaders should communicate all changes to their relevant areas, e.g. operations, data communications, facilities, hardware etc., to a central point. It may be that the post of Recovery Plan Manager is a full-time one, in which case all changes would be presented to, and approved, by him. Alternatively, it may be decided to assign a portion of the task (e.g. the assembly and distribution of plan updates) to a member of staff, as a purely administrative role. The leader of the recovery process, often referred to as the Recovery co-ordinator, would have the final responsibility, in addition to his everyday duties, of approving the amendment prior to authorizing its distribution. @1988 Elsevier Science Publishers B.V.. Amsterdam./88/$0.00 + 2.20 COMPUTER FRAUD & No part of this publication may be reproduced. stored in a retrieval system, or transmitted by any form or by any SECURITY BULLETIN means, electronic. mechanical, photocopying, recording or otherwise, without the prior permissmn of the pubhshers (Readers in the U.S.A. - please see special regulations listed on back cover.)

Transcript of Maintaining and testing a disaster recovery plan

Page 1: Maintaining and testing a disaster recovery plan

Vol. 10, No. 6, Page 12

To meet this need, and to offset its own high costs of data processing, the State Bank of New South Wales is offering customers the use of its new computer disaster recovery centre, located in Sydney's central business district, at a cost of between 1% and 2% of their annual data processing budget. The Bank says the centre gives access to almost a complete duplication of service.

Frank Rees, Melbourne, Australia

MAINTAINING AND A comprehensive disaster recovery plan can only be completed TESTING A DISASTER at considerable expense and effort to the organization concerned. RECOVERY PLAN The continually changing business and computing environment will

dictate that, to maintain the viability of the plan, updating procedures must be in place. The consequences of omitted or out-of-date information within the plan are correctly considered to be the failure of part of, or at worst, all of the recovery capability.

The overall effectiveness of the recovery plan will be severely impacted by changes in the operational environment which the plan was originally created to protect. Major factors which would affect the plan include the introduction of new equipment, departmental and staff organizational changes, and the introduction of new applications.

Procedures must be in place to ensure that all changes which could impact the plan are systematically recorded and passed to a central point where plan amendment is carried out. This can only be achieved by way of established procedures. It is essential that there is regular feedback both from those who are introducing the changes, in the form of improved systems, etc., and from those who can recognize that changes have evolved, e.g. changes in personnel, suppliers, etc.

The person responsible for the plan and its implementation when needed, is unlikely, except in the smallest installations, to be aware of all amendments as they occur. Assistance from those with a vested interest in the plan will be required in the majority of cases.

The updating procedure could be organized in the following way. The Recovery Team Leaders should communicate all changes to their relevant areas, e.g. operations, data communications, facilities, hardware etc., to a central point. It may be that the post of Recovery Plan Manager is a full-time one, in which case all changes would be presented to, and approved, by him.

Alternatively, it may be decided to assign a portion of the task (e.g. the assembly and distribution of plan updates) to a member of staff, as a purely administrative role. The leader of the recovery process, often referred to as the Recovery co-ordinator, would have the final responsibility, in addition to his everyday duties, of approving the amendment prior to authorizing its distribution.

@ 1988 Elsevier Science Publishers B.V.. Amsterdam./88/$0.00 + 2.20

COMPUTER FRAUD & No part of this publication may be reproduced. stored in a retrieval system, or transmitted by any form or by any

SECURITY BULLETIN means, electronic. mechanical, photocopying, recording or otherwise, without the prior permissmn of the pubhshers (Readers in the U.S.A. - please see special regulations listed on back cover.)

Page 2: Maintaining and testing a disaster recovery plan

Vol. 10, No. 6, Page 13

In addition to the on-going amendments, a review of the plan should take place with all interested parties, at least once a year. It may be decided that a test is to be carried out of specific areas immediately after each review, in which case the results of the test and subsequent lessons learnt could considerably enhance the overall plan; testing objectives and methods will be discussed later.

Even with the most efficiently organized maintenance mechanism, the need to incorporate recovery awareness into all aspects of the data processing facility is of prime importance. Upon the introduction of major changes, recovery capability should be uppermost, and not merely an afterthought.

The true effectiveness of the plan, despite constant updates and reviews, can only be gauged by a programme of systematic testing with actual results tabulated and compared with expectations.

There often exists a certain reluctance to test recovery plans; the argument is that the test would cause too much disruption to normal day-to-day operations. Provided that test procedures receive the same degree of attention as other aspects of the plan, a high degree of confidence will result.

The number of tests which should be performed will be in direct proportion to the volume of changes in the established environment. However, it is recommended that a minimum of two tests are carried out annually. A portion of the recovery planning budget should be set aside for testing purposes, particularly if a special team is to be set up both to design the test and to examine the results. An increased degree of realism can be achieved if the team consists of, for example, auditors or users, rather than data processing management.

A considerable amount of preparation for the test will need to take place. Factors which will need to be taken into consideration are:

1. The

a>

b)

c>

d)

e>

specific objectives of the test, for example:

to confirm the compatibility and capability of designated standby equipment to process one or more critical applications.

to assess the ability to modify the existing network in order to support critical applications.

to bring weaknesses and omissions to light so that rectification can take place.

to use the test, not only to confirm the plan, but also as a means of familiarizing and training staff in recovery techniques.

to confirm the adequacy of vital storage and the efficiency of retrieval methods.

o 1988 Elsevier Science Publishers B.V., Amsterdam./88I$O.O0 + 2.20

COMPUTER FRAUD & No part of this publication may be reproduced. stored in a retrieval system, or transmitted by any form or by any

SECURITY BULLETIN means. electronic. mechanical. photocopying. recording or otherwse, without the prior permisslon of the publishers (Readers in the U.S.A. - please see special regulations listed on back cover.)

Page 3: Maintaining and testing a disaster recovery plan

Vol. 10, No. 6, Page 14

f>

g)

h)

i>

2. The

to ensure that the recovery procedures can be initiated and performed within defined timescales.

to reinforce the correct attitude among the recovery personnel by demonstrating, by way of the test, that the plan is to be taken seriously.

to increase the level of confidence among both recovery personnel and senior management that critical applications can be recovered within acceptable timescales.

to demonstrate to senior management the level of available safeguards and recovery capability, and the possible need for further measures.

description, nature and extent of the proposed disaster

(e.g. major or minor fire, hardware failure, loss of key personnel), and the effect on normal processing.

3. The set of rules by which the test is to be carried out (e.g. denial of access to normal data files, or a restriction on the availability of key personnel).

4. The method of reviewing the effectiveness of the recovery personnel and the performance of individuals involved, once the test has been completed. In particular, the method of deciding the degree of success or failure should be determined.

An effective method of testing is that of the limited disaster test; only one application system is involved at any one time, thereby limiting any disruption caused to other areas during the duration of the test and adding a sense of realism to the exercise. An example of the stages of a limited disaster test is as follows:

* Select an application for recovery testing and set a time for the "disaster" to strike. Particularly busy periods should be avoided for initial tests so that the simulated disaster does not cause a real-life one! It is important that the most easily tested portions of the plan are tested first, so that if amendments are required, they can be implemented quickly. Similarly, any known areas of weakness should be avoided at first so that an immediate expected failure does not result.

* Declare a disaster, describing the actual circumstances, without prior warning to the individuals concerned.

* Remove, or prevent access to, all normal data files, operating documentation and other back-up material relating to the application.

* Instruct recovery personnel to carry out emergency processing of the application using only back-up data and supplies. Variations of the simulated disaster could further involve the use of standby equipment.

CJ 1988 Elsevier Science Publishers B.V., Amsterdam.i88/$0.00 + 220

COMPUTER FRAUD 81 No part ofthls publication may be reproduced. stored in a retrieval system, or transmltted by any form or by any

SECURITY BULLETIN means, electronic. mechanical. photocopying, recording or otherwse. wlthout the prior perrmss~on ofthe publishers (Readers in the U.S.A.-please see special regulations listed on back cover.)

Page 4: Maintaining and testing a disaster recovery plan

Vol. 10, No. 6, Page 15

* Ensure that the established recovery procedures are adhered to during the test itself, without the introduction of amendments.

Whichever approach is taken, it is particularly important that certain elements of the recovery procedures are scrutinized. These are:

1. The ability and efficiency with which essential data can be recovered from the security archive.

2. The effectiveness of the procedures to alert recovery personnel, particularly out of normal working hours.

3. The capability of recovery procedures to be implemented by possibly untrained staff, in the absence of key personnel.

4. The ability of the organization to continue functioning during the recovery period when non-critical applications may not be processed.

The results of the test should be presented to senior management in the form of a written report. Not only will this create a realistic picture of the effectiveness of the recovery procedures, but will also serve to illustrate the need for further resources to be applied to identified weaknesses. The report should also positively indicate the level of success or failure of the plan which, until tested, will always be surrounded by a degree of uncertainty.

Steve Watt, Alkemi Ltd, UK

UK NATIONAL The potential losses that can be caused by the use of faulty CERTIFICATION SCHEME computerized accounting systems can easily equal or exceed the FOR ACCOUNTING damage incurred through deliberate acts of fraud. It is not SOFTWARE unkown for fundamental faults in such systems to go unnoticed

until extremely serious financial damage has been done, unlike the traditional "paper and ledger" accounting system, where such things as incorrect entries at least stand a chance of being detected at a reasonable early stage.

With this in mind, the UK's National Computing Centre (NCC) has inaugurated a scheme designed to test all accounting programs, using a system devised in collaboration with the UK Institute of Chartered Accountants. Previous efforts to establish such a UK national testing and certification system had been opposed by leading software houses, partly because of the NCC's initial intention to conduct all testing itself. A compromise now seems to have been reached, however, whereby individual manufacturers will test their own accounting packages according to specifications supplied by the NCC. Manufacturers will report results direct to the NCC, which retains the right to conduct spot checks to ensure compliance. The NCC's testing service and its certificate of approval will take up to six weeks to complete, at a cost to the software supplier of between f1000 and f3500.

0 1988 Elsevier Science Publishers B.V., Amsterdam.lBlli$O.OO + 2.20

COMPUTER FRAUD % No part of this publication may be reproduced, stored in a retneval system, or transmitted by any form or by any

SECURITY BULLETIN means. electronic. mechanical, photocopying, recording or otherwise, without the prior permlssmn of the publishers (Readers in the U.S.A.-please see special regulations listed on back cover.)