CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

25
CS, AU Henrik Bærbak Christensen 1 Critical Systems Sommerville 7th Ed Chapter 3

Transcript of CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

Page 1: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 1

Critical Systems

Sommerville 7th Ed

Chapter 3

Page 2: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 2

Critical Systems

  Sommerville:  Critical System = Dependability is most

important quality  Three main types:

– Safety-critical systems: A system whose failure may result in injury, loss of life or serious environmental damage

– Mission-critical systems: A system whose failure may result in the failure of some goal-directed activity

– Business-critical systems: A system whose failure may result in very high costs for users of the system.

Page 3: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 3

Dependability

  Dependability equals thrustworthiness– Degree of user confidence that the system will

operate as they expect– Not a numerical/quantitative measure but a

relative/perceived measure: (very high very low)

  Engineering dependable systems often– Are conservative: only use proven methods– Are more costly: may fx use formal methods– Must consider the socio-technical system

• Humans to handle errors; humans as source of errors

Page 4: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 4

Dependability subqualities

Page 5: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 5

Dependability

  Availability:– Probability that it will be able to deliver useful service

at any given time

  Reliability:– Probability that it will correctly deliver services as

expected over a given period of time

  Safety:– Judgment of how likely it is that the system will cause

damage to people or environment

  Security:– Judgment of how likely it is that the system can resist

accidental or deliberate intrusions

Page 6: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 6

Dependability versus performance

  Dependability costs performance, but usually dependability is more important than performance:

– Undependable systems are unused– Failure may cost fortunes– Dependability cannot be retrofitted– Lack of performance can be compensated– Untrustworthy systems may loose information

Page 7: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 7

Measuring

  Two of the four aspects are measured qualitatively, that is based upon judgment:

  Security and Safety

  Often one talks about integrity levels.– level 1 is better than level 2 etc.

  Example: NASA Space shuttle mission software– Fault severity levels.

• Level 0 = Loss of craft and crew.• Level 1 = Failure of mission • Level 2 …

Page 8: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 8

Measuring

  Two of the four aspects may be measured quantitatively:

– Availability: Probability that a system at a point in time will be operational and able to provide services

– Reliability: Probability that a software system will not cause the failure of the system for a specified time under specified conditions.

Page 9: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 9

Exercises

  How does availability and reliability as defined by Sommerville fit definitions by IEEE and Bass?

  Why does Bass not cover qualities such as Safety and Reliability?

  An available system – does it really require “at any time?”

Page 10: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 10

Other sub qualities

  Other sub qualities of dependability– Repairability: time to repair– Maintainability: cost of introducing change– Survivability: ability to continue to deliver services

while under attack or while part of the system is disabled. [particular important to web systems]

– Error Tolerance: the extent to which the system has been designed so that user input error are avoided and tolerated.

Page 11: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 11

Cost

  Dependable systems are costly !

Page 12: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 12

Reliability and Availability

Page 13: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 13

The two

  These two qualities are similar but not the same.– Both probabilities, but– High available but not high reliable

• Telephone switch systems: No dial tone, just try again– A connection may fail but if reconnecting is quick, then no harm

  Availability relies on time to fix the error– A: Fails once a year, fixing takes three days– B: Fails once a month, fixing takes 10 minutes– A is most reliable, B is most available

Page 14: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 14

The two

  However, of course they are related– An unreliable system will most certainly be

unavailable…

  Why does Bass not mention reliability but does mention availability?

Page 15: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 15

Ensuring reliability

  Reliability is compromised by failures. So – reliability can be enhanced by several measures.– Fault avoidance: simply avoid introducing defects!

– Fault detection and removal: Find and remove the defects before they cause failures.

– Fault tolerance: Ensure that faults does not lead to failures.

Page 16: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 1616

Run-time cycle Revisited

  Faults cause failures when faulty code is executed with inputs that expose the fault.– I_e: input that will lead the system into error state

Program execution

state

I_e

error states

Input space

Page 17: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 1717

How does each technique cope?

  A) Avoidance?   B) Detection and Removal?   C Tolerance?

Program execution

state

I_e

error states

Input space

Page 18: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 18

Safety

Page 19: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 19

Terminology

  Safety brings its own vocabulary– Accident: Unplanned event or series of events which

results in death, injury, damage to property or environment

– Hazard: A condition which the potential for causing or contributing to an accident.

– Damage: A measure of the loss resulting from the accident.

– Hazard severity: Assessment of worst damage resulting from a hazard.

– Hazard probability: Probability of events occurring which create hazard

Page 20: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 20

Exercise

  Therac-25 Cancer Radiation Therapy– Malfunction 54…

  A software error killed Cox and Kidd. It involved the apparently straightforward operation of switching the machine between two operating modes. Linear accelerators, including the Therac-25, can produce two kinds of radiation beams: electron beams and X-rays. Patients are treated with both kinds. First, an electron beam is generated. It may irradiate the patient directly; alternatively, an X-ray beam can be created by placing a metal target into the electron beam: as electrons are absorbed in the target, X-rays emerge from the other side. However, the efficiency of this X-ray-producing process is very poor, so the intensity of the electron beam has to be massively increased when the target is in place. The electron beam intensity in X-ray mode can be over 100 times as great as during an electron beam treatment.

  However, if the operator selected X-rays by mistake, realized her error, and then selected electrons--all within 8 seconds [1, 13]--the target was withdrawn but the full-intensity beam was turned on. This error--trivial to commit-- killed Cox and Kidd. Measurements at Tyler by physicist Fritz Hager, in which he reproduced the accident using a model of a patient called a "phantom," indicated that Kidd received a dose of about 25,000 rads-- more than 100 times the prescribed dose [1, 2, 5].

  What is accident, hazard, damage, hazard severity, hazard probability…

Page 21: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 21

Techniques

  Hazard avoidance

  Hazard detection and removal

  Damage limitation

Page 22: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 22

Security

Page 23: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 23

Terminology

  Security also brings its own vocabulary– Exposure: Possible loss or harm to system.– Vulnerability: A weakness in the computer based

system that can be exploited to cause harm or loss.– Attack: An exploitation of a vulnerability.– Threats: Circumstances that have potential to cause

loss or harm. (Vulnerability subjected to attack)– Control: Protective measure that reduces a system’s

vulnerability.

Page 24: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 24

Types of damage

  Denial of service: System is forced into state where its normal services becomes unavailable.

  Corruption of programs or data: Software components are altered in unauthorized ways.

  Disclosure of confidential information: Attack expose confidential information to non-authorized personal

Page 25: CS, AUHenrik Bærbak Christensen1 Critical Systems Sommerville 7th Ed Chapter 3.

CS, AU Henrik Bærbak Christensen 25

Techniques

  Vulnerability avoidance

  Attack detection and neutralization