Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf ·...

33
Safety Analysis: FMEA Risk analysis Lecture 8

Transcript of Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf ·...

Page 1: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Safety Analysis: FMEARisk analysis

Lecture 8

Page 2: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Failure modes and effect analysis (FMEA)• Why: to identify contribution of components failures to

system failure

• How: progressively select the individual components or functions within a system and investigate their possible modes of failure

• Information analyzed: possible failure modes, possible causes, local and system effect, how to fix (remedial actions)

Page 3: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

What is the proper level?

• Depends at which design stage: might be very general might be very detailed

• Hardware – To off-the-shelf components– Or field-replaceable assemblies for which failure modes

are available

• Software as a single component – Failure modes as worst possible effects

– Does not include human

Page 4: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Example of hardware-oriented FMEA

Page 5: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Evaluation of FMEA• “+”• Allows to identify redundancy, single-point

failure, inspection points and how often the system needs to be serviced

• Technique is complete• “-”• Time consuming• Does not consider effect of multiple or

common-cause failures

Page 6: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Some notes about FMEA

Very often hardware-oriented FMEA formulates software requirements very vaguely, e.g., “modify software to detect failure”.

How to do it better?Find a common model, i.e., the model which

would be a middle-hand between safety analysis and software requirements

Page 7: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Example – a conveyor system

A conveyor system consists of a feed belt and an elevating rotary table. The feed belt transports objects placed on its left end to the right end and then on the table. The table then elevates and rotates an object to make it available for processing by further machines. The belt has photo-electric cells, which signals when an object has arrived at its ends. The motor of the belt may be switched on and off: it has to be on while waiting for a new object and has to be switched off when an object is at the end of the belt but cannot be delivered onto the table because the table is not in proper position. The table lifts and rotates an object clockwise to a position for further processing. When an object is taken the table moves down and counterclockwise to accept another object. For the brevity we omit the details of table’s implementation. We merely observe that the table moves between two positions: one for loading an object from the feed belt and another for unloading an object for further machines. Initially the table is in its loading position and the feed belt is running empty while waiting for an object to be placed.

Page 8: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Hazards:

– Object is jammed between the belt and the table– Objects are piled up

Page 9: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Safety requirements

• Safety requirement1: The feed belt may onlyconvey an object through its exit sensor, if thetable is in the loading position.

• Safety requirement2. A new object may onlybe put on the feed belt after the exit sensorconfirms that the last one has arrived at theend of the feed belt

Page 10: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Statechart of fault free system

Page 11: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Safety requirements in terms of statecharts

• Safety requirement1:– FB is in Delivering implies TAB is in

ReadyForLoading

• Safety requirement2: – EntrySen_On arrives only when FB is in VACANT

Page 12: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Analysis

• The model of “reality” which controller has is actually a statechart model

• Controller keeps record of state and updates the state upon arrival of every event

• We introduce variables to model states of components

Page 13: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Example of “traditional” FMEA

Unit Feed belt entry sensorFailure mode

Stuck at zero

Possible cause

Primary sensor failure

Local effects

Sensor sends zero signal constantly

System effects

Arrival of an object is undetected. No control over the distance between arriving objects. Danger to pile up.

Remedial action

Ensure that the fault is always detected. Modify software to detect the fault. If fault occurs then switch on alarm and stop the system

Page 14: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Formalizing FMEA

• The main idea is to use statecharts to express how each error should be detected and mitigated

• Two types of detection: Aberrant event and Timeout

Page 15: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Examples of formalized FMEA

Page 16: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components
Page 17: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Statechart model of fail-safe conveyor system (see appendix 3)

Page 18: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Fail-safe systems

A fail-safe system upon detection of an error is shut down.

Page 19: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Fail-safe controller

Constants/*Maximal time to reload object from feed belt to table */MaxDeliveringTime /*Maximal time for object to come from beginning to end of feed belt */MaxTranspDelivTime Procedures…/* Halts feed belt */HaltFB = if FB=VACANT ∧ FB_st=VACANT then FB := HALTINGV || FB_st := HaltedVacelseif FB=FBLOADED ∧ FB_st=StartTransporting then FB := HALTINGL || FB_st := HStartTransporting/* Immediately stops feed belt */StopFB = if (FB=VACANT ∨ FB=FBLOADED) then FB := FBSTOPPED || FB_ST := FBSTOPPED

/* Outputs message to an operator */Warning (Msg: String) = output(Msg)

Page 20: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

/* Timers are active then ON */DeliveryTimer, TranspDelivTimer : {ON, OFF}

/*Time stamps fix time when timers are activated */DeliveryTimerStamp, TranspDelivTStamp : INT

/*Object arrived at the beginning—activate timer TranspDelivTimer */

E=EntrySen_ON ∧ FB=VACANT ∧ FB_st=VACANT → E :=NIL || FB := FBLOADED || FB_st := StartTransporting || TranspDelivTimer := ON || TranspDelivTStamp := t

Page 21: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

/*Object arrived at the end -- deactivate timer TranspDelivTimer */E=ExitSen_ON ∧ FB= FBLOADED || FB_st = Transporting ∧ TAB=TVACANT ∧TAB_st = ReadyToLoad → FB_st := Delivering || TranspDelivTimer := OFF

[] E=ExitSen_ON ∧ FB= FBLOADED ∧ FB_st = Transporting ∧ (TAB=TVACANT ∨TAB=TLOADED) ∧ TAB_st ≠ ReadyToLoad → E :=NIL || FB_st := Waiting || TranspDelivTimer := OFF

/* Object passed the exit sensor while the motor was supposed to be OFF */

E=ExitSen_OFF ∧ FB=FBLOADED ∧ FB_st=Waiting → E :=NIL || FB := FBFAILED || FB_ST := FBFAILED || call Warning(“Feed belt motor fails to stop”) || call StopTAB

Page 22: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

Conclusions on safety analysis• Do not be scared by hardware terms in fault trees and FMEA! Your

knowledge of simple failure modes of sensors and actuators which we have studied already will be sufficient.

• But always deduce software requirements from safety analysis: overlooked safety requirements is a greatest threat!

• Always start safety analysis from identifying hazards, i.e., asking yourself what I would like to avoid happening in this system

• Draw fault tree to analyse how it can happen• Conduct FMEA to see which components in which failure modes are

contributing to hazards• Derive specifications of error detection procedures, remedial

actions and make sure you implement them correctly!

Page 23: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

23

Hazard and accident

• Hazard is a potential for an accident• How do we judge whether a hazard is

acceptable? • The importance of a hazard is related to the

accidents that may result from it.• Accident is an unintended event or sequence of

events that causes death, injury, environmental or material damage

Page 24: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

24

Risk

• Two factors are significant for an accident:– the potential consequences of any accident

that might result from the hazard– the frequency (or probability) of such an

accident occurring• Risk is a combination of the frequency

or probability of a specified hazardous event, and its consequence.

Page 25: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

25

Example

• Failure of a particular component is likely to result in an explosion that could kill 100 people. It is estimated that this component will fail once in every 10000 years. What is the risk associated with that component?

• Risk= severity x frequency =100 x 0.0001 = 0.01 deaths per year

Page 26: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

26

Categories of severity for military systems

Category DefinitionCatastrophic Multiple deathsCritical A single death, and/or multiple severe

injuries or severe occupational illnessesMarginal A single severe injury or occupational

illness, and/or multiple minor injuries or minor occupational illnesses

Negligible At most a single injury or minor occupational illness

Page 27: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

27

Accident probability ranges for military systems

Accident Occurrence during operational life frequency considering all instances of the system

Frequent Likely to be continually experiencedProbable Likely to occur oftenOccasional Likely to occur several timesRemote Likely to occur some timeImprobable Unlikely, but may exceptionally occurIncredible Extremely unlikely that the event will

occur at all

Page 28: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

28

Risk classification

Severity of hazardous event

Frequency of hazardous event

Risk classification

Page 29: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

29

Why classifying risks?

• Risks can be expressed qualitatively and quantitatively

• Calculation of risk results in a risk class (or risk level).

• Most standards define a number of risk classes and then set out development and design techniques appropriate for each category of risk

Page 30: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

30

Risk classes and interpretations for military systems

Page 31: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

31

As Low As is Reasonably Practicable (ALARP) principle

Page 32: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

32

The process of risk reduction

Page 33: Safety Analysis: FMEA Risk analysis - Åbo Akademiusers.abo.fi/etroubit/SWS10Lecture8.pdf · Failure modes and effect analysis (FMEA) • Why: to identify contribution of components

33

Difference between criticality of systems

• Both an electric toaster and a nuclear reactor protection system should be adequately safe but meaning of “adequately” would be different for these two cases

• Hence the importance of safe operation differswidely between applications

• Different safety requirements for different projects mean different levels of risk reduction required