About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three...

55
1 Informational Brief 2012, All Rights Reserved Some Things You May Some Things You May Not Know about FMEAs Not Know about FMEAs Introduction (critique of UA Flight 232) Risk Management Approaches Issues with the FMEA Approach Suggestions for Improvement “It is far better to grasp the universe as it really is than persist in delusion, however satisfying and reassuring.” Cal Sagan , Astronomer and Writer As quality professionals we have the responsibility of assuring the overall quality performance of the business we support. This responsibility includes not only the quality of products, services, and processes supported in the business but the risk management of the systems which are tightly linked to quality performance. It is imperative that we not only consider the improvement of the business systems we support, but also the tools and methods we used to guide us in the decisions for these improvements. The purpose of this presentation is to provide the attendees some insight into the methods and tools used to manage risk in their organizations. One of the more popular methods of risk management used since the early 1960s is Failure Mode and Effects Analysis, or FMEA. This presentation focuses on some of the well documented short-comings of the FMEA method and the ramifications of poorly assessed conditions of risk. At the close of this presentation we will provide some suggestions how the FMEA method might be improved to provide better risk determinations and management decisions thereof.

Transcript of About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three...

Page 1: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

1

Informational

Brief

2012, All Rights Reserved

Some Things You May Some Things You May

Not Know about FMEAsNot Know about FMEAs

� Introduction (critique of UA Flight 232)

� Risk Management Approaches

� Issues with the FMEA Approach

� Suggestions for Improvement

“It is far better to grasp the universe as it really is than persist in delusion, however satisfying and reassuring.”

Cal Sagan, Astronomer and Writer

As quality professionals we have the responsibility of assuring the overall quality performance

of the business we support. This responsibility includes not only the quality of products,

services, and processes supported in the business but the risk management of the systems which are tightly linked to quality performance. It is imperative that we not only consider the

improvement of the business systems we support, but also the tools and methods we used to

guide us in the decisions for these improvements.

The purpose of this presentation is to provide the attendees some insight into the methods and tools used to manage risk in their organizations. One of the more popular methods of risk

management used since the early 1960s is Failure Mode and Effects Analysis, or FMEA. This

presentation focuses on some of the well documented short-comings of the FMEA method

and the ramifications of poorly assessed conditions of risk. At the close of this presentation we will provide some suggestions how the FMEA method might be improved to provide better

risk determinations and management decisions thereof.

Page 2: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

2

Slide 2

Executive Summary:

Some You May Not Know About FMEAs*

� RPNs provide limited Risk Discrimination.

� Prediction overconfidence is common.

� Expert judgments and claims are not consistent.

� No empirical evidence that Risk Rating methods yield useful decision-making information.

*Failure Mode and Effects Analysis

Rather than hold the audience in suspense throughout the presentation this slide provides the

four possible areas of concern that many users of the FMEA method may not know about. The viewer should be aware that there are other concerns about the FMEA method, but we felt the

four shown in this slide provide a foundation of the largest factors affecting risk assessment

using the FMEA approach.

In this presentation we will explain the limitations of Risk Priority Numbers (RPNs) as indicators

of risk, discuss some foundational work associated with human predictions and the pitfall of

overconfidence, look at the use of Subject Matter Expert (SME) claims and the challenges

therein, and close with a brief discussion on the lack of informative feedback supporting risk

rating methods. We hope you find this presentation both thought provoking and insightful.

Page 3: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

3

Informational

Brief

2012, All Rights Reserved

IntroductionIntroduction

“Everyone is perfectly willing to learn from unpleasant experiences—if only the damage of the first lesson could be repaired.”

Lichtenberg, Scientist and Satirist

Page 4: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

4

Slide 4

Basic Definition of Risk

� In the beginning:

Chance occurrence beyond the realm of human

control that could cause loss or harm… (initially

concerned with games of chance)

� Long definition:

The probability (or chance) and magnitude of loss(severity), of an unplanned and/or undesirable

event.

� Short Definition:

� The chance that something bad could happen.

Since the beginning Mankind has struggled to understand the nature of chance

occurrences and their relationship to loss or harm. It is useful to understand there are many definitions of risk. Some are better than others, and some are simply confusing.

We use this slide to bound the definition of risk as a foundational basis for the discussion.

Page 5: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

5

Slide 5

Definition of Risk Management

� Long definition:

The identification, assessment and prioritization of

risk, and the structured and economical application

of resources to minimize and control the

probability and impact of undesirable events.

� Short Definition:

� Being smart about taking chances…

When combining the definition of risk with the definition of management we gain a sense of

our mission as quality professionals. We trust this definition does not surprise you, and hope that you carry it with you as reference throughout the remainder of this presentation.

Page 6: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

6

Slide 6

A Difficult Story

� On July 19, 1989 UA Flight 232 headed out from Denver, CO to Chicago, IL

� In the early afternoon the plane lost use of the rear engine and control of ALL flight surfaces.

� All maneuvering control of the plane had to be made using thrust changes with the wing mounted engines.

� UA 232 was rerouted to Sioux City, Iowa for an emergency landing.

We want to start the presentation with a problem in order to establish a level of importance

for this discussion. In essence, this first section should answer the question, “What’s in it for me?”

Without memorializing this horrific accident we want to use it as a point of departure for the

role and potential peril of risk management efforts. The rest of the slides should be self explanatory.

Page 7: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

7

Slide 7

� The crash of UA 232 resulted in 111 fatalities, with 185

passengers and crew surviving.

� The apparent cause was due to a failure of the rear engine

fan disk. The engine had 17 years of service on it…

� A comprehensive failure assessment found that shrapnel

from fan disk failure severed a key line linking all three

redundant hydraulic control systems.

� The claimed cause was human error in the inspection of the

fan disk during service maintenance of the engine.

Common-Mode Failure

Results of the US NTSA assessment of the UA 232 crash.

Page 8: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

8

Slide 8

DC-10 Tail View

Common-Mode Failure

Common-ModeSystem Failure

It is instructive to recognize that hidden within this simple drawing is a disaster waiting to

happen.

The Common-Mode Failure of the hydraulic system was due to the severing of a section of

piping common to all three “redundant” hydraulic systems that control the flight surfaces.

How did this happen? Why wasn’t it addressed during the design stage of the DC-10

aircraft? Why was there a belief the chance of this line being damage was one in a billion?

What factors contributed to reduce the chance of complete hydraulic failure? What is the

role of risk management during the design stage?

Page 9: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

9

Slide 9

Underestimation of Failure Likelihood

Do you think the risk management methods used by

McDonnell Douglas should have identified this design flaw?

Certainly there were indications of design problems prior to the initial production of the

DC-10. Here is an example of a similar failure that occurred about 4 years prior to the crash of UA Flight 232.

Page 10: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

10

Slide 10

Causal Ladder of Failure

Plane Unable to Land Safely

Loss of Control of all Flight Surfaces

Debris from Fan Disk Damaged Key Control Hydraulics

Stress Cracks in Disk Blades Caused Failure of Fan Disk

Stress Cracks in Disk Blades Missed During Maintenance Inspection

Risk Assessment Methods used to Identify

Limitations of Human

Inspection of Fan Disk were Ineffective!

Likely Common Mode Failure

Not a One-in-a-billion likelihood!

Proposed Common-Mode Failure

Cause Attributed to Human Error!

What were the underlying causal elements that led to the failure of UA Flight 232? The

US NTSA report focused on “human error” as being the predominate cause even though they admitted it would have been difficult to observe the stress crack on the Fan Disk.

Given the nature of the common mode failure shouldn’t a team of risk professionals have

caught this flaw at the design stage? What do you think?

Page 11: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

11

Slide 11

Other Common-Mode Failures

� Hurricane Katrina.

� The Financial Crisis of 2008/09.

� On-board microprocessors in automotive, aviation, and other transportation applications.

� Embedded software applications in all computer controlled devices and equipment.

� Climate Change.

� Supply disruption of raw resources: food, fuel, and water.

� Poor government policies that impact society.

Page 12: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

12

Slide 12

Dramatization of UA Flight 232

� Follow the DiscoveryChannel.ca story of UA Flight 232 via this link:

http://watch.discoverychannel.ca/mayday/season-11/mayday-impossible-landing/#clip662372

Page 13: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

13

InformationalBrief

2012, All Rights Reserved

Risk Management Risk Management

ApproachesApproaches

“There is perhaps no beguilement more insidious and dangerous than an elaborate and elegant mathematical

process built upon unfortified premises”

Thomas C. Chamberlain, Geologist and Writer (1899)

In this section we take a quick look at three approaches commonly used to manage risk,

Failure Mode, Effects and Criticality Analysis (FMECA), Fault Tree Analysis (FTA), and Failure Mode and Effects Analysis (FMEA).

Our focus in this section is on the FMEA approach. Feel free to view the slides in the

Appendix for details on the other two approaches.

Page 14: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

14

Slide 14

Two General Ways to Manage Risk

� Choose the areas to “optimize” risk reduction, subject to various constraints:

� Budget constraints

� Dependencies and interactions among sources, targets, and consequences

� Dependencies among countermeasures

� Identify, document, and rank risk concerns, then tackle the largest perceived risks first:

� Using Rating Scales of risk as a guide

� Using Risk Matrices to identify key concerns

Adapted from webinar of 8-Nov-2012by Dr. Tony Cox of Cox-Associates

This slide describes the two general ways of conducting risk management today.

The first way of managing risk seeks to choose actions called countermeasures that provide

the greatest risk reduction possible for the money spent. This approach is considered an

optimization problem, and as such requires calculating the size of the risk reduction and how

much of the budget is required to achieve it. Using this approach a team has a wide variety of options for employing countermeasures. For example, the team can consider the effects

of a given countermeasure on other identified risks than the parent, and in doing so is able

to optimize both cost factors and risk reduction simultaneously.

The second way of managing risk involves ranking the largest risks identified in the system from a universe of many. This is done because there is just too many potential risks to

manage for the budget to handle. In this approach each potential risk is treated

independently of the others without considering any potential dependencies between risk

events or interactions among the sources or consequences of the countermeasures. This approach to risk management lacks any real optimization heuristics, but is considered much

easier and less complicated then the first approach.

Page 15: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

15

Slide 15

Typical Risk Management Methods

� Typical Risk Management Methods in use today:

� Failure Modes and Effects Criticality Analysis (FMECA)

� Fault Tree Analysis (FTA)

� Failure Modes and Effects Analysis (FMEA)

� Over 75% of applications today use the FMEA approach.

CombinedCombined

TYPE

Type 1Type 1

Type 2Type 2

This slide provides a view of the three typical methods used in industry and government today

to manage risk. Included in this slide is a cross-classification of each method with the two general types of risk management approaches which is show to the far left.

Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk

management approach using an rank ordering of risk classifications as a basis for identifying the largest risks in a system. Of the three methods of risk management, the FMEA method is

the most popular. This is due to its simplicity of use and lack of complex probabilistic

mathematics which is often challenging for most folks untrained in probability theory to grasp.

This said beneath the ranking categories is an implicit sense of probability and uncertainty which is mask by a seemingly simple process of selecting a rating value. Despite the ranking

system the complexity of failure rates, hazard functions, and probability still prevail.

Page 16: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

16

Slide 16

Risk as Defined by FMEA

Risk in FMEA is defined as:

� Severity (of failure effect),

� Occurrence (of cause or failure),

� Detection (of cause or failure)

The FMEA method of risk determination defines risk in three areas as shown in this slide.

These three dimensions in combination address the components of risk as defined by the FMEA method: potential for a failure to occur (Occurrence), the relative level of hazard

(Severity), and the ability to detect or prevent the failure before it happens (Detection). Ideally,

risk is reduced by having clear and accurate understanding of the failure mechanisms, or

reducing the uncertainty associated with each dimension.

Page 17: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

17

Slide 17

AIAG Guidelines:

Severity Ranking Criteria

Reprinted from www.TheNewExcellence.com

This slide shows the ranking criteria supporting the Effect of a failure mode on the system or

user. It ranks the lowest level of risk a “1” and the highest level of risk a “10.” This criteria was developed by the Automotive Interest Action Group (AIAG) to support the management of

both supplier quality and process/product design by the organization via potential risk factors

in a given system.

Page 18: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

18

Slide 18

AIAG Guidelines:

Occurrence Ranking Criteria

Reprinted from www.TheNewExcellence.com

This slide shows an example table of risk ranking criteria for the Occurrence of a failure

mode or the cause of a failure mode. Again, a “1” is considered low risk and a “10” is

considered high risk.

Please note the identification of the column 3 labeled Ppk in this table. In reviewing these

tables online I notice a mixed use of the indices Ppk and Cpk in this column. Ppk is called

the Process Performance Index and Cpk is called the Process Capability Index, and they

are interchangeably used as a soft probability measure of failure rates. In reviewing my AIAG reference this table uses Cpk.

There is great confusion in the automotive and other industries on the value and use of

these two process measures. The confusion is so great that they are often used interchangeably even though they measure entirely different things. In the next slide we try

to explain these two estimates in an effort to minimize the confusion.

Page 19: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

19

Slide 19

Time 2

Time 3

Basis for Computation:

Performance vs. Capability Indices

BetweenGroup

VariationWithinGroup

Variation

Time 1

Time 4

Time 5

Time

Total

Variation Pp, Ppk

When making predictions of the future the measures we use should be reliable. A reliable

measure is one that is consistent over time. If measures are not reliable, then their utility in prediction is limited.

When observing the measure of a products, parts, services or any array of items considered

to be identical we don’t usually measure the same values for all pieces of work. Instead, we measure a range of values around a common aim for process that produced the work. This

range of values is called the observed or “total” variation. The components of total variation

as shown are called “within” and “between” group variation. Common terminology used by

others refer to the average “within” group variation as short-term variation and the “total”

variation as long-term variation. These references are non-standard terms that serve to confuse the purpose of breaking the variation into its components.

Page 20: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

20

Slide 20

Time 2

Time 3Time 1Time 4

Time 5

Time

Total

VariationPp, Ppk Cp, Cpk

If process is in a state of control:

Pp ≡ Cp

Basis for Computation:

Performance vs. Capability Indices

Typical practice is to compute the Process Performance Indices using the “total” variation

as part of the calculation, the Process Capability Indices using the average “within” group variation. As you can see these are two different components of variation which can yield

two different results. Looking at the slide we notice a third component of variation called

“between” group variation. If the “between” group variation is too great the process is

considered unreliable. In essence, large shifts in between group variation indicates multiple

isolated causal elements are present in the process. If the between group variation is too great than estimates of the Process Performance Indices, Pp and Ppk, will be poor

predictors of the future performance of the process.

Better predictors of future process performance are Cp and Cpk. These two indices require the process first achieve a state of statistical or stationary control before being computed.

As such, all references to Pp and Ppk in an FMEA exercise should be exchanged with Cp

and Cpk. A better approach would be to use failure rates and probability of failure directly if

possible.

Page 21: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

21

Slide 21

Yes

Yes

Yes

Yes

Type 2 RM Approach:

Example of a Process FMEA

Engage Transportation

Mode

Take RouteOver North

Bridge

Cross MajorDownstreamIntersection

Arrive at Work On Time

PROCESS

October 10, 2010

This is a simple example used to illustrate the correct application of a Process FMEA.

Please note the correct use of this tool requires you to list the process steps in the far left

column, not process variables. The focus with this tool is on the potential failure modes of

each process step and the relative risk of each cause of a given failure mode indicated as “RPN” which stands for Risk Priority Number.

Page 22: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

22

Slide 22

Example of a Design FMEA

PRODUCT:

Lid

Handle

Body

This is simple example used to illustrate the correct application of a Design FMEA supporting

a product.

Please note the correct use of this tool requires you to list the product components in the far left column, not process steps. Using this tool we focus on the potential failure modes of each

product component and on the component-to-component interactions with a goal of

establishing mitigating design controls or redesigning the product such that we can eliminate

high risk failure modes.

Page 23: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

23

InformationalBrief

2012, All Rights Reserved

Some Issues with Some Issues with

the FMEA Approachthe FMEA Approach

“Quality improvement will result from people improving their processes and from management improving the system.” T. Pyzdek

Page 24: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

24

Slide 24

Four Issues with FMEAs + a bonus

� RPNs provide limited Risk Discrimination.

� Prediction overconfidence is common.

� Expert judgments and claims are not consistent.

� No empirical evidence that Risk Rating methods yield useful decision-making information.

� FMEA risk claims are rarely verified with actual follow-up data.

Here again are the fours issues cited earlier plus one additional issue for good measure.

The next slides discuss each of these issues in some detail to give the reader some insight

on the possible weak areas of a risk assessment using Failure Mode and Effects analysis.

Page 25: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

25

Slide 25

Calculation of Risk Priority Number (RPN)

RPN = Severity Rating * Occurrence * Likelihood of Detection

- How many total RPN values are available for the FMEA analysis?

- How many unique RPN values are available for the analysis?

For those unfamiliar with FMEAs we provide a view of this simple calculation used to

compute a Risk Priority Number or RPN.

Using a scale of 1 to 10 for each risk area how many RPN values do you believe are

available for the FMEA analysis? Now, don’t cheat and look ahead. Try instead to think this

answer through.

Of the total number of expected RPN values calculated from the previous question, how

many of them do you believe are unique from all the others? This is a bit tougher questions,

but try to think it through…

Page 26: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

26

Slide 26

Risk Ranking Scale of 1 to 10:

Enumerating RPN Classifications

S O DCalculated

RPN

RPN Value

Order

Observed Number

of Classifications

0 0 0 0 0 01 1 1 1 1 1

1 1 2 2 2 3

1 1 3 3 3 3

1 1 4 4 4 6

1 1 5 5 5 3

1 1 6 6 6 9

1 1 7 7 7 3

1 1 8 8 8 10

1 1 9 9 9 6

1 1 10 10 10 9

1 2 1 2 11 0

1 2 2 4 12 15

1 2 3 6 13 0

1 2 4 8 14 6

1 2 5 10 15 6

1 2 6 12 16 12

1 2 7 14 17 0

1 2 8 16 18 15

1 2 9 18 19 0

1 2 10 20 20 15

1 3 1 3 21 6

1 3 2 6 22 0

1 3 3 9 23 0 Microsoft Excel Worksheet

To aid in answering the previous questions let open up Excel or other spreadsheet program

and attempt to enumerate all of the possible combinations of RPN. This slide illustrates how

to set up this evaluation. If you have access to our spreadsheet, then open it up and look it over. Now, can you answer the previous questions?

Page 27: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

27

Slide 27

Plot of RPN Class Counts

Risk Priority Value Plot

0

5

10

15

20

25

0 100 200 300 400 500 600 700 800 900 1000

RPN Values

Nu

mb

er

of

Cla

ssif

icati

on

s(Based on a 1 to 10 Scale)

This slide shows a plot of the enumerated RPN values supporting a 1 to 10 scale for three

ranked classifications.

What do you see in this slide? Do you notice the greatest number of RPN values are clustered around an RPN of about 100? How many RPN values are available above 500?

The distribution of RPN values above shows a bias towards the lower range of all possible

values. Also note, each bar of this plot indicates the number of duplicate RPN classifications

available. These are non-unique classifications. Did you know that RPN values behaved this way prior to our discussion? How might this behaviour affect the risk assessment process?

Page 28: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

28

Slide 28

Summary:

Enumerated RPN Classes with a 1 to 10 Scale

FMEAs have an extremely limited capability to discriminate Risk Classifications

This slide summarizes the plot shown previously. Now you are able to answer the earlier

questions we asked about the measurement used to quantify risk with the FMEA method.

Out of a 1,000 possible unique risk classifications, 10 X 10 X 10, how many are actually

available for use in the FMEA risk assessment method?

Out of the actual number available, how many RPNs provide unique classifications? This

information directly speaks to the ability of FMEA to discriminate between different risk

conditions. We illustrate this effect in the next slide.

Page 29: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

29

Slide 29

Some FMEA Examples

� Suppose S=10, O=9, and D=4 for a given failure mode. What actions might you consider?

� Suppose S=4, O=9, and D=10 for a given failure mode. What actions might you consider?

� Suppose a Hazardous failure with a chance of Occurrence ≅ 30%, and the Ability to Detect in Production is Variable? What actions might you consider?

Look at each of the three entries of risk classifications shown in this slide. Given the

components of risk shown, try to provide a sense of the actions you might take to reduce the potential risks.

Would you consider taking the same actions for each of the three listed risk conditions? If

so, then why when there is great differences observed among similar rankings. If not, then

why when the summarized risk in the form of RPN is the same for all three risk conditions?

Using RPN as the primary measure of risk management seems to present a few challenges.

Let’s look at the entire range of possible risk classification for RPN = 360.

Page 30: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

30

Illustration:

Fifteen “Equivalent” Rankings of

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

ID

10Impossible9Very High4Very Low

9Very Remote10Very High4Very Low

9Very Remote8High5Low

8Remote9Very High5Low

10Impossible6Moderate6Moderate

6Low10Very High6Moderate

9Very Remote5Moderate8Hazardous

5Moderate9Very High8Hazardous

10Impossible4Moderate9Hazardous

8Remote5Moderate9Hazardous

5Moderate8High9Hazardous

4Mod. High10Very High9Hazardous

9Very Remote4Moderate10Hazardous

6Low6Moderate10Hazardous

4Moderate High9Very High10Hazardous

Ranked Value

Likelihood of Detection

Ranked Value

Likelihood of Occurrence

Ranked Value

Severity

RPN=360RPN=360

Note in the table shown the range of Severity, Occurrence, and Detection values

observed. All 15 combinations support the same risk, measured using RPN.

Page 31: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

31

Slide 31

���� Poor Risk Discrimination

� Risk Priority Numbers are the products of three ordinal-scale values!

Multiplication and Division Operations

Addition and Subtraction Operations

Ranking or Grouping to define

an Ordering among Categories

Legal MathematicalOperations

Example

Characteristics

Temperature scales with an

absolute zero, i.e. Kelvin

Temperature scales of °F or °C w/o an absolute

zero

Places in a contest such as: 1st, 2nd,

and, 3rd

Values that possess ordering, distance, and an absolute zero

Values that possess both ordering and defined distance

Values ranked in a logical order

Ratio Scale

Data

Interval ScaleData

Ordinal Scale Data

Adapted from D.J. Wheeler, The Six Sigma Practitioner’s Guide to Data Analysis, 2005

So, what is the reason for the observed behaviour of RPNs. This slide provides a clue as

shown in the second column of the table.

What is the difference between Ordinal and Ratio scale data? How are RPNs calculated?

Page 32: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

32

Slide 32

���� Prediction Overconfidence

� Buried beneath the risk rankings for Occurrence and Detection is an estimation of probability.

� Most of us don’t understand how probabilities work, and instead relegate our estimates to ranked values.

� This mental gymnastics carries with it some hidden problems.

� For years research psychologists have known that everyone is naturally “overconfident” in their predictions.

� Let’s illustrate this effect in the next slide where we will ask a few trivia TRUE/FALSE questions.

The claims on this slide are supported by the early ground breaking work of Psychologists

Amos Tversky and Daniel Kahnemann.

This work is so well know in the field of decision science that none questions its validity.

Unfortunately, few in industry have heard of or understand the ramifications of this work.

If we have the time during the presentation we will attempt a limited calibration exercise.

If we are unable to conduct this exercise due to time constraints you can do conduct it

yourself. You can check the Appendix for the answer once you complete the first part of

the exercise.

Page 33: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

33

Simple Calibration Exercise

50% 60% 70% 80% 90% 100%The first six values in the constant PI is 3.14139.

10

50% 60% 70% 80% 90% 100%Modern humans first appeared on the earth about 200,000 years ago.

9

50% 60% 70% 80% 90% 100%In 2002, the price of a new desktop computer was under $1,500.

8

50% 60% 70% 80% 90% 100%One meter equals 37.39 inches.7

50% 60% 70% 80% 90% 100%Napoleon was born on the island of Corsica.6

50% 60% 70% 80% 90% 100%M is one of the three most commonly used letters.

5

50% 60% 70% 80% 90% 100%Mars is always further away from the Earth than Venus.

4

50% 60% 70% 80% 90% 100%A liter of oil weighs less than a liter of water.3

50% 60% 70% 80% 90% 100%There is no species of three-humped camel.2

50% 60% 70% 80% 90% 100%The ancient Romans were conquered by the ancient Greeks.

1

Confidence that You are Correct

Answer (T or F)

Statement

Exercise Instructions:

1. Read the statement.

2. Decide whether the statement is True or False.

3. Circle how confident you feel about your answer.

4. Complete all 10 statements.

Find the answers in the Appendix

Page 34: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

34

Slide 34

Results:

Prediction Calibration

� In subjective assessments an evaluator is considered calibrated if the proportion of true assessments equals the average weighted confidence assigned by the evaluator.

� As an example, suppose in our exercise you observed 6 correct answers out of 10 possible, or 60%, and the average confidence of the 6 correct answers was 75%.

� Therefore, in the long-run you claim to have 75% confidence in achieving correct answers, but you actually answered 60% correct answers, therefore:

� If, %Actual Correct < %Confidence : Overconfident

� If, %Actual Correct > %Confidence : Under-confident

Follow the guidance in this slide to compute the percent of correct answers and the average

confidence for correct answers. Make the comparison shown in the bottom of slide.

Please bear in mind this is a simple exercise containing a sample of only 10 questions. Its

ability to determine “subjective assessment” performance is extremely limited.

If you wanted to get a reasonable estimate of “assessment” performance you would need a

minimum of 50 calibration questions to start. So, don’t worry if you did not do well with this

exercise. Accept that it is just an indicator and realize that you may be capable of

overconfident responses.

Page 35: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

35

Slide 35

���� Inconsistent Expert Claims*

� Like the rest of us, Subject Matter Experts tend make “overconfident” claims.

� Unlike the rest of us, SMEs can often make overconfident claims outside their areas of experience and training.

� This overconfidence can present a problem to an assessment team when evaluating subjective risk.

� There is great tendency by team members to give the SMEs far more latitude in making claims than other members.

� Additionally, SMEs provide expert advice on an inconsistent basis, care should be taken when using uncalibrated expert advice without question…

*Thoroughly researched by Tversky,

Kahneman, Lichtenstein, Fishhoff, and

Phillips

Some additional insight from the work of listed researchers. The take-away is to realize that

SMEs are prone to the same judgement errors as the rest of us. Consider this possibility the

next time you receive expert advice from anyone including me!

Page 36: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

36

Slide 36

Work Conducted by the US Navy in 1981

All of the previous information has been known since the mid-1970s and has been codified in

many US military references and guidance.

Page 37: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

37

Slide 37

���� Limited Empirical Evidence for FMEAs

� A review of the literature, past and present, provides little quantitative empirical evidence of FMEA effectiveness.

� Most companies do not track or collect this information.

� There is substantial research showing the effectiveness of Probabilistic Risk Assessment (PRA) over Risk Ranking Methods (RR).

� Many US government agencies have returned to PRA over Risk Ranking methods since the late 70s. (see Appendix for additional details)

It is difficult to find any literature supporting empirical studies on the use and effectiveness of

the FMEA method of risk management.

Page 38: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

38

Slide 38

Excerpt from 2009 IEEE Journal Article

� Meshkat, Leila PhD, Probabilistic Risk Assessment for Decision Making during Spacecraft Operations, IEEE, 2009. Page 1, Sect. 1.1 Quantitative Risk Assessment (QRA) :

Additional support that US government agencies, once enamored by the FMEA method, are

moving back to more conventional Type 1 risk assessment methods.

Page 39: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

39

Slide 39

���� FMEA Risk Claims Rarely Verified

� This comment is supported by the previous one.

� There is usually no closed-loop evaluation for FMEA risk claims against actual warrantee returns, field issues, etc.

� Without this information it is difficult for an operation to know if their risk assessment efforts actually manage product and process risk.

� Without this knowledge, the operation is unable to address any glaring issues with their risk assessment efforts and take the needed improvement actions.

An FMEA assessment is a predictive evaluation of the system(s) under study. In any

empirical scientific endeavor we always gain feedback from the systems we study and compare our predictions to the actual performance.

For some reason, this doesn’t seem to happen with FMEA work in most companies. I’m not

sure why this is the case, but now that you understand this gap perhaps you might consider including the feedback loop into the risk management process at your company. This is no

other way to uncover the short-comings of this method and make the necessary corrections.

Please note the last bullet point on this slide and feel free to give the timeline of risk management methods a look in the Appendix when you have a chance.

Page 40: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

40

Informational

Brief

2012, All Rights Reserved

Suggestions for Suggestions for

ImprovementImprovement

Page 41: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

41

Slide 41

Suggestions for Improvement of RAs

� Consider phasing out the use of RPNs when conducting FMEAs.

� Consider sorting the risk evaluations in FMEAs by Severity, then Occurrence, and next Detection—then, work with the ranked failure modes directly. (increases risk discrimination)

� Consider a move to Probabilistic Risk Assessment (PRA) methods in the future as rate data become available:

� Use of Monte-Carlo Simulations (uses knowledge of input distributions)

� Use of Bayesian Inversion Analysis (uses past reliability performance)

� If PRA methods are not viable for your work, then consider adjusting the RPN to a Corrective Priority Number by dividing RPN by the unit cost to implement the corrective action or detection method, see next slide for example.

This is a subject of its own. If interest exist we can work on a separate presentation

covering the bullet points in this slide.

Page 42: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

42

Slide 42

Estimating a Corrective Action Index

An example illustration the last bullet point in the previous slide. The use of CAI as shown

in this slide is adapted from the recent work of Dr. Tony Cox.

Page 43: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

43

Slide 43

Selected References

D. H. Stamantis, Failure Mode and Effect Analysis, (1995), copyright ASQ/ASQC Quality Press.

Automotive Industry Action Group, Potential Failure Mode and Effects Analysis –Reference Manual, (February 1995), Second Edition.

Automotive Industry Action Group, Statistical Process Control (SPC) - Reference Manual, (March 1995), Second Printing.

D. W. Hubbard, The Failure of Risk Management – Why It’s Broken and How to Fix It, (2009), copyright John Wiley & Sons.

D. J. Wheeler, The Six Sigma Practitioner’s Guide to Data Analysis, 311-315 (2005), copyright SPC Press.

A. Tversky and D. Kahnemann, Judgement Under Uncertainty: Heuristics and Biases, Science 185, 1124-1131 (1974), copyright 1974 NAAS.

L. Meshkat, Probabilistic Risk Assessment for Decision Making during Spacecraft Operations, (2009), IEEE Journal.

Louis Anthony (Tony) Cox, Improving Risk Management, Comparisons and Decisions, November 2012 SIRA Meeting Webinar, Web Link: http://vimeo.com/53151221

Page 44: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

44

Slide 44

Selected References

US DoD Information Analysis Center, Failure Mode, Effects, and Criticality Analysis (FMECA), (1993), Reliability Analysis Center, Rome, NY.

The New Excellence, AIAG FMEA Severity, Occurrence, and Detection Ranking Guidelines, (2009), Web link: www.TheNewExcellence.com

S. Lichtenstein, B. Fischhoff, and L. D. Phillips, Calibration of Probabilities: The State of the Art to 1980, (1981), Perceptronics, Inc. sponsored by the US Office of Naval Research.

G. Keren, On the Calibration of Probability Judgments: Some Critical Comments and Alternative Perspectives, (1997), Journal of Behavioral Decision Making, Vol. 10, 269-278.

G. E. Apostolakis, How Useful is Quantitative Risk Assessment?, (2004), Risk Analysis, Vol. 24, No. 3, 515-520.

http://livingsta.hubpages.com/hub/20-Worst-Accidents-Involving-US-Carriers, 20 Worst Accidents Involving US (Aviation) Carriers.

Louis Anthony (Tony) Cox Jr., Risk Analysis of Complex and Uncertain Systems, (2009), copyright Springer.

Page 45: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

45

InformationalBrief

2012, All Rights Reserved

AppendixAppendix

Page 46: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

46

Slide 46

Other UA Flight 232 Photos

Page 47: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

47

Slide 47

Timeline of Risk Management Methods

1960s – Contractors of NASA

developed and used variants of

FMECA referred to as FMEA

1967 – SAE published

ARP-926 supporting

FMEA approach

1967 – US Civil Aviation

industry adopts FMEA

approach supported by SAE

1970 – Automotive

industry began wide-

spread use of FMEA

1973 – US EPA adopts use of

FMEA approach for Risk assessment

1993 – AIAG publishes first FMEA standard

1994 – SAE publishes first FMEA standard

1949 - Use of FMECA first

standardized in Mil-P-1629

1980 - Mil-P-1629 revised

supporting FMECA to MIL-

STD-1629A

1984 – US government support of

FMECA MIL-STD1629 canceled*

*Major changes in Risk Management Application

Use of FMECAs

Use of FMEAs

Use of FTAs and PRA

Legend

1971 – Begin Wide spread use of

Probabilistic Risk Assessment in

the Aviation industry and FTA

*

1981 – Mandatory use of PRA and

FTA by US Nuclear power industry*

1962 – Bell Labs develops the

Fault Tree Analysis approach

1970 – US FAA includes use

of FTA into 14CFR25.1309 for

all Transport Category Aviation

*2010 – Widespread use of PRA methods by

the US Dept. of Homeland Security

1987 – Mandatory use of PRA and such

tools as FTA and FMECA by NASA after

shuttle disaster in 1986

*

Timeline

Page 48: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

48

Slide 48

A Combined RM Approach:

Failure Modes and Effects Criticality Analysis

Hardware and Products,

Processes, Product Applications,

Service Systems, etc.

Failure Effects, Detection Methods, Compensating Provisions, Severity

Class

Failure Rate, Mission Time, Modal

Criticality Number

Page 49: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

49

Slide 49

FMEA Part:

Failure Modes and Effects Criticality Analysis

Page 50: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

50

Slide 50

Criticality Analysis:

Failure Modes and Effects Criticality Analysis

Page 51: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

51

Slide 51

FMECA Criticality Matrix:

Failure Modes and Effects Criticality Analysis

Page 52: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

52

Slide 52

Criticality Matrix:

Failure Modes and Effects Criticality Analysis

Some Potentially Useful Analysis Guidance

Page 53: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

53

Slide 53

Type 2 RM Approach:

FTA - Fault Tree Analysis

OR3

OR2OR1 AND1

AND2

Pa=p1+p2

Pb=p3*p4*p5

Pc=((p6+(p7*p8))

Psystem=Pa+Pb+Pc

PA2=p7*p8

Faults, Errors,

Malfunctions, etc.

A Fault Tree Analysis allows quantitative measure of process risk by bounding the

uncertainty associated with complex undesirable events that are linked together logically in a process. If one can assess the frequency of occurrence for the components of each

undesirable event, then one can estimate the chance that a Fault or Failure can be made.

This slide illustrates a simple example using Fault Tree analysis to understand the logical

structure associated with the error of excluding a buffer during one step in a

biopharmaceutical process.

Page 54: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

54

Slide 54

AIAG Guidelines:

Detection Ranking Criteria

Reprinted from www.TheNewExcellence.com

Example of the risk ranking table supporting the Detection risk as provided by AIAG.

Page 55: About FMEAs -ASQ(handout)ascendantconsulting.net/ftpdocs/pdfs/About FMEAs... · Of the three methods shown FMEA embodies all of the characteristics of a Type 2 risk management approach

55

Answers to Simple Calibration Exercise

50% 60% 70% 80% 90% 100%The first six values in the constant PI is 3.14139.

10

50% 60% 70% 80% 90% 100%Modern humans first appeared on the earth about 200,000 years ago.

9

50% 60% 70% 80% 90% 100%In 2002, the price of a new desktop computer was under $1,500.

8

50% 60% 70% 80% 90% 100%One meter equals 37.39 inches.7

50% 60% 70% 80% 90% 100%Napoleon was born on the island of Corsica.6

50% 60% 70% 80% 90% 100%M is one of the three most commonly used letters.

5

50% 60% 70% 80% 90% 100%Mars is always further away from the Earth than Venus.

4

50% 60% 70% 80% 90% 100%A liter of oil weighs less than a liter of water.3

50% 60% 70% 80% 90% 100%There is no species of three-humped camel.2

50% 60% 70% 80% 90% 100%The ancient Romans were conquered by the ancient Greeks.

1

Confidence that You are Correct

Answer (T or F)

Statement

F

T

T

F

F

T

F

T

T

F

Answers to the calibration questions provided in the body of this presentation.