FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ......

30
1 FMEA - Failure Modes and Effects Analysis + Criticality (FMECA) A Core Component of RCM Presented By: Tim Bair Research Engineer The Applied Research Laboratory at The Pennsylvania State University

Transcript of FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ......

Page 1: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

1

FMEA - Failure Modes and Effects Analysis +

Criticality (FMECA)

A Core Component of RCM

Presented By:

Tim Bair

Research Engineer

The Applied Research Laboratory at

The Pennsylvania State University

Page 2: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

2

What is the Role of

Maintenance?

• Maintenance helps to ensure that assets continue to fulfill their intended functions:

– Failure: System completely loses functionality or the operational capability falls below the minimum standarddesired by the user.

• Minimum Standard: initial designed capability or desired performance below design capability

• In order to determine the need for maintenance, the function and performance standards for the asset must be defined.

Initial Capability

Margin for Deterioration

Acceptable

Performance

Perf

orm

ance

Failure

Page 3: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

3

What is a FMEA?

Failure Modes and Effects Analysis (FMEA) It is a tool that is an integral part of the RCM Process.

• The FMEA process is a systematic method to identify:– Primary and secondary functions of the system and the failure

modes that prevent the system from completing its designed purpose.

• Process Objective: – Identify and prioritize the failure modes and the subsequent

effects to the system to help eliminate or minimize catastrophic and critical failure modes through the most appropriate type of maintenance methodology.

• Predictive Maintenance

• Preventative Maintenance

• No Scheduled Maintenance

Page 4: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

4

Sample FMECA Table

Page 5: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

5

The Major Elements of the Basic

RCM Process

• RCM Establishment and Planning

• Analysis:

– Define the function and functional failures of a specific platform, system or component.

– Then conduct a Failure Modes and Effects Analysis

– Identify the failure consequences

– Determine maintenance tasks and intervals.

• Analysis Audit

• Implementation

• Sustaining the RCM Program:

– RCM is a ‘Living Program’

– Implement a RCM management, training, benchmarking, and review process to provide feedback and measurement of progress toward asset management goals

Page 6: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

6

Determine Scope of Analysis

• The ‘Scope of Analysis’ is where the systems and sub-systems to be

analyzed are identified and a description indicating to what level of detail

each will be analyzed.

• The system will be partitioned, and the level and extent of analysis

necessary to meet program objectives is identified.

– Define reasonable boundaries so that the system includes the

necessary inputs and outputs but is not so large that it is difficult to

analyze.

– Include all failure modes that are ‘reasonably likely’ to cause functional

failure.

Page 7: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

7

Defining System and Boundaries: Aircraft

Hydraulic System

• A system is a user defined group of:

– Components

– Systems

– Equipment

that support an operational requirement.

• Boundaries are selected to divide a complex system into manageable sub-systems.

– A boundary or interface should define the inputs and outputs of the system.

Page 8: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

8

Level and Extent of the Analysis: How

Deep Do You Drill Down?

Hydraulic System

Hydraulic Pumps

Control Surfaces

Aircraft

•Cracked Blade

•Broken Blade

•Corrosion

•Shaft Broken

•Coupling Broken

•No Coupling

Lubrication

•Splines Broken

•Bolts Broken

•Inner Race Failure

•Outer Race Failure

•Element Failure

•Cage Failure

•Seal Failure

•Lubrication Failure

•Viscosity

Breakdown

•Contamination

•Loss of

Lubrication

•Material Failure

•Excessive Wear

Root Cause

of Failure

Analysis

The goal is to isolate the

failure mode to the

lowest level that allows

for the most effective

application of the

maintenance policy.

ShaftImpellor Bearings Lubrication Seals Components

Maintain Level

CBM Level

Page 9: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

9

Reliability Block Diagram:

Fault Tree AnalysisHydraulic Pump/Turbine Engine:

Total Loss of Fluid Flow

Failure Mode – λ1

Left Pump Failure – λ2 Left Engine Failure – λ3

Shaft

Failure

Bearing

SeizureAir

Failure

Fuel

Failure

01

02 03

λ4 λ5 λ6λ11

Fluid Leak

λ8

Right Pump Failure – λ9 Right Engine Failure – λ10

Shaft

Failure

Bearing

SeizureAir

Failure

Fuel

Failure

04

05 06

λ7λ12 λ14 λ15

Fluid Leak

λ13

Page 10: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

10

Steps for the Analysis Process:

Information and Decision

1. Identify System Functions: What does the

user need the system to do in its current

operating context?

2. Identify Functional Failures: In what way

can the system fail (or fail to fulfill its

function)?

3. Identify the Failure Modes: What causes

the failures?

4. Identify the Failure Effects: What happens

when failures occur and what are the

symptoms of failure?

5. Identify Failure Consequences: How and

why does the failure matter.

• Frequency of occurrence

• Severity of the failure mode

Reference: TACOM ILSC CBM+, Reliability Centered Maintenance Process Overview, C1061-03-0004

Page 11: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

11

Defining Function

2.1.1

• What are the functions and associated desired standards of performance of the asset in its present operating context?– Identify all primary and secondary functions of the asset or system

in terms of performance.

• Primary Functions: Speed, Output, Storage Capacity, Product Quality

• Secondary Functions: Safety, Control, Containment, Comfort, Economy, Environmental Compliance

– Functional statements should contain a verb, an object and a performance standard.

• The turbine engine is designed to provide 65000 lbs of thrust @ 21000 rpm.

• The platform is designed to transport 100 soldiers and 10,000 lbs. of cargo at an maximum speed of 650 mph.

– Performance standards should be at a level desired for the operational context.

Page 12: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

12

Defining Function –Operating Context

• The operating context may influence the primary and secondary functions.– Operating Environment:

• The operating environment may be unique for one type of equipment, which may be different than other (like) equipment.

– Safety and Environmental Standards:• Equipment may have different safety and environmental standards

based on how they are operated with respect to humans.

– Regular or Intermittent Use:• Does the equipment or platform deploy regularly or is it only used

for unique circumstances (i.e. cold weather kits, bridge layer platforms)?

– System Redundancies:• With backup systems, one piece of equipment may operate

continuously and the redundant system may be on stand-by.

Page 13: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

13

Defining FunctionPerformance Standard

• Many systems will have multiple performance standards.– The engine must operate at 3000 RPM and 1000 HP continuously.

• Quantitative vs. Qualitative: be as precise as possible when defining a performance standard.

– Absolute standard: exact specification• To contain 40 gallons of fluid or to contain fluid.

– Variable standard: mean with upper and lower limits• To contain an average of 40 gallons of fluid +/- 1 gallon.

• Example: Hydraulic PTO Pump – What is its function in its operating context?

– To be able to move hydraulic fluid on demand at a flow rate ranging from 550 to 650 cfm, at a pressure range from 600 to 2000 psig and at a temperature ranging from -50 F to +220 F.

Page 14: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

14

Defining Function –

Hidden Function

• A function whose failure does not become apparent to the operating crew under normal circumstances.

– Equipment Protective Devices (Safety)• Provide a warning indicator of an abnormal condition

• Shut down for equipment in case of failure (avoidance of catastrophic failure)

• Provide redundant control in case of primary control failure

• Guards to prevent physical harm

• Example: Hydraulics – What are the hidden functions?– Relief valve must activate at 2000 psi (OEM setting for maximum

payload) as a safety device from overloading the system.

– Relief Valve Failure Mode: Due to normal valve wear (spring sag) the relief valve will begin to activate at 1600 psi, which will limit the maximum lift capability.

• This will result in a functional failure of the hydraulic system.

Page 15: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

15

Defining Functional

Failure 2.1.2• In what ways can a system fail to fulfill its function to the standard of

performance required by the user?

– Performance standards for the asset must be well defined and agreed upon by operations and maintenance.

– Functional Failure: Once the function of the asset has been established, the inability of the asset to perform to the defined standard constitutes a failure.

– List all failed states associated with each function.• Total and Partial Failures

• Limit Exceedence (Operational Performance)

• Operational Metric Displays (Gauges and Indicators)

• Example: What are the Pump Functional Failures?

1. Total loss of flow 2. No indication of operational parameters

3. Flow below required rate 4. Unable to contain fluid

5. Fluid pressure out of range 6. Inability to control pump operation

7. Fluid temperature above require range

Page 16: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

16

Defining Failure

Modes 2.1.3

• What causes each functional failure?

– List all failure modes reasonably likely to cause each functional failure.

• Decreasing Capability:

– Deterioration: fatigue, corrosion, wear and tear

– Lubrication Failures: root cause of many failures

– Contamination: causes excessive wear

– Human Errors: manual devices not properly operated

• Desired Performance exceeds Capability:– Intentional or Unintentional

– Sustained or Intermittent

Page 17: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

17

Defining Failure Modes

• Failure modes should be defined in enough detail to select the most appropriate failure management policies.

– Failure modes should be identified in terms of ‘cause and effect’ if possible. • Bearing Failure: caused by manufacturing defect or lubrication loss – need to

address both separately.

– List failure modes with the highest probability of occurrence.• Start with failure modes that have occurred on similar equipment

• Use experience to estimate which failure modes are most likely to occur.

– List failure modes that result in the most severe consequences.• Engine coolant pump failure is more severe than the AC pump failure.

– Do not get bogged down with too many failure modes. • SAE Standard Stipulates: Only failure modes that are likely to occur in the

operating context.

• Example: What are the Pump Failure Modes for Total Loss of Flow

1. Pump Shaft Failure 2. Failure of Lubrication (foreign material in the fluid)

3. Pump Impellor Failure 4. Failure of the Seal

5. Pump Bearing Seizure (manufacturing defect)

Page 18: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

18

Defining Failure

Effects 2.1.4

• What are the functional consequences when each failure mode occurs?– Effects: events that are a direct result of the failure

mode.• What evidence shows that the failure has occurred?

– Equipment stops rotating or alarm light.

• What safety or environmental threat exists?– Fire may occur or hazardous material may no longer be

contained.

• How has the mission, production or operation been effected?

– Another vehicle may need to remain with the failed vehicle.

• What damage is caused by the failure mode?– Bearing fault will cause impellor damage.

• What must be done to repair the equipment?– Engine needs to be removed to replace transmission: 8 hours

Page 19: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

19

Defining Failure Effects

– Secondary consequences to the system as a result of the failure mode:

• Backup systems must be operated: is the operation, production or mission effected while running on backup?

• Are spares available? What is the delay time to receive replacement parts?

• Example: What is the Failure Effect for a PTO Pump Bearing Seizure

– Pump rotor is unable to continue to rotate, which causes the fluid to no longer flow.

– Vehicle incurs 5 hours downtime to replace pump.

– Another vehicle must be activated to complete the mission.

Page 20: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

20

Example FMEA Format

• There are many FMEA formats. Choose one that fits your analysis needs best.

Page 21: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

21

Criticality Analysis (Assessing Risk) and

Pareto Analysis

(Identifying ‘Dominant’ Failure Modes)

Useful for Prioritizing, Making

Decisions and Focusing the Analysis

Efforts

Page 22: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

22

Criticality 2.1.5

• How likely is the failure mode to occur?

– The SAE standard stipulates that only failure modes that

are reasonably likely to occur in the context of operations

should be considered for the FMEA.

– Ideally this probability should be quantified in the RCM

analysis.

• What are the Consequences and is the Risk Tolerable?

– Risk is measured by multiplying the probability of failure

mode by the severity of the failure mode.

– Deciding what risk is tolerable is going to be individual

and organization specific.

Reference: SAE Standard JA1012

Page 23: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

23

Critical and Dominant Failure

Modes

• Critical Failure Modeshave significant effects with high level safety, environmental, or mission consequences.

• Dominant Failure Modeis a single failure mode that accounts for a significant portion of the failure of a complex system

Motor

Rotor Stator

Shaft

Bearings

Lubrication

Coupling

Inner Race Ball or Roller

ElementsOuter

Race

Seals

Page 24: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

24

Severity/Consequences

• Need to define the mission function, safety and environmental consequence in terms of severity levels to categorize each failure mode.

• Severity levels will be used to rank and prioritize each failure mode.

Reference: MIL-STD-882C, System Safety Program Requirements

Page 25: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

25

Probability of Occurrence

• Need to qualify the probability that each failure mode will occur to rank and prioritize each failure mode.

• Fleet or Inventory: Normal total run hours to number of items.

Reference: MIL-STD-882C, System Safety Program Requirements

Page 26: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

26

Pareto Analysis

• An effective method for

classifying and prioritizing

information.

• Failure data analysis helps to

identify the largest portion of the

reliability issues which then can

be addressed efficiently with the

most economical use of

resources.

• Failure data can consist of:

– Cost (operation or repair)

– Probability

– Frequency of Occurrence

– Consequence of failure

(severity)

In some cases a large number of

failure modes are the result of only a

few causes.

OPERATING EQUIPMENT ASSET MANAGEMENT YOUR 21ST CENTURY COMPETITIVE NECESSITY, By John S. Mitchell

Pump Failure Modes

Page 27: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

27

Example FMECA Format

• Criticality Information: Frequency of Occurrence and the Severity.

• Including this information provides another parameter to aid decision making.

Page 28: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

28

Criticality Analysis: For Ranking

and Prioritizing Failure Modes

• Criticality Analysis: provides a relative measure of significance of the effect that a failure mode has on the successful operation and safety of the system.

• Calculation is based on definition of failure, severity categories and part failure rate information.

• Two approaches:

– Quantitative: when historic

operational failure rate or test

failure rate data exists

– Qualitative: when little to no

failure rate data exists

Failure Mode, Effects and Criticality Analysis (FMECA), Reliability Analysis Center

Page 29: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

29

where Cm = Criticality number for the “ mth “ failure mode

= Failure mode ratio (for a specific item)

= Conditional probability of loss of function

= part failure rate (failures/million hours)

t = operating time or number of operating cycles

Criticality Analysis: Quantitative

Approach

• Failure Mode Criticality Number: provides a

quantitative criticality rating for the component or

function.

p

tC pm

Practical Reliability Engineering, by Patrick D.T. O’Conner

Page 30: FMEA - Failure Modes and Effects Analysis + Criticality ... · • Process Objective: ... •Corrosion •Shaft Broken •Coupling Broken •No Coupling ... – Failure modes should

35

Summary

• FMEA: An excellent systematic method for identifying and organizing the issues that lead to low operational reliability.

– Function of an Asset:• How the asset can fail to perform to the required standard.

– Failure Modes and Effects:• How do failures prevent an asset from functioning correctly and what happens

when specific failures occur.

• Criticality: A portion of the FMECA that provides additional information for decision making.

– Consequences and Probability of Failure:• How severe are the consequences of failure on the operation and mission and

what is the likelihood of occurrence.

– Pareto Analysis:• Methods for evaluating and prioritizing failure modes to determine the failure

modes that would benefit most from the implementation of the RCM process.