7/30/2019 Reliability Maintenance Operational Management
1/97
Concept of
ALL RIGHTS RESERVED 2007Rev. 1 : August 07, 2007
7/30/2019 Reliability Maintenance Operational Management
2/97
OBJECTIVE OF THIS COURSE :
Provide a basic understanding on
equipment failure
Understand the different maintenance
strategies at hand
RELIABILITY MAINTENANCE
Learn when to use the different main-
tenance matrix and indices
Understand what Root Cause Failure
Analysis is and how far should we
analyze the problem
Share the lessons on Reliability
and Maintenance
7/30/2019 Reliability Maintenance Operational Management
3/97
RELIABILITY MAINTENANCE COURSE MODULES :
Module 2 : Understanding The Different
Maintenance Strategies Reactive Maintenance Preventive Maintenance
Predictive Maintenance
Proactive Maintenance
Module 4 : Root Cause Failure Analysis
3 Levels of Root Cause Failure Analysis
Sample Case Study : Pump Failure
Module 1 : Understanding Equipment
Failure
The truth about equipment failures
What maintenance can do after all ?
Understanding the patterns of failure
Why Preventive Maintenance is limited ?
Module 3 : Maintenance Matrix, Indices
and KPIs
Understanding MTBF and MTTF Understanding MTTR
Module 5 : Lessons On Reliability
7/30/2019 Reliability Maintenance Operational Management
4/97
When we set out to maintain something
What is the existing state that we wish to preserve ?
What is it that we wish to cause to continue ?
Hence, when we maintain an asset
Someone wants it to do something
They expect it to fulfil a specific functions
Because it is what the users want it to do
Maintenance is
ensuring that
physical assets
continue to do
what the users
want them to do.
By Definition
7/30/2019 Reliability Maintenance Operational Management
5/97
RELIABILITY DEFINED
FAILURE simply means the inability of anequipment to perform its required function.
The failure of a component is viewed as
terminating its life on the other hand
RELIABILITY is the probability that nofailure will occur throughout a prescribed
operating period.
BAZOVSKY states th at . . . .
Modern concept of reliability in popular language simply as
the capability of an equipment not to break down in operation.
When an equipment works well and performs to do its job for
which it was designed, such equipment is said to be reliable
7/30/2019 Reliability Maintenance Operational Management
6/97
MODULE 1
Understanding Equipment Failures
7/30/2019 Reliability Maintenance Operational Management
7/97
Backlog grows
and PM is missed
More failures
occur
Resources is
taken down by
breakdowns
Get the line going,
temporary repairs
More repeat
work, working
long hours
Pressure on
maintenance to
keep machine ok
Operations cope
w/ backlog wont
give for PM
Spares and
budget grows
on maintenance
INTRODUCTION :
DOMINO EFFECT
OF BEING
REACTIVE
A BELIEF THAT
All Parts will wear
Start here
Morale declinesand standard
drops
7/30/2019 Reliability Maintenance Operational Management
8/97
Dont Mess With It!
YESNO
YES
YOU BETTER
WATCH OUT !!! NO
Will it Blow
Up In Your
Hands?
NO
Look The Other Way
Does
Anyone Else
Knows?
YOU BETTER
NOT CRY !!!
YESYES
NO
Hide It now quickCan You Blame
Someone Else?
NO
NO PROBLEM!
YES
Is It Working?
Did You Mess
With It?
SIMPLE FLOWCHART FOR REACTIVE ENVIRONMENT
7/30/2019 Reliability Maintenance Operational Management
9/97
LOOKING AT THE REAL MEANING OF FAILURE
Is failure GOOD or BAD ?
Dont we learn from failure ?
Can we succeed without failing ?
Is failure telling us something ?
Is it ok to fail?
Failure is not bad after all
Failure is our greatest teacher
Failure is our key to success
Yes, there are lessons to be learned from it
I really dont see anything wrong with failure if we accept them positively
7/30/2019 Reliability Maintenance Operational Management
10/97
What they say about failures :
Thomas Alva Edison(1847 - 1931)
Dont call it a mistake
call i t an edu cation . . .
Only 3 months of schooling
First incandescent electric
bulb lighted in Oct. 21st,
1879 for 40 hrs
When he died he held over
1368 separate US & foreign
patents
Henry Ford(1863 - 1947)
Fai lure is only the
oppor tun i ty to begin
again mo re intel l igent ly
Had only limited schooling
He produced an affordable
car, paid high wages,helped
create a middle class.
What people see of my
Success is only 1 percent
But what they dont see is99%w/c are my fai lures
Soichiro Honda(1906 - 1991)
Today, Honda Corporation employs
over 100,000 people in the USA and
Japan, and is one of the world's lar-
gest automobile companies.
7/30/2019 Reliability Maintenance Operational Management
11/97
THE TRUTH ABOUT MACHINERY AND EQUIPMENT
CAN WE REALLY ELIMINATE FAILURES ?
An equipment will compose of the following
Electronic parts(100,000 pcs)
Electrical parts( 30,000 pcs)
Mechanical parts(5000 pcs)
2 important questions to raise for the maintenance will be
1) What exact part will fail ?
2) When will that part fail ?
7/30/2019 Reliability Maintenance Operational Management
12/97
THE TRUTH ABOUT MACHINERY AND EQUIPMENT
But we have around 100 similar machines & 10 types of equipment Each equipment have around more than 130,000 components in it
We only have 5 maintenance craftspeople per shift for all our equipment
How do we know which parts will fail, what machine and when ?
Can we accept the fact that failures are really meant to happen after all ?
7/30/2019 Reliability Maintenance Operational Management
13/97
THE TRUTH ABOUT MACHINERY AND EQUIPMENT
TO ADDRESS THIS ISSUE : Many people are deployed to
perform full time repair work
We have some form of Preventive
Maintenance that sort of schedule
these equipments for some form
of maintenance work
Inspections are added from time totime increasing the amount of work
for the maintenance
Maintenance are measured by how
fast they perform their repair
Maintenance will only focus on failed
parts that will stop the equipment
from running & likely ignore failures
of secondary functions
Desp ite these very noble efforts machine st i l l fai ls,RIGHT ?
I guess thats
the way it is
boss !
7/30/2019 Reliability Maintenance Operational Management
14/97
THE TRUTH ABOUT MACHINERY AND EQUIPMENT
We use machinery and equipment
to perform a particular function, if itcannot provide that function we say
that our equipment have failed or a
breakdown occurs
Equipments do not fail, there are
som e parts on the equipment that
had fai led, once we have identi f ied
the fai led part and replace it then
the machine wi l l be runn ing again.
FACT 1
FACT 2
Altho ugh we mig ht be using some stat ist ics & h istory records as a base-l ine, the fact st i l l remains, we do not k now exact ly which p arts are going
to fai l and when i t wi l l fa i l precisely, but we certa in ly k now that one day
our car wi l l run dead, our computer wi l l s top work ing and our equipment
wi l l sto p w orking due to an event o f a failure or breakdown . . . . .
7/30/2019 Reliability Maintenance Operational Management
15/97
THE TRUTH ABOUT MACHINERY AND EQUIPMENT
Therefore, the aim of m aintenance is to contro l the t imin g o f fai lure so
that we can select or perform a task before fai lure happens
The best that we can do to our equipment will be to :
1st - Extend the length of time between failures
2nd - Prevent the failures by replacing the most
worrisome component before they fail
Making equipment mo re rel iable is about extending
the li fe & the time between fai lu re (MTBF) as well as
prevent ing fai lures by replacing o f part & components.
This is what maintenance is all abou t . . . . .
3rd - Monitor failures by providing signs and
symptoms that they are on the verge of
failing, this is possible by determining
the condition of the equipment
7/30/2019 Reliability Maintenance Operational Management
16/97
FAILURE( TIP OF THE ICEBERG )
FRACTURE VIBRATION DIRT / DUST ABRASION
HUMAN ERROR LOOSENESS LEAKAGE CONTAMINATION
CORROSION DEFORMATION TEMPERATURE LUBRICATION
LOOSE BOLTS MISALIGNMENT FATIGUE ENVIRONMENT
TYPICAL CAUSES OF FAILURES
7/30/2019 Reliability Maintenance Operational Management
17/97
Common Belief : Does all parts will wear out ?
Maintenance people believe that ALL
parts after consistent use will reach a pointof wear and tear, hence, overhauling or re-
placing the part before it fails on a specific
fix schedule will ensure the reliability of the
equipment, therefore the concept of Preven-
tive Maintenance will solve the problem of
unexpected failures, RIGHT or WRONG ?
Point that part is expected to
reach failure
Accelerated
Deterioration
Natural Deterioration
Failure LineDETERIORA
TION
Failed State / Run To Fail
TIME
Time-Based
Condition-Based
Point 1 Point 2 Point 3 Point 4
7/30/2019 Reliability Maintenance Operational Management
18/97
It is also borne out by the machine operatorwho says that every time maintenance
works on it over the weekend, it takes up to
Wednesday to get it going again
Reference page 143 RCM by John Moubrey
Most Manufacturing Industries Experience . . .
7/30/2019 Reliability Maintenance Operational Management
19/97
It is the belief that led to the idea that the more often
an item is overhauled, the less likely it is to fail . . .
Schedule Overhauls / Prevent ive
Maintenance inc reases Overal l
fai lures by introduc ing Infant
Mortal ity into o therwise stablesystem
Resulting schedules are used for all similar assets
again, without considering that different conse-
quences apply in different operating context.
This results in large number of schedules which
are wasted , not because they are wrong in the
technical sense, but in reality, they achieve nothing
7/30/2019 Reliability Maintenance Operational Management
20/97
What did Stanley Nowlan and the late Howard Heap Discovered
First, scheduled maintenance has little or no
effect on the reliability of a complex item unless
the item has a dominant failure mode.
Second, there are many items for which there
is no effective form of scheduled maintenance.
2 discoveries evolved which created a change in the evolutionand thinking of the maintenance system worldwide . . . . .
7/30/2019 Reliability Maintenance Operational Management
21/97
UNDERSTANDING BREAKDOWN
HARD FACTS ABOUT EQUIPMENT FAILURES
Not all failures will constitute a downtime Failure occur in 3 pattern, Infant Mortality,
Random Failure & Age-Related Failures,
and most of the failures we encounter is
either random or infant mortality failures
Increasing the amount of PreventiveMaintenance activities on the equipment
will likewise increase the chances of
Infant Mortality Failures & that the only
way to reduce Infant Mortality Failure is
to reduce the amount of work in our PM
Not all failures can be eliminated, the best that maintenance can
actually do is to control the timing of failure and that reducing the
consequences of failure is more feasible rather than trying to
eliminate the failure itself
7/30/2019 Reliability Maintenance Operational Management
22/97
UNDERSTANDING BREAKDOWN
HARD FACTS ABOUT EQUIPMENT FAILURES
Preventive Maintenance can onlycapture wear out or age-related
failures. When failure is random
in nature, this is when PM is at
weakest point and likewise not
feasible to use All failures are not created equal,
yet all failures will have their deg-
ree of consequences. Hence, the
degree of maintenance require-
ments should be based upon theconsequences of failure itself.
When failure has little or minor
consequences it is a good deci-
sion to allow the failure to occur
7/30/2019 Reliability Maintenance Operational Management
23/97
MACHINE 1
1 Failure / Mo 1 Failure / Mo No Failures No Failures
9 Failures / Mo 8 Failures / Mo 1 Failure / Mo No Failures No Failures
1 Failure / Mo
MACHINE 2 MACHINE 3 MACHINE 4 MACHINE 5
MACHINE 6 MACHINE 7 MACHINE 8 MACHINE 9 MACHINE 10
Will these 10 equipments have the same amount of PM required ?
Which machines will require the greater amount of maintenance ?
Should we follow the specs or we apply common sense on maintenance ?
THE TRUTH ABOUT MACHINERY AND EQUIPMENT
7/30/2019 Reliability Maintenance Operational Management
24/97
CASE 1: RANDOM FAILURES Ex : 100 failures encountered on a ball bearing
for a span of 9 years & distribution is as ff
PERIOD OR LIFE
5 15 10 20 10 5 15 1010
1 2 3 4 5 6 7 8 9
CONCLUSION : Failure distribution is not symmetrical, PM not applicable
CASE 2 : AGE-RELATED FAILURES
PERIOD OR LIFE
2 1 0 0 0 2 1 940
1 2 3 4 5 6 7 8 9
CONCLUSION : Failure distribution is almost age-related, for this case
the best period to perform replacement is on the 8 month
BEST PERIOD
TO PERFORM
REPLACEMENT
COMPARING RANDOM AND AGE-RELATED FAILURES
7/30/2019 Reliability Maintenance Operational Management
25/97
There is a belief that all items have a
life and that installing a new part before
the life is reach will automatically restore
it to its original basic condition = FALSE
This will lead us to the conclusion that the truth is . . . . .
MORE PM MEANS MORE PROBLEM
LESS PM MEANS LESSER PROBLEM
7/30/2019 Reliability Maintenance Operational Management
26/97
We need to understand that failure occur in 3 ways . . . . .
CHANGING THE WAY WE THINK ABOUT FAILURES
1st - INFANT MORTALITY : Failure can occur at the beginning
2nd - RANDOM FAILURES : Failure can occur at any period
3rd - AGE-RELATED FAILURES : Failure will wear due to age
And most maintenance only focus on the 3rd type of failure,and neglecting to understand that infant mortality failures &
random failures occur more frequently than wear out failures
RANDOM FAILURES
WEAR OUT
FAILURES
INFANT
MORTALITY
Occurrences of random and infant
mo rtal ity fai lures are more frequent th an
wear out fai lures
BATHTUB CURVE
7/30/2019 Reliability Maintenance Operational Management
27/97
MISCONCEPTION ABOUT PREVENTIVE MAINTENANCE ?
Can all failures be captured by Preventive Maintenance ?ANSWER : Despite the best efforts & structure on Preventive MaintenanceFailures are still inevitable & will not be captured solely by PM. Zeroing
ou t all breakdowns is l ik e catchin g a l ighting w ith a Polaroid Camera . . .
Why wont PM capture all failures ?
ANSWER : Typically only around 20%
of component failures will wear out or
are directly related to the age of the
equipment, and around 80% or all
failures will fit the random and infant
mortality failures.And when the failure is random in
nature, there is no amount of PM that
can address this issue. This is where
PM is at its weakest, hence, let us not
misuse this strategy.
7/30/2019 Reliability Maintenance Operational Management
28/97
REACTIVE
MAINTENANCEPREVENTIVE
MAINTENANCE
PREDICTIVE
MAINTENANCE
PROACTIVE
MAINTENANCE
MODULE 2
Understanding The Different
Maintenance Strategies
7/30/2019 Reliability Maintenance Operational Management
29/97
Maintenance is done at a point when there is repair or actual breakdown
It occurs when repair action is taken on a problem only when the problemresults in machines failure. Unplanned downtime, in its simplest
definition, breakdown maintenance simply means fixing it when it fails
Band-Aid Maintenance No Scheduled Maintenance
Reactive Maintenance
Firefighting
Run-to destructionRun-to fail
REACTIVE MAINTENANCE :
7/30/2019 Reliability Maintenance Operational Management
30/97
REACTIVE MAINTENANCE :
When this is the sole type of maintenance
practice
- High percentage of unplanned activities
- High replacement and parts inventories
- High pressure to keep equipment running
A purely reactive maintenance
strategy ignores opportunities
to influence equipment reliability
and survivability
If aint broke dont fix it, when it breakswill fix it
Justifiable in particular instances if :
- Does not produce critical delays
- Does not sacrifice peoples safety
- Does not significantly increase costs
- With redundant functions of standby
7/30/2019 Reliability Maintenance Operational Management
31/97
RUN TO FAIL
If failure is evident and does not affect
safety or environment, or if it hidden
but does not affect safety or environment
then default decision is No Scheduled Mtce
RUN TO FAIL MAINTENANCE IS VALID IF :
- A suitable scheduled tasks cannot befound for hidden function
- A costs effective preventive tasks cannot
be found for failures w/c have operational
or non-operational consequences
7/30/2019 Reliability Maintenance Operational Management
32/97
WHEN REACTIVE MAINTENANCE CAN BE JUSTIFIED
IS MONITORING, SCHEDULED MAINTENANCE OR INSPECTION
REQUIRED FOR SAFETY OR ENVIRONMENTAL COMPLIANCE ?
NO
WILL THE BREAKDOWN BE MORE COSTLY THAN THE TASKS
OF PREVENTING THE FAILURE ITSELF ?
NO
IS THE EQUIPMENT IN THE CRITICAL PATH IN MANUFACTURINGOR CONSIDERED A BOTTLENECK EQUIPMENT OR PROCESS ?
NO
IS BACK-UP EQUIPMENT UNAVAILABLE ?
NO
WILL THE BREAKDOWN ADVERSELY AFFECT DELIVERYOR CUSTOMER SERVICE OR PROVIDE ANY DELAYS ?
NO
WILL THE BREAKDOWN FURTHER DAMAGE THE EQUIPMENT
OR PROVIDE SECONDARY DAMAGES ?
THEN REACTIVE
MAINTENANCE IS
JUSTIFIED
7/30/2019 Reliability Maintenance Operational Management
33/97
RUN TO FAIL EXAMPLES
Electronic Circuit Boards
When the consequences of
failure and the cost or repair
is minimal
Busted Light Bulb Failures
Parts with redundancy or stand-
by items such as pumps & motors
Spare parts & component failures
that will limit the failure to the
component itself with no chances
of secondary failures
Overstock inventories that
can accommodate the repairtime itself
7/30/2019 Reliability Maintenance Operational Management
34/97
PREVENTIVE MAINTENANCE :
Preventive Maintenance is simply performing maintenance on a fixed
interval w/c may be in the form of time, number of strokes or frequency
Time-Based Running Hours
Scheduled-Discard / Replace Parts
Scheduled-Restoration / Overhaul
Stroke-BasedCalendar-Based
7/30/2019 Reliability Maintenance Operational Management
35/97
PREVENTIVE MAINTENANCE :
Also known as Time-Based or Calendar
Based Maintenance
Maintenance activities are performed on
a calendar or fix operating schedule in
order to extend the life of the equipment
and prevent failures
Maintenance is performed without regard
to equipment condition
Assumes that the condition of the machine
and the need for maintenance is correlated
with time which means that the item can beexpected to operate reliably for an amount
of time and is expected to wear out
A failure rate and history records are used
to established the best frequency
7/30/2019 Reliability Maintenance Operational Management
36/97
Stress cause an asset to deteriorate by lowering its resistance,
exposure to stress includes output, distance traveled, operatingcycles, calendar time and running time
Trademark for Patterns A, B, and C
PREVENTIVE MAINTENANCE :
7/30/2019 Reliability Maintenance Operational Management
37/97
WHEN PREVENTIVE MAINTENANCE IS FEASIBLE
When the part or component wears out
directly with respect to its operating age
These parts will survive this defined age
Ex. 98 % of impellers were replaced after
the end of 2 years
The part or component will have a normalrate of wear, TPM term will be natural
deterioration. A more technical term will
be normal fatigue Fatigue happens when the stress exceeds
the strength of the material of the spare
part or component Application of Preventive Maintenance
tasks will only be worth doing and feasible
to parts that will have a normal wear or
deterioration
7/30/2019 Reliability Maintenance Operational Management
38/97
A common problem with mature maintenance programs that if not
correctly designed, then between 40 to 60% of the PM tasks serve
very little purpose and therefore, evaluating our current Preventive
Maintenance System should lead us :
Many tasks duplicate other tasks
Some tasks are done to often while others
are not enough
Some tasks serve no purpose whatsoever
Many tasks will be intrusive (forced) and
overhaul based whereas they should be
condition-based
Some tasks cost more to do than the
failure it is meant to prevent
Maintenance is costly by replacing
perfectly good parts since we are basing
replacement on time-based
WHY PREVENTIVE MAINTENANCE IS LIMITED ?
John Moubray 1997
John Moubray authorReliability-Centered Maintenance
7/30/2019 Reliability Maintenance Operational Management
39/97
Should maintenance or replacement be carried out
on a piece of equipment & if the equipment is ingood condition, then it should remain in service.
Preventive Maintenance does not guarantee that
the parts to be replace really needs to be replaced
Why don't PMs significantly reduce the
amount of reactive maintenance being
performed in your plant? The answer is
simple. PMs were designed around the
theory that equipment failures are directlyrelated to the age of the equipment. Since
only 20 percent of equipment failures fit
this pattern that means that 80 percent of
equipment failures are not being effectively
managed by doing time-based PMs.
WHY PREVENTIVE MAINTENANCE IS LIMITED ?
7/30/2019 Reliability Maintenance Operational Management
40/97
PREDICTIVE MAINTENANCE :
Predictive Maintenance aids in detective potential failures in equipment
with the aid of specialized instruments. Maintenance is based on the
condition of the equipment which differentiate it from Preventive Mtce
Equipment Monitoring
Technique
Just In Time Maintenance
On-Condition
Tasks
Reliability-Based
Maintenance
Equipment Diagnostic
Technique
Condition-Based
Maintenance
On-Line Monitoring
Equipment
7/30/2019 Reliability Maintenance Operational Management
41/97
A person is gifted with 5
senses which are sense ofsmell, touch, taste, hear,
sight. He can use these
senses to detect problems
on the equipment.
Condition-Based Monitoring
checks the condition of an
equipment through the use
of sophisticated measuringinstruments with precision
accuracy. Predictive Mainte-
nance instruments are a higher
form of the human senses
PREDICTIVE MAINTENANCE DEFINED
7/30/2019 Reliability Maintenance Operational Management
42/97
CBM tasks entails checking for potential failures, so that
action can be taken to prevent the functional failure or toavoid the consequences of a functional failure
CONDITION-BASED MAINTENANCE DEFINED
7/30/2019 Reliability Maintenance Operational Management
43/97
P-F INTERVAL
POTENTIAL FAILURE :
Is defined as an identifiable physical condition which indicates that
a functional failure is either about to occur or is in the process ofoccurring
FUNCTIONAL FAILURE :
Is defined as the inability of an item to meet a specific performance
standard
P-F INTERVAL :
Is the interval between the
emergence of the Potential
Failure and its decay into a
Functional Failure
When to used CBM technique ?
7/30/2019 Reliability Maintenance Operational Management
44/97
DETERMINING POTENTIAL FAILURES
Predictive Maintenance aids us in determining the potential failure or
symptoms that an equipment is in the process of failing. Changes or
increase in the following can denote a potential failure. Specialized
diagnostic instruments can aid in detecting the following :
Heat or temperature
changes in resistance changes in conductivity
changes in dielectric strength
Increase in Noise
Vibration
For Electrical we have
Pressure change
Flow rate change
Lubricant contamination Wall thickness decrement
Rate of corrosion
Leak detection
Crack detection
7/30/2019 Reliability Maintenance Operational Management
45/97
Overhauls performed on a fixed interval
whether Time-Based or Running hours
Overhauls to be performed if there is a
potential failure detected
Preventive Maintenance is performed
when the machine is stopped
Predictive Maintenance can be perform
while the machine is running
Parts are being replaced on fixed-inter-val, after it reached its specific time or
running hours
Parts are only replaced if a specificpotential failure is present, if nothing is
wrong, then no replacement takes place
Parts are being utilized based on the
frequency of replacement, parts will be
replaced even when good, to conform
More cost effective than preventive
since part is utilized almost all of its
entire life span
Possibility of replacing good parts Parts with potential failures replaced
Cannot detect exact location of problem Infra-red cameras can detect the
exact location of the temperature rise
Preventive Maintenance Predictive Maintenance
WHY PDM IS BETTER THAN PM ?
7/30/2019 Reliability Maintenance Operational Management
46/97
PROACTIVE MAINTENANCE :
- Proactive Maintenance is about analyzing why
failures occur so that its recurrence is finally
eliminated, and thereby extending the life of
the part or component
- Proactive Maintenance is when maintenance
or a group of cross-functional team analyzes
the failure with analytical techniques such as
Root Cause Failure Analysis, FMEA, Kepner
Tregoe, P-M Analysis, Fault-Tree Analysis etc.are used to better understand why the failure
occurred in the first place.- In Preventive Maintenance we replace the part
that we think is in the process of wearing out.
Our thinking is that replacing the part will bring
the equipment back to its original condition, we
have not taken into account the need to analyze
further why a certain part keeps on failing.
Trouble shoot ing is no longer an effect ive
strategy. In todays competitive world, the
Analysts find real solutions . . . .
7/30/2019 Reliability Maintenance Operational Management
47/97
REDESIGN or MODIFICATION
- Includes changing the specification ofa component
- Adding a new item
- Replacing an entire machine with a
different type
- Relocating a machine
- Change in process or procedure which
affects operation
SAFETY & ENVIRONMENTAL ASPECTS
- Reduce the probability of Failure Mode
occurring to a level which is acceptable
Replacing component with stronger
or more reliable replacement making thefailure no longer a threat to safety and
environment
1920s 1930s1900s 1940s 1950s 1970s
1980s
1990s
PROACTIVE MAINTENANCE :
7/30/2019 Reliability Maintenance Operational Management
48/97
OPERATIONAL & NON-OPERATIONAL CONSEQUENCES
- Reduce the no. of times failure occurs- Reduce or eliminate the consequences
of a failure (example thru redundancy)
- Preventive tasks is costs effective hence
alternate solution is to re-design
FACTORS CONSIDERED IN REDESIGN :
1. Does the failure involved major operational
consequences ?
2. Is the cost or scheduled / or Breakdown
maintenance high ?
3. Are there specific costs which can be eliminated by the design change ?
4. Does the design have no harmful effects which can be generated afterwards ?
5. Is there an economic trade off study on expected cost savings ?6. Is the asset to stay or to be used for a long time or will it be decommissioned ?
IF YOUR ANSWER TO THIS QUESTION IS YES,
THEN REDESIGN IS RECOMMENDED.
PROACTIVE MAINTENANCE :
7/30/2019 Reliability Maintenance Operational Management
49/97
WORLD CLASS MAINTENANCE EXCELLENCE :
Reactive
Maintenance
Level 1
Band-Aid Maintenance
Breakdown Maintenance
Run to Fail / Destruction
No Scheduled Maintenance
10 - 15 %
Maintenance
Prevention
Maintenance Free
Plug and Play
Longer Lifespan
Level 5
5 % and more
Is your company adopting
Reliability-Centred Maintenance ?
20 - 30 %
Level 2
Scheduled Overhauls
Schedule Discards Outage Schedules
Time-Based Maintenance
Stroke-Based/Running Hrs
Scheduled and Fix Intervals
Preventive
Maintenance
Level 3 Condition-Based Maintenance
Use of Diagnostic Tools
Specialized Equipment
Predict Eminent Failure
Early Alert / Detection
Predictive
Maintenance
40 - 50 %
Level 4 P-M Analysis
Root Cause Failure Analysis Failure Mode & Effect Analysis
Failure Analysis
Proactive
Maintenance
10 - 20 %
7/30/2019 Reliability Maintenance Operational Management
50/97
MODULE 3
MAINTENANCE MATRIX,
KPIs and INDICES
7/30/2019 Reliability Maintenance Operational Management
51/97
MEAN TIME BETWEEN FAILURE
MTBF is a reliability engineering term
that means the average amount of
operating time between the occurrence
of breakdowns that requires repair
MTBF simply means the average time
between failures. It is based on historical
data or estimated by vendors and is use
as a benchmark for reliability
MTBF =OPERATING TIME
NUMBER OF FAILURE
WHERE : OPERATING TIME = LOADING TIME - MACHINE RELATED DOWNTIME
LOADING TIME = AVAILABLE TIME - NON-MACHINE RELATED DOWNTIME
COMPUTE FOR THE MTBF IF BDO IS 6 ?
AVAILABLE TIME = 168 hrs
NMDT MDT
40 hrs 72 hrs (6x)
OPERATING TIME
7/30/2019 Reliability Maintenance Operational Management
52/97
MTBF trend will be the higher the value
the more reliable the machine or part
In case where there is no breakdown
or failure, an MTBF of infinity will be
obtained. This simply indicates that
there is nothing wrong w/ the equation
either prolong the duration of MTBF or
when there is no failure, assume adenominator of 1 to obtain a value
MEAN TIME BETWEEN FAILURE
If we buy a component with 30,000 MTBF,
it means that on an average the part
will run for 3.42 years without failure
7/30/2019 Reliability Maintenance Operational Management
53/97
MTBF VARIATIONS
MTBF can be computed on the following basis :
MTBF BY CRITICAL COMPONENT
To determine on an average when a particular critical component will fail
MTBF BY SUB-ASSEMBLY
To determine which sub-assembly fails frequently on a machine
MTBF BY PROCESS OR LINE
To determine which equipment fails frequently
and identify the bottleneck area in a process
MTBF BY MACHINE
To determine the MTBF of a particular machine
MTBF BY GROUP OF MACHINES
To determine the machine w/ the lowest
MTBF and perform improvements
7/30/2019 Reliability Maintenance Operational Management
54/97
MEAN TIME TO FAILURE
MTBF is a key reliability metric for systems that can be repaired or that
can be restored. MTTF is the expected time to failure of a system. Non-repairable systems can fail only once, hence for non-repairable items,
MTTF is equivalent to its mean of its failure time distribution. Repairable
system can fail several times, while non-repairable can fail only once.
MTTR MTTR
Point where
a new partis installed
A B
Time to repair
MTTF
Point where
the new partwill fail again
Total time it
will take for
the part to fail
x xMTBF
Point where the
1st failure occurs
Point where the
2nd failure occursHENCE : MTBF = MTTR + MTTF
7/30/2019 Reliability Maintenance Operational Management
55/97
WHEN TO USE MEAN TIME BETWEEN FAILURE
When the type of equipment breakdown
or failure is high
When we want to improve the design
weakness of a critical component of an
equipment
To determine main contributor why
equipment keeps on failing (PARETO)
To determine the frequency of
replacement for parts which have
symmetrical or linear failures, not
recommended for parts that fail
randomly (Patterns D, E and F)
To compare 2 identical parts from
different vendors
7/30/2019 Reliability Maintenance Operational Management
56/97
When a failure occurs, it is critical to
restore the equipment as soon aspossible. Typically much of repair
time is spend in determining the
cause of the problem
The traditional trend will be to apply
a fix and never get to the root cause
WHY MEASURE REPAIR TIME ?
Repair time should be performed at
the shortest possible time and our in
goal will be to put back the equipment
its operating state
For failures that keeps on repeating
itself over and over, the best strategy
will be to address the real root cause
of the problem and prevent it from
recurring on its own again
7/30/2019 Reliability Maintenance Operational Management
57/97
MTTR is defined as the average time required to repair the equipment
divided by the Breakdown Occurrence
MTTR =Repair Time
Breakdown Occurrence
MACHINE
STOPS
Find person
who can
repair it
Diagnose
the fault
Find the
spare parts
Repair the
fault
Revalidate
test run the
machine
Endorse
Machine to
operator
Repair time
MTTR
MACHINE DOWNTIME
Downtime means the total amount of timethe asset would normally be out of service
from the time it fails until it is fully operational
MTTR var ies from one company to another, hence, there must
be a clear understand ing on what MTTR cons t i tutes
MTTR DEFINED
When the system fails, and it will fail, how easy
will it be to recover?"
7/30/2019 Reliability Maintenance Operational Management
58/97
MTTR (Mean Time To Repair) is the
average time required to repair acomponent
Other terms used is Mean Time To
Restore or Mean Time To Recover
MTTR trend will be the lower or the
shorter the time to repair the better.Improving the MTTR means shorte-
ning the time to repair the machine
MTTR DEFINED
7/30/2019 Reliability Maintenance Operational Management
59/97
MTTR DEFINED
MTTR (Mean Time To Repair) is the
average time required to performcorrective maintenance or repair
on all of the removable items in a
product or system. MTTR analyzes
how long repairs & maintenance tasks
will take in the event of a system failure
MTTR may be defined as the time
it will take to bring a failed system
back to its available or operating
status again.
If an Ethernet card in your computer
fails and takes 3 hrs to purchaseand install a new card the MTTR for
your computer will be 3 hrs but
the Ethernet card is still broken and
may never be repaired hence the
MTTR for the Ethernet card is forever
7/30/2019 Reliability Maintenance Operational Management
60/97
A true and correct MTTR starts at the
time of failure and continues until thesystem is operational again, regardless
if a system part or component will be
available or not
UNDERSTANDING MTTR
MTTR is also difficult to estimate since
one must consider a variety or repairs.An engine repair will include tightening
a drain plug bolt to overhauling an entire
engine assembly
MTTRand MTBF
is limited to consideration
of predictable failures of parts or system for
operational related causes. Equipment failures
due to war, vehicle collision, fires, terrorism,
lighting and sabotage are generally ignored
7/30/2019 Reliability Maintenance Operational Management
61/97
MTTR can be used to track downthe level of skills for maintenance andTechnicians in performing repairs and
to improve upon it
MTTR TO IMPROVE REPAIR TIME
Example monitoring the MTTR for a
certain group composing of 20 peoplefrom the maintenance department,
Bob is said to have the lowest MTTR
when performing repair, therefore,
we can define proper procedures
on repairs based on Bobs practices
that can be followed by other peoplethereby avoiding trial & error, the goal
is to improve repair time performed
by other maintenance craftsperson
Planned Maintenance Skills EvaluationGearing Tow ards A Pro-Active Maintenance System
7/30/2019 Reliability Maintenance Operational Management
62/97
ivision : Central Equipment Engineering Teamname : The Untouchables Equipment type handled All Types
tation : PLCC Department Leader : Sam Milby
CLA SS D CL A SS C CL A SS B CL A SS A
egend :
Knowledge & Skill not Satisfactory Knowledge Satisfactory Skill Satis factory Knowledge and Skill both Satisfactory
(0 points) ( 0.50 points) ( 0.75 points) (1 Point)
Classification No. Knowledge / Skill Item Yes No SAM BOB RICO RACQUEL CAS SAY UMA NENE FRANZIN JB
1 Basic Machine Function
BASIC MACHINE 2 Machine Specs, Parts and Function
FUNCTION 3 Knowledge in Actual Set-up and Conversion
4 Basic Lubrication Knowledge
5 Basic Repair and Troubleshooting
8 Failure Mode and Effect Analysis
NALYTICAL SKILLS 9 Root Cause Failure Analysis
ENHANCEMENT 10 P-M Analysis11 MTBA Snapshot and Analysis
12 Sequence Of Events Analysis
13 Knowledge and use on FRL's
14 Knowledge and use on Pipings and Connectors
PNEUMATICS & 15 Knowledge and use of Cylinders
HYDRAULICS 16 Knowledge and use on Filtration
17 Knowledge and use on Speed Controllers
18 Leaks and Seals
19 Bearing Failures and Causes
20 Sensors Technology21 Motors and Pumps
OTHERS 22 Screws and Fasteners
23 Spare Parts Management
24 RCM and OER Strategy
25 Maintenance Indices and Measurements
26 Knowledge on Vibration Monitoring
PREDICTIVE 27 Principles of Heat and Thermography
MAINTENANCE 28 Oil Analysis and Tribology
( Specialization) 29 Ultrasonic Monitoring
30 CMMS Structure and System
5-03 Total Points
Training Attended PLANNED MAINTENANCE MEMBERS
7/30/2019 Reliability Maintenance Operational Management
63/97
Module 4
Understanding Root Cause Failure Analysis
7/30/2019 Reliability Maintenance Operational Management
64/97
Root Cause Analysis Defined :
Root Cause Failure Analysis is trying to UNDERSTAND
why something went wrong . . . . .
RCFA provides a methodology
for investigating, categorizing
and eliminating the root cause
of incidents w/ safety, quality,
reliability & manufacturing pro-cess consequences . . .
Root Cause Failure Analysis
identifies the basic source or
origin of the problem so that
recurrence of the problem
may be prevented
Ident i fy ing th e Roo t Cause Fai lure Analysis event al lows
us to exp lain the WHAT, HOW and WHY of the fai lure
7/30/2019 Reliability Maintenance Operational Management
65/97
Proper Root Cause Analysis
identifies the basic source or
the origin of the problem . . . .
The root cause analysis methodology
provides specific & solid foundation
for preventing the recurrence of the
problem or failure
Every system, spares or components
failure happens for a reason. There
are specific succession of events
that lead to a failure. RCFA follows
the cause and effect path from the
final failure back to its origin
Root c ause analysis is a tool to better expla in w hat
happened, to determ ine how i t happened and to better
understand why it happen . . . . .
Root Cause Analysis Defined :
7/30/2019 Reliability Maintenance Operational Management
66/97
Root Cause Analysis separates
the facts from hearsay. RCFAis not about trial and error and
seeing what works and not
While there are many techniques in
analyzing a problem which provide a
quick answer, it does not mean that theanswer is correct everytime. A true and
meaningful Root Cause Failure Analysis
takes the time to prove that what we say
is fact & supports our hypothesis with
evidence before we spend our money to
improve the design of the equipment
When the facts are backed up by evidence & science and
they are separated from the fict ion we now have a better
understanding as to the real Root cause of the prob lem
Root Cause Analysis Defined :
7/30/2019 Reliability Maintenance Operational Management
67/97
CAUSE STUDY :
In the problem below a car wash manufacturer sold one of his complete, turn-keycar wash systems to a client in Maryland. This includes the change machines for
the people who wish to get change to wash their cars. The new owner recognizes
that he is losing a significant amount of money from this change machine and
insinuates that the manufacturers employees have a spare key and are stealing
the money. The problem started when
the new owner complained to Bill thathe was losing significant amounts of
money from his coin machines each
week. Bill just cant believe that his
people was stealing the money since
he have known them for many years
RCFA CASESTUDY : MISSING MONEY
Bill then form a RCFA to getto the bottom of the problem
The group decided to install
a surveillance camera to know
who was stealing the money
7/30/2019 Reliability Maintenance Operational Management
68/97
RCFA CASE STUDY : MISSING MONEY
Missing Money
(Money from the ChangeMachine was missing)
Change Machine
Malfunction
Not working properly
Logic Tree Diagram
Money was stolen
from the machine
(Theres a thief)
Money was
never there
Customers not paying
Stolen by
someone
Stolen by
something
The video surveillance indicates that the customers entering the car wash hence,their hypothesis that customers was not paying was disregarded
The owner try to simulate the Machine by placing some coins in them and the
machine was then working properly so Change Machine Malfunction was not
the problem, It is clear to them that someone is stealing the money but who . . .
7/30/2019 Reliability Maintenance Operational Management
69/97
RCFA CASESTUDY : MISSING MONEY
But the RCFA group had not given up and monitor the surveillance camera
and found out . . .
Thats a bird sitting on the change slot
of the machine and it had to go down
into the machine but why ?
Thats 3 quarters he has in his beak,
another amazing thing is that it was
not just one bird but several of them
There goes another bird this time
taking only 1 quarter
Once they identify the thieves , they fo und over $ 4,000.00 in th e roof
the the car wash and more under a nearby tree, therefore, the case
of the sto len money was solved thanks to Root Cause Analysis . . .
7/30/2019 Reliability Maintenance Operational Management
70/97
Kingdom
is LostLevel 1
King is
KilledLevel 2
Horseshoe
comes offLevel 4 Why did the horseshoe
come off ?
1 nail short
on shoeLevel 5 Why is it that one nail is
short on the horseshoe ?
Why is the kingdom lost ?
Shortage
of nailsLevel 6 Why is there shortage of
nails ?
King fell of
the horseLevel 3
Why is the king killed ?
Why did the king fell of
the horse ?
Prepare horses
for battleLevel 7 Why prepare horses for
battle ?
If the king is not killed then
the kingdom had not been
captured ?
If the horseshoe did not
come off the king might
not fell on the ground and
might not have been killed
The groomsman mighthave prevented the king
from riding the horse due
to a missing nail and its
implications
If the kings horse shoe nail
was complete then it mightnot have come of at all
If the city have been defen-
ded even if the king was
dead then it might not have
been captured ?
Understanding Why-Why Analysis :
7/30/2019 Reliability Maintenance Operational Management
71/97
The story is told that before an important battle
a king sent his horse with a groomsman to the
blacksmith for shoeing. But the blacksmith hadused all the nails shoeing the knight's horses for
battle and was one short. The groomsman tells
the blacksmith to do as good a job as he can.
But the blacksmith warns him that the missing
nail may allow the shoe to come off. The king rides
into battle not knowing of the missing horseshoe
nail. In the midst of the battle he rides toward the
enemy. As he approaches them the horseshoe
comes off the horse's hoof causing it to stumble
and the king falls to the ground. The enemy is
quickly onto him and kills him. The king's troops
see the death, give up the fight and retreat. The
enemy surges onto the city and captures thekingdom. The kingdom is lost because of a missing
horseshoe nail.
(2) (3) (4) (5) (6) (7)(1)
Understanding Why-Why Analysis :
7/30/2019 Reliability Maintenance Operational Management
72/97
EXERCISES : Lets Determine The Sequence Of Events
Bearing Failure
Corrosion Present
Excessive Moisture
Seal was damage
Leak in the seal
High Acidity Level
Lack Lubricant
Bearing Failure
Lack Lubricant
Corrosion Present
Excessive Moisture
Leak in the seal
Seal was damage
High Acidity Level
Determ ine the problem and ask why to determ ine the
sequence of events in these sample
7/30/2019 Reliability Maintenance Operational Management
73/97
How did the incident occurred ? The Physics of the incident.
This usually explains how the failure had occurred, example
a bearing failed due to fatigue, this mostly explains the meta-
llurgical factor why the failure occur
What is the error committed that lead to the physical cause ?
Either someone did something wrong or did the wrong thing
We asked what caused the person to commit this mistake
These are the management system weaknesses. Theseincludes training, policies, procedures & specifications.
People make decision based on these and if the system is
flawed, the decision will be in error and will be the triggering
mechanism that causes the mechanical failure to occur
PROBLEM
PHYSICAL
CAUSE
Layer 1
HUMAN
CAUSE
Layer 2
LATENT
CAUSE
Layer 3
Physical, Human and Latent Causes :
7/30/2019 Reliability Maintenance Operational Management
74/97
DESCRIBE THE FAILURE MODE
HYPOTHESIS
DESCRIBE THE FAILURE EVENT
VERIFY HYPOTHESIS
DETERMINE PHYSICAL ROOTS & VERIFY
DETERMINE HUMAN ROOTS & VERIFY
DETERMINE LATENT ROOTS & VERIFY
In RCFA Analysis a Logic Tree is
used to work through a failure
The failure event is placed on topfollowed by all failure modes or
possible causes of breakdowns
Each of the causes are hypothesis
that needs to be verified so that
we have an understanding on w/cof the causes actually led to the
problem
The next step consists of determi-
ning and verifying the physical
roots, human roots and latentroots behind the failure. The final
cause will always have to do with
the latent cause of failures
RCFA LOGIC TREE DIAGRAM
7/30/2019 Reliability Maintenance Operational Management
75/97
Physical, Human and Latent Cause :
Problem : Cylinder does not operate smoothly
WHY 1 : Why is it that the cylinder
dont not operate smoothly ?
Strainer was clogged
WHY 2 : Why is the strainer clogged ?
Oil was dirty
WHY 3 : Why is the oil dirty ?
Dirt enter the tankWHY 4 : Why did the dirt enter the tank ?
Upper plate in the tank had a
hole and gap - Physical Cause
WHY 5 : Why was there hole and gap in
the tank ?Repair error during maintenance
work - Human Cause
WHY 6: Why was there repair error ?
No procedure to follow - Latent Cause
Evidence of dirt from Oil Analysis
7/30/2019 Reliability Maintenance Operational Management
76/97
PROBLEM
Root Cause
ROOTCAUSE IS LIKE A ROADMAP
In perform ing Root Cause Fai lure Analysis, we are interested
to know the real cause of a part icular fai lure by verify ing each
hypo thesis un ti l we reach th e final cause of th e fai lure . . . . .
WHAT SEPARATES RCFA FROM THE REST
7/30/2019 Reliability Maintenance Operational Management
77/97
WHAT SEPARATES RCFA FROM THE REST
RCFA
ISHIKAWA / FISHBONE
WHY-WHY ANALYSIS
PROBLEM SOLVING TOOLS
BRAINSTORMING
PARETO ANALYSIS
FMEA / FMECA
FAULT TREE ANALYSIS
P-M ANALYSIS
PROCESS MAPPING
FAILURE ANALYSIS
IN-DEPT ANALYSIS
PHYSICAL CAUSE
HUMAN CAUSE
LATENT CAUSE
Root Cause Failure Analysis will
always be based upon pure evidenceand takes the time to verify each failure
mode to determine the real cause of the
problem. RCFA only concludes once
the latent cause had been identified
These techniques most ly
concludes on the physical
and human causes only
7/30/2019 Reliability Maintenance Operational Management
78/97
CAUSE STUDY :
A pump was declared failed since it was not discharging fluid at all. The pumpfailed due to a failure of the bearing. The maintenance decided to perform a
Root Cause Analysis on the failed bearing to determine the real cause of the
problem and have the failed bearing
analyzed on a metallurgical laboratory.
Arrange the causes in sequence to
determine the real root cause of the
problem
RCFA WORKSHOP 1 :
Clues : There are 6 or 7 levels in the logic tree
Metallurgical lab report indicates
that the bearing failed due to fatigue w/c is a a type of wear
The last level (Bottom part) will be the real root cause of the problem
INSTRUCTION :
Brainstorm and analyze the case study
and rearrange the set of cards and prepare
a RCFA Logic Tree Diagram
ANALYZING THE BEARING FAILURE LOGIC TREE
7/30/2019 Reliability Maintenance Operational Management
79/97
LEVEL 1
The pump may fail for a variety of reasons, in this case it is evident to the mtcethat the cause of the pump to fulfill its function of discharge fluid is bearing failure.
A typical job o f the maintenance is to replace the bear ing w ith a new one
since the part had evident ly failed and p rodu ct ion is u p and ru nning again
but the quest ion is asked, Did the problem go away ? No, i t wi l l recur againon a given t ime
What the maintenance will do ?
When we have ou r engineers take a loo k
at the fai led bearing , he then takes a look
on fai lure history and d ata of the pump,
and co nclud e that a di f ferent type of bea-
r ing mo re heavy duty be instal led. We
would then get a heavy d uty bear ing and
instal l i t wi th the new design and again
the quest ion is asked, Did the prob lem go away ?
What the engineers will do ?
ANALYZING THE BEARING FAILURE LOGIC TREE
ANALYZING THE BEARING FAILURE LOGIC TREE
7/30/2019 Reliability Maintenance Operational Management
80/97
Pump Failure(No discharge at all)
Functional Failure
Bearing FailureFailure Mode
Logic Tree Diagram
Valve Is ShutFailure Mode
Motor Burned OutFailure Mode
LEVEL 1
Lets analyzed the failure of a pump
The pump failed since it is not discharging fluid at all
All causes are hypothesis and must be proven if they exists
The motor was checked and it was working, therefore, motor burned
out had been disregarded
The valve was open therefore, valve shut had been disregarded
The bearing had been analyzed and it was evident that there was
bearing failure, we now asked why the bearing had failed
ANALYZING THE BEARING FAILURE LOGIC TREE
ANALYZING THE BEARING FAILURE LOGIC TREE
7/30/2019 Reliability Maintenance Operational Management
81/97
LEVEL 2 : DIRT/DEBRIS and WEAR
The bearing may fail on a variety of reasons, such as dirt entry or ingression which
may have caused the accelerated wear of the bearing. All are probable causes and
are still considered as hypothesis. Hence, to distinguished the facts from hearsay
the bearing was sent to a metallurgical lab for further analysis to determine how did
the bearing failed to fulfill its function.
LEVEL 3 : WEAR DUE TO FATIGUE
The bearing had been analyzed and reviewed
by metallurgist and the report concluded that
there is strong evidence ofFATIGUE, nowthe other probable causes had been there-
fore eliminatedwe ask ourselves how can
fatigue occur on the bearing ?
ANALYZING THE BEARING FAILURE LOGIC TREE
ANALYZING THE BEARING FAILURE LOGIC TREE
7/30/2019 Reliability Maintenance Operational Management
82/97
Pump Failure(No discharge at all)
Functional Failure
Bearing FailureFailure Mode
Logic Tree Diagram
Valve Is ShutFailure Mode
Motor Burned OutFailure Mode
LEVEL 1
Dirt / Debris Lack of Lubrication Overloading Wear
LEVEL 2
Adhesive Abrasive Erosive Fatigue Corrosive
Have the bearing analyze for its metallurgical lab on why it failed
How
LEVEL 3
ANALYZING THE BEARING FAILURE LOGIC TREE
Lubrication in the bearing was checked and found out it is sufficient
Vibration monitoring shows there is no indication of overloading
The only possibility left was Dirt/Debris and Wear and so the team
decided to have the bearing test on a metallurgical laboratory
ANALYZING THE BEARING FAILURE LOGIC TREE
7/30/2019 Reliability Maintenance Operational Management
83/97
LEVEL 4 : HIGH VIBRATION
In Level 4 of our analysis we ask ourselves How can Fatigue occur on the bearing ?We hypothesize that it can come from high vibration. We check our vibration
monitoring records and we are certain that there is evidence of excessive vibration.
Excessive amplitude from our vibration data supports our hypothesis that fatigue
occur on the bearing due to high or excessive vibration
LEVEL 5 : MISALIGNMENT
Again the vibration analyst verifies his vibra-
tion records and find out the resonance and
imbalance is not a major cause for the exce-ssive vibration. We called the maintenance
who aligned the pump to align it again and
we observe his practices. From our obser-
vation we are certain that he does not know
how to align the pump properly
As we dig deeper into the root cause, again
we hypothesize, How can we have excessive
vibration? Possibilities is that it can come
from imbalance, resonance and misalignment
ANALYZING THE BEARING FAILURE LOGIC TREE
ANALYZING THE BEARING FAILURE LOGIC TREE
7/30/2019 Reliability Maintenance Operational Management
84/97
LEVEL 6 : NO PROCEDURE / NO TRAINING / IMPROPER TOOLS
We asked the mechanic if he had been trained in the proper alignment and hesaid that he was never trained in how to align, there was no procedure for the
alignment and how frequent it should be performed
People often misalign because they were
never trained in proper alignment practices,
no procedure exists outlining alignment as
a required practice with specification or the
current alignment equipment we are using
is worn our or inadequate for the application
ANALYZING THE BEARING FAILURE LOGIC TREE
THIS IS THE LATENT CAUSE
ANALYZING THE BEARING FAILURE LOGIC TREE
7/30/2019 Reliability Maintenance Operational Management
85/97
Pump Failure(No discharge at all)
Functional Failure
Bearing FailureFailure Mode
Dirt / Debris Lack of Lubrication Overloading Wear
Adhesive Abrasive Erosive Fatigue Corrosive
Have the bearing analyze for its metallurgical lab on why it failed
High Vibration
Imbalance Misalignment Resonance
No Procedure No Training No Alignment Tools
Real Root Cause of the Problem
Logic Tree Diagram
How
How
How
Valve Is ShutFailure Mode
Motor Burned OutFailure Mode
LEVEL 1
LEVEL 2
LEVEL 3
LEVEL 4
LEVEL 5
LEVEL 6
ANALYZING THE BEARING FAILURE LOGIC TREE
WITHOUT RCFA WHAT DO THEY DO TO SOLVE THE PROBLEM
7/30/2019 Reliability Maintenance Operational Management
86/97
FROM A PREVENTIVE MAINTENANCE VIEWPOINT
The maintenance will merely change or replace the bearing. If this part fails frequently
then boss makes sure that there is enough stock in the warehouse department
Our CBM group can warn the operation of an impending failure to occur bought about
by excessive vibration in the pump. Although the failure is predicted, the problem
still does not seem to go away
WITHOUT RCFA WHAT DO THEY DO TO SOLVE THE PROBLEM
FROM A PREDICTIVE MAINTENANCE VIEWPOINT
FROM AN ENGINEERING VIEWPOINT
Modify or change the bearing with a more heavy duty and put it in service. In shortwe conclude at once to change out the bearings with a New Design
FROM TOP MANAGEMENT VIEWPOINT
We penalize the culprits and even threathen to cut off their 13 month pay if the
same problem arises in the future, or get another guy that can do the job better.
FROM A CONTINUOUS IMPROVEMENT VIEWPOINT
Brainstorming teams gather together with past history and data performance of the
pump and sees a variety of causes, however they are not certain which is the real
cause so they all agreed that it was due to the change in the lubricant
FROM AN OPERATIONS VIEWPOINT
Hold countless hours of meeting blaming the maintenance for not doing their job
7/30/2019 Reliability Maintenance Operational Management
87/97
MODULE 5
LESSONS ON RELIABILITY
7/30/2019 Reliability Maintenance Operational Management
88/97
Focus must be onRELIABILITY& not co st, because ifRELIABILITY
star ts to impro veCOSTw il l def in i te ly go down, there wi l l be t imesthat focus ing onCOSTwi l l tend to hur tRELIAB ILITY, it cannot be
the other way around . Having a low cos t maintenance is a conse-
quence of go od maintenance pract ice
The goal of any maintenance is to improve
equipments reliability, once reliability startsto improve cost goes down & its not the
other way around. Cutting cost on mainte-
nance will definitely not improve reliability.
LESSON # 1 ON RELIABILITY
Reducing cost had been a focus for most
maintenance managers and that perhaps,we need to learn from the lessons of history.
Cost must be studied thoroughly not just
based from its initial cost but on the entire
life cycle cost of the equipment . . . . .
7/30/2019 Reliability Maintenance Operational Management
89/97
When we get really good at doing things
then something is wrong because we are
doing it much often, but when we expecta different result from the same tasks we
are doing then this is simple not possible,
the Chinese called this INSANITY . . . . .
Never ever accept fai lures in you r p lant. Trouble shoot ing
is no longer an effect ive strategy.In todays competitivewo r ld, the analysts f inds real solut ions to the problems
The new paradigm is thatFAILURES MUST
NOT BE ACCEPTEDit can be eliminated ifwe know the right tools to address them.
The true job of maintenance is to eliminate
failures & not fixing them all the time . . . . .
LESSON # 2 ON RELIABILITY
7/30/2019 Reliability Maintenance Operational Management
90/97
The best t ime to add ress a problem is when it is sm all . It is very
hard to advance to any fo rm of special ized maintenance act iv i t iesand improv ement ef for ts i f equipment 's B asic Cond i t ion h ad not
been wel l establ ished. Always remember our equipment is a shared
responsibi l i ty for bo th operators & maintenance people, a lesson
we must al l learn from the Japanese.
Perform ing m aintenance on the equipment is n ot the sole responsib i l i ty
of th e maintenance department, th is sho uld be a shared respon sib i l i ty
for operation s and maintenance . . . . .
LESSON # 3 ON RELIABILITY
7/30/2019 Reliability Maintenance Operational Management
91/97
In aREACTIVE ENVIRONMENT, we always com plain that we lack m an-
power resou rces to address fai lures, bu t once equipment starts toimp rove we always wonder where they have been in the f i rst place . . .
In reali ty maintenance is not o utnumbered, they are just to o bu sy
wo rking wi th breakdowns. Maintenance is not measured by how
fast we repair but o n h ow we are able to el iminate the fai lure i tsel f
LESSON # 4 ON RELIABILITY
7/30/2019 Reliability Maintenance Operational Management
92/97
Every fai lure has a specif ic set of consequences, being PROACTIVE
has som eth ing to do about reducing or eliminating the con sequencesof fai lure to a m inimum rather that com pletely el iminat ing the fai lure
its elf . . . .
LESSON # 5 ON RELIABILITY
The best maintenance strategy
to adopt w i l l a lways have to be
based upon the consequences
of th e fai lure itself
The first th ing to ask in the event
of a fai lure w i l l be what is the
consequences o f the fai lure if i toccurs on i ts own and wi l l the
fai lure be acceptable to the user
or no t . . . .
7/30/2019 Reliability Maintenance Operational Management
93/97
LESSON # 6 ON RELIABILITY
A question on why industry remain reactive may lead to a
thousand reasons or more & those who fear that improvingreliability may lead to elimination of jobs are right only to the
point where they resist change. Increasing reliability is not
achieved by cutting manpower nor are they contrasting goals.
Increasing reliability means slowly getting out of the repair
business so that new doors will open to maintenance function
The best positions in industry always
belong to the maintenance function,
however, most industries groomed
their people to be mechanics rather
than being a maintenance. Alwaysbe proud that you belong to the main-
tenance function . . . .
POSITIONS ON MAINTENANCE
7/30/2019 Reliability Maintenance Operational Management
94/97
POSITIONS ON MAINTENANCE
Maintenance
Positions
Spare PartsManager
Tribologist
Fractographer
Reliability
Expert
Technical
Trainer
Ultrasonic
Analyst
Thermographer
Vibration
Analyst
CMMS
Specialists
Failure Analyst
Oil / Lube
Analyst
Preventive
Maintenance
7/30/2019 Reliability Maintenance Operational Management
95/97
LESSON # 7 ON RELIABILITY
The real mission of the maintenance
department is to provide reliable phy-sical assets & excellent support for its
customers by reducing and eliminating
the need for maintenance. Do not con-
fuse maintenance as synonymous to
repair, these 2 are entirely different.
The distinction between a true
blooded maintenance & a me-
chanic is a maintenance uses
more of his brain than his handwhile a mechanic uses his hand
much of the time. Let us treat our
people as maintenance & not as
mere mechanics
7/30/2019 Reliability Maintenance Operational Management
96/97
LESSON # 8 ON RELIABILITY
There is no silver bullet program or
strategy that can transform a plantsreliability overnight all will start with
its basic foundation and that is by
EDUCATION and this is the most
most powerful weapon to change
the mindset of our people
Reliability is not a program with an
end but a culture without an end, its the same as any continuous
improvement philosophy . . . .
7/30/2019 Reliability Maintenance Operational Management
97/97
Top Related