Systems Prognostic Health Management April 1, 2008

Systems Prognostic Health ManagementApril 1, 2008

Christopher ThompsonIBM Global Business Services

FCS LDMS Program

Systems Engineering Program

Disclaimer: This briefing is unclassified and contains no proprietary information. Any views expressed by the author are his, and in no way represent those of Lockheed Martin Corporation.

2

My Engineering ExperienceIBM Global Business Services, Dallas TX

Requirements Lead/Prognostics SMEFCS Logistics Data Management Service (LDMS)

Lockheed Martin Missiles and Fire Control, Dallas TXSenior Systems Engineer- Multifunction Utility/Logistics Equipment (MULE)

Lockheed Martin Aeronautics, Fort Worth TXVehicle Systems - Prognostic Health Management- F-35 Joint Strike Fighter (Lightning II)

Lockheed Martin Missiles and Fire Control, Dallas TXReliability Engineer- Army Tactical Missile System (TACMS)

SMU School of Engineering, Dallas TX- TA for Dr. Stracener

3

My Education

B.S. in Electrical Engineering, SMU (1997)

M.S. in Mechanical Engineering, SMU (2001)- Major: Fatigue/Fracture Mechanics

M.S. in Systems Engineering (2002)- Major: Reliability, Statistical Analysis

Ph.D. in Applied Science (anticipated ~ 2009)- Major: Systems Engineering (PHM)

4

My Dissertation

Fleet Based Analysis of Mission Equipment Sensor Configuration and Coverage Optimization for Systems Prognostic Health Management

5

Sensor Tradeoffs

As more sensors are added to your system:

GOOD BAD

weightincreases

A0

increases

P(FDI)increases

R(t)decreases

TRADEOFFS

powerincreases

AUPCincreases

Life costdecreases

MTTRdecreases

MTBUMAincreases

P(Prog)increases

volumeincreases

cablingincreases

6

PHM Optimization

OperationalAvailability

AO

0 # of Sensors N

Cost*$

AUPC

* Other metrics will include Weight/Volume, Power (K/W), Specants (Computing Power)

LCC

AO

Optimum

7

PHM Optimization

Probabilityof

Detectionof

Crack

0 x = mean distance between sensors X

x = mean distance between sensorsx

x

x

structuralelement

∞ N = # of sensors 0

optimumsolution

8

For a common LRU (such as an engine), plotting engine power against against an environmental measure (such as temperature) over time:

NoDamage

MildDamage

ModerateDamage

SevereDamage

EngineInternalTemp.

Time

Engine Power

200% spec limit

150% spec limit

125% spec limit

100% spec limit

9

Estimating the damage accumulated (or life consumed)

MildDamage

ModerateDamage

SevereDamage

x1

x2

x4 (or more)

150% specification limit

200% specification limit

Hypothetical engine air/oil/fuel filter performance over its life

optimalperformance

engine systemfailure likely

acce

ptab

lepe

rfor

man

ce

filter life (in miles)

filt

er p

erfo

rman

ce (

flo

w r

ate)

engine systemdamage likely

deg

rade

dpe

rfor

man

ce

MTBF

haza

rdou

spe

rfor

man

ce

distribution offailure times

11

Common LRU used on multiple vehicle types with platform specific (hidden) failure modes

Why is the Failure

Rate for the LRU

in Platform 4 higher?

What is different

about Platform 4?

Platform 1

FailureRate

Platform 2 Platform 3 Platform 4

stat

istic

ally

sig

nific

ant

diff

eren

ce

12

Standard oil filter used in engines across FCS vehicles,

replaced at a scheduled time/miles

ScheduledReplacement

Time

Actual Condition of the oil filter

Vehicle 1 Vehicle 2 Vehicle 3

We

aro

ut –

Life

Co

nsu

mpt

ion

wasted filter life

Increased engine lifeconsumption

Correctaction

13

Standard structural element across several vehicles (under cyclic loading)

Time (or miles, or load cycles, or on/off cycles, …)

DamageAccumulation

fleetbased

estimate

Repair neededbefore estimate

LRU lifehistories

14

The MULE Program

Future Combat Systems Multifunction Utility/Logistics Equipment

15

Keys to the Success of FCS

• Reducing Logistics footprint• Increasing Availability• Reducing Total Cost of Ownership• Implementing Performance Based Logistics• Improvements in the ‘ilities’ (RAM-T)

– Reliability– Availability– Maintainability– Testability– Supportability

16

Prognostics

Of or relating to prediction; a sign of a future happening; a portent.

The process of calculating an estimate of remaining useful life for a component, within sufficient time to repair or replace it before failure occurs.

17

Prognostic Health Management (PHM)

PHM is the integrated system of sensors which:• Monitors system health, status and performance • Tracks system consumables

oil, batteries, filters, ammunition, fuel…• Tracks system configuration

software versions, component life history…• Isolates faults/failures to their root causes• Calculates remaining life of components

18

Diagnostics

The identification of a fault or failure condition of an element, component, sub-system or system, combined with the deduction of the lowest measurable cause of that condition through confirmation, localization, and isolation. • Confirmation is the process of validation that a

failure/fault has occurred, the filtering of false alarms, and assessment of intermittent behavior.

• Localization is the process of restricting a failure to a subset of possible causes.

• Isolation is the process of identifying a specific cause of failure, down to the smallest possible ambiguity group.

19

Faults and Failures

Fault: A condition that reduces an element’s abilityto perform its required function at desired levels, or degrades performance.

Failure: The inability of a component, sub-system or system to perform its intended function. Failure may be the result of one or more faults.

Failure Cascade: The result when a failure occurs in a system where the successful operation of a component depends on a preceding component, which can a failure can trigger the failure of successive parts, and amplify the result or impact.

20

Classes of Failures

Design Failures: These take place due to inherent errors or flaws in the system design.

Infant Mortality Failures: These cause newly manufactured systems to fail, and can generally be attributed to errors in the manufacturing process, or poor material quality control.

Random Failures: These can occur at any time during the entire life of a system. Electrical systems are more likely to fail in this manner.

Wear-Out Failures: As a system ages, degradation will cause systems to fail. Mechanical systems are more likely to fail in this manner.

21

The Ultimate Goal of Prognostics

The aim of Prognostics is to maximize system availability and life consumption while minimizing Logistical Downtime and Mean Time To Repair, by predicting failures before they occur. This is a notional diagram indicative of a wear out failure.

22

What is PHM?Prognostic Health Management (PHM) is the integratedhardware and software system which:• Monitors system health, status and performance • Tracks system consumables

oil, batteries, filters, ammunition, fuel…• Tracks system configuration

software versions, component life history…• Diagnoses/Isolates faults/failures to their root causes• Calculates remaining life of components• Predicts failures before they occur• Continually updates predictive models with failure data

23

What is PHM?Prognostic Health Management is a methodology for establishing system status and health, and projectingremaining life and future operational condition, by comparing sensor-based operational parameters to threshold values within knowledge base models. These PHM models utilize predictive diagnostics, fault isolation and corroboration algorithms, and knowledge of the operational history of the system, allowing users to make appropriate decisions about maintenance actions based on system health, logistics and supportability concerns and operational demands, to optimize such characteristics as availability or operational cost.

24

PHM Stakeholders

SYSTEMS ENGINEERING

SOFTWARE & SIMULATION

TEST ENGINEERING

MECHANICAL ENGINEERING

ELECTRICAL ENGINEERING

TRAINING & PROD. SUPP.

PHM ModelDesign

InterfaceManagement

RequirementsDevelopment

SensorOptimization

CAIV/WAIVAnalysis

PrognosticTrending

SystemArchitecture

PHM ModelIntegration

SoftwareInterfaces

Fault/FailureSimulation

ContinuousBIT/PHM

TestPlanning

Fault/FailureCriticality

Fault/FailurePropagation

Fault/FailureSimulation

PlatformIntegration

Crack GrowthSensing

Stress/StrainSensing

CorrosionSensing

VibrationSensing

ConsumablesMonitoring

AcousticSensing

ThermalSensing

SensorImplementation

SensorIntegration

Data Management

Data Architecture

Reliability/Failure Modes

Maintainability& Testability

Logistics &Sustainment

Training

Safety

25

PHM Design Methodology

26


27


28


29


30

• Availability, Achieved

where

MTBF = Mean Time Between Failure

MTTR = Mean Time To Repair

MTTRMTBF

MTBF

Time Down

Time UpAA

Availability Analysis

31

• Availability, Operational

where

MTBUMA = Mean Time Between Unscheduled Maintenance Actions

ALDT = Administrative Logistical Down Time

MTTR = Mean Time To Repair

MTTRALDTMTBUMA

MAMTBU

Time Down

Time UpAO


32

• MTBUMA = Mean Time Between Unscheduled Maintenance Actions

where

MTBM = Mean Time Between Failures

MTBM = Mean Time Between Maintenance

MTBM

1

MTBM

1MTBF

11

MTBUMA

defect noinduced


33

• How can we improve AO?

- By decreasing Administrative & Logistical Down Time (ALDT)

- By increasing Mean Time Between Failures (MTBF)

- By decreasing Mean Time To Repair (MTTR)

- By increasing Mean Time Between Unscheduled Maintenance Actions (MTBUMA) – [by decreasing MTBR induced and MTBR no defect]


34

• How can we decrease ALDT?

- By improving Logistics Improve scheduling of inspectionsImprove commonality of partsDecrease time to get replacements

- By improving PrognosticsReplace parts before they fail, not afterMaximize use of component lifeImprove off-board prognostics trendingMore sensors!!


35

• How can we increase MTBF?

- By improving ReliabilitySelect more rugged componentsImprove life screening and testingImprove thermal management

- By improving QualityBetter parts screeningBetter manufacturing processes

- By adding RedundancyAt the cost of Size, Weight and Power!


36

• How can we decrease MTTR?

- By improving MaintainabilityImprove quality and efficacy trainingSimplify fault isolationDecrease number of tools and special equipmentDecrease access time (panels, connectors…)Improve Preventative Maintenance

- By improving DiagnosticsImprove BIT and BITEDecrease ambiguity group sizeImprove maintenance manuals and training


37

• How can we increase MTBM (induced/no defect)?

- By improving SafetyLimit the potential for accidental damage

- By improving PrognosticsImprove PHM models to monitor induced damage

- By improving DiagnosticsLower the false alarm rateDon’t repair/replace things which aren’t broken!


Systems Prognostic Health Management April 1, 2008

Documents

Transcript of Systems Prognostic Health Management April 1, 2008