Why HALT Won't Give You an MTBF (And Why You Shouldn't Care)

Post on 24-Apr-2015

378 views 3 download

description

Describes the concepts of Highly Accelerated Life Test (HALT) and Mean Time Between Failures (MTBF) and why each is not related to the other.

Transcript of Why HALT Won't Give You an MTBF (And Why You Shouldn't Care)

Why HALT Won’t Give You An MTBF And Why You Shouldn’t Care

Mark L. MorelliReliability & Test Engineer (Retired)

Hobbs Engineering Webinar June 4, 2014

2

Author’s Bio

• Recently retired after 32-year career in reliability and test engineering– Aerospace, military, and commercial building systems– Applied HALT > 200 products (> 500 separate testing

activities)– Applied HASS to ~ 10 product lines (many thousands of

units tested)• BSEE Univ. of Hartford• Adjunct Professor, Univ. of Hartford• Authored and presented numerous technical papers • Presently a freelance writer for The Motley Fool (finance site)

with a focus on technology and innovation

3

Agenda

• Discussion of MTBF

• Why HALT doesn’t provide a MTBF value

• Why HALT will improve reliability

4

Mean Time Between Failures (MTBF)

• MTBF = elapsed time between inherent failures– Assumes a renewal process (system repaired upon

failure)• Does not provide a failure distribution or pattern• Most prediction methods (e.g. MIL-HDBK-217)

use “old” data

MTBF = Cumulative Fleet Op Time ÷ Failures

MTBF = time based parameter but does not account for product failure distribution/patterns

5

MTBF and failure distributionsT (hrs) System 1 failures System 2 failures System 3 failures

50 x

100 X (2)

200 x

500

1000

2000 x

5000

10000 x

15000 x X (2)

20000 x X (2)

All systems have same MTBF = 20,000 ÷ 4 = 5,000 hours

6

MTBF and failure distribution

7

Why HALT ≠ MTBF• HALT is exploratory process used on electronics that seeks to

identify weaknesses though application of increasing and varying stress– Difficult to correlate to a precise time period but process does identify

most failure types that occur during product life cycle– Difficult to calculate an acceleration factor between test and deployment

• Typically a small number of test articles are used and not every sample will have all stresses (temperature, vibration, electrical, etc.) applied– Nearly impossible to accurately measure reliability w/ low sample sizes– My experience indicates that many (if not most) product failures are due

to lot-related part defects or process variations that can not be found in a single test at a single point in time

HALT = stress-based tool that addresses typical product failure types

8

Example: Capacitor failures

Aug-93 Mar-94 Sep-94 Apr-95 Oct-95 May-96 Dec-960

2

4

6

8

10

12

14

16

When ProducedWhen Failed

# Failures

Design testing completed Jan 1994: No capacitor failures

9

What HALT does do

• Makes product more robust– Can withstand (sometimes) unknown factory and

field environments• Allows development of ongoing reliability

tests (ORT), including production screening regimen (e.g. HASS)– Not all failures are related to design and will creep

into product over time

10

11

Ongoing reliability testing

• Needed to account for issues creeping into product over time– Lot-related (e.g. capacitor) problems– Process (e.g. solder) variation

• Periodic re-HALT• Production screening– HASS

12

Summary

• MTBF is a time-based parameter that is unrelated to a failure distribution

• HALT is a stress-based tool that is related to failure patterns (and distributions)

13

Contact information

Mark L. Morellimathman6577@gmail.comTwitter: @mathman6577LinkedIn: Mark Morelli (Greater NYC area)