Electronic Part Failure Analysis Tools and Techniques Web view Furthermore, parts removed for...

download Electronic Part Failure Analysis Tools and Techniques Web view Furthermore, parts removed for failure

of 14

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of Electronic Part Failure Analysis Tools and Techniques Web view Furthermore, parts removed for...

Electronic Part Failure Analysis Tools and Techniques

Electronic Part Failure Analysis Tools and Techniques

Walter Willing

Jonathan Fleisher

Michael Cascio

Walter Willing

Jonathan Fleisher

Michael Cascio

Northrop Grumman Corporation

7323 Aviation Blvd,

Baltimore, MD, 21090, USA

e-mail: walter.willing@ngc.com




The current emphasis on Physics of Failure (PoF) and accurate Root Cause Analysis (RCA) highlights the need for effective electronic part failure analysis processes and capabilities. Failure analysis can be as simple as visually inspecting a part and as extensive as performing sub-micron level cross-sectioning of silicon die using Focus Ion Beam (FIB) technology. This tutorial presents a “Process” as well as the tools and techniques required to perform effective failure analyses on electronic components. In addition, the common failure mechanisms found in electronic hardware are explained and emphasized with a case study.

Walter Willing

Mr. Willing is a Senior Advisory Reliability Engineer within the Northrop Grumman Corporation Electronic Systems Sector, System Supportability Engineering Department. Mr. Willing has over 30 years experience in space systems reliability. He received a BSEE from the University of Delaware and an MSEE from the Loyola College of Maryland. He is active in the IEEE (Sr. Member, Vice Chairman of the Baltimore Section), IEST and serves on the RAMS Management Committee. He has authored five peer reviewed technical papers and one RADC publication.

Jonathan Fleisher

Mr. Fleisher is a Principal Reliability Engineer within the Northrop Grumman Corporation Electronic Systems Sector, System Supportability Engineering Department. Mr. Fleisher received a BSME and an MSIE from New Mexico State University. He has 16 years of engineering experience on a variety of defense related programs, with multiple Systems Engineering responsibilities, including Environmental Qualification Lead on several radar programs. During the last several years, he has focused on reliability engineering for NGC Space Programs.

Michael Cascio

Mr. Cascio is a Failure Analysis and Reliability Engineer within the Product Integrity Department of the Northrop Grumman Electronics Systems Sector in Baltimore Maryland. Mr. Cascio received a BSEE from The Pennsylvania State University. He has over 20 years of electronic experience in Radar, Reliability and Failure Analysis. He spent eleven years in the United States Air Force where he managed operations, maintenance and support equipment for 20 two million dollar radars. He also directed the research and development upgrades on the enhancement of radar systems. At Northrop Grumman he has 10 years of engineering experience in Failure Analysis and Reliability.

Table of contents

1. Introduction 1

2. Importance of Effective Failure Analysis 1

3. Basic Failure Analysis Techniques 2

4. Suggestions For Your Own Failure Analysis Capabilities 7

5. Understanding Electronic Part Failure Mechanisms 7

6. Failure Analysis Case Study 11

7. Conclusions 12

8. References 12

Willing, et al: page ii


Organizations that produce electronic hardware should have some level of electronic part failure analysis capability and knowledge of where to go for extended failure analysis. The failure analysis process is also important. First, it is important to verify and characterize the failure via electrical test. Subsequent steps should involve non-invasive examinations such as microscopic visual inspection, X-ray and hermetic seal tests. Finally, after all non-invasive tests are completed, devices can be de-lidded (or de-capsulated) and silicon die inspections and evaluations can be performed.

This tutorial discusses the fundamental electronic part failure analysis processes, methods, tools and techniques that can be utilized to accurately determine why devices fail. This tutorial is an expansion of the 1997 O.A. Plait award winning tutorial “Understanding Electronic Part Failure Mechanisms”, sections of which are repeated in this tutorial (refer to Section 5). It is important to know what the common part failure modes are as well as the failure analysis techniques used to find them.

Understanding the cause of the part failure allows for effective corrective action and the prevention of future occurrences. Suggestions for several levels of failure analyses capabilities will be presented (Basic, Moderate, Advanced) as well as some examples of actual failure analyses to illustrate what actually occurs in failed hardware.


When electronic parts fail, it’s important to understand why they failed. Effective root cause analysis of part failures is required to assure proper corrective action can be implemented to prevent reoccurrence. Determination of root cause is also important for High Reliability systems such as implantable medical devices, space satellite systems, deep well drilling systems, etc, where failures are critical, as well as consumer products where the cost of a single failure mode can be replicated multiple times.

A common term for the process of root cause determination and applying corrective action is called FRACAS (Failure Reporting, Analysis and Corrective Action System). Failure Analysis is the crucial part of the FRACAS process.

Failure Analysis must be performed correctly to assure the failure mechanism is preserved, not “Lost” due to carelessness, bypassing critical measurements or performing destructive analyses in an incorrect sequence. For example, once wirebonds are removed, the part may not be able to be electrically tested. Furthermore, parts removed for failure analysis may “Re-Test OK” (RTOK) as a result of the wrong part being removed, or the fact that testing does not properly capture the part’s failure mode (such as a subtle parameter shift) or a particular failure sensitivity (gain vs temperature) exists.

Since it is important to preserve and characterize the failure mode to the greatest extent possible, this tutorial presents a suggested failure analysis flow, starting with full part failure characterization, followed by non-invasive and finally invasive failure analysis techniques.

The following sections herein address basic failure analysis techniques. Additional information on failure analysis methods can be found in Mil-Std-883 and Mil-Std-1580. While these specifications define test and evaluation methods, the “requirements” and methods within these standards provide a good baseline for evaluating failed parts. For example, when evaluating the wirebonds on a failed part, the pull test limits in Mil-Std-883 (Method 2011) can provide insight as to whether the failure part has good wirebonds. The internal visual inspection criteria of Mil-Std-883 (Methods 2010 and 2017) help determine whether any anomalies are actually defects or allowed process variations.

For further investigation into advanced failure analysis techniques and component failure modes, the reader is encouraged to become familiar with the International Reliability Physics Symposium (IRPS) as well as other venues.

The following are some top causes for component failures experienced on various types of electronic equipment:

1) Electrical Overstress: During board level testing, it’s quite common to experience electrical overstress due to transients related to test setups. All power inputs to electronic assemblies should be properly controlled to protect against fault conditions and unattended transients. Inadvertent connections or rapid switching to full amplitude voltage levels can lead to inrush or high transient conditions that can damage components. Human body electrical static discharge (ESD) overstress is also a well-known and documented mechanism that damages components. ESD sensitive integrated circuits (IC) are the most commonly affected. ICs rated below 250V for ESD are easily damaged by human handling without adequate ESD controls.

2) Contamination: One of the more common causes of latent failure is due to contamination. Contamination ultimately leads to failures stemming from corrosion or degradation related to active elements such as semiconductors. Contamination can also rapidly destroy wire bond interconnects and metallization. Sources of contamination can typically be traced to either human by-products (Spittle) or chemicals used in the assembly process.

3) Solder joint failure: Solder joint workmanship is the most common issue related to initial assembly or board fabrication. It is also commonly responsible for latent failures due to joint fatigue driven by thermal cycling. Non compliant or leadless ceramic type components of >0.25inch size are the parts that are most susceptible to solder joint wear out failures. Examples of solder joint failures are shown in Figure 1.

4) Cracked Ceramic Packages: Ceramics are used for the majority of high reliability military and space applications. However, the packages are very brittle and susceptible to cracking due to stress risers from either surface anomalies or general mounting. Root cause for these issues can typically be traced to either design implementation or process control.

5) Timing Issues: Inadequate timing margins are sometimes misdiagnosed as intermittent component behavior. Thorough timing analysis should be part of any design in particular when asynchronous signals are present.

Figure 1.