Failure analysis semiconductor

download Failure analysis semiconductor

of 14

Transcript of Failure analysis semiconductor

  • 7/27/2019 Failure analysis semiconductor

    1/14

    Tutorial Notes 2012 AR&MS

    2012 Annual RELIABILITY and MAINTAINABILITY Symposium

    Electronic Part Failure Analysis Tools and Techniques

    Walter Willing, Jonathan Fleisher & Michael Cascio

    Walter Willing, Jonathan Fleisher & Michael CascioNorthrop Grumman Corporation

    7323 Aviation Blvd,Baltimore, MD, 21090, USA

    e-mail: [email protected], [email protected] & [email protected]

  • 7/27/2019 Failure analysis semiconductor

    2/14

    ii Willing, Fleisher & Cascio 2012 AR&MS Tutorial Notes

    SUMMARY & PURPOSE

    The current emphasis on Physics of Failure (PoF) and accurate Root Cause Analysis (RCA) highlights the need for

    effective electronic part failure analysis processes and capabilities. Failure analysis can be as simple as visually inspecting a

    part and as extensive as performing sub-micron level cross-sectioning of silicon die using Focus Ion Beam (FIB) technology.

    This tutorial presents a Process as well as the tools and techniques required to perform effective failure analyses on electronic

    components. In addition, the common failure mechanisms found in electronic hardware are explained and emphasized with a

    case study.

    Walter Willing

    Mr. Willing is a Senior Advisory Reliability Engineer within the Northrop Grumman Corporation Electronic Systems

    Sector, System Supportability Engineering Department. Mr. Willing has over 30 years experience in space systems reliability.

    He received a BSEE from the University of Delaware and an MSEE from the Loyola College of Maryland. He is active in the

    IEEE (Sr. Member, Vice Chairman of the Baltimore Section), IEST and serves on the RAMS Management Committee. He has

    authored five peer reviewed technical papers and one RADC publication.

    Jonathan Fleisher

    Mr. Fleisher is a Principal Reliability Engineer within the Northrop Grumman Corporation Electronic Systems Sector,

    System Supportability Engineering Department. Mr. Fleisher received a BSME and an MSIE from New Mexico State

    University. He has 16 years of engineering experience on a variety of defense related programs, with multiple Systems

    Engineering responsibilities, including Environmental Qualification Lead on several radar programs. During the last several

    years, he has focused on reliability engineering for NGC Space Programs.

    Michael CascioMr. Cascio is a Failure Analysis and Reliability Engineer within the Product Integrity Department of the Northrop

    Grumman Electronics Systems Sector in Baltimore Maryland. Mr. Cascio received a BSEE from The Pennsylvania State

    University. He has over 20 years of electronic experience in Radar, Reliability and Failure Analysis. He spent eleven years in

    the United States Air Force where he managed operations, maintenance and support equipment for 20 two million dollar radars.

    He also directed the research and development upgrades on the enhancement of radar systems. At Northrop Grumman he has

    10 years of engineering experience in Failure Analysis and Reliability.

    Table of Contents

    1. Introduction ..........................................................................................................................................................................12. Importance of Effective Failure Analysis ............................................................................................................................. 1

    3. Basic Failure Analysis Techniques ...................................................................................................................................... 24. Suggestions for Your Own Failure Analysis Capabilities .................................................................................................... 7

    5. Understanding Electronic Part Failure Mechanisms ............................................................................................................7

    6. Failure Analysis Case Study ............................................................................................................................................... 117. Conclusions ........................................................................................................................................................................ 12

    8. References ..........................................................................................................................................................................12

    9. Tutorial Visuals.. .................................13

  • 7/27/2019 Failure analysis semiconductor

    3/14

    1 Willing, Fleisher & Cascio 2012 AR&MS Tutorial Notes

    1. INTRODUCTIONOrganizations that produce electronic hardware should

    have some level of electronic part failure analysis capability

    and knowledge of where to go for extended failure analysis.

    The failure analysis process is also important. First, it is

    important to verify and characterize the failure via electrical

    test. Subsequent steps should involve non-invasive

    examinations such as microscopic visual inspection, X-ray

    and hermetic seal tests. Finally, after all non-invasive tests are

    completed, devices can be de-lidded (or de-capsulated) and

    silicon die inspections and evaluations can be performed.

    This tutorial discusses the fundamental electronic part

    failure analysis processes, methods, tools and techniques that

    can be utilized to accurately determine why devices fail. This

    tutorial is an expansion of the 1997 O.A. Plait award winning

    tutorial Understanding Electronic Part Failure Mechanisms,

    sections of which are repeated in this tutorial (refer to Section

    5). It is important to know what the common part failure

    modes are as well as the failure analysis techniques used to

    find them.

    Understanding the cause of the part failure allows for

    effective corrective action and the prevention of futureoccurrences. Suggestions for several levels of failure analyses

    capabilities will be presented (Basic, Moderate, Advanced) as

    well as some examples of actual failure analyses to illustrate

    what actually occurs in failed hardware.

    2. IMPORTANCE OF EFFECTIVE FAILURE ANALYSIS

    When electronic parts fail, its important to understand

    why they failed. Effective root cause analysis of part failures

    is required to assure proper corrective action can be

    implemented to prevent reoccurrence. Determination of root

    cause is also important for High Reliability systems such as

    implantable medical devices, space satellite systems, deep

    well drilling systems, etc, where failures are critical, as well asconsumer products where the cost of a single failure mode can

    be replicated multiple times.

    A common term for the process of root cause

    determination and applying corrective action is called

    FRACAS (Failure Reporting, Analysis and Corrective Action

    System). Failure Analysis is the crucial part of the FRACAS

    process.

    Failure Analysis must be performed correctly to assure

    the failure mechanism is preserved, not Lost due to

    carelessness, bypassing critical measurements or performing

    destructive analyses in an incorrect sequence. For example,

    once wirebonds are removed, the part may not be able to be

    electrically tested. Furthermore, parts removed for failureanalysis may Re-Test OK (RTOK) as a result of the wrong

    part being removed, or the fact that testing does not properly

    capture the parts failure mode (such as a subtle parameter

    shift) or a particular failure sensitivity (gain vs temperature)

    exists.

    Since it is important to preserve and characterize the

    failure mode to the greatest extent possible, this tutorial

    presents a suggested failure analysis flow, starting with full

    part failure characterization, followed by non-invasive and

    finally invasive failure analysis techniques.

    The following sections herein address basic failure

    analysis techniques. Additional information on failure

    analysis methods can be found in Mil-Std-883 and Mil-Std-

    1580. While these specifications define test and evaluation

    methods, the requirements and methods within these

    standards provide a good baseline for evaluating failed parts.

    For example, when evaluating the wirebonds on a failed part,

    the pull test limits in Mil-Std-883 (Method 2011) can provideinsight as to whether the failure part has good wirebonds. The

    internal visual inspection criteria of Mil-Std-883 (Methods

    2010 and 2017) help determine whether any anomalies are

    actually defects or allowed process variations.

    For further investigation into advanced failure analysis

    techniques and component failure modes, the reader is

    encouraged to become familiar with the International

    Reliability Physics Symposium (IRPS) as well as other

    venues.

    The following are some top causes for component failures

    experienced on various types of electronic equipment:

    1) Electrical Overstress: During board level testing, its quite

    common to experience electrical overstress due totransients related to test setups. All power inputs to

    electronic assemblies should be properly controlled to

    protect against fault conditions and unattended transients.

    Inadvertent connections or rapid switching to full

    amplitude voltage levels can lead to inrush or high

    transient conditions that can damage components.

    Human body electrical static discharge (ESD) overstress

    is also a well-known and documented mechanism that

    damages components. ESD sensitive integrated circuits

    (IC) are the most commonly affected. ICs rated below

    250V for ESD are easily damaged by human handling

    without adequate ESD controls.

    2) Contamination: One of the more common causes of latent

    failure is due to contamination. Contamination

    ultimately leads to failures stemming from corrosion or

    degradation related to active elements such as

    semiconductors. Contamination can also rapidly destroy

    wire bond interconnects and metallization. Sources of

    contamination can typically be traced to either human by-

    products (Spittle) or chemicals used in the assembly

    process.

    3) Solder joint failure: Solder joint workmanship is the most

    common issue related to initial assembly or board

    fabrication. It is also commonly responsible for latent

    failures due to joint fatigue driven by thermal cycling.Non compliant or leadless ceramic type components of

    >0.25inch size are the parts that are most susceptible to

    solder joint wear out failures. Examples of solder joint

    failures are shown in Figure 1.

    4) Cracked Ceramic Packages: Ceramics are used for the

    majority of high reliability military and space

    applications. However, the packages are very brittle and

    susceptible to cracking due to stress risers from either

    surface anomalies or general mounting. Root cause for

  • 7/27/2019 Failure analysis semiconductor

    4/14

    2012 Annual RELIABILITY and MAINTAINABILITY Symposium Willing, Fleisher & Cascio 2

    these issues can typically be traced to either design

    implementation or process control.

    5) Timing Issues: Inadequate timing margins are sometimes

    misdiagnosed as intermittent component behavior.

    Thorough timing analysis should be part of any design in

    particular when asynchronous signals are present.

    Figure 1. Defective Solder Joints

    6) Power Sequencing Issues: Many of the IC technologies are

    susceptible to damage if bias voltages are not properly

    applied prior to control or data input voltages.

    7) Design Implementation: Often component failures are

    related to poor design implementation rather than random

    defects in the components themselves. Examples include

    inadequate derating (voltage, power, and thermal),

    floating CMOS inputs, improper reset sequencing, or

    applying low bias voltages. The most common of these is

    due to mismanaging component thermal conditions and

    operating parts outside their rated power dissipation

    limits.

    3. BASIC FAILURE ANALYSIS TECHNIQUES

    The basic flow for effective part failure analysis starts

    before the component is removed from the board. Upon

    completion of the board troubleshooting and fault isolation

    process, the cognizant Failure Analysis engineer should

    review the troubleshooting results while the part is still on the

    board witnessing any in-situ part measurements (for later

    verification in the FA lab) and noting any anomalies that exist

    on the board which may potentially have contributed to the

    part failure. Prior to removing a part from the board, it is also

    recommended to photograph the part as installed for future

    reference. Photos should be taken from various angles tocapture the details of the installation, such as the solder

    attachment. In addition, contacting the vendor before

    removing high value parts is advised. Reviewing the failure

    data with the vendor can often identify external interfaces as

    the culprit rather than the suspected part. As some devices can

    cost many thousands of dollars to replace, it is highly

    recommended that all resources available be used prior to

    replacing them.

    The Failure Analyst should also be consulted on the safest

    means for removing the part to preserve it to the greatest

    extent possible. Once the part is removed for failure analysis,

    three (3) basic processes should be followed:

    Electrical Testing and part characterization

    Non-Invasive tests

    Invasive testsThis general failure analysis process is illustrated in Table 1.

    Additional details pertaining to these tests and methods are

    discussed in this sectionTable 1. General Failure Analysis Process

    Electrical Testing / CharacterizationTest / Characterize over temperature

    Curve Tracer I-V check of Inputs

    Non-Invasive TestsExternal Microscopic Exam / Photo

    Fine & Gross Leak

    Vacuum Bake (Non-Hermetic Parts)

    X-rayPIND

    XRF

    SAM / C-SAM

    Invasive TestsLid Removal / Decapsulate

    Die Examination

    Die Probing

    IR Microscopic Exam

    Liquid Crystal

    Cross-Sectioning

    SEM

    EDS/EDX

    FIB

    Auger

    SIMS

    FTIR

    TEM/STEM

    3.1 Electrical Part Testing and Characterization

    Electrical part testing and characterization is important, asit is necessary to confirm the part has indeed failed (if not, the

    fault may still exist at the board level) and to determine if

    there are any temperature, voltage or clock speed sensitivities

    associated with the parts performance. All parts should be

    fully electrically tested at ambient, cold and hot temperatures

    to determine if the failure is sensitive to temperature. Another

    step in part characterization is to perform a curve tracer

    current vs. voltage (IV) characterization of each input signal

  • 7/27/2019 Failure analysis semiconductor

    5/14

    3 Willing, Fleisher & Cascio 2012 AR&MS Tutorial Notes

    (typically to ground) to determine if any input overstress have

    occurred. The IV characteristics of the failed part can be

    compared to a known good part with any deviations noted and

    recorded for later die examination.

    Electrical Testing / Characterization Outline:

    Test / Characterize, over temperature, voltage, clock

    speed

    I/O Curve tracer assessments Compare to known good

    devices

    3.2 Non-Invasive Examinations

    Once the failed components have been fully characterized

    via electrical testing, non-Invasive examinations can be

    performed. It is important to perform all necessary non-

    invasive tests and examinations first, so as to not destroy any

    evidence until a good set of non-invasive characteristics

    have been defined for the failed part.

    3.2.1 External Microscopic exam / Photo

    Using a stereo microscope, a thorough external visual

    examination of the suspect part should be performed early in

    the failure analysis process. Typical inspection scopes range

    from 10X to 30X magnification, which is usually sufficient to

    identify such items as external contamination and/or solder

    balls (possibly shorting out pins on the device), damaged leads

    or package seals, gross cracks in the package, etc.

    Magnification levels up to 100X can be employed to further

    examine any anomalies identified. The following conditions

    should be specifically looked for: Contamination

    Mechanical damage

    Thermal or electrical damage

    Seal integrity

    Lead integrity

    Photographs should be taken to document the condition of thepart and to record any anomalies.

    3.2.2 Fine & Gross seal tests for hermetic devices

    Hermeticity testing (refer to Mil-Std-883 Method 1014)

    should be performed on hermetic parts to ensure no leaks that

    could have allowed moisture to enter the package exist. Any

    internal moisture might result in possible corrosion or provide

    a conductive path on the semiconductor die surface, thereby

    causing a failure. A fine leak test often involves placing the

    part in pressurized helium (He) chamber in an attempt to force

    He into the device cavity through any leak sites, then moving

    the part to a Helium detection chamber to see if any He leaks

    out. Gross leak testing involves placing the part in a heatedfluorocarbon bath and literally Looking for Bubbles. The

    heated bath causes the atmosphere within the package to

    expand, forcing it through any large leak sites. It is important

    to perform both Fine Leak and Gross leak testing, as a Gross

    Leak site may be large enough to allow a full venting of the

    pressurized He, subsequently resulting in a false pass for the

    fine leak test. It is also important that the failed part be clean

    of any external epoxy or contamination that could absorb the

    He and provide a false positive reading. Newer optical leak

    test equipment using laser imaging of package lid deflection to

    confirm hermeticity is also available.

    3.2.3 Vacuum Baking

    If a non-hermetic part or cable is suspected to have a

    moisture related issue, a vacuum bake can be performed to

    drive out any residual moisture. If the problem disappears

    after the vacuum bake process, humidity could have been the

    cause. The authors were recently involved with a case wheretrapped moisture affected the performance of an RF cable.

    3.2.4 X-ray (Film, Real-time, 3D)

    Radiograph (refer to Mil-Std-883 Method 2012), often

    referred to as X-ray, is a very powerful tool for non-invasive

    failure analysis as X-ray can detect actual or potential defects

    within enclosed packages. There are multiple types of X-ray

    equipment available, from the basic film X-ray systems to

    real-time and 3-D X-ray systems. While film X-rays can be

    useful, the modern real-time X-ray provides a more extensive

    capability. Basic X-rays allow internal part examination

    looking for:

    Internal particles Internal wire bond dress

    i.e. can make sure the wire bonds are not touching

    each other or package lids

    Die attach quality (voiding, die attach perimeter)

    Solder joint quality for connectors

    Insufficient or excessive solder

    Substrate or printed wiring board trace integrity

    Obvious voids in the lid seal

    Foreign metallic particles within the package

    Internal part orientation, etc.

    The resolution of a basic film X-ray is typically to a 1 mil

    particle size, or bond wires to 1 mil diameter. The principal

    limitation of film X-ray is that it only allows one exposure

    level at a time. Not all characteristics can be observed at a

    single exposure level. Conversely, real-time X-ray typically

    has a resolution range from 1um to 0.4 um and allows for a

    continuous adjustment of exposure levels and conditions, as

    well as real time part rotation to obtain the most revealing X-

    ray view. Special digital filtering and image processing can

    also be used to detect possible delineations in the image not

    otherwise observable on the image screen.

    3.2.5 PIND Test / Particle ImpactNoise Detection (PIND)

    Cavity device failures can be caused by internal

    conductive particles shorting adjacent conductors. While X-ray techniques can be used to detect internal particles, another

    method is Particle Impact Noise Detection (PIND), refer to

    Mil-Std-883 Method 2020. PIND Testing can be subjective

    and may not be easily performed on complex hybrids.

    However, it can provide evidence of internal particles. A

    common technique employed is to perform X-ray and PIND

    together; first an X-ray is taken, then the part is PIND tested,

    and then a second X-ray is taken. This allows one to identify

    particles that are free-floating within the package.

  • 7/27/2019 Failure analysis semiconductor

    6/14

    2012 Annual RELIABILITY and MAINTAINABILITY Symposium Willing, Fleisher & Cascio 4

    Section 5.5.3 discusses loose particles detected during

    PIND test.

    3.2.6 X-ray Fluorescence (XRF)

    X-ray Fluorescence (XRF) is a non-destructive technique

    used to determine the elemental composition of solid and

    liquid samples. The X-rays excite atoms in the sample,

    causing them to emit X-rays with energies characteristic of

    each element present. The XRF equipment measures the

    energy and intensity of these X-rays and is capable ofdetecting elements from Al to U in the periodic table. XRF

    can determine concentrations ranging from parts per million to

    100% at depths as great as 10m. Using reference standards,

    XRF can accurately quantify the elemental composition of the

    samples. XRF is commonly used to examine platings for pure

    tin content, as well as for cadmium and zinc [1].

    3.2.7 Acoustic tests (SAM / C-SAM)

    Acoustic testing is a popular test method to look for voids

    and delaminations or cracks in Plastic Encapsulated

    Microcircuits (PEMS) and ceramic capacitors. Acoustic tests

    rely on acoustic energy transfer through the part. If there is a

    void, the acoustic energy is blocked and voids can be detected.The acoustic tests can also be tuned to attempt to determine

    the depth of any void. Acoustic tests involve either reflected

    acoustic energy or energy transmitted through the part. Since

    the energy transmission medium is typically deionized water,

    parts to be examined must withstand exposure to water.

    3.2.8 Residual Gas Analysis, internal water vapor content

    Before transitioning to invasive examinations, for a

    hermetic part suspected of having an internal moisture issue,

    then Residual Gas Analysis (RGA) should considered once all

    non-invasive tests are performed. If a part only fails at cold

    temperature, an RGA test should be considered as cold

    temperature failures may be a result of excessive internal

    moisture condensing on the die surface. RGA (refer to Mil-

    Std-883 Method 1018) involves Poking a Hole through the

    device lid, using a vacuum to remove the interior gas and

    performing a spectral analysis of the internal gases to

    determine their content. RGA can detect most of the gasses

    found within devices and report their individual

    concentrations. For water vapor, the maximum allowed

    concentration is typically 5000 ppm. This corresponds to the

    dew point (sublimation point) of -2C where the partial

    pressure of the H20 prevents any liquid condensation.

    3.3 Invasive Examinations; Part De-Lid / De-ProcessAfter all Non-Invasive examinations have been

    performed, its time to Bite the Bullet and dig deeper into

    the part. For cavity parts, this often involves a process called

    delidding where the device lid is removed, often by grinding

    down the lid around the seal ring or weld seal. For Plastic

    Parts, a chemical vapor deprocessing (desolving) of the

    encapsulant material must be performed. In either case, the

    goal is to expose the top chip surface to allow for visual

    examination. As Flip Chip devices become more popular,

    chip to substrate de-stacking will be required. For this

    process, sending the parts back to the original manufacturer is

    recommended. If a cavity device has been determined to

    contain an internal particle via X-ray or PIND testing, one

    technique that can be used to capture the particle is to first

    grind down the lid in one corner to the point where the cover

    thickness in the corner is very thin, then try to shake the

    particle down to that corner. Finally the corner can be

    carefully pealed back, exposing the particle of interest. A

    second option is to punch a small hole in the thinned lid andcover it by adhesive tape. The part can then be run on the

    PIND tester until the noise stops. This procedure results in the

    particle being stuck on the tape.

    Figure 2 presents a part with the lid removed, for a failure

    associated with a melted wire bond.

    Figure 2. Device with lid removed Revealing open wire

    bond.

    3.3.1 DIE Exams

    Once the top surface of the die is exposed, a microscopic

    die exam should be performed to look for obvious issues, such

    as damaged metal traces, die cracks, broken or damaged

    wirebonds, etc.

    These examinations are typically performed using a

    microscope at magnifications of 100X to 1000X. Deep UV

    optical microscopes can reach 16,000X magnification and are

    capable of resolving 10 microns. Microscopes equipped with

    both dark and light field illumination are helpful, as changingthe lighting conditions can help reveal anomalies.

    Photographs should be taken to document the condition of the

    die and to record any anomalies.

    3.3.2 Die Probing

    If the failure analyst is familiar with the part die, probing

    using micro-manipulators and special probes can be

    performed to determine if any die metallization traces are

  • 7/27/2019 Failure analysis semiconductor

    7/14

    5 Willing, Fleisher & Cascio 2012 AR&MS Tutorial Notes

    shorted or open or to confirm an internal bias level. Detailed

    knowledge of the die design is necessary when performing

    this type of probing.3.3.3 Thermal imagining of die

    Quite often, defects on semiconductor die are associated

    with hot spots. These hot spots can be associated with

    shorts or circuits that are otherwise operating hotter than

    expected. There are two commonly used techniques to look

    for hot spots; an IR Microscope or liquid crystal die thermalmapping. Both techniques require the die to be biased, so it

    needs to be in a state where the leads can be connected or the

    die pads can be probed and voltages applied. The resolution

    of IR microscopes is on the order of 1 to 5 microns. The more

    accurate technique, especially when looking for point site

    defects, is the liquid crystal die thermal mapping. While a

    calibrated IR microscope can provide an actual die thermal

    measurement, the liquid crystal technique shows a relative

    hotspot as the liquid crystals change color with temperature.

    It has a higher resolution to determine exactly where the

    hotspot exists on the die. Once the hot spot is located, it can

    be further examined using high power microscope

    examinations, SEM or FIB, as discussed below.

    3.3.4 Wire Bond Pull Test (NDPT and DPT)

    As part of the invasive Failure analysis examination, Wire

    bonds should be checked, especially if a bad interconnect is

    suspected. A non-destructive pull test (NDPT) can be

    performed first (refer to Mil-Std-883 Method 2023) followed

    by an electrical retest of the part (if necessary). If a high

    resistance bond is still suspected, a destructive bond pull test

    (DPT) should be performed (refer to Mil-Std-883 Method

    2011). Wire bond pull strength depends on the type (Au, Al,

    etc.) and diameter of the wire. To gauge the proper bond pull

    strength, the post-seal bond strength requirements of

    Method 2011 should be considered (~ 80% of initial pull

    strength), to allow for some loss of bond strength with time

    and thermal exposure. For thermo-compression or thermo-

    sonic ball bonds, any bond pull failure where the entire ball

    bonds lifts off of the pad should be examined in more detail.

    These kinds of ball lifts are quite often a result of

    Kirkendall voiding and could represent a fundamental wire

    bond issue with the part. Section 5.1.1 discusses additional

    wire bond issues.

    3.3.5 Cross Sectioning

    Cross-Sectioning is a very important means of failure

    analysis. It is often used for connector, printed wiring board,substrate, solder joint, capacitor, resistor transformer,

    transistor and diode failure analysis. Cross-sectioning of

    semiconductor die can also be performed using a Focused Ion

    Beam (FIB). More information on FIB techniques is

    discussed in section 3.3.8. Prior to cross-sectioning, the

    sample is usually potted in a hard setting acrylic or polyester

    rosin. Cross-Sectioning is exactly as the name implies; the

    failed item is literary cut in a cross-sectioned fashion then

    highly polished to allow detailed microscopic examinations to

    be made. The potted sample can be cut in half initially to

    target the failure site, or the cross-section can commence at

    one end of the sample and then progressively continue up to

    and through the failure site. This progressive cross-sectioning

    can provide a 3D view of the failure site. Of course,

    photographs should be taken at all cross-section points for

    documentation. Figure 3 is a cross-section of a solder joint.

    Figure 3. Solder Joint Cross-Section

    3.3.6 Scanning Electron Microscope (SEM)

    A Scanning Electron Microscope is an important tool for

    semiconductor die failure analysis, as well as metallurgical

    failure analysis. The SEM can provide detailed images of up

    to120,000 X magnification, with typical magnifications of

    50,000 to 100,000X and features resolution down to 25

    Angstroms. NANO SEMs can resolve features down to 10

    Angstroms.

    With a SEM image, the depth of field is fairly large,

    thereby providing a better overall three-dimensional view of

    the sample. While high power microscopes can reach 1000 X,

    the depth of field is usually very small and only features in a

    single plane can be examined. SEM examinations are often

    used to verify semiconductor die metallization integrity and

    quality (refer to Mil-Std-883 Method 2018). Figure 4

    presents a SEM photo of a FET gate metallization structure.

    3.3.7 EDS/EDX

    Energy dispersive X-ray analysis, alternately known as

    EDS, EDAX or EDX, is a technique used along with a SEMto identify the elemental composition of a sample. During

    EDS, a sample is exposed to an electron beam inside the SEM.

    These electrons collide with the electrons within the sample,

    causing some of them to be knocked out of their orbits. The

    vacated positions are filled by higher energy electrons that

    emit X-rays in the process.

    By spectrographic analysis of the emitted X-rays, the

    elemental composition of the sample can be determined. EDS

  • 7/27/2019 Failure analysis semiconductor

    8/14

    2012 Annual RELIABILITY and MAINTAINABILITY Symposium Willing, Fleisher & Cascio 6

    is a powerful tool for microanalysis of elemental constituents

    [2].

    Figure 4. SEM photo of a FET gate metallization structure

    3.3.8 Focused Ion Beam (FIB)

    The Focused Ion Beam is a tool where an ion beam(typically a Gallium Liquid Metal Ion Source (LMIS)) is used

    to microscopically mil or ablate (e.g. ion milling) material

    away to allow for cross-sectioning of semiconductor die.

    Tungsten ion beams may also be used. The FIB cross-sections

    can be examined by Scanning Electron Microscope (SEM) to

    see features such as die metallization construction, pinhole in

    dielectrics (oxides/nitrides), any EOS, or ESD damage sites.

    The FIB cross sections are very polished revealing features

    at 100 Angstrom resolution. The FIB can also be used to cut

    semiconductor metallization lines to isolate circuitry on the

    die and, if necessary, a Platinum ion beam can be used to

    actually deposit metallization and create new circuit traces. In

    this case, die level design changes (known as Device

    Editing) can be implemented to allow for a design try-out.

    Figure 5 presents a FIB cross-section of a FET gate structure

    (see cut-out site in Figure 4).

    3.3.9 Auger Electron Spectroscopy (AES)

    Auger (O-J) analysis is a technique where samples are

    exposed to an electron beam designed to dislodge secondary

    electrons (otherwise known as Auger electrons) from the

    materials being examined. The materials can be identified by

    the different energy level spectra unique to each materials

    valence bands. Auger detection systems are useful for

    detecting organic materials on the surface of the die sinceAuger is more sensitive to lighter elements than EDS.

    While some depth profiling can occur, it is usually useful

    to 1um deep. Auger, like EDS, is an elemental technique that

    provides little compound information, but is most useful

    because it analyzes only the near surface region (~50

    Angstroms analysis depth). Figure 6 presents a Auger profile

    of the contamination on the surface of a wire bond pad.

    Figure 5. FIB cross-section of a FET gate structure

    (see cut-out site in Figure 4)

    Figure 6. Auger profile of contamination on the surface of a

    wire bond pad.

    SIMS is a technique that can detect very low

    concentrations of dopants and impurities. By ion milling

    deeper into the sample, SIMS can provide elemental depth

    profiles over a depth range from a few angstroms to tens of

    microns. SIMS works by sputtering the sample surface with a

    beam of primary ions. Secondary ions formed during

    sputtering are analyzed with a mass spectrometer. These

    secondary ions can range down to sub-parts-per-million trace

    levels [3].

    Advanced SIMS analyses, such as Time-of-Flight SIMS

    (TOF-SIMS) and Dynamic SIMS (D-SIMS), provide

    additional means of elemental detection and resolution.

    3.3.10 Fourier Transform Infrared Spectroscopy (FTIR)

    Fourier Transform Infrared Spectroscopy is an analytical

    technique used primarily to identify organic materials, such as

    solder flux contamination associated with a part failure. The

  • 7/27/2019 Failure analysis semiconductor

    9/14

    7 Willing, Fleisher & Cascio 2012 AR&MS Tutorial Notes

    FTIR reveals infrared absorption spectra that provides

    information about the chemical bonds and molecular structure

    of a material. The FTIR spectrum is like a "fingerprint" of the

    material; however, the fingerprint itself is not like a typical

    spectrum with known peaks for each element. When running

    an FTIR analysis, it helps to compare FTIR spectrums to

    known samples as it can be difficult to determine the exact

    components of the material just from the spectra itself.

    Cataloged FTIR spectra exist to help identify the materials.

    FTIR samples of the materials most suspect to be the culpritare often taken and then compared to the contamination

    samples FTIR fingerprint. Unfortunately, most FTIR

    equipment requires a fairly large sample of the material in

    question, which is often not available with typical failures [4].

    3.3.11 TEM (transmission electron microscopy

    STEM (scanning transmission electron microscopy)

    Transmission Electron Microscopy (TEM) and Scanning

    Transmission Electron Microscopy (STEM) use a high energy

    electron beam to image through an ultra-thin sample, thereby

    allowing for image resolutions on the order of 1 - 2

    Angstroms. S/TEM has better spatial resolution then a

    standard SEM and is capable of additional analyticalmeasurements. However, S/TEM requires significantly more

    sample preparation as samples need to be very thin, created by

    using FIB techniques.

    S/TEM provides outstanding image resolution making it

    is possible to characterize crystallographic phase,

    crystallographic orientation (both by diffraction mode

    experiments), produce elemental maps (using EDS), and

    generate images that highlight elemental contrast (dark field

    mode)all from nm sized areas that can be precisely located

    [5].

    3.3.12 ESD Testing

    If a part is suspected to be damaged by Electrostatic

    Discharge (ESD), it is advisable to subject a known good part

    to ESD testing and compare the results to the failed device in

    question (Reference Mil-Std-883 Method 3015, JEDEC and

    ESD Association Std ANSI/ESDA/JEDEC JS-001-2010).

    4. SUGGESTIONS FOR YOUR OWNFAILURE ANALYSIS

    CAPABILITIESThis section provides some suggestions for establishing

    Failure Analysis capabilities for a typical electronics firm.

    Three levels of Failure Analysis capabilities are suggested;

    Basic, Moderate and Advanced. Beyond these three levels,

    one might consider using commercial failure analysislaboratories for the more esoteric capabilities such as TEM,

    STEM or SIMS. Usually it is more cost effective to

    subcontract out those types of analyses vs. establishing their

    capabilities in-house.

    Basic Failure Analysis Lab

    Basic Meters (DVMMs)

    Stereo Microscope (10X to 30X)

    (Preferably with digital camera)

    Cross Sectioning Equipment

    Power Supplies / Signal generator

    Oscilloscope

    Moderately Equipped Failure Analysis lab

    SEM

    Curve Tracer

    Metallurgical Microscope (1000X)

    (Preferably with digital camera)

    Chemical hood with decapsulating chemicals

    Die Probe Station

    Liquid Chrystal Film X-ray

    Advanced Failure Analysis lab

    Real Time X-ray

    SEM/EDS

    FIB

    Auger Analysis System

    RF Test Equipment (If necessary)

    5. UNDERSTANDING ELECTRONIC PART FAILUREMECHANISMS [6]

    Excerpts from the 1997 Alan O.Plait Award for Tutorial

    Excellence

    This section describes failure mechanisms commonly

    encountered with electronic parts. Figure 7 illustrates three

    common part styles; a Transistor, Hybrid, and an Integrated

    Circuit IC). The Hybrid contains multiple devices, including

    resistors and capacitors, along with semiconductors and ICs.

    Figure 7. Typical Transistor, Hybrid and IC

    In this section, examples of failures specific to each part

    type are reviewed, with guidelines to help choose the most

    effective corrective action. There are five subjects covered:

    Interconnects

    Semiconductor elements

    Passive elements

    Substrates

    Packages.

  • 7/27/2019 Failure analysis semiconductor

    10/14

    2012 Annual RELIABILITY and MAINTAINABILITY Symposium Willing, Fleisher & Cascio 8

    5.1 Interconnects

    Interconnects within components connect circuit elements

    and substrates to each other and to the device package. Wire

    bonding is used to electrically connect circuit elements to

    substrates, to package pins, and to other circuit elements

    within a package. Soldering is used both to physically attach

    circuit elements to substrates or package headers and to

    physically attach substrates to package headers. It also

    provides a thermal path for heat dissipation. In many cases,

    soldering also serves to establish an electrical connection.

    Epoxy serves the same basic function as solder, to attach

    circuit elements to substrates or headers. Conductive epoxy is

    used in place of nonconductive epoxy when an electrical

    connection is also needed.

    5.1.1 Wire Bonding

    Wire bonding in microelectronics is generally performed

    in one of two ways; thermo-sonic ball and stitch bonding or

    ultrasonic wedge bonding. In thermo-sonic wire bonding, fine

    gold wire (typically 1 mill diameter) is used on a heated stage

    (~ 150C). A ball is formed at the end of the wire via an

    electronic arc (older machines used a hydrogen gas flame) andthe ball is bonded to the contact bond pad by the heat of the

    stage, the force and ultrasonic energy applied by the wire

    bonding machine capillary. This is called a ball-bond. The

    capillary is then raised and moved to the next bonding site

    where temperature and pressure form another bond (called a

    stitch bond). In ultrasonic bonding, aluminum wire is

    generally used. There is no heated stage used in this process

    and the pressure of the wire bonding machine on the wire is

    incidental. Most of the energy is supplied by high-frequency

    acoustical movement of the wire against the bonding area.

    This energy is sufficient to break through the oxides

    surrounding the wire or bonding surface. The wire is cut

    instead of being flamed off.The reliability of a wire bond using any of these methods

    is affected by bond placement, wire dress, bonding energy,

    bonding temperature, bondability of the surface, and any

    dissimilar metals used.

    Incorrect bond placement on a bonding pad can result in

    shorts to nearby metallization tracks. This can also result from

    using a too large diameter wire for the bonding target. Wire

    dress refers to how wire bonds are routed and to the amount of

    stress relief used in the wire. Improper routing can cause wire

    bonds to short to other wire bonds or to conductors in a

    package. Insufficient stress relief can cause wires to break or

    lift off of the bond pad during thermal excursions. Excessive

    stress relief can allow a wire bond to short to the lid of thepackage.

    Bonding energy is the amount of energy used to form the

    bond. In ultrasonic wire bonding, excessive bonding energy

    (ultrasonic) can result in an unacceptable thinning of the wire

    at the heel or in microcracking in the underlying silicon. This

    could lead to a break in the wire at the heel or a chipout at the

    bond pad. In thermo-sonic wire bonding, too much pressure

    can deform the ball and cause damage to the bond pad.

    Insufficient bonding energy can cause weak bonds with all

    technologies. The bonding temperature is important in the

    thermo-sonic bonding. If the bonding temperature is too low, a

    weak bond may result. The use of dissimilar metals, usually

    gold and aluminum, can also be a source of failures. While the

    formation of gold/aluminum intermetallics are necessary to

    form a metallurgical bond between the two metals, voiding at

    the intermetallic sites (Kirkendall voiding) can cause high

    electrical resistance and low mechanical strength. Bondability

    refers to the ability of the two bonding surfaces to form a goodbond. Contamination by foreign substances, incomplete

    photoresist removal, incomplete oxide removal, or incomplete

    nitride removal all affect bondability. This may result in the

    inability to form a bond or in a weak, highly resistive bond

    that will eventually fail. Contamination can greatly increase

    the formation of Kirkendall voids in a bimetallic system.

    5.1.2 Soldering

    Soldering is used in microelectronic parts to attach circuit

    elements to a substrate or a package header and substrates to

    package headers. Eutectic bonding, the attachment of circuit

    elements to a package header or substrate using a eutectic

    material system, will also be discussed in this section. Theeutectic composition of a material system (if there is one) is

    the composition of elements that give the lowest melting

    temperature. The most common eutectic attachment system

    used in microelectronics is the gold/silicon system, which

    melts at about 370C. Die attach serves three basic functions

    in a part; it physically attaches the circuit elements to a

    substrate or header, it provides a thermal path for heat

    dissipation, and in many cases, provides an electrical

    connection for the circuit. The optimum die attach would have

    100% of the die's underside in contact with the header or

    substrate. In reality, due to either surface irregularities (die,

    substrate), a die attachment process problem, or

    contamination, the die attach usually contains some voiding.

    The voids interrupt the thermal path used to remove the heat

    from the die. Depending on the severity of the voiding and the

    power dissipation in the die, the die may fail from

    overheating. In extreme cases, poor die attach can result in an

    electrically open condition and the die breaking free of the

    header or substrate (refer to Figure 8).

    Substrate attach using solder is similar to die attach with

    solder. Various active and passive elements are bonded to a

    substrate that is then soldered to a package. Substrate attach

    affords the substrate the same benefits that die attach affords

    the die in that it provides the substrate with physical

    attachment, a thermal path, and in some cases, an electricalpath. Voiding in the substrate attach solder is a major

    concern.

    Corrosion of indium solder joints, used for their ductile

    property, can occur when subjected to high humidity

    environments. Therefore, it is important to assemble the

    device in a dry environment and ensure it is contained in a

    hermetically sealed package.

    Indium and gold solder joints also form extremely brittle

    intermetallics when exposed to temperatures above 70 to 80C,

  • 7/27/2019 Failure analysis semiconductor

    11/14

    9 Willing, Fleisher & Cascio 2012 AR&MS Tutorial Notes

    under humid or dry conditions.

    Figure 8. Poor Die Attachment

    5.1.3 Epoxy

    Epoxy can be used instead of solder in many

    microelectronic part assembly processes. Epoxies, bothconductive (usually silver filled) and nonconductive, can be

    applied to accomplish die attach and/or substrate attach and

    have become more popular as the quality of micro-electronic

    grade epoxies has improved.

    Conductive epoxy is selected when an electrical

    connection is also required. The advantages of using epoxy

    include ease of application, low temperature curing, and

    reworkability. Epoxies do, however, display several failure

    mechanisms. Improperly cured epoxy can outgas inside a

    hermetic package after it has been sealed, releasing moisture

    and ionic contaminants into the internal cavity of the package.

    Because of their inherent charge, these ionic contaminants

    may shift the electrical parameters of electronic devices in thepackage. This is of particular concern when Metal Oxide

    Semiconductor (MOS) devices are present. Adhesive ionic

    contaminant issues can be mitigated by selecting epoxies that

    meet Mil-Std-883 Method 5011 requirements. Poor adhesion

    of an epoxy to either the die or the substrate is another failure

    mechanism for epoxy. This type of failure is usually caused by

    improper cleaning or abrading of either joining surface.

    If stable electrical resistance of the attachment is critical

    to circuit performance, conductive epoxy may not be the best

    choice as earlier formulations exhibited changes in the

    electrical resistance over time. It can also be affected by

    factors such as temperature and humidity. Electrolytic

    corrosion can occur in silver filled conductive epoxy when

    sufficient moisture is present in a package. The silver from the

    epoxy is corroded by the moisture and by other substances in

    the epoxy. It can then be transported under the influence of an

    electric field in the package and cause shorting to adjacent

    metallization tracks or components.

    5.2 Semiconductor Elements

    Semiconductor elements include discrete diodes, discrete

    transistors, and integrated circuits. The semiconductor

    elements can be packaged individually or grouped together in

    a hybrid configuration. Semiconductor element failures can be

    broken down into the three categories of metallization failures;

    oxide failures, and failures induced by overstress.

    5.2.1 Metallization

    Metallization on a semiconductor element is a thin film

    pattern of metal deposited on a chip to connect electronic

    components contained on the chip or to establish contacts that

    may be connected externally. Metallization failures generally

    result in electrical opens, although shorts may also be

    experienced. Metallization failures can be divided into the

    following specific categories; step coverage, electromigration,

    misalignment, corrosion, mechanical damage, and stress

    voiding.

    Step coverage on a semiconductor element refers to the

    thickness of a material deposited on an area with an uneven

    topography. A change in the vertical direction is called a step.

    Thinning in the metallization (usually aluminum) over a stepis allowed to reduce to 50% of the metal thickness over a flat

    area. If step coverage is poor (less than 50%), open circuits

    can result. Modern ICs have multilayer planarized

    metallization which eliminates many of the step issues.

    Electromigration of metal results in an open circuit

    condition. Electromigration is caused by a thermal activation

    of aluminum ions that are physically moved by momentum

    exchange with flowing electrons. Electromigration failures

    are a function of the current density in an aluminum conductor

    and its temperature. Usually, design rules preclude this

    current density from being exceeded. Mil-Prf-38535, for

    example, specifies that the current density for glassivated

    aluminum metallization shall not exceed 5x105

    A/cm2

    for caseoperating temperatures up to 125C. Defects in the

    metallization, such as poor step coverage or voiding, can

    allow localized areas of current constriction to occur.

    Misapplication of a device in a circuit can also lead to

    excessive current densities. Misaligned metallization on an

    integrated circuit can result in poor contact to active circuit

    elements or to other metallization levels. This type of defect

    is caused by poorly aligned masks during fabrication. Failures

    in the form of opens can result from this defect.

    Corrosion of aluminum metallization is another failure

    mechanism. Corrosion can occur due to the introduction of

    contaminants during processing or due to moisture penetrating

    into the cavity of a non-hermetic package. Aluminum bondpads are especially susceptible because they are not

    passivated. Corrosion can also occur if moisture is

    inadvertently sealed in a hermetic package.

    Mechanical damage to metallization can be introduced

    during probing or handling. This is especially true in hybrid

    microcircuits, which are exposed to a large number of

    assembly steps. Mechanical damage to metallization can

    result in shorts or opens. Stress voiding is a relatively new

  • 7/27/2019 Failure analysis semiconductor

    12/14

  • 7/27/2019 Failure analysis semiconductor

    13/14

    11 Willing, Fleisher & Cascio 2012 AR&MS Tutorial Notes

    include poor adhesion, cracking, EOS, and, ESD due to their

    thin film nature.

    5.4 Substrates

    Substrates are used in microelectronics, particularly when

    manufacturing hybrids, to mount circuit elements onto and to

    make electrical interconnections. Substrates, typically formed

    out of a ceramic material, save space inside a package and

    reduce its weight. Substrates can fail from several different

    mechanisms as discussed below.

    5.4.1 Cracking

    Cracking in a substrate can cause a failure if the substrate

    crack propagates through a metallization stripe. A crack in a

    substrate can also propagate through an attached component (a

    die, for example) causing the component to fail. Cracks in a

    substrate can be caused by a thermal coefficient of expansion

    mismatch between a substrate and a package header. They can

    also be introduced by mechanical damage.

    5.4.2 Metallization

    Metallization failures, which were shown to occur on

    semiconductor die, also occur on substrates. Lifting of themetallization from the substrate can occur, usually resulting

    from an improperly cleaned substrate prior to metallization

    application. Poor metallization coverage is also a failure

    mechanism. Leaching of gold metallization into solders can

    also occur if the proper barrier metals are not used.

    5.4.3 Multilayer Substrates

    Multilayer substrates (substrates with two or more levels

    of metallization) suffer from the same failure mechanisms as

    single layer substrates, with two additions. Incomplete via fills

    (a via is an internal connection between two metallization

    layers) occur during substrate fabrication and result in open

    circuits. Shorts between metallization layers also happen

    during substrate fabrication.

    5.5 Packages

    Packages physically protect circuit elements from the

    external environment. They also allow for electrical

    connection to other packages in an electrical system. The

    failure of a package to protect its internal components from

    the external environment can result in device failure. Package

    failures can be classified as hermeticity failures, insulation

    resistance failures, or failures caused by loose particles within

    the package.

    5.5.1 Hermeticity

    Microelectronics packages are either hermetic or

    nonhermetic. Hermetic packages effectively seal the internal

    components from the external atmosphere. Nonhermetic

    packages (plastic packages) allow outside air to penetrate the

    package.

    Moisture can lead to many forms of corrosion inside a

    package and is one of the most important contaminants to seal

    out. Hermetic seals require the use of some combination of

    metal. glass, and/or ceramic in the package seal. Devices that

    fail hermeticity are referred to as fine leakers or gross leakers.

    A fine leak is defined as a leak rate that is greater than 1 x 10-

    7 atm cc/sec (however this rate does depend on the package

    volume). A gross leak is any leak rate greater than 1 x 10-5

    atm cc/sec, usually detectable by looking for bubbles from a

    package while immersed in a hot fluorocarbon.

    5.5.2 Insulation Resistance

    Insulation resistance between package pins and leads mustbe maintained for a device to function properly.

    Contamination on the exterior of a package can cause the

    insulation resistance to fail. Leaching of lead from the glass

    sealing material has historically caused insulation resistance

    failures.

    5.5.3 Loose Particles

    Loose particles inside a package can cause a failure. This

    is especially true if the particles are conductive. A loose

    conductive particle in a package can cause a failure by

    creating a short between other conductors inside the package.

    Particle Impact Noise Detection (PIND) testing is used to

    detect loose particles inside a package. Radiographicexamination can then be used to verify the size and density of

    the particle before the device is delidded.

    6. FAILURE ANALYSIS CASE STUDY

    This section discusses the failure analysis performed on a

    hermetically packaged integrated circuit (multiplexer). The

    failure was caused by corrosion within the package. This

    example presents the types of problems that are encountered

    and how proper failure analysis can help implement effective

    corrective action.

    6.1 Mux Failure Analysis

    A multiplexer Integrated Circuit (Mux IC) failure was

    first discovered during a system level electrical test. The

    microcircuit used a standard high reliability package design

    consisting of a ceramic housing with a hermetic seal. Prior to

    the failure, the multiplexer was exposed to multiple

    temperature performance and environmental screening tests at

    the component and board level assembly. It was not until

    integration at a higher level assembly that an anomaly arose.

    The initial trouble shooting quickly isolated the problem to the

    multiplexer. At the time, the anomalous behavior was seen

    only during electrical testing below 0C. After careful

    assessment of the part as installed on the board, it was

    removed for further investigation. The part was photographedand leak checked as a normal course of action. It passed the

    fine and gross leak check. The part was then retested

    electrically at low temperature to demonstrate the issue was

    reproducible at the component level. The next step was

    performing real-time X-ray, which observed a possible open

    circuit, refer to figure 9.

    After exhibiting similar anomalous behavior it was

    delidded for internal inspection. The inspection revealed

    signification corrosion; refer to figure 10.

  • 7/27/2019 Failure analysis semiconductor

    14/14

    2012 Annual RELIABILITY and MAINTAINABILITY Symposium Willing, Fleisher & Cascio 12

    The corrosion primarily attacked the wire bond for Vcc

    which brings in external DC power. The interconnect was

    degraded to the point of being intermittent over temperature.

    The cause of corrosion is typically due to moisture trapped in

    packages prior to seal. Moisture can react with residual

    plating salts and cause significant corrosion between

    interconnecting joints especially in presence of an electrical

    field.

    Figure 9. Real-time X-ray of Multiplexer IC Revealed Possible

    Corrosion

    Figure 10. Multiplexer IC After Lid Removal revealing

    corrosion on Pin 13

    Military-Standard packages require Residual Gas

    Analysis Test (RGA) as a qualification for low moisturecontent (i.e.