Fault Localization in ICs by Lock-in Thermal Imaging Techniques

1

Fault Localization in ICs by Lock-in Thermal Imaging Techniques

Author: Aron Virginas-Tar

Master of Information Technology, Year 1

Topic: Thermal Monitoring (No. 27)

1. Introduction Thermographic cameras detect infrared (IR) radiation in the 9–14 µm spectrum and produce images of this radiation, called thermograms. Planck’s black body radiation law proves that all objects above absolute zero emit infrared radiation and the amount of radiation emitted by an object increases with its temperature. Hence, thermography allows us to visualize and monitor temperature variation. Thermal imaging techniques have been used for fault localization in integrated circuits (IC) for more than two decades. Such techniques can be effective in detecting defects that lead to local heat generation, including oxide breakdown, latch-up, electrostatic discharge (ESD) damage or metallization shorting. [3] Due to the highly complex circuit structure, these defects can usually not be detected electrically. [2] If the positions of the heat sources can be localized with enough precision, a surface inspection by scanning electron microscopy (SEM) or a focused ion beam (FIB) analysis may reveal the physical origin of the faults. [3] There exist several different approaches to thermographic fault localization in ICs. They will be presented briefly in the following subsection, without pretention of completeness. In this article I propose to analyze and compare two techniques that rely on the lock-in correlation procedure to reduce statistical noise and improve accuracy. 1.1. Thermal Fault Detection Techniques The most common thermal imaging techniques include: liquid crystal microscopy (LCM), fluorescent microthermal imaging (FMI), infrared thermography (IRT) and Schlieren imaging. All of these techniques have their inherent advantages and disadvantages. For performing LCM, one needs a light microscope with crossed polarizers. The temperature of the sample has to be precisely stabilized or varied very slowly. Only at a certain well-defined sample temperature do local heat sources appear dark in the image. Hence, performing this technique and correctly interpreting the results requires a certain degree of experience from the operator. The temperature resolution of LCM is usually given as 0.1 K and the spatial resolution as 1 µm. FMI, by the other hand, is much more straightforward to apply and interpret. The technique relies on the UV light stimulation of a fluorochrome (fluorescent dye) coating applied to the sample. By using a light microscope with built-in UV source, FMI allows the detection of thermally emitting fault failure spots on electrically driven ICs. [2] By averaging the image data over many images and over a number of neighboring pixels, a thermal resolution limit as low as 0.006 K was reported for a spatial resolution of 15 µm. However, the nominal spatial resolution of FMI may be

2

under 1 µm (with a correspondingly increased thermal resolution limit). One limitation of FMI is the inevitable UV-induced bleaching of the dye, which limits the usable exposure time and may lead to certain artifacts in the image. Both for LCM and for FMI a foreign layer has to be applied to the surface. Hence, these techniques cannot be used in in-circuit wafer testers. Furthermore, neither technique can be used if the chips are mounted face-down (flip-chip technology). Backside thermal imaging is, on the other hand, possible with IRT and Schlieren imaging. [3] Another disadvantage of FMI is that the sensitivity limit for emitted thermal power is only about 10 mW, which makes it impractical to use for less integrated ICs and power components. FMI the images also show a strong topographical contrast, which is caused by surface structure effects of the fluorochrome coated sample and mostly dominates over the thermal contrast. [2] IRT is based on the detection of thermally emitted radiation by an IR camera and can be viewed as the classical thermal imaging technique. Nevertheless, it has two important limitations: its spatial resolution is limited to about 5 µm due to the wavelength range used by the technique, and it shows a strong IR emissivity contrast (ε-contrast). The emissivity contrast comes from the fact that the intensity of the thermally emitted radiation equals that of a black body at that temperature, multiplied by the value of the IR emissivity 𝜀 of the surface. By Kirchhoff's law, the emissivity equals the absorbance of the surface and is usually low (𝜀 < 0.1) for highly reflecting metallized surfaces. Therefore, these regions appear dark in the IR image. Though silicon is quite transparent for IR radiation, free carrier absorption may lead to a considerable IR emissivity of the silicon surface. Thus, even for an unpowered IC with a homogeneous surface temperature, the IR image is characterized by a strong ε-contrast between metallized and non-metallized regions. Schlieren imaging relies on the angular deflection of parallel light near a local heat source, which is caused by the temperature dependence of the refractive index n of the material. This technique can only be applied with a special optical setup in reflection mode from the backside of an IC, or if the surface is coated by a special diffracting layer. Schlieren imaging is thus generally considered impractical for fault detection in ICs and has been only rarely used for this purpose. This technique has a thermal resolution limit of 0.01 K and a spatial resolution of 1 µm. [3] The detection limit of the thermal methods discussed above is limited to 10–100 mK. Decreasing operation voltages, multiple metallization layers, and new package technologies complicate the use of these standard techniques. In [1], the authors propose a lock-in thermography technique that considerably lowers the thermal detection limit to ≈ 0.1 mK and eliminates the ε-contrast. In [2], an improved FMI technique is presented, based on the same lock-in method. The authors show that their technique essentially lowers the thermal resolution and eliminates the topographical contrast from the taken images. 2. Lock-in Thermography Technique A general problem of the microscopic thermal imaging techniques described above is that they are steady-state techniques. In general, steady-state techniques are useful when the temperature of the material does not change with time. However, since silicon shows a high thermal conductivity, heat produced locally in ICs is rapidly distributed into the surrounding. This is the reason why microscopic thermal images always appear blurred. So, even if the optical resolution in an FMI experiment is below 1 µm, the effective spatial resolution of this and other steady-state techniques

3

may be poor for fault localization. This situation changes if time-dependent heat sources are considered, since heat conduction is a time-dependent process. If the power is switched on at 𝑡 = 0, initially only the immediate surrounding of the source heats up. But if the power persists dissipating, the heat gradually diffuses more and more away from the heat source position, leading in to the blurred appearance of thermograms. Therefore, if one would observe only the temperature response immediately after switching on the heat source, this image would be much less influenced by the lateral heat dissipation and would appear less blurry. In principle, this is realized in lock-in thermography, a technique presented in [1] and [3]. The technique can have a thermal sensitivity below 0.1 mK, due to its averaging nature. This permits the thermal investigation of many processes, which have remained unidentified before, due to lack of sensitivity. Lock-in thermography also overcomes the emissivity contrast problem, thus even weak heat sources lying below metal layers become visible. This technique also has the advantage of being easy to apply, and the results are straightforward to interpret. Hence, in spite of its limited spatial resolution of about 5 µm, lock-in thermography is a promising tool for thermal failure analysis. This section will describe the basic principles and the technique of lock-in thermography. Key aspects, such as the spatial resolution of this technique, will also be discussed. Lock-in thermography was invented in 1984 and has been used extensively in non-destructive testing and thermo-elastic investigations. On the other hand it represents also a standard technique for investigating shunting phenomena in solar cell research. Lock-in thermography means that the power dissipated in the object under investigation is periodically amplitude-modulated and the resulting surface temperature modulation is imaged by a thermographic camera. The generated IR images are then digitally processed according to the lock-in principle. [3] The lock-in signal treatment can be described as a multiplication of the detected signal 𝐹 𝑡 by a correlation function 𝐾 𝑡 , which in our case is a symmetric square wave function. [4] The effect of lock-in thermography is the same as if each pixel of the IR image would be connected with a two-phase lock-in amplifier. Consequently, the two primary results of lock-in thermography are the image of the in-phase signal 𝑆!"!° and that of the out-of-phase (or quadrature) signal 𝑆!"!!"°. Note that in lock-in thermography often the −90° signal is used instead of the +90° one, since the latter is essentially negative. From these two signals we can derive the image of the phase-independent amplitude image 𝐴!" and the phase image Φ!" of the surface temperature modulation as follows:

𝐴!" = 𝑆!"!°! + 𝑆!"!!"°

! (1)

Φ!" = arctan

−𝑆!"!!"°

𝑆!"!° (2)

The amplitude image displays the local temperature modulation amplitude, while the phase image essentially describes the time delay of the local temperature modulation referred to the periodic power supply pulse. [1] Both the 0° and the −90° image are proportional to the power of the pulsed heat source, thus also in the amplitude image the contrast of a heat source is proportional to its dissipated power. The phase image, on the other hand, should be independent of the power of the heat source. It can be regarded as a measure of the time delay of the surface temperature modulation

4

referred to the power modulation, which is independent from the magnitude of the modulated power (as long as there is no superposition of the temperature fields of neighboring heat sources). In addition, the phase image is also independent from the IR emissivity 𝜀 of the surface. Hence, lock-in thermography is effective in eliminating the ε-contrast. This technique also implies a kind of dynamic compression, so that local heat sources with different powers are displayed with a similar signal height. These properties greatly simplify the interpretation of the results. [3] 2.1. Experiments. Results This subsection will describe the lock-in thermography technique and system used by the authors in [1] and will present the experimental results obtained during their research. The experiments will be illustrated by the authors’ images and the results will be interpreted. The lock-in thermography system used the authors to take the data that will be presented hereupon was the TDL 384 M “Lock-in” from Thermosensorik GmbH, Erlangen, Germany. The scheme of this system is shown in Figure 1. It is based on a highly sensitive 384×288 pixel sized Stirling-cooled HgCdTe (mercury-cadmium-telluride, MCT) focal plane array (FPA) IR detector head made by AIM, Heilbronn, Germany. It detects at 3−5 µm wavelength at a full frame rate of up to 140 Hz. The digital image information is captured by a frame grabber board and transmitted on-line by direct memory access (DMA) to the RAM of a PC.

Figure 1: Scheme of the TDL 384 M “Lock-in” thermography system

The PC performs the lock-in correlation and controls the hardware for generating the lock-in reference signal, which triggers the pulsed power supply. The detector head may be equipped with a 28 mm standard IR objective, providing a pixel resolution down to 30 µm, or by special microscope objectives, providing a pixel resolution down to 5 µm. The temperature noise level of this system decreases with the square root of the acquisition time and reaches a level of about 70 µK after 1000 s (≈ 17 min).

5

Figure 2: Principle of the lock-in correlation procedure The digital lock-in correlation procedure performed by this system is explained in Figure 2. The pixel information of each incoming frame is multiplied in two channels by different sets of weighting factors, in channel 1 approximating a sin 𝑡 and in channel 2 a −cos 𝑡 function. After summing up the results in two frame storages over many periods, storage 1 contains the in-phase (0°) image and storage 2 the quadrature (−90°) image. These can be converted into the amplitude- and phase images using the formulae (1) and (2). For lock-in thermography on ICs before [1] only amplitude images have been published. Figure 3 shows an IR topography image (always measured before a lock-in measurement) together with amplitude- and phase images of an intact integrated circuit in static operation with the supply voltage of 16 V and 2.4 mA pulsed at 3 and 20 Hz. The topography image was measured with supply voltage applied, but the emissivity contrast obscures the weak heat sources. In the amplitude images, the stationary topography contrast is compensated and the dominant heat sources can be located. On the other hand, as this image is still modulated by the ε-contrast, it is hard to decide whether some bright features are weak local heat sources or regions of high IR emissivity. The phase image is totally free of emissivity contrast and single heat sources are clearly visible. The influence of the lock-in frequency on the extension of the inevitable halos around heat sources can be seen in both the amplitude- and phase images. The dynamic compression property of the lock in technique is also clearly observable in the phase image, as heat sources of different power appear with a similar brightness. Topography and phase images from the investigation of 3 positions in an 8 Bit microprocessor IC are shown in Figure 4. This IC was working in dynamic operation with the internal clock generator of 12 MHz running, but no ROM was connected. The supply voltage of 5 V was pulsed at 20 Hz. The authors have investigated an intact IC consuming a current of about 8 mA, and a defective IC taking about 50 mA at 5 V. The experiments have been performed from the opened backside of the chip, as required in case of flip-chip configuration. The heat sources in the upper images a) and b) of Figure 4 with a maximum amplitude of about 10 and <1 mK, belong to the normal operation of this IC. They are found both in an intact IC consuming 8 mA supply current as well as in the defective one. The dominant heat source in image c)

6

of Figure 4, however, is the fault location. This heat source is showing an amplitude of about 1 K. In an intact sample, no measurable heat is produced in this position. The acquisition times for Figure 4 a) and b) were 2 and 20 min, but c) has been recorded within some seconds.

The last example shown in Figure 5 demonstrates the high sensitivity of the proposed lock-in thermography technique. The sample used here was a CMOS device damaged by ESD pulses, which showed a leakage current of about 24 µA at an applied bias of 5.5 V, leading to a dissipated power of 132 µW. The device was measured under these conditions with a lock-in frequency of 3 Hz for an acquisition time of 𝑡!" = 7 min. The maximum measured signal amplitude in the fault position was about 500 µK, and the signal-to-noise ratio (SNR) was about 4. Hence, the noise

Figure 3: Topography, amplitude-, and phase images of an IC in pulsed

static operation for two lock-in frequencies

Figure 4: Topography and phase images of 3 positions in an 8 bit microprocessor clocked at 12

MHz, supply voltage pulsed at 20 Hz

7

level of this measurement was in the order of 125 µK. Since the noise level decreases with 1 𝑡!", after 1 h acquisition time, a heat source of the same geometry with a power of 45 µW could be detected with the same SNR.

Figure 5: Topography and amplitude image of a defect in an IC dissipating 132 µW, measured at 3 Hz

with an acquisition time of 7 min The authors have performed about 40 fault localizations in different IC types by IR lock-in thermography, all of them being successful. In some cases the fault origin was verified using FIB cross section preparation of the fault region followed by transmission electron microscopy (TEM) failure analysis. The results of the experiments show that a fault that results in a current increase of >10 mA usually can be localized by this technique to a spatial accuracy of 5−10 µm. 3. Lock-in Fluorescent Microthermal Imaging (FMI) Technique In [2] the authors propose an improved version of the FMI thermal fault localization technique. This relies on the based on lock-in method presented earlier using a specially developed hardware. This technique is also outlined in [1], alongside lock-in thermography. It can be applied for the localization of leakage currents in microelectronic devices and it is characterized by a spatial accuracy of about 1µm.

Figure 6: Scheme of the FMI assembly

8

Lock-in FMI, similarly to lock-in thermography, means that a periodically pulsed supply voltage (instead of a DC voltage) is applied to the sample device. This technique exploits the natural 100 Hz flickering of the UV lamp of the FMI optical microscope. The lock-in FMI system developed by the authors operates with a conventional light microscope and a low-noise, cooled slow-scan camera with 1280×1024 pixels resolution at 256 levels of grey. The UV light that stimulates the fluorochrome is generated by a compact array of 20 mW UV diodes. Figure 6 shows the scheme of a generic FMI system. Because the fluorescence is weak, long exposure times (up to 1000 ms) are necessary. Using special hardware, the phase between the UV light pulses and the supply voltage pulses is changed by 90° for successively captured images, so that four subsequent fluorescence images correspond to a phase difference between light- and bias pulses of 0°, 90°, 180°, and 270°. Figure 7 illustrates the Phase relations between supply voltage and UV stimulation. [3] From these four images, the amplitude signal 𝐴!" and the phase signal Φ!" are calculated for every pixel position by the following formulae:

𝐴!" = 𝐼!"!° − 𝐼!"!"#°! + 𝐼!"!"° − 𝐼!!!"#°

! (3)

Φ!" = arctan

𝐼!"!"° − 𝐼!"!"#°

𝐼!"!° − 𝐼!"!"#° (4)

This can be viewed as a heterodyne technique, since the fluorescence signal is the product of the UV intensity and the quantum efficiency of the dye, the latter being temperature-dependent. Therefore, the total amount of light, averaged over many periods, contains some amplitude and phase information about the periodic temperature modulation at 100 Hz. By calculating the differences of always two by 180° shifted values in (3) and (4), the part of the fluorescence signal, which is not affected by temperature modulation, completely cancels out. By averaging the result over an appropriate number of lock-in periods the SNR of the lock-in measurement improves proportional to the square root of the acquisition time and may become orders of magnitude better than the SNR of the primary temperature measurement. The main advantage of this technique over steady-state FMI is that it essentially lowers the minimum of the detectable heat generation and eliminates the topographical contrast from the taken images.

Figure 7: Phase relations between supply voltage and UV stimulation

9

3.1. Experiments. Results The authors have experimented with FMI-analysis at 6.5 V supply voltage and 300µA current flow. Figure 8 shows the result of lock-in FMI applied to an NMOS test structure with an oxide breakdown, leading to a leakage current of 6.5 mA at 5 V. The topography image (a single fluorescence image) was captured with a bias applied to the structure, showing a weak quenching of the fluorescence in the fault position. After 𝑡!" < 10 min acquisition time at 400× magnification, the amplitude image clearly shows the point-like fault position as well as a resistively heated line. The pixel resolution in these images is 0.5 µm, while the spatial resolution of the lock-in FMI technique is about twice this value. The physical cause of the fault was determined using TEM failure analysis, as in the case of the previous technique.

Figure 8: Topography (fluorescence) image and lock-in FMI amplitude image of an NMOS test structure

containing an oxide breakdown 4. Comparison The authors’ experiments prove that both lock-in thermal methods presented in this article are effective in reducing SNR. Table 1 presents the main characteristics of both techniques in a comparative manner. We can observe that the spatial resolution of the lock-in FMI is much better than that of lock-in thermography, which is limited to 5 µm because of the IR wavelength of 3−5 µm used. The thermal resolution of the lock-in thermography technique is about 0.1 mK, which is considerably better those that of the steady-state thermographic techniques presented in section 1.1. The authors do not present a concrete value for the thermal resolution of the lock-in FMI technique, but they consider it unsatisfactory. However, they state

10

that it can be improved by further research. Both techniques are easy to apply and effective in eliminating undesired contrast (emissivity contrast in case of lock-in thermography and topology contrast in case of lock-in FMI). Lock-in thermography also has the advantage of being applicable in flip-chip configuration.

Lock-in thermography Lock-in FMI

Thermal resolution 0.1 mK ? Spatial resolution 5 µm 1 µm

Applicable in flip-chip configuration yes no

Eliminates contrast yes yes Easy to apply and

interpret results yes yes Table 1: Comparison between lock-in thermography and lock-in FMI

(green=better, red=worse) 5. Summary and Conclusions The comparison done by the authors between different thermal fault detection techniques has shown that lock-in thermography is able to detect practically all defects, which are visible also in light emission microscopy (LEM). Moreover, the considerably lower thermal resolution of lock-in thermography enables it to detect faults previously undetectable. If the lock-in frequency range would be extended up to 1 kHz, the thermal resolution could be further improved. Lock-in FMI may become an interesting alternative in the future by detecting local heat sources with sub-micron spatial resolution, but only if a solution is found to considerably improve the detection sensitivity. 6. References [1] O. Breitenstein, J.P. Rakotoniaina, F. Altmann, J. Schulz, G. Linse − Fault Localization and Functional Testing of ICs by Lock-in Thermography, in: Proc. 28th ISTFA, 29-36 (2002) [2] F. Altmann, Th. Riediger, O. Breitenstein, J.P. Rakotoniaina − Fault Localisation of ICs by Lock-in Fluorescent Micro-thermal Imaging (Lock-in FMI) [3] O. Breitenstein, J.P. Rakotoniaina, F. Altmann, T. Riediger, O. Schreer − Thermal Failure Analysis by IR Lock-in Thermography [4] O. Breitenstein, W. Warta, M. Langenkamp − Lock-in Thermography: Basics and Use for Evaluating Electronic Devices and Materials, 2nd Edition, Springer (2010)

Fault Localization in ICs by Lock-in Thermal Imaging Techniques

Documents

Transcript of Fault Localization in ICs by Lock-in Thermal Imaging Techniques