IDDQ Diagnosis

Paper 31.3 INTERNATIONAL TEST CONFERENCE 1 1-4244-0292-1/06/$20.00 © 2006 IEEE

The Power of Exhaustive Bridge Diagnosis using IDDQ Speed, Confidence, and Resolution

Doug Heaberlin ([email protected])

IBM Systems and Technology Group Essex Junction, Vermont

Abstract A method is presented for high-speed diagnosis of all two-node bridging defects in a logic circuit using IDDQ. The method is tractable for large industrial circuit designs, requiring less than two CPU minutes to evaluate the ten trillion bridging defects on a 4.5-million gate ASIC. More significant than the speed of the method, however, is the precise diagnostic resolution typically achieved when the list of bridge faults diagnosed is examined in light of the circuit’s physical layout. The robustness of the IDDQ bridge fault model and the near impossibility of matching a long IDDQ signature by chance result in a confidence in the results rarely matched by diagnostic methods that must rely on modeling the logical behavior of a bridging defect. Performance data and results from physical failure analysis are presented for a variety of production ASIC designs.

1. Introduction The logical and electrical behavior of a bridging defect, an unintended electrical connection or “short” between nodes in a circuit, has long been a subject of research within the Test community. The prevalence of bridges on semiconductor devices and the potential complexity of their behavior have led to much creative work on practical methods for modeling, detection, and diagnosis of bridges [1-13].

Most published methods of bridge diagnosis share two attributes. First, they generally analyze only a small fraction of the theoretically possible bridge faults in a

circuit. Because a circuit containing n nodes has

2n pairs

of nodes, the problem of analyzing all theoretically possible two-node bridges has computational complexity O(n2), and is generally considered computationally intractable for industrial circuit designs. Most bridge diagnostic methods address this problem by considering only so-called “realistic” bridges, physically adjacent node pairs identified through analysis of the circuit’s physical layout [14].

A second shared attribute of most bridge diagnostic methods is their built-in tolerance for defect behavior that is not anticipated by the fault models they employ.

The method presented in this paper shares neither of these attributes. On every diagnostic run, it analyzes all theoretically possible two-node bridges in the circuit

model, which on many parts number in the trillions. And in contrast to methods that can accommodate bridge behavior that deviates from the fault model, the method presented here is decidedly “brittle,” requiring that a defect behave exactly as predicted by the IDDQ bridge fault model in order to be diagnosed successfully.

One might expect a diagnostic method that can be described as “brittle” and “exhaustive” to have little practical application. As will be demonstrated, however, the method is not only remarkably tractable and effective at diagnosing real-world bridges on industrial circuits, but typically produces results with diagnostic resolution and confidence unmatched by other bridge diagnostic methods.

The remainder of this paper is organized as follows: Section 2 discusses some previous approaches to bridge diagnosis and defines the IDDQ bridge diagnostic problem. Section 3 presents the diagnostic method. Section 4 presents results of the method when applied to a number of production ASIC designs sent through physical failure analysis. Section 5 discusses some unique advantages and limitations of the method, and section 6 concludes the paper.

2. Background and Previous Work

2.1 Bridge Diagnosis using Logical Fault Models

Diagnostic methods using logical fault models must grapple with the difficulty of modeling logical bridge behavior. Although a low-resistance bridge from a signal node to Ground or Vdd may behave as a simple stuck fault, in general the behavior of a short between two or more signal nodes depends on such physical details as the relative drive strengths of the transistors upstream of the bridged nodes, the input logic thresholds of the devices downstream of the bridge, and the resistance of the bridge itself. The possibilities of a bridge introducing a feedback path into a circuit or of exhibiting “Byzantine Generals” behavior (in which different circuits downstream of the bridge interpret the degraded voltage level on their inputs as different logical states) further complicate the problem of capturing the complexity of bridge behavior with simple fault models.

Because the cost of accurate circuit simulation of bridges for diagnosis is prohibitive, however, simple fault models are the norm. Several diagnostic methods have employed multiple stuck faults to approximate the behavior of a single bridge. [3,6,13]. Such methods try to compensate

Paper 31.3 INTERNATIONAL TEST CONFERENCE 2

for inaccuracies in the fault model through strategies ranging from applying sophisticated matching and scoring algorithms [6] to simply ignoring any test vector on which the defect fails to behave as a stuck fault. [13].

Other researchers have employed simple two-node bridge fault models, such as “wired logic” models (in which the bridge functions as a logical AND or OR gate) and models in which the value on one node always dominates the other. At least one group has experimented with a bridge model that simply propagates the “unknown” logic state to all gates downstream of a bridge [15].

Bridge diagnostic methods that use logical fault models have relied almost universally on the use of realistic bridges to achieve acceptable run times. However, even limiting analysis to realistic bridges has not always been sufficient to render diagnosis tractable. One group of researchers noted that a single node in one circuit design they studied had more than 90,000 physically adjacent nodes, and concluded that simulation of the set of realistic bridges involving this node alone would require multiple CPU months [15].

2.2 Bridge Diagnosis using IDDQ

A review of the literature on bridge testing and diagnosis reveals an evolution in the Test community’s awareness of the potential effectiveness of IDDQ for diagnosis. The paper that first used the analogy of the “Byzantine Generals Problem” to describe certain bridge behavior [2] mentions IDDQ in passing, but concludes “IDDQ testing does not provide enough information to distinguish which fault causes the high current.” Around the same time, however, Aitken [8] demonstrated that combining the results of IDDQ testing with logic diagnosis could improve diagnostic resolution beyond that possible with logic diagnosis alone, and the following year published results using only IDDQ on a set of realistic bridging faults [9].

Chakravarty and Suresh subsequently published a technique for exhaustive diagnosis of all two-node bridges in a logic circuit using IDDQ [10]. The feasibility of the method was demonstrated only for circuits ranging in size from 200 to 40K nodes, however, and all subsequently published IDDQ diagnostic work appears to have continued the industry trend toward use of realistic bridges. Nigh et al. [11] relied on a set of realistic bridges for their experiment with IDDQ diagnosis and found that limiting the list of bridge candidates a priori to physically proximate node pairs produced very good resolution: a single bridge diagnosed in 76% of test cases.

Gattiker and Maly [16] revealed the rich set of diagnostic information available in IDDQ “current signatures” and showed how the relative amount (and number of levels) of current drawn by a circuit can distinguish faults that are equivalent under the simple IDDQ bridge fault model. Recently, Nigh and Gattiker [20] demonstrated that even as IDDQ testing loses effectiveness for manufacturing test,

IDDQ signatures remain a rich source of information for defect characterization.

2.3 The IDDQ Bridge Diagnostic Problem

The IDDQ bridge fault model avoids the problems inherent in predicting the logical behavior of a bridge by ignoring logical behavior entirely. Instead, the model predicts only the circumstances under which the circuit, while in a quiescent state, will draw additional current (“defect current”) due to the presence of the defect. For fully complementary CMOS, the rules for predicting when IDDQ will be elevated due to the presence of a bridge are simple: whenever two bridged nodes are driven to opposite logical states (implying that one is connected to Vdd and the other to Ground), defect current will flow, because the bridge in that case provides a current path between Vdd and Ground. When the bridged nodes are at the same logical state, however, no such path exists, and the presence of the defect results in no additional current flow. Figure 1 shows a plot of IDDQ for a chip with a bridging defect, where the red points represent measurements taken on exactly those test vectors that drive the bridged nodes to opposite states.

The IDDQ bridge diagnostic problem is thus the following: given the pattern of high and low current measured during an IDDQ test (the circuit’s “IDDQ signature”), find any pair of circuit nodes that were driven to opposite logical states on test vectors for which IDDQ was elevated, and that were at the same logical state on vectors for which IDDQ was low. For example, for the pattern of high and low current (signified by “H” and “L,” respectively) shown below, and the corresponding logic states on nodes A and B during those measurements, the IDDQ bridge fault model indicates that these nodes are not bridged:

This reconciling of the logic states on two nodes with an observed IDDQ signature to determine whether the nodes may be bridged is a fundamental operation of IDDQ bridge diagnosis, and will be referred to in this paper as an “explicit evaluation” of a potentially bridged pair of nodes.

Such evaluations can be made very rapidly; the fastest server to which the author has access can make approximately 5.5 million such evaluations per second. However, the number of potential bridges for industrial circuits remains potentially daunting. The 14.5 million-gate ASIC used for one test case in Section 4 has almost

IDDQ Signature: Values on Node A: Values on Node B:

LLLLLHHLLHHHHLLLHLLHHLLLL… 0111010010101011100100010… 0111001011010011010010010…

Contradiction to model

Figure 1. Iddq signature for a chip with a bridging defect


106 trillion potential two-node bridges. At a rate of 5.5 million evaluations per second, diagnosis of this chip would require more than seven CPU months.

3. The Diagnostic Method

Prior to performing explicit evaluation of any node pairs, we apply three heuristic methods for reducing the number of bridges that must be evaluated explicitly. These methods are “heuristic” in that the computational complexity of the problem remains O(n2); in practice, however, sets of nodes that are resistant to performance speed-up using the method employed in phase I are highly susceptible to the methods used in phases II and III, and vice versa. Together, these heuristic methods typically enable diagnosis in minutes, and are guaranteed to produce the same result one would achieve by explicitly evaluating a bridge between every pair of nodes in the logic model.

3.1 Phase I: Partitioning Nodes by Logic State on Low-Current Vectors

The first method applies a procedure of the “divide and conquer” variety, in which a large problem is split into multiple smaller problems that together require less time to solve than the original. Figure 2 illustrates the fundamental operation applied in this phase, where the circles in this figure each represent a set of nodes in the circuit1. For a given IDDQ measurement at which low current was measured, the n nodes in the parent set are assigned to two new sets: those nodes that were at logical 0 during that IDDQ measurement, and those that were at logical 1. This operation reduces the number of explicit evaluations required because we can immediately conclude that no node in the “zero” set can be bridged to any node in the “one” set. (If a bridge existed between a node in one set and a node in the other, IDDQ would have been elevated, and we know that IDDQ was low on this vector.) Thus we need evaluate bridges only between two nodes that are both members of the same set.

Using other low-current vectors, this operation is applied recursively to each new set, constructing a binary tree whose height is the number of low-current vectors considered. On completion of this process, the nodes in the circuit have typically been partitioned into over a million sets (the leaves of the tree), any member of which cannot be bridged to a node in any other set.

The number of explicit evaluations eliminated as each set is split in two depends on the proportion of nodes that are at one logic state versus the other on the vector being

1 In order to avoid potential confusion when referring to the “nodes” of a circuit as well as to the “nodes” of the binary tree constructed by this method, outside of this footnote I will refer to nodes in the binary tree as “sets.” A “set” is a node in the tree representing a collection of circuit nodes (in some cases many or all circuit nodes, but also possibly zero or one.) The term “node” will henceforth always refer to a circuit node.

considered. As shown in figure 2, for the ideal case in which the parent set divides into two sets that are each exactly half the size of the parent, the number of explicit evaluations required is reduced by just over half.

As discussed below, however, there are many nodes in a circuit that tend to remain at one state for all or virtually all of a set of test vectors. Splitting a set composed mainly of these nodes is much less effective at reducing explicit evaluations, and the effectiveness of phase I approaches a limit as the construction of the tree progresses.

All of this is best illustrated with an example taken from an actual diagnostic run, shown in figure 3. The root of the tree represents all 4.6 million nodes in the circuit which, as shown to the right of the tree, represent almost 11 trillion potential bridges. At 5.5 million evaluations per second, we would require more than 22 CPU days to evaluate all of these explicitly.

Based on the logical state of all nodes during the first low-current vector, the root is split into two sets: those nodes that were at 0 and those that were at 1 on this vector. In this single step, we determine that none of the 2.7 million nodes in the first child set can be bridged to any of the 1.9 million nodes in the other, and this fact cuts the number of evaluations required almost in half, to 5.6 trillion. These two sets are each subsequently split into two new sets based on the logical state of their nodes on the second low-current vector, reducing the number of evaluations required to 3.3 trillion. Construction of the tree continues in this manner, using all 26 of the low-current vectors provided as input, and ultimately results in the creation of 1,498,521 sets at the leaves of the tree (only eight of which are shown.)

Over two-thirds of these leaf sets contain a single node, each of which has unique logical behavior over the 26 low-current vectors, and thus cannot be bridged to any other node in the circuit. Over 98% of the leaf sets contain 10 or fewer nodes and can be evaluated very quickly. The construction of the tree, which in this case required about 70 CPU seconds, has reduced the projected CPU time for

0 1

Evaluations Required

+ 2n

2n

n

2n

2

2/n

2

2/n

Figure 2. Reduction in evaluations achieved in the ideal case, in which a set of n nodes is split exactly in half.

2)1( −nn

= 4)2( −nn

=


all remaining evaluations from more than three weeks to approximately a day.2

As indicated in the figure, however, there are two sets of nodes that are still especially problematic, and which together represent almost all of the remaining projected CPU time: the set of nodes that remained at 0 on all low-current vectors and the set that were always at 1, shown at the far left and far right, respectively. (The projected CPU time required to evaluate the bridges represented by these sets only is shown in parentheses just below each.)

Two reasons are apparent for the large number of these “constant-state” nodes, which may be observed during virtually any diagnosis. First, there are many clock and control lines that tend to be at one state on most test vectors. If for example, all IDDQ measurements are made with the clocks held inactive (as was true in this case), then any node that is part of a clock tree will be at one state not only during all low-current measurements, but on all high-current measurements as well. Second, the same nodes that tend to be “random pattern resistant” will have

2 The partitioning operation of phase I is similar to the method described in [10], with each level of the binary tree analogous to a “set of ordered pairs of sets” (SOPS) constructed by that method. Where the method of [10] is obliged to generate a new SOPS for every test vector used in the diagnosis, however, phase I partitions nodes into sets based on low-current vectors only, and leaves high-current vectors to be analyzed by the faster and less memory-intensive methods of phases II and III.

a “preferred” state that they will assume most of the time. To reduce the projected CPU time to an acceptable level, we must have a fast way of processing these nodes.

3.2 Phase II: Processing of “Constant-State” Nodes

The method of Phase II is based on the observation that many of the nodes that are at one state during all low-current vectors are in fact at the same state on all high-current vectors as well. The first step of this method, which is applied only to the leaf sets shown at the far left and far right in figure 3, is to identify all of the nodes that are at one state for all IDDQ measurements, both high and low.

Because a bridge involving a constant-state node can result in the observed IDDQ signature if and only if it is connected to another node from the same leaf set that is at the opposite logical state on all high-current vectors, the second step is to find any such nodes, and to report them as potentially shorted to any of the constant-state nodes. Note that a short to a signal node that remains at logical 1 or at logical 0 on all measurements would produce exactly the same IDDQ signature as a short to Vdd or Ground, respectively. Thus this step identifies any nodes potentially shorted to Vdd or Ground as well.

Figure 4 illustrates this process for our example diagnostic run. The circles at the top of this diagram represent the leaf sets of the binary tree created in phase I. The contents of the sets at the far left and far right are each now split into

Tree Evaluations Projected Level Remaining CPU Time

307,325 3,253

0 1

6,840

1 0

1 1

0 1

3 14

0 1

4,638,209

2,743,464 1,894,745

2,062,710 680,754 683,293 1,211,452

1,730,923 331,787 376,695 304,059 380,842 302,451 377,791 833,661

0

0

0 0 0 0

0

1

11

1 1 1

26 543 Billion 1.1 days

3 2.2 Trillion 4.6 days




(24 hrs, 58 min)

994,380

(2 hrs, 23 min)

Figure 3. Phase I: Partitioning circuit nodes by logical state on low-current vectors


three groups based on their logical states during high-current vectors: constant-state nodes that were at the same value on all IDDQ measurements, nodes that were at one state on all low-current vectors and at the opposite state on all high-current vectors, and the nodes that were at one state for all low-current vectors, but at both states at some point during high-current vectors. (The values shown in the figure under the headings “Low” and “High” represent the logical states of the nodes on low-current and high-current vectors, respectively. However, in order to make this example small enough to show easily, the values for only five vectors of each type are shown.) As shown, the vast majority of nodes in these leaf sets are constant-state nodes, represented by the entries 00000|00000 and 11111|11111.

Any of these nodes could be shorted only to a node that was at opposite state on all high-current vectors (represented by the entries 00000|11111 and 11111|00000); however, as shown, there are no such nodes in this case. Thus, none of the constant-state nodes are shorted, and they can be eliminated from further consideration. This phase has consumed approximately four CPU seconds and has had a dramatic effect on the number of potential bridges remaining to consider, reducing their number from 543 billion to 11 billion. The projected CPU time to process the remaining bridge faults has dropped from more than a day to approximately half an hour.

3.3 Phase III: “Complement Set” processing

Although investing 30 CPU minutes to obtain a diagnosis that may be the basis for several days of painstaking physical failure analysis seems reasonable, these methods are heuristic, and are not guaranteed to be as effective in all examples as in this one. (For one test case, the projected remaining CPU time after phases I and II was 87 CPU hours.) Thus we apply a third method to process

most of the remaining potential bridges without explicit evaluation.

The third method is in fact a generalization of the method applied in phase II, and is based on the observation that the leaf sets that still represent many potential bridges are composed mainly of nodes that during the low-current vectors have shown a “preference” for one logic state over the other. For nodes that are not bridged, this preference is typically an artifact of the logic design of the circuit, and these nodes will thus tend to exhibit a preference for the same state on high-current vectors as well.

In order for any two nodes to produce the observed IDDQ signature, their states at all high-current vectors must be complementary: whenever one node is at logical 0, the other must be at logical 1. This implies that if there are i high-current vectors total, and node A is at 0 on j of these vectors, node B need be considered as a potential bridge partner only if it is at 0 on i-j vectors (or, equivalently, at 1 on j vectors). Because the nodes comprising the sets to which we will apply this method have shown a preference for one state over another, however, there are likely to be few pairs of nodes whose state count over all high-current vectors is “complementary” in this way.

Complement set processing is performed on any set still containing a relatively large number of nodes, and proceeds as follows: each node in the set is assigned to one of i+1 new sets, where i is the number of high-current vectors, based on the number of times that node was at logical 0 during a high-current measurement. We will refer to the set containing nodes that were at 0 on j high-current vectors as “set j”. When all nodes have been assigned in this manner, the only bridges that need be considered are those between nodes in any set j and its “complement set,” i-j, for all j from 0 to 2/i . (Note that if the number of

0/1 Count #Nodes 0/1 Count #Nodes ------------- ----------- -------------- -------------- 0/23: 0 23/ 0: 0 1/22: 0 22/ 1: 88,538 2/21: 0 21/ 2: 24,803 3/20: 0 20/ 3: 6,809 4/19: 0 19/ 4: 2,173 5/18: 0 18/ 5: 664 6/17: 0 17/ 6: 276 7/16: 0 16/ 7: 102 8/15: 0 15/ 8: 35 9/14: 0 14/ 9: 13 10/13: 0 13/10: 2 11/12: 1 12/11: 1

(23 min)

64,431 3,253 6,840 1 1 3 14 123,417

1 1 3 14 2

(6 min)

0 0 0

(0 sec) (0 sec) (0 sec) (0 sec)

(4.3 sec) (1.0 sec)

(45 million evaluations / 9 CPU seconds)

Figure 5. Phase III: Complement set processing

(11 billion evaluations / 33 CPU minutes)

307,325 3,253 6,840 1 1 3 14 994,380

64,431 3,253 6,840 1 1 3 14 123,417

(2 hrs, 23 min)(24 hrs, 58 min)

(23 min) (6 min)

Low High Count -----|----- ------- 00000|00000 870,963 00000|11111 0 00000|00010 00000|00110 00000|10000 123,417 ... 00000|00001 ------------ -------- Total: 994,380

Low High Count -----|----- ------- 11111|11111 242,894 11111|00000 0 11111|01111 11111|11101 11111|10101 64,431 ... 11111|11011 ------------ -------- Total: 307,325

}}

(11 billion evaluations / 33 CPU minutes)

(543 billion evaluations / 1.1 CPU days)

Figure 4. Phase II: Processing of “constant-state” nodes


high-current vectors i is even, set i/2 is its own complement set.)

Returning to our example, figure 5 shows how this procedure works when applied to the leaf set at far left, which contains nodes that were at 0 on all low-current vectors. There are 23 high-current vectors used as input, so the nodes are divided into 24 sets based on the number of these vectors (from 0 to 23) at which each was at 0. In the figure, each of these sets is shown opposite its complement set; the set of nodes on either side of an arrow need be explicitly evaluated only with nodes of the set on the other side. But since this entire collection of nodes has a strong preference for being at 0, the sets that contain nodes often at 0 all have complement sets that are empty – meaning that none of the nodes in these sets can be bridged to any other node. The only exception in this example is a single node pair: a node that was at 0 on 11 high-current vectors may be bridged to the single node that was at 0 on 12 high- current vectors, and so this pair must be explicitly evaluated. The number of nodes requiring explicit evaluation in this set is thus reduced from 123,417 to 2, and the CPU time projected for evaluation of nodes in this leaf set drops from 23 minutes to 0 seconds (out to several

decimal places). As indicated in the figure, similar dramatic reductions occur when the other large sets of nodes are processed in this manner.

For the diagnostic run used for this example, the complement set heuristic was applied to 69 of the remaining leaf sets, consuming about 8 CPU seconds, and reduced the total number of evaluations required from 11 billion to 45 million – which can be performed in about 9 seconds. The CPU time required to evaluate all 10.8 trillion potential bridges on this part is as follows:

(This diagnostic run resulted in the diagnosis of 27 pairs of potentially bridged nodes, exactly one pair of which were subsequently found to lay physically close enough to short.)

Entry Design/ Sample

Gate Count

(n)

Bridge Count

2n

CPU Time

(mm:ss)

Number Bridges

Diagnosed

Number Bridges Identified/ Diagnostic Resolution after Layout

Analysis Comments

1 A / #1 1.2M 664B 3:46 33 1 bridge / 30 um at M2 Burn-in failure

2 A / #2 1.2M 664B 3:25 36 1 bridge / 470 um at M6 Burn-in failure

3 B / #1 282K 40B 1:04 13 1 bridge / 133 um at M4 and M2 Scan chain failure

4 B / #2 282K 40B 3:02 46 1 bridge / 117 um at M4 and M3 4-node bridge

5 C / #1 404K 82B :38 1 1 bridge/ (layout analysis not attempted) Reticle defect causing chain failures

6 D / #1 3.8M 7.2T 4:16 4 1 bridge / 190 um at M3 Customer return

7 D / #2 3.8M 7.2T 1:55 1 2 bridges / 1000 um at M2, M1, PC, RX Systematic IDDQ only yield loss

8 E / #1 576K 166B 6:55 5,190 1 bridge* / 28 um on M2 and M1 (*Combined results with those of logic diagnostics prior to layout analysis)

9 E / #2 576K 166B 1:37 1 1 bridge / 12 um at M2 and M1 Stress Fail

10 F / #1 2.0M 2.0T 12:57 78 1 bridge / 25 um at M2 IDDQ-only yield loss

11 F / #2 2.0M 2.0T 14:18 2 1 bridge / 3 um at M1 IDDQ-only yield loss

12 H 14.5M 105.8T 17:32 0 (n/a) Dummy test case on large part

1 min, 10 sec Phase I: Partitioning nodes by logic state 4 sec Phase II: Processing constant-state nodes 8 sec Phase III: Complement set processing 9 sec Explicit evaluation of remaining node pairs

1 min, 31 sec Total CPU Time

Table 1. Performance of exhaustive bridge diagnosis on 11 production parts with manufacturing defects. (Entry 12, with a run time approaching 20 CPU minutes, is a dummy test case used to test the performance of the software on a very large ASIC.)


4. Hardware Results

Table 1 summarizes the results of applying this method on a variety of stress failures and customer returns requiring physical failure analysis. All parts shown are 0.18 um or 130 nm designs, with IDDQ background currents ranging from 50 uA to 11 mA. The current elevation caused by the defects varied from 375 uA to 3.6 mA.

The third and fourth columns contain, respectively, the total number of nodes in the (gate-level) test models used for diagnosis and the number of potential bridges this represents. The next two columns provide raw performance data: the CPU time, in minutes and seconds, for each diagnostic run, and the number of potential bridges diagnosed. (Note that run times are for the bridge diagnosis only, and do not include time required for the one-time simulation to determine the state of the fault-free circuit at IDDQ measurements.) The seventh column contains a key result: the number of bridges implicated after examination of the physical proximity of all diagnosed node pairs. The first figure in this column is the number of node pairs diagnosed that run close enough at some point to be connected by a defect; the second is a list of process layers and the length in microns for which the nodes lie close enough to bridge. The last column contains some additional information of interest about each part.

Physical failure analysis based on these results was successful in all cases; figure 6 shows photos of six of the defects found.

Note that in 10 of 11 cases, no other diagnostic technique was utilized, yet the suspected defect location after layout analysis in every case except one3 is confined to a single bridge, and often to a single process layer.

For one module (entry 8) IDDQ diagnostic resolution alone was not sufficient for PFA. In this case, intersection of the IDDQ results with those of stuck-fault simulation resulted in the diagnosis of a resistive short to Vdd. This bridge exhibited “Byzantine Generals” behavior (propagating a faulty 1 through only two of four downstream circuits) that had resulted in a less-than-perfect match to the stuck-fault model.

Included in the table are two cases (entries 3 and 5) in which the defect caused the scan chains not to operate correctly. In one case (entry 4), the initial diagnostic run found no two-node bridge matching the signature. In this case, however, the results of multiple diagnostic runs against different discrete current levels in the IDDQ signature enabled confident diagnosis of a single bridge connecting four circuit nodes. In every case shown, the IDDQ bridge fault model perfectly explained the current elevation caused by the bridge.

3 Ironically, the case (7) in which two potential bridges were implicated after layout analysis is one for which the software had diagnosed only a single bridge. Examination of the layout revealed an unmodeled complement signal derived from the diagnosed net that also required inspection.

1

9

5

6 7

Figure 6. Some bridging defects diagnosed using the method. Clockwise from lower right: Foreign material shorting M3 metal lines, bridge between three signal nodes and ground bus due to catastrophic electromigration event, printed polysilicon bridge due to mask reticle defect, bridge at top of M3 metal lines due to underpolish, photolithography defect shorting diffusion areas, M2 shortdue to metal corrosion. The numbers in each image correspond to entry numbers in Table 1. (PFA credits in same order: David Picozzi and Rick Wasielewski, David Picozzi and Bill Bentley, Herve Chincholle and Jean-Luc Kasnesralla of Altis Semiconductor, David Picozzi, David Picozzi and Pete Klinger, Jim Massucco and Bill Bentley.)

4


Three of these PFA submissions were requested in order to investigate IDDQ-only yield loss, illustrating a unique advantage of IDDQ bridge diagnosis: the ability to diagnose circuits that pass all logical tests applied, but which have a resistive bridge that may ultimately cause the part to fail in the application.

5. Advantages and Limitations 5.1 Speed

Because the evaluation of a bridge between any two circuit nodes requires knowledge only of their state in the fault-free circuit, IDDQ bridge diagnosis requires no “fault simulation” in the usual sense of the term. We perform a one-time simulation of the fault-free circuit to determine the state of every node at each IDDQ measurement, and may then use this information to diagnose any number of chips of that design, with no additional simulation cost.

If one compares the time required for exhaustive diagnosis of the example in section 3 with the “CPU months” estimate for fault simulation of 90,000 realistic bridges, one finds that the method presented in this paper can outperform traditional bridge fault simulation by a factor of a trillion or more. Even if one includes the time required for simulation of the fault-free circuit, exhaustive analysis of trillions of bridge faults using IDDQ still requires less time than fault simulation of a few hundred stuck faults over the same test vectors.

This performance is achieved, however, by abandoning any pretense at accommodating defect behavior that deviates from the fault model. One might ask, therefore, to what extent the applicability of this method is limited by its lack of tolerance for noise and uncertainty.

5.2 Accuracy of the IDDQ Bridge Fault Model

The variety of defect mechanisms illustrated in figure 6 –ranging from the neatly “printed” short between polysilicon wires (defect 5) to the explosion of amorphous copper oxide foam spanning four metal segments (defect 4) -- illustrates the versatility and robustness of the IDDQ bridge fault model. One could reasonably argue, however, that these PFA results are self-confirming: when the defect behaves according to the model, diagnosis succeeds; when the defect behaves otherwise, we fail to find it, and thus avoid direct evidence that the defect is in fact a bridge for which the IDDQ model is not perfectly accurate.

Over the past several years, there have been a number of parts for which this method failed, but for which other diagnostic methods and PFA were ultimately successful, and the defects found on these parts offer some insight into the types of mechanisms that cannot be diagnosed with this method. In one such case, the defect was found to be a (diodic) bridge at the transistor level. In this case, the same foreign material shorting the two transistor nodes was also blocking a contact. The result was an IDDQ signature whose

elevated readings were largely due to “shoot-through” current in a transistor with a floating gate.

For cases other than this module, however, the primary reasons for the failure of the method were 1) the defect was an open, usually at the transistor level, and 2) the defect was a bridge, but the shorted nodes were not represented in the gate-level model used for diagnosis. In the latter case, when the logic states on the bridged nodes during IDDQ test were subsequently determined, the bridge was found to produce exactly the current signature predicted by the IDDQ bridge model.

Though this evidence is only anecdotal, the failure over several years of diagnostic work to encounter a simple bridging defect whose behavior is not perfectly predicted by the IDDQ model seems nonetheless remarkable. Though the method can clearly fail for complex transistor-level bridges, experience on real-world parts suggests that it has very broad applicability.

5.3 Diagnostic Resolution One disadvantage of exhaustive IDDQ bridge diagnosis is that the worst-case diagnostic resolution possible is far worse than the worst-case resolution obtained from traditional logic diagnosis. In the latter case, one can at least confine the list of fault candidates to those in the input logic cone of failing latches. Because IDDQ measurements alone do not implicate any particular portion of the circuit, however, when IDDQ diagnostic resolution is bad, it can be almost comically bad. As an extreme example, the circuit discussed in section 3 has over 870,000 nodes that remain at logical 0 on all IDDQ measurements, and another 242,000 that are always at 1. If one were to determine that a defect was causing current elevation on all vectors, the list of bridge faults diagnosed would include any “always 0” node paired with any “always 1” node – and there are over 21 billion such node pairs!

In general, an IDDQ signature that has very few elevated or very few non-elevated vectors tends to produce a long list of matching bridge candidates. (Entry 8 in Table 1 -- for which only 1 of 256 IDDQ measurements were elevated -- is an example.) This seems to be a consequence of the large number of circuit nodes that tend to remain at one state on many test vectors. Most IDDQ signatures observed in practice, however, include several measurements on which the defect is active, and several on which it is not, and the list of bridges diagnosed for such a signature tends to be limited to bridge faults that are equivalent by construction of the circuit design. As illustrated by the results shown in column six of Table 1, such a list can still be quite large by the standards of traditional logic diagnosis.

However, as [15] has observed, because we know we are looking for a bridging defect, we can filter the “raw” results of diagnosis through an examination of the circuit’s physical layout, and concern ourselves only with node pairs that are physically proximate, with no intervening


nodes. As illustrated in column 7 of Table 1, this analysis typically culls the list of bridges diagnosed to a single candidate, and often strictly limits the wire segments and process layers requiring inspection as well.

In practice, the worst diagnostic resolution achieved through this method is often for nodes that are diagnosed as bridged to Vdd or Ground (or, equivalently, to one of the many “constant-state” signal nodes). For such cases, analysis of the layout often will not significantly reduce the area requiring inspection, because there tend to be many locations and process layers at which such a short could occur. Note, however, that this result is essentially equivalent to that of a “perfect score” stuck fault – often considered a best-case scenario for logical fault diagnosis.

5.4 “Realistic” bridges versus Exhaustive Analysis

One might ask if there is any advantage to avoiding the use of “realistic” bridge faults if one must ultimately employ information from the physical layout in order to attain high diagnostic resolution. The answer is clearly yes, for two reasons.

First is the expense of extracting and storing a set of realistic bridges. The method described in [17], for example, requires over 63 CPU hours to extract a list of adjacent nodes for a part with only 2.28 million transistors. Although others have created extraction tools that run more quickly at the price of less exacting analysis of the critical area between shapes [18], layout extraction is still typically an “overnight” job. In an environment in which a fab manufactures hundreds of different ASIC designs a year, any one of which may become of interest for diagnosis and PFA, the ability to diagnose bridges without such an onerous prerequisite is clearly an advantage.

However, perhaps the most compelling advantage of exhaustive diagnosis is simply the ability to consider “everything,” without any assumption as to what bridges a manufacturing line is capable of producing. Though the idea of an ideal list of only those bridges that can actually occur is appealing, in practice extraction tools must deal with constraints on time and space, and ultimately must ignore many bridges that could reasonably be considered “realistic.” For example, most fault extractions identify bridges only between nodes in the same process layer. Even in Aitken’s seminal paper on IDDQ diagnosis [9], however, one of the two defects for which SEM photos are provided is a vertical bridge between process layers4. Anyone who has worked closely with PFA over many years can likely vouch for the ability of a semiconductor manufacturing line to produce defects that are “unexpected.” The ability to diagnose mechanisms that

4 One thus wonders whether the ability to diagnose this defect successfully was due to inclusion of inter-layer bridges in the list of realistic faults used for diagnosis, or simply the fact that the IDDQ behavior of this defect, a short to Ground, happens to match the pseudo-stuck-at-0 fault on the same node, a model also used during this experiment.

were not anticipated is clearly an advantage of exhaustive diagnosis.

5.5 Confidence

Intuitively, one might expect exhaustive IDDQ diagnosis to produce many “false positive” results. Given 10 trillion potential bridges, the odds of the behavior of at least one bridge matching an IDDQ signature purely by chance might seem very high. A propensity to send PFA to inspect many locations where there is no defect to be found would clearly be a major drawback for any diagnostic method. As discussed earlier, there are some IDDQ signatures, usually easily recognized by virtue of having only a few measurements that differ in level from the others, for which many node pairs will be diagnosed. These signatures aside, however, the odds of matching a long IDDQ signature “by accident,” even given 10 trillion chances, tend to be very low.

The reason is the sheer size of the “IDDQ signature space,” which doubles in size as each new measurement is made. Given 256 IDDQ measurements, there are 2256 possible two-level IDDQ signatures, a number of literally astronomical proportions (see Table 2).

By way of analogy, the odds of getting a “false positive” result for a given IDDQ signature are somewhat akin to the likelihood of a computer cracker successfully breaking a 256-bit encryption key by making 10 trillion random guesses. Table 3 provides some odds for this as well as some other unlikely events.

Thus, whereas most diagnostic techniques tend to err on the side of inclusiveness in order to accommodate noisy data and imperfect fault models, by requiring an exact match to the IDDQ model, we intentionally set a very high hurdle toward reporting any results at all. With few

1 in 121 million (226)

1 in 9 billion (234)

Winning the Powerball® lottery

Being struck by lightning (per day)

Winning the lottery and being struck by lightning on the same day

Doing so on three consecutive days

Correctly guessing a random 256-bit string given ten trillion tries

Event Odds of Occurrence

1 in 2180

1 in 2211

1 in 260

Table 3. Probabilities of some unlikely events

Age of the universe (in seconds) Number of atoms in the sun

Number of atoms in the galaxy Number of unique 256-bit IDDQ signatures Number of atoms in the known universe

259

2190

2223

2256

2265

Table 2. Some large numbers. (Physical constants and probabilities in Tables 2 and 3 are adapted from [19].)


exceptions, any defect that can clear this hurdle is almost certainly the cause of the observed IDDQ signature.

6. Conclusion

The method described in this paper renders exhaustive two-node bridge diagnosis feasible for large industrial circuit designs, and offers a combination of speed, accuracy, diagnostic resolution, and confidence in the results rarely realized using traditional diagnostic methods. Experience over several years in using this method to diagnose production ASIC designs confirms the remarkable robustness of the IDDQ bridge fault model and the richness of diagnostic information present in IDDQ signatures. Section 2 discussed an evolution in the Test community’s awareness of the power of IDDQ for diagnosis. The author hopes and believes that this evolution will continue.

7. Acknowledgements A complete list of the colleagues in Test Engineering, Research, Failure Analysis, Diagnostics, and management who have in some way supported the development of this work would be too long to include here. However, the author is particularly indebted to Phil Nigh, Anne Gattiker, Dale Grosch, Maroun Kassab, Bill Livingstone, Deb Korejwa, and David Picozzi. I would also like to express my appreciation to Stefan Eichenberger and my ITC reviewers for helpful comments on an earlier draft of this paper.

8. References [1] E. Isern and J. Figueras, “IDDQ Test and Diagnosis of

CMOS Circuits,” IEEE Design and Test of Computers, Winter 1995, pp. 60-67

[2] J. M. Acken and S. D. Millman, “Fault Model Evolution for Diagnosis: Accuracy vs. Precision,” Proceedings of the Custom Integrated Circuits Conference, 1992, pp. 13.4.1-13.4.4

[3] S. D. Millman, E. J. McCluskey, J. M. Acken, “Diagnosing CMOS Bridging Faults with Stuck-At Fault Dictionaries,” Proc. Int. Test Conf., 1990, pp. 860-870

[4] J. Acken and S. Millman, “Accurate Modeling and Simulation of Bridging Faults,” Proc. IEEE Custom Integrated Circuits Conf., 1991, pp. 17.4.1 – 17.4.4

[5] S. Millman and J. Acken, “Diagnosing CMOS Bridging Faults with Stuck-At, IDDQ, and Voting Model Fault Dictionaries,” IEEE Custom Integrated Circuits Conf., 1994, pp. 17.1-4

[6] D. B. Lavo, B. Chess, T. Larrabee, and F. J. Ferguson, “Diagnosing Realistic Bridging Faults with Single Stuck-At Information,” IEEE Transactions on Computer-Aided Design, pp. 255-268, March 1998

[7] D. B. Lavo, T. Larrabee, B. Chess, “Beyond Byzantine Generals: Unexpected Behavior and Bridging-Fault Diagnosis,” Proc. Int’l Test Conf, 1996, pp. 611-619

[8] R.C. Aitken, “Fault Location with Current Monitoring,” Proc Int. Test Conf., 1991, pp. 623-632

[9] R.C. Aitken, “A Comparison of Defect Models for Fault Location with IDDQ Measurements,” Proc. Int. Test Conf., 1992, pp. 778-787

[10] S. Chakravarty, S. Suresh, “IDDQ Measurement Based Diagnosis of Bridging Faults in Full Scan Circuits,” Proc. 7th Int. Conf. on VLSI Design, Jan. 1994, pp. 179-182

[11] P. Nigh, D. Forlenza, and F. Motika, “Application and Analysis of IDDQ Diagnostic Software,” Proc. Int. Test Conf.,1997 pp. 319-327

[12] D. Lavo, B. Chess, T. Larrabee, I. Hartanto, “Probabilistic Mixed-Model Fault Diagnosis,” Proc. Int. Test Conf., 1998, pp. 1084-1093

[13] L. Huisman, “Diagnosing Arbitrary Defects in Logic Designs Using Single Location at a Time (SLAT),” IEEE Transactions on CAD of Integrated Circuits and Systems, Vol. 23, No. 1, January 2004

[14] J. P. Shen, W. Maly, and F. J. Ferguson, “Inductive Fault Analysis of MOS Integrated Circuits,” IEEE Design and Test of Computers, Vol. 2, No.6, Dec. 1985, pp. 13-26

[15] Z. Stanojevic, H. Balachandran, D. Walker, F. Lakhani, S. Jandhyala, J. Saxena, K. Butler, “Computer-Aided Fault to Defect Mapping (CAFDM) for Defect Diagnosis,” Proc. Int. Test Conf., 2000, pp. 729-738

[16] A. Gattiker and W. Maly, “Current Signatures: Application,” Proc. Int. Test Conf., pp 156-164, 1997.

[17] S. Zachariah and S. Chakravarty, “A Scalable and Efficient Methodology to Extract Two Node Bridges from Large Industrial Circuits,” Proc Int. Test Conf., 2000, pp. 750-759

[18] Z. Stanojevic, H. Balachandran, D. Walker, F. Kakhani, S. Jandhyala, “Defect Localization Using Physical Design and Electrical Test Information,” Proc. Advanced Semiconductor Manufacturing Conf., 2000, pp. 108-115.

[19] B. Schneier, “Applied Cryptography: Protocols, Algorithms, and Source Code in C,” 2nd ed., John Wiley and Sons, 1996

[20] P. Nigh and A. Gattiker, “Random and Systematic Defect Analysis Using IDDQ Signature Analysis for Understanding Fails and Guiding Test,” Proc Int. Test Conf., 2004, pp. 309-318

IDDQ Diagnosis

Documents

Transcript of IDDQ Diagnosis