05875994

14
1072 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 7, JULY 2011 BIST-Based Fault Diagnosis for Read-Only Memories Nilanjan Mukherjee, Member, IEEE, Artur Pogiel, Member, IEEE, Janusz Rajski, Senior Member, IEEE, and Jerzy Tyszer, Senior Member, IEEE Abstract —This paper presents a built-in self-test (BIST)-based scheme for fault diagnosis that can be used to identify perma- nent failures in embedded read-only memories. The proposed approach offers a simple test flow and does not require intensive interactions between a BIST controller and a tester. The scheme rests on partitioning of rows and columns of the memory array by employing low cost test logic. It is designed to meet requirements of at-speed test thus enabling detection of timing defects. Experimental results confirm high diagnostic accuracy of the proposed scheme and its time efficiency. Index Terms—Built-in self-test (BIST), deterministic parti- tioning, discrete logarithms, embedded read-only memory, fault diagnosis. I. Introduction T HE INTERNATIONAL Technology Roadmap for Semi- conductors [13] predicts memories to occupy more than 90% of the chip silicon area in the foreseeable future. Due to their ultralarge scale of integration and vastly complex structures, memory arrays are far more vulnerable to defects than the remaining parts of integrated circuits. Embedded memories have already started introducing new yield loss mechanisms at a rate, magnitude, and complexity large enough to demand major changes in test procedures. Many types of failures, often not seen earlier, originate in the highest density areas of semiconductor devices where diffusions, polysilicon, metallization, and fabricated structures are in extremely tight proximity to each other. Failing to properly test all archi- tectural features of the embedded memories can eventually deteriorate the quality of test, and ultimately hinder yield. Embedded memories are more challenging to test and diagnose than their stand-alone counterparts. This is because their complex structures are paired with a reduced bandwidth of test channels resulting in limited accessibility and Manuscript received September 7, 2010; revised December 15, 2010; ac- cepted February 4, 2011. Date of current version June 17, 2011. A preliminary version of this paper appeared as “Fault diagnosis for embedded read-only memories” at the Proceedings of the IEEE International Test Conference in 2009, paper 7.1. This paper was recommended by Associate Editor D. M. Walker. N. Mukherjee and J. Rajski are with Mentor Graphics Corporation, Wilsonville, OR 97070 USA (e-mail: nilanjan [email protected]; janusz [email protected]). A. Pogiel is with Mentor Graphics Polska, Poznañ 61-131, Poland (e-mail: artur [email protected]). J. Tyszer is with the Faculty of Electronics and Telecommunica- tions, Poznañ University of Technology, Poznañ 60-965, Poland (e-mail: [email protected]). Digital Object Identifier 10.1109/TCAD.2011.2127030 controllability. Consequently, the memory built-in self-test (MBIST) has established itself as one of the mainstream design for test (DFT) methodologies as it allows one to generate, compress, and store on chip very regular test patterns and expected responses by using a relatively simple test logic. The available input/output channels, moreover, suffice to control built-in self-test (BIST) operations, including at-speed testing and detection of timing defects. Non-volatile memories are among the oldest programmable devices, but continue to have many critical uses. ROM, PROM, EPROM, EEPROM, and flash memories have proved to be very useful in a variety of applications. Traditionally, they were primarily used for long-term data storage, such as look-up tables in multimedia processors or permanent code storage in microprocessors. Due to the high area density and new submi- crometer technologies involving multiple metal layers, ROMs have also gained popularity as a storage solution for low- voltage/low-power designs. Moreover, different methods such as selective pre-charging, minimization of non-zero items, row(s) inversion, sign magnitude encoding, and difference encoding are being employed to reduce the capacitance and/or the switching activity of bit and word lines. Such design, technology, and process changes have resulted in an increase in the number of ROM instances usually seen in design. New non-volatile memories such as ferroelectric, magnetoresistive, and phase changed RAMs retain data when powered off but are not restricted in the number of operation cycles [1], [12]. They may soon replace other forms of non-volatile memory as their advantages, e.g., reduced standby power and improved density, are tremendous. It has become imperative to deploy effective means for testing and diagnosing non-volatile memory failures. A func- tional model employed for these memories remains similar to that of RAMs with relevant fault types such as stuck-ats and bridges being tackled through functional test algorithms [25]. Also, all addressing malfunctions are covered by memory cell stuck-at fault tests as there are no writes in the mission mode. Typically, the basic test reads successive memory cells, and processes output responses by performing a polynomial division to compute a cyclic redundancy code (signature). The same procedure can be used to detect certain classes of dynamic faults provided memory cells are designed with additional DFT features [14]. No longer, however, is it sufficient to determine whether a memory failed or not [3], [27]. In ROM defect analysis and 0278-0070/$26.00 c 2011 IEEE

Transcript of 05875994

Page 1: 05875994

1072 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 7, JULY 2011

BIST-Based Fault Diagnosis for Read-OnlyMemories

Nilanjan Mukherjee, Member, IEEE, Artur Pogiel, Member, IEEE, Janusz Rajski, Senior Member, IEEE,and Jerzy Tyszer, Senior Member, IEEE

Abstract—This paper presents a built-in self-test (BIST)-basedscheme for fault diagnosis that can be used to identify perma-nent failures in embedded read-only memories. The proposedapproach offers a simple test flow and does not require intensiveinteractions between a BIST controller and a tester. The schemerests on partitioning of rows and columns of the memoryarray by employing low cost test logic. It is designed to meetrequirements of at-speed test thus enabling detection of timingdefects. Experimental results confirm high diagnostic accuracyof the proposed scheme and its time efficiency.

Index Terms—Built-in self-test (BIST), deterministic parti-tioning, discrete logarithms, embedded read-only memory, faultdiagnosis.

I. Introduction

THE INTERNATIONAL Technology Roadmap for Semi-conductors [13] predicts memories to occupy more than

90% of the chip silicon area in the foreseeable future. Dueto their ultralarge scale of integration and vastly complexstructures, memory arrays are far more vulnerable to defectsthan the remaining parts of integrated circuits. Embeddedmemories have already started introducing new yield lossmechanisms at a rate, magnitude, and complexity large enoughto demand major changes in test procedures. Many types offailures, often not seen earlier, originate in the highest densityareas of semiconductor devices where diffusions, polysilicon,metallization, and fabricated structures are in extremely tightproximity to each other. Failing to properly test all archi-tectural features of the embedded memories can eventuallydeteriorate the quality of test, and ultimately hinder yield.

Embedded memories are more challenging to test anddiagnose than their stand-alone counterparts. This is becausetheir complex structures are paired with a reduced bandwidthof test channels resulting in limited accessibility and

Manuscript received September 7, 2010; revised December 15, 2010; ac-cepted February 4, 2011. Date of current version June 17, 2011. A preliminaryversion of this paper appeared as “Fault diagnosis for embedded read-onlymemories” at the Proceedings of the IEEE International Test Conference in2009, paper 7.1. This paper was recommended by Associate Editor D. M.Walker.

N. Mukherjee and J. Rajski are with Mentor Graphics Corporation,Wilsonville, OR 97070 USA (e-mail: nilanjan [email protected];janusz−[email protected]).

A. Pogiel is with Mentor Graphics Polska, Poznañ 61-131, Poland (e-mail:artur−[email protected]).

J. Tyszer is with the Faculty of Electronics and Telecommunica-tions, Poznañ University of Technology, Poznañ 60-965, Poland (e-mail:[email protected]).

Digital Object Identifier 10.1109/TCAD.2011.2127030

controllability. Consequently, the memory built-in self-test(MBIST) has established itself as one of the mainstreamdesign for test (DFT) methodologies as it allows one togenerate, compress, and store on chip very regular testpatterns and expected responses by using a relatively simpletest logic. The available input/output channels, moreover,suffice to control built-in self-test (BIST) operations, includingat-speed testing and detection of timing defects.

Non-volatile memories are among the oldest programmabledevices, but continue to have many critical uses. ROM, PROM,EPROM, EEPROM, and flash memories have proved to bevery useful in a variety of applications. Traditionally, they wereprimarily used for long-term data storage, such as look-uptables in multimedia processors or permanent code storage inmicroprocessors. Due to the high area density and new submi-crometer technologies involving multiple metal layers, ROMshave also gained popularity as a storage solution for low-voltage/low-power designs. Moreover, different methods suchas selective pre-charging, minimization of non-zero items,row(s) inversion, sign magnitude encoding, and differenceencoding are being employed to reduce the capacitance and/orthe switching activity of bit and word lines. Such design,technology, and process changes have resulted in an increasein the number of ROM instances usually seen in design. Newnon-volatile memories such as ferroelectric, magnetoresistive,and phase changed RAMs retain data when powered off butare not restricted in the number of operation cycles [1], [12].They may soon replace other forms of non-volatile memoryas their advantages, e.g., reduced standby power and improveddensity, are tremendous.

It has become imperative to deploy effective means fortesting and diagnosing non-volatile memory failures. A func-tional model employed for these memories remains similarto that of RAMs with relevant fault types such as stuck-atsand bridges being tackled through functional test algorithms[25]. Also, all addressing malfunctions are covered by memorycell stuck-at fault tests as there are no writes in the missionmode. Typically, the basic test reads successive memory cells,and processes output responses by performing a polynomialdivision to compute a cyclic redundancy code (signature).The same procedure can be used to detect certain classesof dynamic faults provided memory cells are designed withadditional DFT features [14].

No longer, however, is it sufficient to determine whether amemory failed or not [3], [27]. In ROM defect analysis and

0278-0070/$26.00 c© 2011 IEEE

Page 2: 05875994

MUKHERJEE et al.: BIST-BASED FAULT DIAGNOSIS FOR READ-ONLY MEMORIES 1073

fine-tuning of a fabrication process, the ability to diagnosethe cause of failure is of paramount importance. In particular,new defect types need to be accurately identified and wellunderstood. It is also a common desire to verify if the program-ming device that is writing the ROM is working correctly. Themethod and accuracy of the diagnostic technique, therefore, isa critical factor in identifying failing sites of a memory array. Itcan be performed either on chip or off-line after downloadingcompressed test results.

Until recently, the main strategy for ROM diagnosis wasto have users provide an initialization file that describesthe content of the ROM. The initialization sequence can berandom as far as the test is concerned. During the MBISTsession, the content of the ROM is read multiple times usingdifferent addressing schemes and compressed into a signature.The signature is downloaded at the very end of the algorithm.Although the data transferred to the memory is minimal, thediagnostic procedures employed for ROMs are cumbersomein nature. Current techniques either rely on downloading thesignature value at certain intervals (based on binary searchtechniques) such that one can corner the test step whenthe MISR gets corrupted. Some other techniques suggestdownloading the content of the entire ROM when a failureoccurs. Such techniques can get to the failing address and data,but they are complex, time consuming, and often prohibitivein practice. Therefore, additional hardware is added to allowdownloading the content of the entire ROM. As the ROMneeds to be stopped after every read operation, the time neededto diagnose ROM failures increases significantly.

Susceptibility to different forms of failure has given rise tovarious memory diagnostic algorithms. Typically, they targetRAMs by modifying the MBIST controller [5], [20] to carryout extra tests aimed at localizing all single-cell faults. Asyndrome compression scheme, which requires a content ad-dressable memory to perform data accumulation, is presentedin [15]. Various techniques [2], [7], [9] are, in essence, off-linereasoning procedures gearing toward accurate reconstructionof error bitmaps. The recent method of [17] achieves similargoals by employing flexible test logic to record test responsesat the system speed with no interruptions of a BIST session.

Several diagnostic schemes have been patented. Solutionspresented in [6] and [26] use dedicated circuits to compressdiagnostic data at high speed and download them to a slowmemory tester. Similarly, the scheme of [7]–[9] compressesmemory test responses using a combinational logic and scansthem outside the chip using a reduced bandwidth. Reference[28] proposed a fault syndrome compression scheme to iden-tify failing patterns by means of coordinates compression. Atechnique similar to that of [11] is deployed in [29], whererepetition of the same test is required. A dynamic switchingbetween BIST and built-in self-diagnosis modes is introducedin [24]. It allows some failing patterns to be recognized andencoded into bit-strings.

In this paper, we propose a low-cost test and diagnosticscheme that allows uninterrupted test response collection toperform accurate identification of failing rows, columns, andcells in read-only memories [18], [23]. The method utilizes aconcept of partitioning, originally introduced in [21] for scan-

TABLE I

Basic Parameters of a ROM Array

R The number of rowsB The word size (the number of bits)M The number of words in a row (mux factor)C The number of columns (C = B × M)

based fault diagnosis in BIST environment, and further refinedin [4]. The proposed scheme partitions rows and columns ofa ROM array deterministically and records signatures corre-sponding to array segments being currently read (observed),every time narrowing down possible error locations until thefailing rows and columns are determined. Such approachneither requires interactions between BIST and automatic testequipment (ATE) nor interrupts a test flow.

This paper is organized as follows. In Section II, theoverall architecture of the diagnostic environment is presented.Section III details foundations of row and column partitioning,while Section IV introduces its hardware implementation.Section V demonstrates how to locate single erroneous cellswithin failing rows and columns. In Section VI, we report ex-periments performed using the proposed approach. Section VIIdiscusses area cost of the scheme. Finally, Section VIII con-cludes this paper. Compared to the earlier version of this paper[18], we have added: 1) a comparison regarding diagnostictime as offered by the proposed method and other techniquesbased on a complete ROM dumping or a binary search acrossthe address space; 2) results of logic synthesis (in termsof area overhead) with respect to proposed diagnostic logic;and 3) detailed discussions of new diagnostic algorithms andadopted experimentation procedures. Moreover, this paper issignificantly modified with respect to its presentation style,including several additional comments, new figures, and illus-trative examples.

II. Test Logic Architecture

A. Memory Array Organization

Fig. 1 shows the salient architectural features of a ROM.Every row consists of M words, each B-bit long. Bits be-longing to one word can be either placed one after anotheror interleaved forming segments, as illustrated in the figure.Decoders guarantee the proper access to memory cells ineither a fast row or a fast column addressing mode, i.e.,with row numbers changing faster than word numbers or viceversa. Table I gives the main memory parameters that wewill use in the next sections of this paper. It is worth notingthat algorithms proposed in this paper do not impose anyconstraints on the addressing scheme so that the memory arraycan be read using either increasing or decreasing address order.

B. Collection of Diagnostic Data

The same Fig. 1 summarizes the architecture of a test envi-ronment used to collect diagnostic data from the ROM arrays.In addition to a BIST controller, it consists of two modulesand gating logic that allow selective observation of rows and

Page 3: 05875994

1074 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 7, JULY 2011

Fig. 1. Memory array architecture and diagnostic environment.

columns, respectively. Assuming permanent failures, the BISTcontroller sweeps through all ROM addresses repeatedly whilethe row and column selectors decide which data arriving fromthe memory rows and/or columns is actually observed by thesignature register. Depending on a test scenario, test responsesare collected in one of the following test modes.

1) Row disable = 0 and column disable = 1; the rowselector may enable all bits of the currently receivedword, thereby selecting a given row; this mode is usedto diagnose row failures and, in some cases, single cellfaults.

2) Row disable = 1 and column disable = 0; assertionof the row disable signal effectively gates the rowselector off; the column selector takes over as it picksa subset of bit lines to be observed (this correspondsto selecting desired columns and is recommended todiagnose column and single cell failures).

3) Row disable = 0 and column disable = 0; de-assertingboth control lines allows observation of memory cellslocated where selected rows and columns intersect; thismode is discussed in Section IV-D.

Fault diagnosis has a simple flow. It proceeds iterativelyby determining a signature, which corresponds to the selectedrows or columns, followed by a transfer of such a test responseto the ATE through an optional shadow register. If the obtainedsignature matches the reference (golden) signature, we declarethe selected rows and/or columns fault-free. Time required to

filter out failing sites accurately depends on how selectionof observable rows and columns is carried out. Our schemeemploys an enhanced version of deterministic partitioningoriginally proposed for scan-based diagnosis [4]. It assures thefastest possible identification of fault sources down to the arraynodes that cannot be recognized as fault-free ones. Details ofthe partitioning procedure will be presented in Sections IIIand IV.

C. Signature Register

A signature register is used to collect all test responsesarriving from selected memory cells. The register is reset atthe beginning of every run (test step) over the address space.Similarly, the content of the register is downloaded once perrun. A multiple input ring generator (MIRG) [16] driven bythe outputs of gating logic is used to implement the signatureregister. The design of Fig. 2 features the injector networkhandling the increasing number of input channels. It is worthnoting that connecting each input to uniquely selected stagesof the compactor makes it possible to recognize errors arrivingfrom different input channels. This technique visibly improvesdiagnostic resolution, as is demonstrated in the followingsections.

III. Deterministic Partitioning

In principle, selection of rows and columns that shouldbe observed during a single diagnostic test run proceeds in

Page 4: 05875994

MUKHERJEE et al.: BIST-BASED FAULT DIAGNOSIS FOR READ-ONLY MEMORIES 1075

Fig. 2. MIRG-based signature register.

accordance with a deterministic scheme sketched in [4] and,for the sake of completeness, briefly summarized in Section IV.The set of memory rows or columns is decomposed severaltimes into groups of 2n disjoint partitions of approximatelysame size. In order to reduce test time, the number of partitionswithin each group should be small. Consequently, the sameapplies to the value of n. On the other hand, we need toguarantee that successive groups of partitions are formed insuch a way that each partition of a given group shares atmost one item with every partition belonging to the remaininggroups. This implies that the partition size must not exceed thenumber 2n of partitions. Hence, if the total number of memoryrows or columns v is an even power of 2, then the value of ncan be computed as 0.5 log2 v. Otherwise, n = �0.5 log2 v�. Asa result, the size of partitions may vary from 2n−1 to 2n. Thisrule guarantees the most time-efficient tracking down of faultyrows or columns. Indeed, if the array has x failing elements,then it suffices to run a test as indicated by x + 1 groups todetermine the faulty items [4].

Example: Let us consider a 16-row memory array. Twogroups, each comprising four unique partitions, are shown onthe left-hand side of Fig. 3(a). Suppose row 7 is faulty. Afterproducing four signatures according to the scheme definedby the first group (0), it appears that signatures representingpartitions 0, 1, and 2 are error-free, thereby rows that belongto these partitions can be cleared [the right-hand side ofFig. 3(a)]. Since the signature obtained by processing datafrom rows 3, 7, 11, and 15 (partition 3) is erroneous, theserows become now suspects [as marked in Fig. 3(a)]. The sus-pect rows belong to different partitions in the subsequent group(1). After running four tests for this group, it becomes evidentthat only signature representing partition 2 is erroneous. Sincerows 2, 8, and 13 were identified earlier as fault-free, we caneasily isolate row 7 as the failing one. �

Example: Assume now that x = 3 rows (5, 10, and 11) ofthe same memory are faulty. As can be seen in Fig. 3(b), aftercollecting four signatures for group 0, only rows 0, 4, 8, and 12can be declared fault-free. Running tests for group 1 results inerroneous signatures associated with partitions 0 and 1; hencethe number of suspects drops to 6 candidate rows: 1, 5, 10, 11,14, and 15. The next round of tests produces three erroneoussignatures, but due to new contents of partitions 0, 1, and 3,the possible suspects can be confined to rows 1, 5, 10, 11,and 14. Eventually, tests for group 3 produce two erroneoussignatures for partitions 2 and 3, where only rows 5, 10, and

11 are still present. As a result, these rows are identified asfaulty ones. Clearly, reading x+1 = 4 groups of row partitionssuffices to uniquely determine the failing rows. �

As with many other schemes, ROM diagnosis can be per-formed either in a non-adaptive mode where tests are selectedprior to the actual diagnostic experiment, or in an adaptivefashion, where selection of tests is based on the outcomes ofthe previous runs. In the first case, the process targets a pre-specified number x of failing items and does not require anyinteraction with a tester, as only signatures for x + 1 partitiongroups have to be collected. In the second approach, if thecurrent number of suspect rows or columns does not narrowdown anymore, the failing items are assumed to be determined,and the test stops.

IV. Row and Column Selection

In this section, we introduce several hardware solutions forrow and column selection. In particular, after presenting sepa-rate row and column selectors that implement a deterministicpartitioning of a ROM array, we introduce a scheme that allowsone to partition rows and columns simultaneously.

A. Row Selection

We start by introducing the general structure of the rowselector shown in Fig. 4. Essentially, it is comprised offour registers. The up counters partition and group, each ofsize n = �0.5 log2 R�, keep indexes of the current partitionand the current group, respectively. They act as an extensionof the row address register that belongs to the BIST controller(the leftmost part of the counter in Fig. 4). A linear feedbackshift register (LFSR) with a primitive characteristic polynomialimplements a diffractor providing successive powers of agenerating element of GF(2n), which are subsequently usedto selectively invert data arriving from the partition register.The same register can be initialized when its input load isactivated. Similarly, one can initialize a down counter calledoffset by asserting its input load.

In principle, the circuit shown in Fig. 4 implements thefollowing formula used to determine members r of partitionp within group g:

r = S · k + (p ⊕ (g ⊗ k)), k = 0, 1, . . . , P − 1 (1)

where S is the size of partition, P is the number of partitions,⊕ is a bit-wise addition modulo 2, and g ⊗ k is a statethat the diffractor reaches after k − 1 steps assuming that itsinitial state was g. If k = 0, then g ⊗ k = 0. For example,(1) yields successive partitions of Fig. 3 for S = 4 andk = 0, 1, 2, 3, assuming that the diffractor cycles through thefollowing states: 1 → 2 → 3 → 1. Let g = 3 and p = 2. Thenwe have

k = 0: r = 4 · 0 + (2 ⊕ (3 ⊗ 0)) = 0 + (2 ⊕ 0) = 2k = 1: r = 4 · 1 + (2 ⊕ (3 ⊗ 1)) = 4 + (2 ⊕ 3) = 5k = 2: r = 4 · 2 + (2 ⊕ (3 ⊗ 2)) = 8 + (2 ⊕ 1) = 11k = 3: r = 4 · 3 + (2 ⊕ (3 ⊗ 3)) = 12 + (2 ⊕ 2) = 12.

With the ascending row address order, selection of rowswithin a partition, a group, and finally the whole test is done as

Page 5: 05875994

1076 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 7, JULY 2011

Fig. 3. Partition groups for 16-word memory. (a) Single faulty row. (b) Three faulty rows.

Fig. 4. Row selector.

follows. The offset counter is reloaded periodically every timethe n least significant bits of the row address register becomezero (this is detected by the NOR gate N1). Once loaded, thecounter is decremented to reach the all-0 state after p⊕(g⊗k)cycles. This is detected by the NOR gate N2 associated withthe counter. Hence, its asserted output enables observation ofa single row within every S successive cycles. As indicatedby (1), the initial values of the offset counter are obtained byadding the actual partition number to the current state of thediffractor. The latter register is initialized by using the groupnumber at the beginning of every test run, i.e., when the rowaddress is reset. Subsequently, the diffractor changes its stateevery time the offset register is reloaded. As the period ofthe LFSR-based diffractor is 2n − 1, and the offset counter isreloaded 2n times, the missing all-0 state is always generatedat the beginning of a test run by means of the AND gatesplaced at the outputs of the diffractor.

Example: Fig. 5 illustrates operation of the row selectorfor the memory of Fig. 3, i.e., for 16 rows forming fourpartitions. In this case n = 2. Two state diagrams shown inthe figure correspond to the content of the diffractor and the

offset register when handling partition 2 (102) of group 3 (112).The last diagram depicts the output of gate N2. As can beseen, the diffractor is loaded at the beginning of a test run(row address = 0) with the group number 3 (112), and then itchanges its state every four cycles by following the trajectory 1→ 2 → 3. At the same cycles, i.e., 0, 4, 8, and 12, the offsetcounter is reloaded with the sum of the partition number 2(102) and the previous state of the diffractor, except for thefirst load, when only the partition number goes to the offsetcounter as the outputs of the AND gates are set to 0. Afterinitializing, the offset regiter counts down and reaches zero atcycles 2, 5, 11, and 12, which yields an active signal on line“observe row” resulting, in turn, in observing data from thememory rows with addresses 2, 5, 11, and 12, respectively. �

B. Column Selection

Fig. 6 shows the column selector used to decide, in adeterministic fashion, which columns should be observed. Itsarchitecture resembles the structure of the row selector asboth circuits adopt the same selection principles. The maindifferences include the use of a BIST column address registerand a diffractor clocking scheme. Moreover, the offset counteris now replaced with a combinational column decoder, whichallows selection of one out of B outputs of the word decoder(see Fig. 1). It is worth noting that the diffractor advancesevery time the column address increments. Its content addedto the partition number yields a required column address in amanner similar to that of the row selection.

If the size B of the memory word is equal to M (thenumber of words per row), it suffices to select one out ofB columns at a time to cover all columns of the memoryarray for one partition group. Typically, however, we observethat B > M. This requires more than one column of eachword to be selected at a time, as far as the single test runis concerned for every partition. The number t of columnsobserved simultaneously can be determined by dividing themaximal number of columns in a partition, which is 2n, bythe number M of memory words per row

τ = 2n/M. (2)

It is important to note that columns observed in parallelcannot be handled by a single “t out of B” selector, as in such

Page 6: 05875994

MUKHERJEE et al.: BIST-BASED FAULT DIAGNOSIS FOR READ-ONLY MEMORIES 1077

Fig. 5. Row selector operation.

Fig. 6. Column selector.

a case certain columns would always be observed together,thereby precluding an effective partitioning. Consequently, theoutput column decoder is divided into t smaller “1 out of B”decoders fed by phase shifters (PS), and then the diffractor,as shown in Fig. 7. The phase shifters transform a given inputcombination in such a way that the resultant output values arespread in regular intervals over the diffractor state trajectory.Fig. 8 demonstrates this scenario for a 3-bit diffractor drivingthree phase shifters and using primitive polynomial x3 + x + 1.Let the diffractor be initialized to the value of 1. The phaseshifters PS1, PS2, and PS3 are then to output states of theoriginal trajectory, but starting with the values of 4, 6, and 5,respectively. When various partition groups are examined, thediffractor traverses the corresponding parts of its state spacewhile the phase shifters produce appropriate values that ensuregeneration of all 2n − 1 combinations. The missing all-0 stateis again obtained by means of AND gates. Synthesis of phaseshifters is thoroughly discussed in [22].

Fig. 7. Enhanced column selector.

Fig. 8. Use of three phase shifters.

Example: Let a memory row consist of M = 2 8-bit inter-leaved words arranged as shown in Fig. 9. From (2) we havethat τ = 4/2 = 2, so we need two “1 out of 4” column decodersand one phase shifter connected to the decoder selecting bitsb4 to b7. Table II illustrates how columns are selected for

Page 7: 05875994

1078 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 7, JULY 2011

Fig. 9. Example of column selector.

Fig. 10. Phase shifters for column partitioning.

partition groups 1 (012) and 2 (102). The first two rows ofthe table contain values generated by the diffractor (initializedwith the group number 1) and a phase shifter for partition 0. Ascan be seen, despite the diffractor’s initial value, address 0 isfirst observed at the input of column decoder 0 due to the logicvalue of 0 driving the AND gates. The next state provided tothe column decoder is 2, which is the second state producedby the diffractor. These two addresses at column decoder 0result in the following selections: column 0 of word 0, andcolumn 5, which, in fact, is the column 2 of word 1. Moreover,column decoder 1 receives states 3 and 1 produced by thephase shifter (see the corresponding diffractor trajectories).They facilitate selection of columns 14 and 11, respectively. Asfor the remaining partitions of group 1, the same states occurat the outputs of AND gates and the phase shifter, but theyare further modified by adding successive partition numbers.It effectively results in selection of the remaining columns.Column selection for the next partition groups is carried outin a similar manner except for initialization of the diffractor.The diffractor trajectory and selection of columns for partitiongroup 2 are presented in Fig. 10. �

C. Combined Row and Column Selection

In order to reduce the area overhead, some componentsof the row selector and the column selector can be shared.The circuit by which this concept is implemented is shownin Fig. 11 where the partition and group registers feed bothselectors. Since the word address increments prior to therow address, the memory array is read in the fast column

TABLE II

Column Partitioning

Word Partition Group Column Column ObservedAddress Decoder 0 Decoder 1 Columns

0 00 01 00 11 0, 141 00 01 10 01 5, 110 01 01 01 10 2, 121 01 01 11 00 7, 90 10 01 10 01 4, 101 10 01 00 11 1, 150 11 01 11 00 6, 81 11 01 01 10 3, 130 00 10 00 01 0, 101 00 10 11 10 7, 130 01 10 01 00 2, 81 01 10 10 11 5, 150 10 10 10 11 4, 141 10 10 01 00 3, 90 11 10 11 10 6, 121 11 10 00 01 1, 11

addressing mode. As no interaction between control signalsarriving from the word and row address registers is needed,the scheme enables reading the memory array in the fast rowmode as well, after exchanging the row and column addressregisters. Furthermore, the combined row and column selectoris designed in such a way that none of the components requireclock faster than the one used to increment either the word orrow address register. As a result, the proposed scheme allowsreading memory at-speed, and thus detection of timing defects.Finally, as the combined selector makes it possible to collectthe row and column signatures in parallel, such an approachallows one to reduce the diagnostic time by half. In this mode,however, two signature registers are required.

D. Trellis Selection

Given x + 1 groups of signatures, the selection schemespresented earlier allow one to identify correctly up to either

Page 8: 05875994

MUKHERJEE et al.: BIST-BASED FAULT DIAGNOSIS FOR READ-ONLY MEMORIES 1079

Fig. 11. Combined row and column selector.

Fig. 12. Trellis selection. (a) Single stuck-at column and single stuck-at rowfailure. (b) Error-free response. (c) Erroneous response.

x failing rows or x failing columns, exclusively. The actualfailure may comprise, however, faults occurring in rows andcolumns at once. Fig. 12(a) illustrates a failure that consistsof a single stuck-at column and a single stuck-at row. Theblack dots indicate failing cells assuming a random fill—notethat some cells of the faulty row and column store the samelogic values as those forced by the fault. If diagnosed byusing separate selection of rows and columns, such a faultwould affect most of signatures as cells belonging to the failingcolumn make almost all row signatures erroneous, and cellsof the failing row would render almost all column signatureserroneous, as well.

Collecting signatures in so-called trellis mode provides asolution to this problem by partitioning rows and columnssimultaneously. Selecting rows and columns in parallel sub-stantially reduces the number of observed cells, thereby in-creasing a chance to record fault-free signatures and to sievesuccessfully failing rows and columns. Fig. 12(b) and (c) areexamples of trellis compaction in the presence of a single-row-single-column failure. Observed are memory cells locatedat the intersections of rows and columns only. The resultantsignatures are therefore likely to be error-free, as shown inFig. 12(b). Consequently, the selected rows and columns canbe declared fault free. When the selected cells come across thefailing row or the failing column, one may expect to captureat least one error, as in Fig. 12(c).

There is an intrinsic rows-to-columns correlation in the trel-lis selection mode. In particular, using the same characteristicpolynomial for both diffractors of the combined selector, andinitializing them with the same group number causes pre-dictable changes in this dependency—many row-column pairsalways end up in the same partitions. As a result, the diagnostic

TABLE III

Correlation in the Trellis Mode

k Column diffractor initialized with the group number0 952 320 920 576 888 832 31 7441 95 232 126 976 158 720 1 015 8082 0 0 0 03 1024 0 0 04 1024 0 05 1024 08 016 032 1024

Column diffractor initialized with the group number + 10 953 312 922 560 892 800 297 6001 92 256 122 016 149 792 527 7442 2976 2976 3968 180 5443 32 992 1984 31 7444 32 0 49605 32 39688 99216 99232 32

algorithm is unable to distinguish fault-free rows and columnsfrom defective ones since they are permanently paired by theselection scheme. The upper part of Table III illustrates a pos-sible impact this phenomenon may have on diagnostic quality.The results were obtained for a memory array with 1024 rowsand 1024 columns. The row and column selectors employidentical diffractors with a primitive polynomial x5 + x2 + 1.Each entry to the table provides the number of row-columnpairs (out of total 10242) that occur k times within the samepartitions for arbitrarily chosen 3, 4, 5, and 32 partition groups.As shown in the table, 1024 rows and columns get always tothe same partition regardless of the number of partition groups.A thorough analysis of these results has further revealed thatevery row is permanently coupled with a certain column dueto this particular selection mechanism.

It appears, however, that a simple n-bit arithmetic incre-menter (a module “+1” in Fig. 11) placed between the groupregister and one of the diffractors alters this row-columnrelationship so that the resultant correlation is significantlydecreased. This is confirmed by the experimental data gatheredin the lower part of Table III. We assume here that the col-umn diffractor is initialized with the group number increasedarithmetically by 1. As can be seen, the enhanced selectiontechnique clearly reduces the number of the row-column pairsthat always end up in the same partitions. Interestingly, thenumber of such pairs is equal to the number of partitions in agroup (32). This is due to the zero states that are always con-tributed by the AND gates at the beginning of each partition.

V. Single Cell Failures

The methods presented in the previous section allow iden-tification of failing sites with single-row and/or single-columnaccuracy. It is also possible to take diagnosis a step furtherand determine location of a single faulty cell within a row ora column. This section summarizes this approach.

Page 9: 05875994

1080 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 7, JULY 2011

Fig. 13. Single cell failure diagnosis.

Since the compactor (signature register) is a linear circuit,we work with so-called error signature E, which replaces theactual signature A, and can be obtained by adding modulo 2 agolden (fault-free) signature G to A, i.e., E = A⊕ G. In termsof error signatures, the compactor remains in the all-0 state(Fig. 13) till a fault injection that moves the compactor to acertain state x determined by the compactor injector network.Subsequently, the compactor advances by additional d steps toreach state y. Typically, d is the number of steps required tocomplete a given memory run. The same value provides thenthe actual fault location which is the distance between statesx and y, as recorded by the compactor.

The value of d, and hence a fault site, can be found byusing a discrete logarithm-based counting [10], [17]. It solvesthe following problem: given an LFSR and a given state,determine the number of cycles needed to reach that stateassuming that the compactor is initially set to 0. . . 001. Whenworking with a failing row signature (most likely representinga single cell failure), a fault injection site (the compactor input)is unknown. Thus, d must be computed B times by applyingrepeatedly the following formula:

d = dy − dx (3)

where dy and dx are distances between the state 0. . . 01 andstates y and x, respectively. Recall that state x depends onwhere a fault is injected, so does dx. Finally, only d < M · R

is considered an acceptable solution. It is worth noting thatonce accepted, the corresponding state x identifies uniquelythe memory segment from which a fault arrives. The followingsteps summarize the above procedure:

compute dy using discrete logarithms countingfor i = 0 to B − 1 repeat

recall dx(i) from LUTcompute d = dy − dx(i)

if d < 0, then d ← d + LFSR periodif d < M · R, then stop; the failing cell belongs tosegment i and its distance to the end of this segmentis d.

end for.Example: Suppose we use a simple 4-bit compactor with

two inputs as shown in Fig. 14. The same figure illustratesits state trajectory. Let the compactor work with a memoryhaving the following parameters: B = 2, M = 2, and R = 4.As can be seen, if an error is injected through input a, thenthe compactor will initially move to state 0110 (6). Similarly,an error reaching input b takes the compactor to state 1001(9). From the figure we have that a distance d6 between state0001 (1) and 0110 (6) is 2. Also, d9 = 11. Assume that the

Fig. 14. Simple compactor and its state trajectory.

compactor has reached state 1110 (14). Its distance from state0001 (1) is 7, and thus d = 7 − 2 = 5 for input a. Since7 < M · R = 2 · 4 = 8, we get an acceptable result, whichindicates that the failing cell belongs to the segment connectedwith input a, and its distance from the end of the segment is5. If one examined results obtained for input b, then we wouldget d = 7−11 = −4. After adjusting this number by using thecompactor period, we would have d = −4 + 15 = 11. Clearly,11 > 8, and thus the result could be discarded. �

Having collected column signatures, a fault injection sitecan be determined in a straightforward manner. Consequently,the above diagnostic procedure simplifies and becomes morereliable, as there is no need to repeat all diagnostic steps forsuccessive inputs of the compactor.

Moreover, information related to failing rows (or columns),obtained as shown in the earlier sections, can be used in furtherefforts to improve accuracy of diagnosis. Given distance d, onecan easily determine a row r to which the suspect cell belongs.If r does not match the row indicated by virtue of the waythe selection mechanism works, the algorithm continues totarget the following memory segments. The same techniqueallows scaling down the size of the compactor itself. In fact,the compactor period can be shortened even below the sizeof a single memory segment. Failing row information usedto eliminate inconsistent results effectively counterbalances apossible “wrap-around.” The following code summarizes thebasic steps of the improved diagnostic procedure:

compute dy using discrete logarithms countingfor i = 0 to B − 1 repeat

recall dx,i from LUTcompute d = dy − dx,i

if d < 0, then d ← d + LFSR periodif d > segment size, start a new iterationj ← 0while (d + j · LFSR period < segment size) repeat

r ← R − 1 − (d + j · LFSR period) / Mc ← M − 1 + i ·M − (d + j · LFSR period) mod Mif r is a suspect row and c is the failing column(if known), then stop; the failing cell is (r, c)j ← j + 1

end whileend for.

Page 10: 05875994

MUKHERJEE et al.: BIST-BASED FAULT DIAGNOSIS FOR READ-ONLY MEMORIES 1081

VI. Experimental Results

This section reports results of experiments carried out tocharacterize performance of the proposed diagnostic scheme.In particular, a diagnostic coverage is used as a primary figureof merit. Assuming that we target up to x failing rows orcolumns, all numbers presented in this section have beenobtained by adopting the following procedure.

1) Run tests for x + 1 column partition groups. Let xc bethe resultant number of failing columns.

2) Repeat the same tests for x+ 1 row partition groups. Letxr be the resultant number of failing rows.

3) If neither xc nor xr is less than or equal to x, then carryout the trellis selection and stop. Otherwise:

4) If xc ≤ x, then:Examine signatures (one per a failing column) collectedin step (1) against single cell faults (by using the discretelogarithms-based counting).

5) If xr ≤ x, then:Examine signatures (one per a failing row) collected instep (2) against single cell faults (again by using thediscrete logarithms-based counting).

The order of actions proposed above plays a key role inoptimizing the diagnostic performance. For example, if thereare single cell faults only, then both xc and xr are less orequal to x and the discrete logarithms-based counting can beapplied to examine both column and row signatures (steps 4and 5), and subsequently to cross-check their results. Typically,the columns are to be examined prior to the rows. This isbecause the failing column number is known when launchingthe diagnostic procedures for columns, so that we can runthe discrete logarithms-based reasoning only once for a given(known) injector polynomial. It remarkably minimizes thelikelihood of choosing a wrong segment number, which mightbe the case in step (5) as further illustrated by results ofTable IV.

On the other hand, in a rare case of multiple row and columnfailures [19], the trellis selection is the only feasible diagnosticapproach, and thus the condition of step (3) must be checkedbefore launching the remaining techniques proposed in thispaper.

The first group of experiments examines a relationshipbetween the compactor size and the diagnostic coverage whenattempting to identify single cell failures. Table IV presents thediagnostic coverage numbers as a function of the memory andcompactor sizes. Each entry to the table indicates a fraction offailures that were correctly diagnosed out of 100 K randomlygenerated single cell failures. In order to increase statisticalsignificance of the experiments, the compactor injector net-work kept changing every 1000 failures.

It is worth noting that only failing row signatures were con-sidered as starting points to trace faulty cells. This experimentcan be regarded therefore as the worst case analysis as far asthe discrete logarithm-based counting is concerned. Typically,one may expect substantial improvement once failing columnsignatures are also available.

As shown in the table, the size of the signature register canbe crucial in achieving adequate diagnostic resolution and cov-

TABLE IV

Diagnostic Coverage [%] Versus Compactor Size

Segment Memory Compactor SizeSize Size [kB] 20 24 28 32

B = 32256 8 98.49 99.22 99.43 99.741 K 32 97.43 98.50 98.86 99.444 K 128 96.67 97.90 98.41 99.1316 K 512 96.19 97.57 98.15 98.9764 K 2048 95.81 97.42 98.04 98.88

B = 12864 8 99.01 99.49 99.83 99.85256 32 96.76 98.59 99.27 99.471 K 128 92.45 97.01 97.95 98.754 K 512 88.02 94.79 96.20 97.6716 K 2048 85.27 93.01 94.60 96.77

erage. Interestingly, there is a coverage drop when comparingmemories of the same capacity but having different numberof segments. Apparently, the increasing number of segmentsadversely impacts the diagnostic coverage. Fortunately, thisphenomenon is gradually diminishing with the increasing sizeof the compactor itself.

It also appears that the discrete logarithms-based countingworks fine even for memories greater than the compactorperiod. As an example, consider a 2 MB array comprising over16.8M memory cells. They may potentially produce 16.8Merroneous patterns. Nevertheless, a 32 input 20 bit compactorwith the period of 1 048 575 guarantees almost 96% diagnosticcoverage. This is because the diagnostic algorithm targetsonly memory cells belonging to the indicated failing rows,as presented in Section V.

The schemes proposed in this paper were further testedon 128 kB and 2 MB memories working with 16 bit and 32bit compactors, respectively. This group of experiments wasaimed at determining the overall diagnostic coverage for faultscommonly exhibited by memories. They are listed in the firstcolumn of Table V. In principle, each entry to the table consistsof two numbers. The first one is the percentage of faults of agiven type that were correctly identified. The second numberprovides the percentage of test cases in which at least rowsand/or columns that host the actual failure were part of thesolution. Clearly, if the first number is 100%, the secondassumes the same value and is, therefore, omitted in the tableif the complete coverage was reached for all cases in a row.Note that each data presented in the table was obtained byinjecting 10 K and 5 K randomly generated failures to 128 kBand 2 MB devices, respectively. In each case, the number offailures was chosen arbitrarily.

As can be seen, the table presents a predictable trade-off between accuracy of diagnosis and test application time(measured in memory runs—see the header third row). Inparticular, the increasing number of memory runs increasesthe diagnostic coverage as well. The best results are achievedfor the largest partition groups. Predominantly, the coverage iscomplete. The proposed scheme always yields a solution thatincludes all columns and rows that host failing cells. It wasmeticulously verified during the experiments and is confirmed

Page 11: 05875994

1082 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 7, JULY 2011

TABLE V

Diagnostic Coverage [%] for Two Memory Arrays

Memory Size 128 kB 2 MBArchitecture 1024 × (32 × 32) 4096 × (32 × 128)Compactor Size 16 32Partition groups 3 4 5 3 4 5Memory runs 96 128 160 192 256 320Single cell 100 100 100 100 100 100Two cells 100 100 100 100 100 100Three cells 70.48 100 100 83.08 100 100

100 100 100 100 100 100Single row 100 100 100 100 100 100Two rows 100 100 100 100 100 100Three rows 83.77 99.98 99.98 91.16 100 100

100 100 100 100 100 100Single column 100 100 100 100 100 100Two columns 99.98 99.98 99.98 100 100 100

100 100 100 100 100 100Three columns 83.98 99.99 99.99 91.42 100 100

100 100 100 100 100 100Single row 99.99 99.99 99.99 98.02 98.02 98.02and single cell 100 100 100 100 100 100Single row 84.42 99.98 99.98 87.16 95.26 95.26and two cells 100 100 100 100 100 100Two rows 83.85 99.99 99.99 89.46 97.26 97.26and single cell 100 100 100 100 100 100Single column 99.99 99.99 99.99 100 100 100and single cell 100 100 100 100 100 100Single column 84.31 100 100 91.36 100 100and two cells 100 100 100 100 100 100Two columns 84.11 100 100 91.46 100 100and single cell 100 100 100 100 100 100Single row 77.55 87.91 93.83 88.18 93.66 96.58and single column 100 1100 100 100 100 100Two rows 35.93 73.23 88.91 60.70 87.24 95.24and single column 100 100 100 100 100 100Single row 35.81 73.25 89.18 59.16 87.14 95.04and two columns 100 100 100 100 100 100Two rows 7.43 54.01 83.81 24.24 74.38 92.42and two columns 100 100 100 100 100 100

in Table V by the overwhelming presence of 100% numbers. Adetailed analysis of the remaining test cases reveals that somediagnostic misses can be attributed to one of the followingreasons.

1) Insufficient number of partition groups (mostly columnslabeled 3 in the table). One may alleviate this drawbackby simply collecting more signatures at the price of alonger test session.

2) Low diagnostic resolution due to a small compactor. Itbecomes apparent when looking for single cell failures.The diagnostic algorithm returns incorrect faulty sitesthat still belong to the same failing row/column asthe actual faulty cell. As demonstrated earlier, a largercompactor can easily alleviate this problem.

3) Correlation between rows and columns (observed mainlywhen using the trellis selection). For larger memoriesthis effect is negligible. For instance, there are 32out of 1024 (≈3.1%) correlated rows and columns in128 kB memory, whereas a 2 MB array lowers down thispercentage to 1.6% (64 out of 4096).

It is also instructive to compare diagnostic times whenemploying the method proposed in this paper and someconventional techniques. In the rest of this section, we will

delve into two of these techniques. According to the firstone, each ROM address location is read from and its contentis dumped into an s-bit register, which is then immediatelyshifted out. This approach allows diagnosing any number ofmemory failures. The second method follows a binary searchscheme. The ROM address space is divided in half, and theMBIST is run for both halves separately to collect two s-bit signatures. Once the failing half is determined, one cancontinue running MBIST for the corresponding sub-halvesto match signatures again. Clearly, this technique allows forcorrect identification of single memory failures only.

Diagnostic time can be derived from the cycle time, memorysize n (in terms of its words), and the number of cyclesrequired to perform both read operations and serial downloadof the resultant signatures. The memory dump techniquerequires n read cycles and ns shift cycles to download acontent of successive memory words. The binary search-basedmethod proceeds as follows. First, it reads all n memory wordsand dumps two s-bit signatures. Next, it reads n/2 memorylocations and again produces two signatures. This is roughlyrepeated log2 n times. Hence, it takes n+n/2 +n/4 + . . . ≈ 2 n

cycles to carry out the read operations, and additional 2s log2 n

cycles to download all relevant signatures.The approach presented in this paper reads all memory

locations g times assuming that one targets at most g−1 faults.Since test time in this case is memory-architecture dependent,we will assume the worst-case scenario where the number ofrows is equal to the number of memory words n. Therefore,it requires ng cycles. In reality, as shown earlier in this paper,the presence of multiple-word rows may actually accelerate thediagnostic process. Moreover, since the number of partitionsfor each group is roughly equal to 2h, where h = 0.5 log2 n

(see Section III), this scheme produces approximately g√

n

signatures, and it takes sg√

n cycles to shift them out. Let cdenote the MBIST clock cycle, and assume that a shift clockused by a tester is r times slower than the MBIST clock. Thethree schemes discussed here would have then approximatelythe following diagnostic time:

1) the memory dump: cn + rcns ≈ rcns;2) the binary search: 2c (n + rs log2 n);3) the new method: cg(n + rs

√n).

Clearly, the binary search offers the shortest test time.Unfortunately, it will only work for single failures. Let usnow assume that n = 1024, s = 32 (the signature size), andr = 10 (the ratio of BIST and ATE clocks.1) Then in orderto locate faults of multiplicity 3 (which implies g = 4), itwill take 327 680 cycles c to diagnose the ROM by using thedump-based approach, whereas the method proposed in thispaper requires only 45 056 such cycles, i.e., more than seventimes faster. Interestingly, both techniques need a similar testtime, if one wants to locate up to 28 different faults. For a

1It is worth noting that typically the ratio of the speed at which BIST isrun versus the speed at which data is shifted out to the ATE is very high. Sointerrupting BIST to dump the memory contents and to shift them out resultsin a much bigger overhead than that of calculating signatures for the entirealgorithm and shifting them out at the very end of test. Even if one needs torun BIST multiple times (as proposed in this paper), the resultant overheadremains feasible as BIST clock speeds are orders of magnitude higher thanshift speeds.

Page 12: 05875994

MUKHERJEE et al.: BIST-BASED FAULT DIAGNOSIS FOR READ-ONLY MEMORIES 1083

TABLE VI

Test Logic Area Overhead

ROM size[kB]

8 64 128 2048

Architecture 256× 1024× 1024× 4096×R × (M × B) (4 × 64) (4 × 128) (32 × 32) (32 × 128)n (see Fig. 4) 4 5 5 6τ, see (2) 4 8 1 2

Area [µm2]Combinational 1084 2899 605 2708Non-combinational

159 446 159 521

Interconnections 1473 4030 836 3837Total 2716 7375 1600 7066ROM array 82 575 660 602 1 321 205 21 139 292Percentage 3.29 1.12 0.12 0.033

4096-word ROM and otherwise the same conditions, the newmethod would be almost 23 times faster.

VII. Hardware Overhead

The silicon area of test logic amounts to a certain numberof gates and flip-flops. The number of gates depends on thememory word size, which in turn affects the number of inputsof the signature register and the size of its XOR injectionnetwork. Furthermore, the number of gates is implied by themux factor M, as the ratio of M and B determines the numberc of columns to be observed at a time, and thereby givesthe number of phase shifters, and thus XOR gates. Table VIprovides the actual area costs computed with a commercialsynthesis tool for four memory arrays of different capacity andarchitecture. All components of our test logic were synthesizedusing a 90 nm CMOS standard cell library under 3.5 ns timingconstraint. For indicated memory sizes and the relevantparameters n and τ, the table reports the resultant silicon areawith respect to combinational and non-combinational devices,as well as their interconnection network. The total area takenby the proposed test logic is subsequently compared withthe corresponding ROM array area (determined based onindependent data provided by two silicon manufacturers).The area occupied by test logic and expressed as a fractionof ROM area is reported in the last row of the table asa percentage. Clearly, the area overhead of the proposeddiagnostic circuitry is an insignificant part of the entirereal estate designated to host ROM arrays, their controllers,and a BIST infrastructure. In particular, a small amount ofsequential circuitry is required in each test case. Consequently,the numbers of Table VI make the proposed diagnostic schemevery attractive as far as its silicon cost is concerned.

VIII. Conclusion

In this paper, we proposed a new fault diagnosis scheme forembedded read-only memories. It reduces the diagnostic datathat needs to be scanned out during ROM test such that theminimum information to recover the failure data is preserved,and the time to unload the data is minimized.

The presented approach allows an uninterrupted collectionand processing of test responses at the system speed. This

has been achieved by using low-cost on-chip selection mech-anisms, which are instrumental in very accurate and time-efficient identification of failing rows, columns, and singlememory cells. In particular, the scheme employs the originaldesigns of row and column selectors with phase shifters con-trolling the way the address space is traversed. Furthermore,the new combined selection logic allows the scheme to collecttest results in parallel (leading to shorter test time) withoutcompromising quality of diagnosis.

Results of experiments performed on several memory arraysfor randomly generated failures clearly confirm high accuracyof diagnosis of the scheme provided the signature registers andthe proposed selection logic are properly tuned to guarantee adesired diagnostic resolution.

Acknowledgment

The authors would like to acknowledge a private commu-nication with F. Poehl of Infineon Technologies AG, Munich,Germany, concerning logic synthesis of semiconductor mem-ories.

References

[1] R. D. Adams, High Performance Memory Testing: Design Principles,Fault Modeling and Self-Test. New York: Kluwer, 2003.

[2] D. Appello, V. Tancorre, P. Bernardi, M. Grosso, M. Rebaudengo,and M. Sonza Reorda, “Embedded memory diagnosis: An industrialworkflow,” in Proc. ITC, 2006, paper 26.2.

[3] S. Barbagallo, A. Burri, D. Medina, P. Camurati, P. Prinetto, and M.Sonza Reorda, “An experimental comparison of different approaches toROM BIST,” in Proc. Eur. Comput. Conf., 1991, pp. 567–571.

[4] I. Bayraktaroglu and A. Orailoglu, “The construction of optimal deter-ministic partitioning in scan-based BIST fault diagnosis: Mathematicalfoundations and cost-effective implementations,” IEEE Trans. Comput.,vol. 54, no. 1, pp. 61–75, Jan. 2005.

[5] T. J. Bergfeld, D. Niggemeyer, and E. M. Rudnick, “Diagnostic testingof embedded memories using BIST,” in Proc. DATE, 2000, pp. 305–309.

[6] T. Boehler and G. Lehmann, “Using data compression for faster testingof embedded memory,” U.S. Patent 6 950 971, Sep. 27, 2005.

[7] J. T. Chen and J. Rajski, “Method and apparatus for diagnosing memoryusing self-testing circuits,” U.S. Patent 6 421 794, Jul. 16, 2002.

[8] J. T. Chen, J. Rajski, J. Khare, O. Kebichi, and W. Maly, “Enablingembedded memory diagnosis via test response compression,” in Proc.VTS, 2001, pp. 292–298.

[9] J. T. Chen, J. Khare, K. Walker, S. Shaikh, J. Rajski, and W. Maly, “Testresponse compression and bitmap encoding for embedded memories inmanufacturing process monitoring,” in Proc. ITC, 2001, pp. 258–267.

[10] D. W. Clark and L.-J. Weng, “Maximal and near-maximal shift registersequences: Efficient event counters and easy discrete logarithms,” IEEETrans. Comput., vol. 43, no. 5, pp. 560–568, May 1994.

[11] X. Du, N. Mukherjee, W.-T. Cheng, and S. M. Reddy, “Full-speed field-programmable memory BIST architecture,” in Proc. ITC, 2005, paper45.3.

[12] D. Gizopoulos, Ed., Advances in Electronic Testing—Challenges andMethodologies. Dordrecht, The Netherlands, Springer, 2006.

[13] International Technology Roadmap for Semiconductors. (2009) [Online].Available: www.itrs.net

[14] Y.-H. Lee, Y.-G. Jan, J.-J. Shen, S.-W. Tzeng, M.-H. Chuang, and J.-Y.Lin, “A DFT architecture for a dynamic fault model of the embeddedmask ROM of SoC,” in Proc. Int. Workshop Memory Technol. DesignTesting, 2005, pp. 78–82.

[15] J.-F. Li and C.-W. Wu, “Memory fault diagnosis by syndrome compres-sion,” in Proc. DATE, 2001, pp. 97–101.

[16] G. Mrugalski, J. Rajski, and J. Tyszer, “Ring generators: New devicesfor embedded deterministic test,” IEEE Trans. Comput.-Aided Design,vol. 23, no. 9, pp. 1306–1320, Sep. 2004.

[17] N. Mukherjee, A. Pogiel, J. Rajski, and J. Tyszer, “High throughputdiagnosis via compression of failure data in embedded memory BIST,”in Proc. ITC, 2008, paper 3.1.

Page 13: 05875994

1084 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 7, JULY 2011

[18] N. Mukherjee, A. Pogiel, J. Rajski, and J. Tyszer, “Fault di-agnosis for embedded read-only memories,” in Proc. ITC, 2009,paper 7.1.

[19] P. Nagvajara and M. G. Karpovsky, “Built-in self-diagnostic read-only-memories,” in Proc. ITC, 1991, pp. 695–703.

[20] D. Niggemeyer and E. M. Rudnick, “Automatic generation of di-agnostic memory tests based on fault decomposition and outputtracing,” IEEE Trans. Comput., vol. 53, no. 9, pp. 1134–1146,Sep. 2004.

[21] J. Rajski and J. Tyszer, “Diagnosis of scan cells in BIST en-vironment,” IEEE Trans. Comput., vol. 48, no. 7, pp. 724–731,Jul. 1999.

[22] J. Rajski, N. Tamarapalli, and J. Tyszer, “Automated synthesis of phaseshifters for built-in self-test applications,” IEEE Trans. Comput.-AidedDesign, vol. 19, no. 10, pp. 1175–1188, Oct. 2000.

[23] J. Rajski, N. Mukherjee, J. Tyszer, and A. Pogiel, “Fault diagnosis inmemory BIST environment,” U.S. Patent Applicat. 20110055646, Sep.18, 2008.

[24] C. Selva, R. Zappa, D. Rimondi, C. Torelli, and G. Mastrodomenico,“Built-in self diagnosis device for a random access memory and methodof diagnosing a random access memory,” U.S. Patent 7 571 367, Aug. 4,2009.

[25] A. K. Sharma, Semiconductor Memories: Technology, Testing and Reli-ability. New York: Wiley, 2002.

[26] J. Volrath, K. White, and M. Eubanks, “On-chip circuits for high speedmemory testing with a slow memory tester,” U.S. Patent 6 404 250, Jun.11, 2002.

[27] L.-T. Wang, C.-W. Wu, and X. Wen, VLSI Test Principles and Architec-tures. Design for Testability. New York: Morgan Kaufmann Publishers,2006.

[28] C.-W. Wu, R.-F. Huang, C.-L. Su, W.-C. Wu, Y.-J. Chang, K.-L. Luo, andS.-T. Lin, “Method and apparatus of build-in self-diagnosis and repairin a memory with syndrome identification,” U.S. Patent 7 228 468, Jun.5, 2007.

[29] H. Yamauchi, “Semiconductor memory device for build-in fault diagno-sis,” U.S. Patent Applicat. 20 050 262 422, Nov. 24, 2005.

Nilanjan Mukherjee (S’87–M’89) received theB.Tech. (Hons.) degree in electronics and electricalcommunication engineering from the Indian Insti-tute of Technology Kharagpur, Kharagpur, India, in1989, and the Ph.D. degree from McGill University,Montreal, QC, Canada, in 1996.

He is currently a Software Development Directorof the Design-to-Silicon Division, Mentor GraphicsCorporation, Wilsonville, OR. With Mentor Graph-ics Corporation, he was a co-inventor of the em-bedded deterministic test (EDT) technology and was

a Lead Developer for the leading test compression tool in the industry,TestKompress. Prior to joining Mentor Graphics Corporation, he was withLucent Bell, Holmdel, NJ, where he primarily contributed to the areas oflogic built-in self-test, RTL testability analysis, path-delay testing, and on-line testing. He has published more than 45 technical articles in various IEEEjournals and conferences. He is a co-inventor on 27 U.S. patents. He was aninvited author for the special issue of the IEEE Communications Magazine

in June 1999. His current research interests include developing next generationtest methodologies for deep submicrometer designs, test data compression, testsynthesis, memory testing, and fault diagnosis.

Dr. Mukherjee was the co-recipient of the Best Paper Award at the 2009VLSI Design Conference, the Best Paper Award at the 1995 IEEE VLSI TestSymposium, and the Best Student Paper Award at the Asian Test Symposiumin 2001. His paper in EDT at the International Test Conference in 2002was recognized as one of the most significant papers of ITC publishedin the last 35 years. He received the prestigious 2006 IEEE Circuits andSystems Society Donald O. Pederson Outstanding Paper Award recognizingthe paper in EDT published in the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. He served on theprogram committees of several IEEE conferences, including the Asian TestSymposium, the International Test Synthesis Workshop, the VLSI Design andTest Symposium, the Symposium on Design and Diagnostics of ElectronicsCircuits and Systems, and VLSI Design.

Artur Pogiel (M’09) received the M.S. degreein electrical engineering and the Ph.D. degree intelecommunications from the Poznañ University ofTechnology, Poznañ, Poland, in 2002 and 2008,respectively.

He was a Teaching Assistant with the Faculty ofElectronics and Telecommunications, Poznañ Uni-versity of Technology, until 2008. He is currentlya Software Development Engineer with MentorGraphics Polska, Poznañ. He has published 12 tech-nical papers in various IEEE journals and confer-

ences. He is a co-inventor on five U.S. patents. His main research interestsinclude design for testability, built-in self-testing, embedded testing, and faultdiagnosis.

Dr. Pogiel was the co-recipient of the Best Paper Award at the 2009 VLSIDesign Conference.

Janusz Rajski (A’87–SM’10) received the M.Eng.degree in electrical engineering from the TechnicalUniversity of Gdansk, Gdansk, Poland, in 1973, andthe Ph.D. degree in electrical engineering from thePoznañ University of Technology, Poznañ, Poland,in 1982.

From 1973 to 1984, he was a Faculty Memberwith the Poznañ University of Technology. In June1984, he joined McGill University, Montreal, QC,Canada, where he became an Associate Professorin 1989. In January 1995, he became the Chief

Scientist with Mentor Graphics Corporation, Wilsonville, OR. His mainresearch interests include design automation and testing of very largescale integration systems, design for testability, built-in self-test, and logicsynthesis. He has published more than 150 research papers in these areasand is a co-inventor on 73 U.S. and international patents. He is the principalinventor of the embedded deterministic test technology used in the firstcommercial test compression product, TestKompress. He is the co-author ofArithmetic Built-In Self-Test for Embedded Systems (Englewood Cliffs, NJ:Prentice-Hall, 1997).

Dr. Rajski was the co-recipient of the 1993 Best Paper Award for the paperin logic synthesis published in the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, the co-recipient ofthe 1995 and 1998 Best Paper Awards at the IEEE VLSI Test Symposium,the co-recipient of the 1999 and 2003 Honorable Mention Awards at theIEEE International Test Conference, as well as the co-recipient of the 2006IEEE Circuits and Systems Society Donald O. Pederson Outstanding PaperAward recognizing the paper in embedded deterministic test published inthe IEEE Transactions on Computer-Aided Design of Integrated

Circuits and Systems, and the 2009 Best Paper Award at the VLSI DesignConference. He was the Guest Co-Editor of the June 1990 and January 1992special issues of the IEEE Transactions on Computer-Aided Design

of Integrated Circuits and Systems devoted to the 1987 and 1989International Test Conferences, respectively. In 1999, he was a Guest Co-Editor of the special issue of the IEEE Communications Magazine devotedto testing of telecommunication hardware. He was the Associate Editor forthe IEEE Transactions on Computer-Aided Design of Integrated

Circuits and Systems, the IEEE Transactions on Computers, andthe IEEE Design and Test of Computers Magazine. He has served ontechnical program committees of various conferences, including the IEEEInternational Test Conference and the IEEE VLSI Test Symposium.

Jerzy Tyszer (M’91–SM’96) received the M.Eng.(Hons.) degree in electrical engineering from thePoznan University of Technology, Poznan, Poland,in 1981, the Ph.D. degree in electrical engineeringfrom the Poznan University of Technology in 1987,and the Dr.Hab. degree in telecommunications fromthe Technical University of Gdansk, Gdansk, Poland,in 1994.

From 1982 to 1990, he was a Faculty Member withthe Poznan University of Technology. In January1990, he joined McGill University, Montreal, QC,

Canada, where he was a Research Associate and Adjunct Professor. In 1996,he became a Professor with the Faculty of Electronics and Telecommunica-tions, Poznan University of Technology. He has published eight books, morethan 100 research papers in his areas of expertise, and is a co-inventor on

Page 14: 05875994

MUKHERJEE et al.: BIST-BASED FAULT DIAGNOSIS FOR READ-ONLY MEMORIES 1085

53 U.S. and international patents. He is the co-author of Arithmetic Built-InSelf-Test for Embedded Systems (Englewood Cliffs, NJ: Prentice-Hall, 1997),and is the author of Object-Oriented Computer Simulation of Discrete EventSystems (Boston, MA: Kluwer, 1999). His main research interests includedesign automation and testing of very large scale integration (VLSI) systems,design for testability, built-in self-testing, embedded testing, and computersimulation of discrete event systems.

Dr. Tyszer was the co-recipient of the 1995 and 1998 Best Paper Awardsat the IEEE VLSI Test Symposium, the 2003 Honorable Mention Award atthe IEEE International Test Conference, the 2006 IEEE Circuits and Systems

Society Donald O. Pederson Outstanding Paper Award recognizing the paperin embedded deterministic test published in the the IEEE Transactions on

Computer-Aided Design of Integrated Circuits and Systems, andthe 2009 Best Paper Award at the VLSI Design Conference. In 1999, hewas a Guest Co-Editor of the special issue of the IEEE Communications

Magazine devoted to testing of telecommunication hardware. He has servedon technical program committees of various conferences, including the IEEEInternational Test Conference, the IEEE VLSI Test Symposium, and the IEEEEuropean Test Symposium.