RSBST: an Accelerated Automated Software-Based Self-Test ...agrawvd/.../P07_Suryasarman_35...the...

https://doi.org/10.1007/s10836-019-05825-9

RSBST: an Accelerated Automated Software-Based Self-TestSynthesis for Processor Testing

Vasudevan Madampu Suryasarman1 · Santosh Biswas1 · Aryabartta Sahu1

Received: 12 May 2019 / Accepted: 6 September 2019© Springer Science+Business Media, LLC, part of Springer Nature 2019

AbstractSoftware-based test (SBST) techniques are increasingly being used for testing of modern processors because of the ease ofsynthesis using evolutionary approaches, coverage for difficult to test faults, non-intrusive nature, low hardware overhead,etc. However, the test synthesis time required by SBST is high. In this paper, an advanced SBST technique, termed as RapidSBST (RSBST) is proposed that reduces the overall test synthesis time by reusing the simulation responses of existing testprograms of identical observability. The test codes, developed using the evolutionary process, that produce similar faultsimulation results are reused for the fault evaluation. We exploit this reusability to enhance the speed of the test synthesisprocess. The efficacy of the proposed scheme is demonstrated on a 32-bit MIPS processor and on a minimal configurationof 7-stage SPARC V8 Leon3 soft processor. The traditional SBST synthesis requires 122 hours for the MIPS processor and142 hours for the Leon3 processor to develop test program sets that cover 93.9% and 92.9% of the behavioral-level faults ofthese processors, respectively. An existing enhanced greedy-cover method, that also detects the hard-to-test faults, improvesthe coverage towards 96.3% for the MIPS processor and 95.8% for the Leon3 processor, but this slower test developmentconsumes 168 hours and 172 hours, respectively. In the proposed RSBST scheme, the synthesized test codes achieve anadequate fault coverage of 96.1% for the MIPS processor and 95.5% for the Leon3 processor. This accelerated test patterngeneration takes 90 hours and 98 hours for these two processors. So it may be concluded that the proposed RSBST techniquespeeds up the traditional SBST synthesis by a factor of 1.35 while maintaining the fault coverage above 95.5%. To validatethe test quality evaluation using behavioral fault models, a strong correlation (94.8%) between the behavioral faults andgate-level faults of MIPS processor is demonstrated and verified for the proposed RSBST scheme. Also, the simulationresponses of the test programs synthesized by RSBST scheme consumes only 14.25% of storage space when compared withthe storage consumption of the actual simulation used by the existing test code generation methods.

Keywords Software-based self-test (SBST) · Test generation · Evolutionary strategies (ES) · Observability ·Behavioral fault simulation · Greedy cover

1 Introduction

As the processor technology is complex and expanding, thereliability of embedded processors is highly critical duringthe phases of chip manufacturing, and operational stages.Traditionally, ATPGmethods [28, 37] were used to generatethe test patterns using fault sensitization techniques that

Responsible Editor: R. A. Parekhji

� Santosh [email protected]

1 Department of Computer Science and Engineering, IIT,Guwahati, India

were applied on the processor using an external automatictest equipment (ATE). Later, several design for testability(DFT) techniques [2, 13] were introduced which exploitdesign-level alterations to achieve high quality test patterns.However, the transfer of the processor testing approach fromexternal testing to an internal built-in self-test mechanism(BIST) [25, 35] lead to a significant reduction in the cost oftest generation and application.

The advancements in manual test program generationapproaches [11, 15, 21, 29] have substantially contributed indeveloping effective test programs. In these works, complexfunctional test patterns are developed for testing pipelinedprocessors with multithreading, dynamic instruction execu-tion, and multicores. Nonetheless, the cost of test patterndevelopment is a tradeoff because the assembly programmer

/ Published online: 9 October 2019

Journal of Electronic Testing (2019) 35:695–714

http://crossmark.crossref.org/dialog/?doi=10.1007/s10836-019-05825-9&domain=pdf

http://orcid.org/0000-0003-3020-4154

mailto: [email protected]

has to devise complicated, high-coverage test programs forlarger processors manually.

Psarakis et al. [29] discuss the taxonomy of variousstructural and functional test generation approaches. Togenerate test programs, structural testing approaches usestructural information, such as RTL descriptions andfunctional approaches use functional information, suchas ISA. The hierarchical structural approaches, wheretest programs are generated module-by-module, can beautomated using powerful formal verification engines,such as bounded model-checking or satisfiability-basedmethods [30, 31, 41]. But these formal techniques arecomputationally prohibitive for complex processor circuits.Constraint-based structural test generation methods [5, 40]consider the gate-level or the structural details of the moduleunder test (MUT) whereas the remaining processor modulesare considered at a higher level. Now, these methods extractthe constraints imposed by the execution of the instructionson the MUT and develop efficient test patterns for the MUTbased on these extracted constraints. Finally, these module-level patterns are translated into Software-based test (SBST)programs.

Other prevalent structural testing approaches are ATPG,pseudorandom test generation, and deterministic testingapproaches. ATPG techniques [28, 37] develop the teststimuli with the help of the gate-level netlist of theprocessor. Pseudorandom pattern generators [19, 39] couldbe used for BIST to generate random but efficient patternswith a low area overhead. Deterministic testing approaches[15, 21, 27] generate test set corresponding to the operationsand functionalities of each processor module.

There are two types of functional testing techniques:1) Randomizer and 2) feedback-based techniques. Coderandomizer techniques [1, 26, 33] target the functionalfaults using a random sequence generator. Evolutionarytechnique [7, 8, 12] is a feedback-based functional testingapproach, where test patterns are evolved automaticallyusing genetic algorithmic strategies. This population-basedoptimizer refines a set of test program solutions iterativelyto develop high-quality test programs. In our approach,we have chosen the evolutionary technique for our testsynthesis because this approach naturally develops smallerbut efficient test patterns with lesser computation cost.

Identification of all physical faults has been ever-challenging because the test patterns must be applied at theoperational frequencies of processors, which are extremelyhigh. Self-testing reduces yield loss with the help of actualat-speed testing while the overall test cost of the processoris lesser. This at-speed testing feature is very difficultto achieve with external tester technologies as the ATEfrequencies could not reach up to the processor frequencies[14]. The use of hardware-based or software-based self-testing drives down the design cycle and therefore, a better

time-to-market is achieved. The Intellectual Property (IP)protection is also improved when compared with that of thescan-based DFT techniques.

In hardware-based self-testing, also termed as BIST, adedicated hardware module is attached to the processorfor testing. This module generates the test patterns andapplies them to the module under test (MUT). Eventu-ally, the responses are collected and delivered to anothercircuit, which does the response analysis. An apparent draw-back of this approach is the hardware overhead spent forthe additional testing circuit. Also, during the hardware-based self-testing, power consumption is more than that ofthe normal operational mode of the chip. To solve this,SBSTmethodologies [4, 5, 22, 27] have cultivated software-based test codes to be applied on the processors as test rou-tines. These test codes are sequences of instructions withselected operands that could validate the processor func-tionality. The SBST approaches are non-intrusive becausethe chip design does not necessitate any modification fortesting. These light-weight test codes are uploaded into thememory locations and the responses are downloaded andcompared for the fault identification. Furthermore, SBSTdoes not require any extra hardware which leads to areduced test cost and zero chip area penalty [21]. For thesereasons, SBST is exceedingly used for embedded processortesting.

In [9], Corno et al. describes microGP (μGP), an evolu-tionary approach to automatically synthesize assembly codeprograms for target microprocessors. In μGP evolutionarystrategy, new individual assembly program solutions aregenerated which could be employed as SBST test programs.These new test programs are combined with existing par-ent test program solutions for breeding a new generationof the population of test program solutions. Also, the self-adaptive architecture of μGP [34] searches for enhancedtest programs. The earlier μGP approaches [9, 10, 32, 34]employ statement, toggle, branch, expression, and conditioncoverages as the code coverage metric for the test programevaluation. However, the fault coverages of the synthesizedtest codes were inadequate.

Suriasarman et al. [36] have proposed a greedy frame-work for the self-adaptive μGP approach [34] of test syn-thesis to discover the hard-to-test faults of the processor. Intheir approach, a greedy component is integrated into theμGP-based SBST synthesis of processor cores along witha testability analysis feature. Eventually, 40% of the hard-to-test faults were traced and identified by Suriasarmanet al. [36]. Nonetheless, the synthesis of high-coverage testprograms consumed an undesirable amount of time.

This paper discusses a rapid software self-test techniquetermed as, rapid SBST (RSBST), where the test synthesis isfaster compared to that of the greedy-basedμGP approaches[36] and the conventionalμGP approaches [9, 10, 32, 34], at

J Electron Test (2019) 35:695–714696

the same time does not drop the fault coverage. This fastertest synthesis is realized by integrating the reusability offault simulation results in the existing greedy-based μGPframework [36]. In RSBST, redundant test solutions areidentified and their simulation results are reused for a fasterfault coverage evaluation. This reusability scheme couldexpedite the greedy-based μGP evolutionary test synthesiswhich covers many hard-to-test faults.

Many of the test solutions developed using the evolution-ary process could produce similar fault simulation results.RSBST reuses these fault simulation results to evaluate thefault coverage of test programs with similar characteristicsof fault identification. This substitution could remarkablyreduce the test synthesis time without compromising thefault coverage. So, the contributions of this paper are:

• In RSBST, a substantial reduction in the overall testsynthesis time is demonstrated by reusing the simu-lation responses of existing test programs of identicalobservability.

• To maintain the fault coverage, RSBST is built onthe existing greedy-based μGP framework [36] whichgreedily covers the hard-to-test faults, and achieves ahigh fault coverage.

• Different module-by-module test programs were syn-thesized for a 32-bit MIPS processor and monolithictest programs were synthesized for a Leon3 processorusing a conventional μGP method [34], a greedy-basedμGP method [36], and the proposed RSBST technique.From the results, it is observed that the test synthesis isnoticeably expedited with the introduction of the pro-posed RSBST approach which also maintains adequatefault coverage.

• In this paper, we have demonstrated the close corre-lation (of 94.8%) between high-level behavioral faultsand gate-level faults on a MIPS processor, whichensured the test quality for the proposed faster evolu-tionary test synthesis. A strong correlation of behavioralfaults with gate-level faults in processors yields a higheffective test quality (gate-level coverage > 91%) forthe test codes synthesized using the proposed RSBSTscheme.

• A storage analysis and comparison of the proposedRSBST method and the existing μGP techniquesare conducted. This study shows that the simulationresponse of RSBST programs only consumes 14.25%of storage space when compared with the actual storagespace consumption in the conventional μGP methods.

To summarize, we propose RSBST, which is an effectivetest synthesis, where hard-to-test faults are identified andthe reusability of test programs helps in the reduction of testsynthesis time. These hard-to-test faults are detected withthe help of a greedy-cover based evolutionary test synthesis.

The test synthesis time is reduced by reusing the simulationresponses of equally-observable, existing test programs. Inthe next section, we discuss the significant advances in testsynthesis to enhance the quality of SBST programs.

This paper is organized as follows: The Section 2analyzes and compares the existing SBST synthesisstrategies. The overall scheme of the proposed RSBST isdiscussed in Section 3. This section has three subsections.In Section 3.1, the preliminaries required for manual andautomated SBST test program synthesis procedures aredefined. The proposed observability based reusability forthe simulation responses is demonstrated in Section 3.2.Section 3.3 focuses on the algorithm of RSBST testsynthesis and its components. In Section 4, we analyze theexperimental results to assess the performance improvementof RSBST. Finally, Section 5 concludes the paper and givefurther research directions.

2 RelatedWorks in SBST Synthesis

In Table 1, different strategies for SBST test program syn-thesis is illustrated. In [34], G. Squillero illustrates μGP,an evolutionary framework specifically designed for devel-oping the assembly code test program for microproces-sors. Unlike the earlier evolutionary approaches, this μGPtechnique focuses on self-adaptive population-based searchwhich tunes the search process internally. To validate theefficacy of self-adaptive μGP [34], a 5-stage pipelined DLXmicroprocessor with 79 instructions is tested using the testprograms synthesized with the help of self-adaptive μGP[34]. These test programs are observed to be the compo-sitions of instruction sets that yield high RTL statementcoverage.

In [32], the existing test program solutions of previousgenerations are analyzed and assimilated with the newpopulation for faster convergence of test synthesis withlesser generations. In addition to statement coverage,toggle, expression, branch, and condition coverages arealso heuristically measured for the test program evaluation.Among these code coverage metrics, statement andbranch coverages were effective (more than 90%) whereasremaining coverages were discarded (less than 80% on theaverage).

Previous μGP techniques, proposed in [9, 10, 32, 34],could not guarantee the test quality because the hard-to-test faults were left undetected and the code coverage-based fault evaluation metrics do not hold strict correlationwith gate-level fault models. These methods could notrealize a gate-level fault coverage of more than 90%for the synthesized test programs. Although these μGPexperiments converge in tens of hours [32, 34], which isreasonably fast, lacks adequate fault coverage.

J Electron Test (2019) 35:695–714 697

Table1

Analysisandcomparisonof

severalS

BST

synthesisstrategies

Fram

ework

Testsynthesisapproach

Testprogram

evaluatio

nFaultcoverage&

timefortest

Advantagesanddisadvantages

fram

ework

synthesis

G.S

quillero[34]

Self-adaptive

μGParchitecture

RTLstatem

entcoverage

100%

statem

entcoverage

Pros:F

ine-tunedevolutionary

in24

hours

testsynthesis.

Cons:Testevaluatio

nmethod

isinefficient

Sanchezetal.[32]

Previously

devisedtestprograms

Statem

ent,Branch,

Statem

entcoverage

=99.7%

Pros:T

estsolutions

arereused

areincluded

andalso

assimilated

Conditio

n,Expression,

Branchcoverage

=99.1%

toim

provetestsolutio

nsearch

inthenewpopulatio

nandem

ployed

andTo

gglecoverages.

Conditio

ncoverage

=80.3%

Cons:Low

code

coverage

asparent

testsolutio

nsto

cultivate

Expressioncoverage

=53.7%

newtestprograms

Togglecoverage

=85.3%

CPU

Tim

e=212hours

Kranitis

etal.[20]

Ahybrid

SBST

(H-SBST

)Statem

ent,Branch,

92.5%

overallcodecoverage

Pros:R

educed

testsynthesis

methodology

thatcombines

Conditio

n,andTo

ggle

in646,185cycles

time.

low-coststructuralS

BST

coverages.

Cons:Inadequatecoverage

andhigh

coverage

RTPG

.

Luetal.[24]

Com

binesdeterm

inistically

Gate-levelsinglestuck-at

Obtains

afullprocessor

Pros:Improved

coverage.

developedtestprogramsfor

faultm

odel

faultcoverage(>

98%

)close

Cons:Difficultandlaborious

each

processormoduleand

tothatof

afullscan

chain.

testevaluatio

n

random

lygeneratedinstruction

sequencesforself-testin

gpipelin

e

processors.B

othcaneffectively

compensateeach

otherforfault

detection.

Suriasarman

etal.[36]

Agreedy-based

objective

Behavioralfaultcoverage

96.32%

ofthetestablebehavioral

Pros:Improved

Coverage

functio

nthatsynthesizes

faultsof

MIPSprocessorin

168

Highcorrelationwith

testsolutio

nsthatcould

hoursand95.8%

ofthetestable

gate-levelfaults.

detecthard-to-testfaults

behavioralfaultsof

Leon3

processor

Cons:Longertestsynthesis

in17

2ho

urswith

thedetectionof

40%

ofthehard-to-testfaults

J Electron Test (2019) 35:695–714698

Table1

(contin

ued)

Fram

ework

Testsynthesisapproach

Testprogram

evaluatio

nFaultcoverage&

timefortest

Advantagesanddisadvantages

fram

ework

synthesis

Proposed

RSB

STAnobjectivefunctio

nBehavioralfaultcoverage

96.1%

ofthetestablebehavioral

Pros:F

astertestgeneratio

n

thatcouldevaluate

faultsof

theMIPSprocessorin

90with

adequatecoverage

thetestsolutio

nseffectively

hoursand95.5%

ofthetestable

behavioralfaultsof

theLeon3

processorin

98hours

In [20], Kranitis et al. introduced a hybrid SBST (H-SBST) methodology for low-cost development of high-quality test programs. H-SBST has three phases. In thefirst phase, the modules-under-test (MUTs) are identified.During the second phase, MUTs are classified as functionaland control components and are ranked based on theirtestability. Later, a combined test development strategy ofstructural SBST methodologies and random test programgeneration (RTPG) is discussed in the third phase. Afterapplying the structural testing for the MUTs, RTPG isapplied as the supplementary step to improve the faultcoverage. In a case study of test development for OpenRISC1200 processor, incompetent code coverage of 92.5% isachieved consuming a low test execution time.

On the other hand, the recent SBST automationapproaches [16, 20, 36] have improved in terms of coveragebut rely on time-consuming test generation techniques.Lu et al. [24] have developed a hybrid test program thatcombines deterministic test programs for each moduleand randomly developed instruction sequences for a high-performance self-testing of pipeline cores. The experimentson ARMv4 and miniMIPS processors demonstrate animproved gate-level fault coverage of more than 98% forthis hybrid test generation method, which is close to thecoverage achieved for a full scan chain-based technique. Butthe consideration of exhaustive gate-level fault models leadsto a longer test synthesis for real-world, complex, pipelinedprocessors.

In the greedy-based framework of μGP, proposed bySuriasarman et al. [36], the test quality is evaluated usinga behavioral-level fault model. Eventually, the synthesizedtest solutions could trace 95.8% of the testable behavioralfaults of a Leon3 processor and 96.32% of the testablebehavioral faults of a MIPS processor. Although the greedycomponent improves the test quality, this test generationprocedure [36] would have to bear approximately 170 hoursof fault evaluation and comparison. So, the test synthesishas a slower convergence because the evolutionary modulefocuses on the comprehensive search for the hard-to-testfaults. So, we have extended the reusability scheme forfaster test synthesis proposed by [38] in this paper. In thisextension, a detailed study of the reusability of simulationresponses, a correlation analysis, and a storage analysis areintroduced in this work.

In this work, an accelerated greedy-based evolutionarytest synthesis (RSBST) is developed for validating proces-sor functionalities. To accelerate the test synthesis, RSBSTtechnique reuses the test responses of existing identical testprograms. The greedy technique enhances the test solutionsearch and improves the fault coverage but consumes hugetime for test synthesis. RSBST ensures a faster convergenceof evolutionary test synthesis maintaining adequate faultcoverage.

J Electron Test (2019) 35:695–714 699

3 Proposed Framework of Rapid SBST(RSBST) Test Synthesis

As suggested in traditional SBST synthesis [9, 10, 32, 34],the automation of test programs for processors is performedwith the help of an evolutionary approach that developsvarious test solutions using genetic operators. Each testsolution is evaluated in terms of fault coverage for theselection of fittest programs. In the proposed approach,an enhanced greedy-cover based test synthesis, termed asRSBST, is used for faster cultivation of test programs with asequence of instructions that could also trace and detect thehard-to-test faults.

Faulty and non-faulty processor models are simulated fortracing the manufacturing and online faults of the processor.The responses of these simulations would comprise thecontents of the observable locations (registers, memoryupdates, primary output, etc.). Further, the contents ofthe observable destinations of faulty processor models arecompared with the expected response generated by thegood processor model to realize hardware fault detection.In the SBST approach, the test quality is evaluated by thefault coverage and fault list extracted from the simulationresponses, i.e., the contents of the observable locations.Intuitively, the fault coverage and fault list of an SBST testprogram are completely associated with the observability ofthe processor modules.

As shown in Fig. 1, the processor is simulated for a goodreference model andN faulty models. Each faulty processormodel is inserted with a single fault which logically repre-sents the physical faults. All of these faulty models must besimulated independently to collect the test responses. Later,these responses are compared with the golden responsesof the good processor model to assess the fault cover-age. As each single fault simulation is time-consuming,fault simulation of all faulty processor models wouldconsume an enormous amount of time. Also, the responsecollection of test programs, where every observable point

Fig. 1 Fault simulation overview

must be recorded in each cycle, is computationally inten-sive. So, the simulation of faulty processor models and thesuccessive response collection and comparison are exceed-ingly time-consuming.

In the proposed RSBST technique, we reuse the sim-ulation responses for equally-observable test programs toreduce the overall test generation time. The overall approachof RSBST automation scheme for a processor is shownin Fig. 2. In this scheme, an evolutionary test generatorrapidly develops test solutions of optimum fault coverageexploiting an initial population of test programs, repre-sented as directed acyclic graphs (DAGs), and an instructionlibrary.

Formerly, all test programs were evaluated using anexternal evaluator as shown in Fig. 1. To reduce the costof this external fault evaluation, we have introduced anobservability comparator, which compares and identifiesthe test programs with similar observability. This couldeffectively reduce the number of fault simulations as thesimulation responses of the parent test programs, storedin a database of simulation responses, could be reused forthe offspring test programs with equal observability. Thedatabase of simulation responses stores the observabilityvalues, fault coverage, and the fault list of the parentchromosomes. This database is updated with the simulationresponses after each external simulation.

The observability comparator calculates the contents ofthe observable destinations of a test program. To conductthis calculation, the observability comparator makes use ofa high-level logic simulation technique which could rapidlyevaluate test programs. The contents of the observablelocations, which are the simulation responses obtained usingthe high-level logic simulation, are contrasted with theobservability values of the parent chromosomes to check forthe scope of reusability of the test program. If an offspringtest solution and one of its parent test solutions haveidentical observability values, the fault simulation could beavoided for the offspring test solution.

The fault coverage of a test program, synthesized by theevolutionary core, is evaluated using either of path 1© with

Fig. 2 RSBST automation scheme

J Electron Test (2019) 35:695–714700

bold lines or path 2© with dashed lines, as shown in Fig. 2.Path 1© denotes the reuse of simulation responses usinga rapid test evaluation method with a high-level responsecollection and comparison whereas path 2© denotes theregular, external evaluation of the test program. If thesimulation responses of a parent chromosome could bereused, external evaluation (path 2©) is avoided for the testprograms.

Initially, when a test program is synthesized by theevolutionary core, path 1© is selected for the rapidtest evaluation. In path 1©, the observability comparatorconducts a rapid high-level logic simulation for the testprogram and collects the contents of the observablelocations. If these observability values of the test programand one of its parent test programs are identical, thesimulation responses (fault coverage, fault list, etc.) ofthat parent test program are reused for the offspringtest program. Later, these responses are delivered to theevolutionary test generator.

If the observability values of the test program andany of its parent test programs are not identical, path2© is chosen. In path 2©, the external evaluator conductsa time-consuming, detailed fault simulation, as shown inFig. 1, for the test program and delivers the simulationresponses to the evolutionary test generator. Later, thedatabase of the simulation responses is updated with thesesimulation responses. So, in RSBST, high-level processordescriptions and high-level simulations are used for rapidtest evaluation (path 1©) whereas detailed HDL descriptionsof the processor and time-consuming HDL simulations areused for the external evaluation of test programs (path 2©).In the next section, the conventional framework for buildingSBST test programs with the help of a high-level faultmodeling approach is discussed.

3.1 Preliminaries of SBST Code Development

The overall SBST code development procedure includesthree phases: A) Information extraction, B) Processorcomponent classification and test prioritization, and C)Test program synthesis. In phase A, the ISA informationand RTL information of the processor are used to identifythe processor components, and the component operations,etc. With the help of the information from phase A,the processor components are classified in phase B asfunctional, control, and hidden components.

Functional components could be either computationalfunctional modules, such as Arithmetic Logic Unit (ALU),adder, multiplier, etc. or storage functional modules, suchas accumulator, register file, etc. Major control components,such as control unit, generate the control signals for thefunctional components of the processor and thereby controlthe data flow and instruction flow. The hidden components,

such as pipelining, increase the throughput of instructionexecution but are functionally invisible.

These components are prioritized based on accessibilityand testability to enhance the test development phase(Phase C). The functional components are assigned a higherpriority for test development as their operations are directlyassociated with instruction execution. In phase C, self-testcodes are synthesized for each component as a module-under-test (MUT), based on the priority. The self-testsynthesis is initially conducted for high-priority componentsbecause adequate coverage has to be achieved as quickly aspossible.

For the self-test synthesis, we gather the ISA information,component information and test-priority of componentsfrom phase A, and the know-how of a low-cost test programevaluation. In fact, a high-level fault modeling approachthat makes use of RTL information could reduce the cost oftest program evaluation. In the next subsection, we discusshow efficient test programs could be synthesized using ISAinformation and RTL-level fault models.

3.1.1 Foundations of Test Program Synthesis

The test program synthesis uses the instruction library ofthe processor to constitute the test program. The instructionlibrary comprises all the instructions of the ISA of theprocessor. Each entry of the instruction library is calleda macro, which is an instruction with randomly selectedoperands. The instruction library may also contain multiplevalid macros of the same instruction.

A sample macro for store instruction is shown in Fig. 3.Here, the store instruction is encoded between two registersand a 16-bit constant. The first and third parameters ($1 and$3) denote the two registers, each of them chosen from R1and R2. The second parameter ($2) is the 16-bit constant,which may have a value between -128 and 127. Individualtest solutions, which are self-test programs, such as theassembly code in Fig. 4, are valid sequence of macros.

In most of the IC designs, the gate-level structuraldescriptions are not available to generate the conventionalfault models. So, various high-level fault models aregenerated using behavioral level fault modeling, wheredifferent faults such as the stuck-at-0 and stuck-at-1faults are injected into the Register Transfer Level (RTL)descriptions. For example, in the input stuck-at fault model,the input is stuck to 0 or 1 for a bit or bit vector type signal

Fig. 3 A sample macro

J Electron Test (2019) 35:695–714 701

Fig. 4 A sample test program

and stuck to false or true for a Boolean type signal in theRTL statements. Presumably, most of the hardware faultsare covered if the behavioral fault coverage is good enoughbecause of the robust correlation (above 95%) between thebehavioral faults and the physical faults as demonstrated in[6, 17, 18].

This approach has a higher level of abstraction comparedto the gate-level fault modeling because the fault modelsare associated with the behavioral level descriptions [3, 23].For example, an if stuck else f ault shown in Fig. 5 isthe failure to execute an if condition, i.e., the if conditionstatement is replaced by if (FALSE)then.

Test programs with branch instructions, as shown inFig. 6, could be used to detect some of the instruction fetchand decode faults of the control components. This fragmentof assembly code has loop-based branch instructions.Initially, operands R1 and R2 are loaded with immediatevalues where the value imm1, loaded in R1, is less than thevalue imm2, loaded in R2. Later, R1 is incremented untilR1 and R2 have equal values. Finally, the content of R1 isstored in memory.

This test program could detect a behavioral if stuck

else f ault in the description shown in Fig. 5. During theexecution of a branch instruction, the if block will not beexecuted and the control signal branch is not activated.This test program branches only if the branch signal isactivated. This indicates that R1 is not incremented up toR2 and finally, the content of R1 stored in the memoryis observed to identify the fault. To automatically developtest programs that could detect every possible behavioralfault, evolutionary test synthesis techniques are employedas discussed in the next subsection.

3.1.2 Evolutionary Approach for Automation of TestSynthesis

The evolutionary core develops a genetic algorithm (GA)based automated test code synthesis procedure as shownin Fig. 7. A population of DAGs that represent the testsolutions is developed using the instruction macros selected

Fig. 5 If-stuck-else fault in instruction decoding in RTL

Fig. 6 Test program with branch instructions

from the instruction library. An external fault simulator Fevaluates the test program quality in terms of fault coverage.The evolutionary core develops a new generation of DAGsusing the fittest solutions of the previous generation.

In this evolutionary method, a parent population of testprogram solutions is modified in each generation usingmutation, crossover, and selection operators. Mutation andcrossover operators explore the search space for diverse testsolutions and selection operator selects the fittest of themin order to generate the offsprings which eventually becomethe parent population for the next generation.

Evolutionary strategies (ES) are employed to automatethe test synthesis using a directed acyclic graph (DAG)method. A DAG represents a test solution which is asequence of macros as shown in Fig. 8. A DAG node haspointers to a macro element in the instruction library and theset of parameters as shown in Fig. 9. Epilogue and prologuenodes (I0 and IF ) are the initial and final empty nodes. Aμ + λ strategy of ES with an initial generation of μ DAGtest solutions is carried out to develop efficient assemblyprograms that could validate the processor components.In every generation, new λ offsprings are created using a1-point crossover and the following mutation operators:

• Add node: A node is added to the DAG. The new nodecould be inserted anywhere after the prologue node andbefore the epilogue node in the DAG. If a new macro isinserted between a branch and its target instruction, thetarget address in the branch instruction must be updated.

• Remove node: A node is removed from the DAG.Any node could be removed from the DAG except theprologue and epilogue nodes. If the removed instructionis in the region between a branch and its targetinstructions, the target address in the branch instructionmust be updated.

Fig. 7 Basic scheme of μGP test program synthesis

J Electron Test (2019) 35:695–714702

Fig. 8 A DAG with pointers to instruction library

• Modify node: A node is modified in the DAG. Ifthe modified node is a branch instruction, its targetinstruction must be within the DAG.

Among the μ + λ individuals of a generation, μ fittestoffsprings are selected by the tournament selection operatorof tournament size τ . In each generation, these individualtest programs are evaluated using behavioral fault model toselect the fittest population. An efficient selection of theobjective function could help in carrying out the progressivedevelopment of the genetic population in consecutivegenerations and the test solutions evolve through thegenerations until an optimal solution is achieved.

From a testing point of view, the test program instructionswhich test the functionality of the processor componentsgain high coverage as they could propagate nearly all faultsof the functional components. So, the test programs in theintermediate generations that detect many of the arithmeticand logic functional faults in ALU are likely to surviveto the later generations. But the test programs that detectcertain non-functional hard-to-test faults are less likely tosurvive if they do not validate any such functional modules.For example, some faults in the control unit are excited onlyif a beq (i.e., branch) instruction immediately comes after aload instruction.

Fig. 9 A DAG node with a pointer to instruction library and a pointerto set of parameters

Although some test programs detect faults which areextreme corner cases, they may not survive to thesubsequent generations of test program population. It isobserved that these solutions are preserved only if theobjective function of the evolutionary approach deals withthe coverage of the freshly and exceptionally identifiedfaults. So, a greedy-cover based objective function is usedto synthesize test solutions with instruction sequences thatcould detect the uncovered faults. In greedy-cover based testsynthesis, test programs that detect hard-to-test faults aregreedily selected for the next generation of population.

The test synthesis time is high for the greedy-coverbased test synthesis because the test solutions with highfault coverage are not selected always. In fact, the testsolutions with considerably lesser fault coverage could evenbe selected for the future generations if it detects few hard-to-test faults. So, the convergence of the traditional SBSTwill require more generations of chromosomes and thus,is slower. So, we introduce a reusability technique, whichcould avoid the fault simulation of test programs of similarinfluence on the observable locations of the processor, for afaster test synthesis in the next subsection.

3.2 Observability-based Reusability of TestPrograms

While the evolutionary process progresses, it is highlylikely that the evolutionary core develops individualsolutions with similarities in fault simulation results. Ifthe instruction sequences of two test individuals havesimilar functionalities, the fault simulation results couldbe reused to reduce the test synthesis time. As the initialμ chromosomes of a generation are replicated from thepopulation of the previous generation, their responses couldbe naturally reused, thereby avoiding re-simulation. Butthe new λ chromosomes, which are cultivated using theμ individuals of the current generation, has to be dealtwith a faster and high-level comparison of the states of theobservable destinations.

Processor faults are identified by comparing the contentsof the observable locations on the processor. If thevalues stored in these observable locations following thesimulation of an offspring solution are the same as that ofone of its parent solutions, the set of faults that they couldidentify are likely to be the same; i.e., the fault coveragesof equally-observable test solutions are likely to be same.In that case, a re-simulation of the offspring solution couldbe avoided by reusing the identified fault list and the faultcoverage of the parent solution.

In Fig. 10, the framework for the rapid test evaluationis elaborated. Here, a high-level test program simulation,which has low timing requirement, is used to get the con-tents of all observable locations. Further, the observability

J Electron Test (2019) 35:695–714 703

Fig. 10 Test program evaluation in RSBST

of a newly generated test program and its parents are com-pared to identify the scope of reusability. As the crossoveroperator is one of the genetic operators employed forthe SBST synthesis, each offspring solution P would becomposed of the genetic information of two parent solu-tions. If any of these parents holds the same observabil-ity as that of P, the simulation responses of that parentsolution could be taken from the database of simulationresponses, where the simulation responses of the test solu-tions of the previous generations are stored. If no parentholds the same observability as that of P, P is to be faultsimulated for the responses, which is a time-consumingprocess.

In Fig. 10, the path 1© with bold lines denotes therapid test evaluation method and path 2© with dashed linesdenotes the external evaluation of the test program. Afterthe external fault simulation, the database of simulationresponses is updated with the achieved responses. Finally,the contents of the observable locations, the fault list, andthe fault coverage are achieved from the responses, stored inthe simulation database. In the next subsection, we discusshow the database of simulation responses is developed andupdated with respect to the fault simulation of each testprogram solution.

3.2.1 Database of Simulation Responses

The database of simulation responses, as shown in Table 2,stores these observability values along with the fault cov-erage and the fault list of each test program solution. LetS

ji be the j th individual of the population in the ith gener-

ation and OBSji be the values of the observable locations

after the execution of the test program solution Sji on the

processor. The content of these observable locations are thesimulation responses using which the fault coverage, faultlist, etc. are evaluated. The set OBS

ji is a combination of:

• Mji : Memory updates after the execution of the test

program solution Sji .

• Rji : Contents of the register locations after the

execution of the test program solution Sji .

• Oji : Primary output values after the execution of the test

program solution Sji .

So, OBSji = {Mj

i , Rji , O

ji }, is the overall test response on

which the quality of the test program solution Sji is eval-

uated. The test quality is dependent on the fault detectionparameters, such as fault covered list, fault coverage, etc.,which are evaluated using the observability values OBS

ji .

This database has a record for each test program solutionS

ji , as shown in Table 2. Following the execution of S

ji ,

the contents of observable locations OBSji are stored in the

record corresponding to Sji in the database. Let F

ji be the

fault coverage achieved by the test program solution Sji and

FLji be the set of faults covered by S

ji . Now, F

ji and FL

ji ,

obtained using OBSji values, are also stored in the record

correpsonds to Sji in the database shown in Table 2. As the

fault simulation that produces the simulation responses, i.e.,OBS

ji , is time-consuming, we adopt a dynamic, high-level

logic simulation as discussed in the next subsection.

3.2.2 High-level Simulation

To conduct a high-level logic simulation of test programs,the encoding and syntax of each instruction are modeled

Table 2 Database of simulationresponses for the test programsolutions of ith generation

Test solutions of ith Values of observable locations Faut coverage Faut list

generation

S1i OBS1

i = {M1i , R1

i , O1i } F 1

i FL1i

S2i OBS2

i = {M2i , R2

i , O2i } F 2

i FL2i

... ... ... ...

... ... ... ...

Sμ+λi OBS

μ+λi = {Mμ+λ

i , Rμ+λi , O

μ+λi } F

μ+λi FL

μ+λi

J Electron Test (2019) 35:695–714704

and the instruction operations are simulated using thefunctionalities of high-level programming languages, suchas C, Python, etc. Initially, this high-level component allotslimited memory for the observable locations of updatedmemory, registers, and primary outputs. The memoryupdates are observed with the help of the occurrencesof store instructions in the test program. The contents ofall registers and primary outputs are observed for logicsimulation. Later, each instruction is selected and is logicsimulated. After the logic simulation, the contents of theseobservable locations are collected as the test programresponses.

Let p denote the updated memory locations, q denote theregister locations, and r denote the primary outputs whichare the observable locations of the processor. During thefault simulation of the test program solution S

ji , developed

during the evolutionary test synthesis, the data and controlsignals are dumped into a simulation log file in each clockcycle. Further, this simulation log file is parsed to observethe memory updates M

ji = {m1, m2, m3, . . . , mp}, contents

of register locations Rji = {r1, r2, r3, . . . , rq}, and the

primary outputs Oji = {o1, o2, . . . , or}. Finally, the overall

observability OBSji = {m1, m2, m3, . . . , mp, r1, r2, r3, . . . ,

rq , o1, o2, o3, . . . , or} is stored in the record corresponding

to Sji in the simulation database.Each instruction is subjected to logic simulation using

the information of its opcode and operands. To realizethe operations of instructions, a high-level procedure isdeveloped for each opcode which could be reused for thelogic simulation of every instruction composed of the sameopcode. The opcode of each instruction is parsed to learnthe operation to be performed on the observable locations.These operations are executed on the source operandsand simultaneously, the observable locations correspondingto the destination operands are modified. Finally, thecontents of observable locations are compiled to form thetest program response, based on which the reusability isdetermined.

A memory update value of setMji is represented using an

〈address, value〉 pair where address is the updated mem-ory location and value is the updated value at address fol-lowing the logic simulation of each store instruction. In thisset, each value is represented using an 〈address, value〉pair which corresponds to each store instruction. For exam-ple, a MIPS store word instruction swR1, off set (R2) has〈(R2 + off set), R1〉 as the 〈address, value〉 pair, i.e.,the memory location (R2 + off set) is updated with thevalue in R1. So, if the test program has n store instruc-tions, n 〈address, value〉 pairs are observed. These val-ues are extracted using logic simulation to constitute theset of memory updatesM

ji . However, the test synthesis does

not allow more than p store instructions in the test program,

which is much less than the overall number of memorylocations, to reduce the cost of observability comparisonprocedure.

In Fig. 11, an intermediate test program solutionof the evolutionary test synthesis is shown. This testprogram has 8 internal nodes that represent 8 instructions,and a subgraph, with 2 internal nodes, that representsmultiplication operation. Suitable procedures are definedcorresponding to the opcode of each instruction in the ISA.Also, in the state before the test program execution, everyobservable location is initialized to zero. Each procedurepasses the values of source operands as the input arguments,conducts the operation defined by the opcode, and returnsthe value of the updated destination operand.

Initially, we select the first instruction ori of thetest program shown in Fig. 11, with register r2 and animmediate value 2 as operands. To conduct logic simulationof this instruction, the ori procedure is activated andmodifies the value of r2 as 2 in the set of register values R

ji .

Likewise, procedures of each opcode in the ISA is activatedfor the logic execution of each instruction. For the logicsimulation of all the 13 instructions of the test programshown in Fig. 11, 9 procedures (ori, addi, bne, jal, sw, li,syscall,mul, jr) must be activated. In this simulation, addi

procedure is activated 3 times and sw procedure is activated2 times.

After the execution of every instruction in the testprogram shown in Fig. 11, the values in the registerlocations r1, r2, r3, r4, r5 of R

ji becomes 6, 2, 10, 10, 100,

respectively. Also, the memory locations m1 and m2 of Mji

are updated with 〈address, value〉 pairs 〈5, 6〉 and 〈6, 100〉,respectively, and they correspond to two store word

Fig. 11 Representation of an intermediate test program of μGP testsynthesis

J Electron Test (2019) 35:695–714 705

instructions. Primary outputs remain unchanged. So,OBSji ,

which is the compilation of contents of Rji , M

ji , and

Oji after the execution of S

ji , are evaluated using the

procedures of logic simulation, and are stored in the recordcorresponding to S

ji in the simulation database. In the next

subsection, we demonstrate how the observability valuesof two test programs are compared for the discovery ofequally-observable test programs.

3.2.3 Observability Comparator

In a generation of test individuals, μ offsprings are selectedand replicated directly from their parents. So, the faultsimulation results of the candidate test solutions of aspecific generation could be reused for the selected μ

offsprings of the next generation. Therefore, the faultevaluation of μ test solutions out of the μ + λ test solutionsof any generation becomes effortless. If the test synthesisundergoes a huge number of generations, time consumed forthe fault evaluation of the remaining λ test solutions wouldbe enormous. Therefore, while cultivating the remaining λ

individuals using the genetic operators, an observability-based reusability method of RSBST could be used to furtherreduce the test synthesis time.

The observability comparator analyzes the contents ofthe observable locations of a test program and its parenttest programs. Based on the analysis results, the methodof test program evaluation is chosen. For example, the testprogram 1 shown in Fig. 12a has equal observability as thatof the test program 2 shown in Fig. 12b. Let us assumethat the contents of all the registers are the same beforethe execution of these test programs. In these programs,identical functionalities are executed on the same registersand the eventual memory updates are same; i.e., followingthe execution of both test programs, the registers r1 andr2 are assigned with X and X + 1, respectively. Also, thememory update corresponding to the store instruction of testprogram 1 is 〈(r4 + off set), (r2)〉 and the memory updatecorresponding to the store instruction of test program 2 is〈(r4 + off set), (r3)〉. As the contents of r2 and r3 wereinitially the same and remain unchanged, these memoryupdates are also identical. So, when these two test programsare executed independently on equivalent initial processorstates, all the updated memory values, register values, andprimary output values are observed to be identical.

Fig. 12 Equally-observable test programs

Now, consider the actual, computationally intensive faultsimulation of these two test programs, as discussed inSection 3 and Fig. 1. The processor is simulated witheach of these test programs for N faulty models anda good reference model. Initially, the test program 1 isexecuted on these models and the simulation responses arecollected for every cycle of execution. The fault coverageis evaluated by comparing these simulation responses,which are the contents of the observable locations. However,test program 1 and 2 would have identical contents ofobservable locations for all N + 1 processor models sinceboth of them realize same functionality on each observablelocation. So, test program 1 and 2 would have equalcoverages too.

As equally-observable test programs are likely to haveequal fault coverage, the simulation responses of testprogram 1 could be reused for test program 2 and viceversa. The next subsection illustrates the algorithm of RapidSBST technique which has two significant aspects: 1)observability-based reusability of fault simulation results oftest programs, and 2) greedy-based test synthesis for thedetection of hard-to-test faults.

3.3 Algorithm for Faster Test Synthesis using RSBST

The proposed method of reusability of the fault simulationresponses are described in the Step 1-10 of Algorithm 1.Let S

parent1(j)

i−1 and Sparent2(j)

i−1 be the two parent test

program solutions of Sji . These two test solutions of (i −

1)th generation are subjected to crossover and mutationoperators to synthesize S

ji in the ith generation. After the

test program solution Sji is synthesized, a high-level internal

logic simulation is conducted for the test quality evaluation

of Sji . Now, we extract the contents of the observable

locations of Sji from the simulation responses and of

its parents Sparent1(j)

i−1 and Sparent2(j)

i−1 from the database

shown in Table 2. If the observable contents (OBSji ) of

Sji are equivalent to either (OBS

parent1(j)

i−1 ) of Sparent1(j)

i−1

or (OBSparent2(j)


i−1 , the fault coverage Fji

and the fault list FLji of the equally-observable parent are

selected from the database in Table 2 and are reused for Sji ,

as illustrated in Step 1-6 of Algorithm 1.If the observabilities of the offspring solution and none

of its parent solution are not identical (Step 7-10 ofAlgorithm 1), the fault coverage and fault list of S

ji are

evaluated using behavioral fault simulation (Step 8 ofAlgorithm 1). After the test evaluation, the database recordfor S

ji is updated with the fault coverage F

ji and fault list

values FLji , achieved using the fault simulation (Step 9 of

Algorithm 1). These simulation responses could be reused

J Electron Test (2019) 35:695–714706

for the further generations, i.e., if OBSoff spring(j)

i+1 of the

test program solution Soff spring(j)

i+1 in the i + 1th generation

matches with OBSji of S

ji , the fault list and coverage of S

ji

are reused for Soff spring(j)

i+1 .

In this approach, a set CFi−1 refers to the list of allcovered faults until the generation i, and NF

ji is the set

of newly detected faults by Sji . In step 11, the objective

function is defined as |Fji |, which is the cardinality of

the set of the newly covered faults. This greedy approachtends to protect the chromosomes that detect the hard-to-test faults through the generations. Finally, CFi is createdby merging CFi−1 and the set of fresh faults NF

ji that

are detected by the selected chromosomes in the ith

generation, as described in Step 12 of Algorithm 1. Theexperimental results that validate a faster test synthesis withthe help of RSBST technique are demonstrated in the nextsection.

4 Experimental Results

For the experimental evaluation of our RSBST testsynthesis, we have used a similar kind of setup, benchmarkprocessors, and fault models considered in [36]. We havealso used a 32-bit MIPS processor and a Leon3 processormodel of SPARC V8 architecture with a 7-stage pipeline tobe tested with the help of 10 behavioral fault representationsshown in Table 3. The MIPS processor is synthesizedusing 810 lines of VHDL code and the Leon3 processorhas 5017 lines of VHDL code. The command-line optionsof ModelSim 10.5b simulator are used to execute thesynthesized test programs on the faulty and non-faultymodels of the processor. The fault simulation responses areextracted to evaluate the test programs and thereafter, thefittest solutions are selected. Generally, the synthesized testprograms are of 40–60 lines of assembly code.

We employed a 4-core Intel i5 3.20 GHz processor toevaluate the conventional μGP [34], greedy GA [36], andthe proposed RSBST test synthesis methods. In the existingμGP techniques, a test program simulation needs a hugestorage space of 538 MB to store the responses since theseresponses comprise the clock-by-clock information of everyprocessor RTL signal. In the proposed RSBST approach, weavoid storing the detailed simulation responses of the testsolutions whose observabilities matches with their parenttest solutions. However, the < T P, OBS, FC, FL >

information of these test solutions are stored in a databasewhich altogether consumes a much lesser storage spaceof 1.65 MB. This implies a considerable reduction of thestorage requirement for the fault simulations in the proposedRSBST scheme.

Table 3 10 categorical forms of behavioral faults [3]

Fault representation Failure type

Input stuck-at fault Any primary input signal

Output stuck-at fault Any primary output signal

If stuck then fault If block is executed always

If stuck else fault Else block is executed always

Elsif stuck then fault Elsif block is executed always

Elsif stuck else fault Elsif block is failed to get

executed always

Assignment statement fault Assignment of new values to

a signal is failed always

Dead clause fault A selected When clause

in a case statement

Micro-operation fault Micro-operations are failed

always

Local stuck data fault A signal object

in a local expression

J Electron Test (2019) 35:695–714 707

Table 4 Storage spaceconsumption of existing andproposed methods

Method Storage space Storage space Overall storage

for non-reusable for reusable space consumed

test programs test programs

Existing GA 538 MB 538 MB 8.07 GB

Proposed RSSBST 538 MB 1.65 MB 1.15 GB

In Table 4, we have shown the storage space consumed byreusable and non-reusable test programs. The total storagespace consumption is the space consumed by eachgeneration of test solutions (15 individuals) since we couldremove the simulation responses and reuse the storage spaceafter each generation. This final storage consumption is 8.07GB for the existing muGP methods whereas the proposedRSBST consumes only 1.15 GB of memory space, which ismuch lesser. So, the test programs only consumes 14.25% ofstorage space when compared with the actual storage spaceconsumption of test programs in the existing μGP methods.

The functional and control components of the MIPSprocessor are tested in a software simulation environmentof 270 fault models. To evaluate the test program, memoryupdates, contents of general-purpose registers, and theprimary outputs are extracted from the simulation responsesusing Python scripts and compared with the goldenresponses. The (μ + λ) evolutionary-based test synthesizeris developed using an ANSI C implementation of 934lines with three mutation operators and a 1-point crossoveroperator.

The parameter values used for the proposed automatedsynthesis are shown in Table 5. The chromosomesof each generation are selected using a tournamentselection operator where the tournament size (τ ) is 2.The evolutionary core executes the test synthesis for 400generations and terminates if there is no improvement for40 generations, which is the steady-state threshold. For theconventional μGP and the greedy-based μGP, the size ofthe initial population (μ) is taken as 10 and the number ofoffsprings to be generated in each generation (λ) is taken as5. But the faster convergence of RSBST could be exploited

Table 5 Specifications for the proposed automated test synthesis

Specification Value

Number of generations 400

Selection methodology Tournament selection

Size (τ ) of the tournament 2

Steady-state threshold 40

Evolutionary methodology ES(μ + λ) approach

Fault coverage evaluation method Behavioral fault evaluation

for achieving adequate coverage using a larger populationsize of test solutions. So, for RSBST test synthesis, weexpand the search space by adopting μ as 20 and λ as 10.In the next subsection, we discuss how the observabilitycomparator makes use of test program observability forfaster test synthesis.

4.1 Observability Analysis of Test Programs

To identify the equally-observable test programs, theobservability comparator is developed using ANSI C onthe greedy-based evolutionary test synthesizer with 1309lines of code for the MIPS processor. This module storesthe contents of the observable destinations to identify theredundant test programs which could be internally evalu-ated. We observe 64 memory updates (each correspondsto a store instruction), contents of all 32 registers, and 2primary outputs for the high-level logic simulation of testprograms.

Let the set of memory updates be Mji = {m1, m2,

m3, . . . , m64}, contents of register locations be Rji = {r1, r2,

r3, . . . , r32}, and the primary outputs be Oji = {o1, o2} after

the execution of the test program solution Sji , which is the

j th individual of the population in the ith generation. So ,theoverall observability OBS

ji for test program solution S

ji

becomes {m1, m2, m3, . . . , m64, r1, r2, r3, . . . , r32, o1, o2},i.e., contents of 98 observable locations.

Now, the simulation database is loaded with the contentsof the observable locations (OBS

parent1(j)


i−1

and (OBSparent2(j)


i−1 , which are the the

parent test programs of Sji . Let OBS

parent1(j)

i−1 be the setof values {m′

1, m′2, m

′3, . . . , m′

64, r ′1, r

′2, r

′3, . . . , r

′32, o′

1, o′2}

and OBSparent2(j)

i−1 be the values {m′′1, m

′′2, m

′′3, . . . , m′′

64,

r ′′1 , r ′′

2 , r ′′3 , . . . , r ′′

32, o′′1, o

′′2}. If the observability OBS

ji of

test program Sji is equivalent to either the observability

OBSparent1(j)

i−1 of one of its parent Sparent1(j)

i−1 or the

observability OBSparent2(j)

i−1 of the other parent Sparent2(j)

i−1 ,the fault lists and fault coverages of the parent solution canbe reused for the evaluation of S

ji . We have applied and

validated these equivalences on a MIPS processor and aLeon3 processor and the results are illustrated in the nextsubsection.

J Electron Test (2019) 35:695–714708

10

20

30

40

50

60

70

80

90

100

0 50 100 150 200 250 300 350 400

Faul

t Cov

erag

e(%

)

Generations

Existing microGPGreedy-based GAProposed RSBST

10

20

30

40

50

60

70

80

90

100

0 50 100 150 200 250 300 350 400

Faul

t Cov

erag

e(%

)

Generations

Existing microGPGreedy-based GAProposed RSBST

Fig. 13 Average Fault Coverage of a MIPS Processor and a Leon3 Processor over 400 Generations using 1) μGP [34] with Behavioral FaultModel 2) Greedy-based GA [36] 3) Proposed RSBST

4.2 Case Studies of Faster SBST Synthesis onMIPSProcessor and Leon3 Processor

A set of macros corresponding to the 9 instructions of theMIPS processor are used for developing the constituentsof the instruction library. For the MIPS processor, theenhancement in the development of the test synthesis usingthe proposed RSBST scheme is shown in Fig. 13a. Theaverage fault coverage of the conventionalμGP scheme [34]with behavioral fault model achieves an adequate coverage(80–85%) after 50 generations, whereas the existing greedy-based GA [36] could only cover 85% of the faults after75 generations. Eventually, 93.9% of the behavioral faultsare detected by the μGP approach [34] and 96.3% offaults are detected by the greedy-based GA [36]. However,our RSBST test synthesis yields more than 85% of faultcoverage before 50 generations and conclusively, carries outan adequate coverage of 96.1%.

For the Leon3 processor, the progress in the achievedfault coverage using the RSBST scheme is shown inFig. 13b. The conventional μGP approach [34] with thebehavioral fault model, yields a fault coverage of 80–85% before 150 generations whereas the greedy-based GA

[36] accomplishes above 80% coverage only after 200generations. Finally,μGP approach [34] could detect 92.9%of the faults and the greedy-based GA [36] comes up witha fault coverage of 95.8%. However, the proposed RSBSTtest synthesis covers more than 85% of the possible faultsbefore 100 generations and ends with a final fault coverageof 95.5%.

The test synthesis for MIPS processor was conductedmodule-by-module whereas monolithic test programs weregenerated for the Leon3 processor. The processor modeldescribes the RTL model of the processor to be testedin the hardware description language VHDL, either insynthesizable or simulatable form. Here, the RTL modelis subjected to module partitioning which is realized bybreaking down the RTL design into several functional unitsand testing them separately. So, each processor modulecorresponds to a single hardware block and therefore, thereare as many modules as the number of valid digital blocksin the processor model.

The coverage and test synthesis time for the five majormodules of the MIPS processor is shown in Table 6. Theoverall test set constitute the test programs synthesized forthe validation of each module. In the conventional μGP

Table 6 Achieved coverage and synthesis time of MIPS processor modules

MIPS processor module Conventional μGP by G.Squillero [34] Greedy GA by Suriasrman et al. [36] Proposed RSBST

Coverage (%) Synthesis time Coverage (%) Synthesis time Coverage (%) Synthesis time

(hours) (hours) (hours)

ALU 100 24.5 100 33.73 100 18.07

PC 100 11.5 100 15.83 100 8.48

RF 96.67 15 96.67 20.65 96.67 11.06

ALU Control 90.24 21.5 94.87 29.63 94.27 15.86

Control Unit 83.83 49.5 90.32 68.16 90.12 36.53

Total 93.9 122 Hrs 96.3 168 Hrs 96.1 90 Hrs

J Electron Test (2019) 35:695–714 709

Fig. 14 A block diagram of 32-bit CLA of MIPS processor with eight4-bit CLA blocks

[34], achieved coverage (83.83%) was inefficient for thecontrol unit module but the synthesis was reasonably fast(49.5 hours). So, the coverage of the control unit wasimproved towards 90.32% for the greedy coverage method[36], which encounters a longer test synthesis of 68.16hours.

Some of the behavioral faults of the control unit couldonly be tested using rare sequences of instructions only. So,we make use of a larger solution space for the populationof test individuals. Since the RSBST test synthesis is faster,this larger population of test individuals helps in developinginstruction sequences that could detect harder behavioralfaults for the control unit. As a result, test programs withcoverage above 90% is synthesized for the control unitwithin 36.53 hrs using the proposed RSBST technique.For the remaining modules, minimum coverage of 94%is guaranteed with the overall evolutionary test synthesisterminates in 90 hours. Now, the amount of simulationresponses reused for the chromosomes, bypassing the faultsimulation, is discussed in the next subsection.

4.2.1 A Correlation Study Between Behavioral-leveland Gate-level Fault Modeling for MIPS Processors

Recently, gate-level structural testing has become hugelydifficult because of the extensive circuit complexity ofmodern processors. This challenge impelled the testengineers to devise behavioral fault models with RTL-levelabstraction. To evaluate the reliability of these fault models,a correlation study between behavioral-level fault modelsand gate-level fault models is conducted. As gate-level faultmodels are closely correlated with the real physical faults,

Fig. 15 Behavioral-level assignment fault

a high correlation between the behavioral level and gate-level faults would be sufficient to prove the reliability ofbehavioral-level fault models. Some of the previous studies[6, 17, 18] have validated a high correlation between gate-level and RTL faults for generic complex circuits. Weconduct a similar study for the behavioral-level faults ofMIPS processor circuitry.

In this study, we consider the behavioral fault modelsshown in Table 3 proposed by [3]. Our aim is to provethat the synthesized high-quality test program, which coulddetect most of the behavioral faults of a MIPS processor,can detect almost every gate-level stuck-at faults of thatprocessor. To realize this, we partition the processor intovarious functional modules and determine the correlationof behavioral faults on each module. If an adequate overallcorrelation is yielded for all modules, we eventually adoptthis behavioral fault model for test evaluation.

We consider a 32-bit carry-lookahead adder (CLA) ofarithmetic and logic unit (ALU) with eight 4-bit CLAblocks, as shown in Fig. 14. This CLA circuit is used for thefaster addition of two 32-bit integers (A31:0 and B31:0) withan input carry Cin which eventually generates the outputS31:0 and the output carry Cout . Let us assume that thebehavioral fault shown in Fig. 15, which is an assignmentstuck-at-zero fault, is injected in the RTL descriptions ofthe CLA module. These descriptions have two 32-bit inputsignal lines (in a and in b), and an output signal out c. A 4-bit opcode alu op selects the add operation to be performedby this CLA module. If the assignment fault is injectedon the addition statement, the value of out c signal wouldbecome “0000 · · · 0000” always.

The test code routine, shown in Fig. 16, could detect theassignment fault injected in the CLA module. In this test

Fig. 16 SBST test code for ALU for MIPS processor

J Electron Test (2019) 35:695–714710

Fig. 17 A 4-bit CLA block of MIPS processor

code, we focus on the procedure that adds two register val-ues and related instructions. To add the signal values in a

and in b, the add proc procedure adds the contents of reg-isters r1 and r2 and stores the value in the memory location< r4 + off set >. So, the memory update correspondingto the store instruction becomes < r4 + off set, 30 >.If the assignment fault, shown in Fig. 15, is injected onthe signal out c, the memory update is observed to be< r4 + off set, 0 >. Thus, the synthesized test code,shown in Fig. 16, could detect the behavioral-level assign-ment fault shown in Fig. 15 by comparing the memoryupdates corresponding to store instructions. Likewise, everybehavioral fault of CLA module is traced by the entiretest code, which is developed using the proposed RSBSTscheme, of which a short code fragment is shown in Fig. 16.

Each 4-bit block of CLA shown in Fig. 14 has 4 fulladders and a lookahead logic to compute the carry for eachadder, as shown in Fig. 17. The generate and propagatesignals Gi and Pi , which are the inputs to the lookaheadlogic, are calculated using the Ai and Bi signals. The carryout value cout is calculated from cin and the Gi and Pi

signals. Let us assume that the value of input carry cin iszero. If both A3:0 and B3:0 were 1111, the value of cout

would be 1. Now, consider a stuck-at-0 fault in the finalOR gate of the 4-bit CLA block as shown in Fig. 17. Thisstuck-at-zero would change the value of cout from 1 to 0

Now, we test the 32-bit CLA block using the test codeshown in Fig. 16. When this test program is executed, avalue 15 is assigned to the registers r1 and r2, which is 1111in binary. Subsequently, A3:0 and B3:0 also become 1111.This activates one of the inputs of the final OR gate andeventually, the value of carry out cout becomes 1. However,the other input is zero since cin is zero. Now, if a stuck-at-0 fault is present in the input of the final OR gate as shownin Fig. 17, both inputs becomes zero and cout becomes 0,i.e., the test program execution has propagated the stuck-at-0 fault towards cout . If the value of cout changes due to thefault, the output value of the next CLA block (S7:4) will alsobe changed.

Likewise, almost every gate-level stuck-at-faults of theCLA circuitry can be detected using the test programsynthesized using behavioral fault simulations. We couldmodel 1,180 gate-level single stuck-at faults for the 32-bit CLA module of MIPS processor. A gate-level faultsimulation with the help of ModelSim yielded a detection of99% of the gate-level stuck-at faults of the CLA module forthe test programs synthesized using the proposed RSBSTscheme.

This correlation study is conducted for other MIPSmodules also. A correlation value of 97% for ALU, 94% forPC, 92% for RF, 91% for ALU control unit, and 89% for thecontrol unit are achieved as discussed in Table 9. Altogether,a correlation coefficient of 94.8% with the gate-level faultsis achieved for MIPS processor. The effective gate-levelfault coverage of each MIPS processor module is alsoshown in Table. 9. As these test codes cover 96.1% of thebehavioral faults, the overall effective gate-level coveragebecomes 91.1% for the proposed RSBST scheme. Theseresults show that the behavioral faults ensure adequate testquality nearly equivalent to the gate-level faults and thus,could be employed for test quality evaluation.

Table 7 MIPS processor - achieved coverage and time of the 1) μGP [34] with behavioral fault model 2) greedy-based GA [36] 3) proposedRSBST method

Framework Simulation environment Behavioral Test synthesis Chromosome Remarks

fault coverage time reuse

Conventional μGP by Modelsim version 5.7a 93.9% 122 hours 66.6% Lesser fault coverage

G.Squillero [34] but test synthesis is faster.

Greedy GA by GHDL 96.3% 168 hours 66.6% Improved fault coverage

Suriasrman et al. [36] but test synthesis consumes

huge time

Proposed RSBST Modelsim version 10.5b 96.1% 90 hours 82.1% Adequate fault coverage

and faster test synthesis

J Electron Test (2019) 35:695–714 711

Table 8 Leon3 processor - achieved coverage and time of the 1) μGP [34] with behavioral fault model 2) greedy-based GA [36] 3) proposedRSBST method

Framework Simulation environment Behavioral Test synthesis Chromosome Remarks

fault coverage time reuse

Conventional μGP Modelsim version 5.7a 92.9% 142 Hours 66.6% Lesser fault coverage

by G.Squillero [34] but reasonable test

synthesis time.

Greedy GA by GHDL 95.8% 172 hours 66.6% Improved fault coverage

Suriasrman et al. [36] but longer test synthesis

Proposed RSBST Modelsim version 10.5b 95.5% 98 hours 80.8% Adequate fault coverage

and faster test synthesis

4.3 Chromosome Reusability of RSBST

In Table 7, the fault coverage, test synthesis time, and theamount of chromosome reuse are shown. The chromosomereuse refers to the percentage of chromosomes (test programsolutions) reused throughout the test synthesis except forthe first generation. For the first generation of test programsolutions, μ+λ fault simulations must be performed to loadthe simulation results into an empty simulation database.From the second generation, we investigate the scope ofreusability and thereby reduce the test development time.

For the MIPS processor, the μGP approach [34]consumes 122 hours for the test synthesis and the greedy-based GA [36] takes 168 hours. The RSBST approachconsumes only 90 hours, which is 46.4% faster than thegreedy-based GA [36], and with adequate coverage of96.1% as shown in Table 7. To synthesize monolithic testprograms for the Leon3 processor, the μGP approach [34]takes 142 hours and the greedy-based GA [36] consumes172 hours. The proposed RSBST approach consumes only98 hours, which is 43% faster than the greedy-based GA[36], with coverage of 95.5% as shown in Table 8.

The chromosome reusability is exploited to acceleratethe convergence of the greedy-based GA [36]. For the μGPand the greedy-based GA [36], the simulation responses ofthe selected chromosomes (μ = 10) of each generation areadopted and reused directly from the parent chromosomes,

Table 9 Effective gate-level coverages of MIPS processor modules

Module Correlation Behavioral fault Effective gate-level

coefficient (%) coverage (%) coverage (%)

ALU 97 100 97

PC 94 100 94

RF 92 96.67 88.93

ALU Control 91 94.27 85.78

Control Unit 89 90.12 80.2

Total 94.8 96.1 91.1

which saves 66.6% of test synthesis time. For our proposedRSBST approach, the offspring chromosomes (λ = 5) aresubstituted by the equally-observable parent chromosomesalong with the reuse of the selected chromosomes (μ = 10).Eventually, the overall reusable chromosomes could mountup to 82.1% for the MIPS processor and 80.8% for theLeon3 processor (Table 9).

5 Conclusion

Evolutionary methods [9, 10, 32, 34] are prevalently usedto synthesize effective, smaller SBST codes for processors.Although these methods converges quickly, the yieldedcoverage was inadequate. An existing greedy-based μGPframework [36] extensively searches and detects manyof the hard-to-test faults, but has a slow convergence.To realize a faster test synthesis, we have integrated thereusability of fault simulation results for the test programsinto the greedy-based μGP method.

From the results, we could conclude that using a morecomprehensive fault model, our strategy develops test solu-tions that could detect 96.1% of the testable behavioralfaults of the MIPS processor in 90 hours and 95.5% that ofthe Leon3 processor in 98 hours. This affirms a chromo-some (test program) reuse of 82.1% for the MIPS processorand 80.8% for the Leon3 processor. Also, this reusabilitytechnique reduced the storage consumption of the simula-tion responses to 14.25% when compared with the actualcycle-by-cycle simulations conducted by the existing μGPmethods. Later, we conducted a module-by-module cor-relation analysis for the high-level behavioral faults andgate-level stuck-at faults of MIPS processor. A strongcorrelation (94.8%) between the behavioral faults and thegate-level faults establishes a high effective test quality(gate-level fault coverage above 91%) for our scheme.

As part of future work, we need to validate the correlationof behavioral fault models with other intricate gate-levelfault models like bridging fault models and delay fault

J Electron Test (2019) 35:695–714712

models. This extention could help to completely estab-lish the effectiveness of RTL fault models in test qualityevaluation. A faster and profound test synthesis could bedeveloped using the fragment-wise reusability of test pro-grams. Even if the observability of 2 test programs aredifferent, the identical, and data-independent code frag-ments (chunks) could be extracted from these test programsand reused. Also, the fault equivalence techniques couldbe used for reducing the volume of simulations and thetest generation time. Two faults are declared to be equiv-alent only if they lead to identical output values for everyprocessor module. So, we classify such equivalent faultsinto a group since a test program that detects a fault coulddetect its equivalent faults too. So, every set of equivalentfaults could be tested using a single fault simulation, whichwill further accelerate the test synthesis.

References

1. Bayraktaroglu I, Hunt J, Watkins D (2006) Cache residentfunctional microprocessor testing: avoiding high speed io issues.In: 2006 IEEE international test conference. IEEE, pp 1–7

2. Carbine A, Feltham D (1997) Pentium (r) pro processor design fortest and debug. In: Proceedings international test conference 1997.IEEE, pp 294–303

3. Chen C-IH (2003) Behavioral test generation/fault simulation.IEEE Potentials 22(1):27–32

4. Chen L, Dey S (2001) Software-based self-testing methodologyfor processor cores. IEEE Trans Comput Aided Des Integr CircuitsSyst 20(3):369–380

5. Chen L, Ravi S, Raghunathan A, Dey S (2003) A scalablesoftware-based self-test methodology for programmable proces-sors. In: Proceedings of the 40th annual design automationconference. ACM, pp 548–553

6. Corno F, Cumani G, Reorda MS, Squillero G (2000) An rt-levelfault model with high gate level correlation. In: Proc. high-leveldesign validation and test workshop, IEEE international, pp 3–8

7. Corno F, Cumani G, Reorda MS, Squillero G (2002) Efficientmachine-code test-program induction. In: Proceedings of the2002 congress on evolutionary computation. CEC’02 (Cat. No.02TH8600), vol 2. IEEE, pp 1486–1491

8. Corno F, Cumani G, Reorda MS, Squillero G (2002) Evolutionarytest program induction for microprocessor design verification. In:Proceedings of the 11th Asian test symposium, 2002. (ATS’02).IEEE, pp 368–373

9. Corno F, Sanchez E, Reorda MS, Squillero G (2004) Automatictest program generation: a case study. IEEE Des Test Comput21(2):102–109

10. Corno F, Sanchez E, Squillero G (2005) Evolving assemblyprograms: how games help microprocessor validation. IEEE TransEvol Comput 9(6):695–706

11. De Carvalho M, Bernardi P, Sanchez E, Reorda MS, Ballan O(2014) Increasing the fault coverage of processor devices duringthe operational phase functional test. J Electron Test 30(3):317–328

12. Deb K (2001) Multi-objective optimization using evolutionaryalgorithms, vol 16. Wiley, New York

13. Ghosh I, Raghunathan A, Jha NK (1999) Hierarchical testgeneration and design for testability methods for aspps and asips.

IEEE Trans Comput Aided Des Integr Circuits Syst 18(3):357–370

14. Gizopoulos D, Paschalis A, Zorian Y (2013) Embedded processor-based self-test, vol 28. Springer Science & Business Media,Berlin. ISBN: 978-1-4020-2801-4

15. Gizopoulos D, Psarakis M, Hatzimihail M, Maniatakos M,Paschalis A, Raghunathan A, Ravi S (2008) Systematic software-based self-test for pipelined processors. IEEE Trans Very LargeScale Integr VLSI Syst 16(11):1441–1453

16. Hudec J, Gramatova E (2015) An efficient functional testgeneration method for processors using genetic algorithms. JElectr Eng 66(4):185–193

17. Karputkin A, Raik J (2016) A synthesis-agnostic behavioralfault model for high gate-level fault coverage. In: Proc design,automation & test in Europe conference & exhibition (DATE),pp 1124–1127

18. Karunaratne M, Sagahayroon A, Prodhuturi S (2005) RTL faultmodeling. In: Proc 48th midwest symposium on circuits andsystems, pp 1717–1720

19. Kim K, Ha DS, Tront JG (1988) On using signature registersas pseudorandom pattern generators in built-in self-testing. IEEETrans Comput Aided Des Integr Circuits Syst 7(8):919–928

20. Kranitis N, Merentitis A, Theodorou G, Paschalis A, GizopoulosD (2008) Hybrid-sbst methodology for efficient testing ofprocessor cores. IEEE Des Test Comput 25(1):64–75

21. Kranitis N, Paschalis A, Gizopoulos D, Xenoulis G (2007)Software-based self-testing of embedded processors. In: Processordesign. Springer, pp 447–481

22. Kranitis N, Paschalis A, Gizopoulos D, Zorian Y (2003)Instruction-based self-testing of processor cores. J Electron Test19(2):103–112

23. Leveugle R, Hadjiat K (2003) Multi-level fault injections in vhdldescriptions: alternative approaches and experiments. J ElectronTest 19(5):559–575

24. Lu T-H, Chen C-H, Lee K-J (2011) Effective hybrid test programdevelopment for software-based self-testing of pipeline processorcores. IEEE Trans Very Large Scale Integr VLSI Syst 19(3):516–520

25. McCluskey EJ (1985) Built-in self-test techniques. IEEE Des TestComput 2(2):21–28

26. Parvathala P, Maneparambil K, Lindsay W (2002) Frits-a microprocessor functional bist method. In: Proceedings.International test conference. IEEE, pp 590–598

27. Paschalis A, Gizopoulos D, Kranitis N, Psarakis M, Zorian Y(2001) Deterministic software-based self-testing of embeddedprocessor cores. In: Proceedings design, automation and test inEurope. Conference and exhibition 2001. IEEE, pp 92–96

28. Prinetto P, Rebaudengo M, Reorda MS (1994) An automatic testpattern generator for large sequential circuits based on geneticalgorithms. In: Proceedings., International test conference. IEEE,pp 240–249

29. Psarakis M, Gizopoulos D, Sanchez E, Reorda MS (2010)Microprocessor software-based self-testing. IEEE Des TestComput 27(3):4–19

30. Riefert A, Cantoro R, Sauer M, Reorda MS, Becker B (2015) Onthe automatic generation of sbst test programs for in-field test.In: Proceedings of the 2015 design, automation & test in Europeconference & exhibition. EDA Consortium, pp 1186–1191

31. Riefert A, Cantoro R, Sauer M, Reorda MS, Becker B (2016) Aflexible framework for the automatic generation of sbst programs.IEEE Trans Very Large Scale Integr VLSI Syst 24(10):3055–3066

32. Sanchez E, Reorda MS, Squillero G (2006) Efficient techniquesfor automatic verification-oriented test set optimization. Int JParallel Prog 34(1):93–109

J Electron Test (2019) 35:695–714 713

33. Shen J, Abraham JA (1998) Native mode functional testgeneration for processors with applications to self test and designvalidation. In: Proceedings international test conference 1998(IEEE Cat. No. 98CH36270). IEEE, pp 990–999

34. Squillero G (2005) Microgp—an evolutionary assembly pro-gram generator. Genet Program Evolvable Mach 6(3):247–263

35. Stroud CE (2006) A designer’s guide to built-in self-test, vol 19.Springer Science & Business Media, Berlin

36. Suryasarman VM, Biswas S, Sahu A (2018) Automation of testprogram synthesis for processor post-silicon validation. J ElectronTest 34(1):83–103

37. Tupuri RS, Abraham JA (1997) A novel functional testgeneration method for processors using commercial atpg. In:Proceedings international test conference 1997. IEEE, pp 743–752

38. Vasudevan M, Biswas S, Sahu A (2019) Rsbst: a rapid software-based self-test methodology for processor testing. In: 201932nd international conference on VLSI design and 2019 18thinternational conference on embedded systems (VLSID). IEEE,pp 112–117

39. Wagner KD, Chin CK, McCluskey EJ (1987) Pseudorandomtesting. IEEE Trans Comput 3:332–343

40. Wen C-P, Wang L-C, Cheng K-T (2006) Simulation-basedfunctional test generation for embedded processors. IEEE TransComput 55(11):1335–1343

41. Zhang Y, Rezine A, Eles P, Peng Z (2012) Automatic test programgeneration for out-of-order superscalar processors. In: 2012 IEEE21st Asian test symposium. IEEE, pp 338–343

Publisher’s Note Springer Nature remains neutral with regard tojurisdictional claims in published maps and institutional affiliations.

Vasudevan Madampu Suryasarman is a PhD student in theDepartment of Computer Science and Engineering, IIT Guwahati. Hisresearch interests include VLSI testing and Processor Design.

Santosh Biswas received B.E degree from NIT, Durgapur, India, in2001. He has completed his M.S. and Ph.D from IIT Kharagpur, India,in the year of 2004 and 2008, respectively. He works as an AssociateProfessor at the Department of Computer Science and Engineering,IIT Guwahati. His research interests include networking, VLSI testingand discrete event systems.

Aryabartta Sahu received Ph.D from IIT Delhi, India, in the yearof 2009. He works as an Associate Professor at the Departmentof Computer Science and Engineering, IIT Guwahati. His researchinterests include Multiprocessor Scheduling and High PerformanceComputing.

J Electron Test (2019) 35:695–714714

RSBST: an Accelerated Automated Software-Based Self-Test ...agrawvd/.../P07_Suryasarman_35...the...

Documents

Transcript of RSBST: an Accelerated Automated Software-Based Self-Test ...agrawvd/.../P07_Suryasarman_35...the...