[IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference....

IEEE Instrumentation and Measurement Technology Conference Budapest Hungary May 21-23 2001

Using run-t ime reconfiguration for fault injection applications

Lorinc Antoni+ R6gis Leveugle B6la Feh6r+ Budapest University of Technology and Economics

Laboratory of Techniques of Informatics and + Department of Measurement and

Information Systems

Mail Budapest Hungary H-1521

H-1111 Budapest Miiegyetem rkp 9

Bldg R I 113

Phone +36 1463-2057 Fax +36 1463-4112

Microlectronics for Computer Architecture (TIMA)

Institut National Polytechnique de Grenoble

46 Avenue FBlix Viallet

38031 GRENOBLE Cedex France

Phone +33 4 76 57 46 15 Fax +33 4 76 47 38 14

E-mails [Lorinc Antoni RegisLeveugle]imagfr fehermitbmehu

Abstract - I n this paper approaches using Run-Time Re- configumtion for fault injection in programmable systems are introduced I n FPGA-based systems an important char- acteristic is the time to reconfigure the hardware including re-synthesis place and route and finally bitstream download- ing Modifications can be carried out at low-level directly in the bitstream so that re-synthesizing the description can be avoided to inject new faults Moreover with some FPGA families (eg Virtex or AT6000) it is possible to reconfigure the hardware partially at run-time Important time- savings can be achieved when taking advantage of these fea- tures These characteristics fit well to apply with fault injection where the injection necessitates the reconfiguration of only a few resources of the device with a few modifications Time gains can be various depending on the number and kind of faults to be injected and the device used for the experiments The experiments show that his approach could be several orders faster than the implementation using Compile- Time Reconfiguration

Keplwords - Fault-injection Partial Run- Time Reconfigura- tion FPGA

I INTRODUCTION

The fault injection techniques have been recognized for a long time ago as necessary to validate the dependability of a system by analysing the behaviour of the devices when a fault occurs Also reconfigurable hardware is appropriate to implement and test prototypes by synthesizing descriptions in high-level languages such as VHDL Another advantage of prototyping is the pos- sibility to perform in-system emulation before any manufacturing

Run-Time Reconfiguration (RTR) is a well-known technique that reconfigures hardware during execution of the application There are different kinds of RTR depending whether all the circuitry is reconfigured or only part of it and if it is only partially reconfigured then the rest of the circuit either remains in operation or not

In this paper an approach is proposed to apply RTR to realise fault injection in circuit prototypes Experi-

0-7803-6646-801$1000 02001 IEEE

ments have been made to realise the injection of stuck- at faults and new approaches have been developed to implement the injection of Single Event Upsets (SEUs)

The paper is organized as follows Section I1 contains a summary of fault-injection techniques and a brief description of Run-Time Reconfiguration (RTR) The possible fault injection techniques are described in Sec- tion 111 The approach developed is presented in Sec- tion IV Section V contains the results achieved by the experiences Section VI concludes on the novelty of the approach validated and on the on-going work

11 PRELIMINARIES

A Fault-injection and hardware prototyping

As previously mentioned fault injection techniques have been proposed for a long time to evaluate the dependability of a given circuit or system implementation Most of the approaches proposed up to now apply once the system or circuit is available Such approaches in- clude pin-level fault injection memory corruption [l] heavy-ion injection power supply disturbances [2] laser fault injection [3] or software fault injection [4]

More recently several authors proposed to apply fault injection early in the design process The main approach consists in injecting the faults in high level models (most often VHDL models) of the circuit or system [5] describes for example the injection of faults in behavioural VHDL descriptions of microprocessor- based systems [6] and then [7] or [8] consider the injection of different types of faults in the VHDL model of a circuit at several abstraction levels and using various techniques based on the modification of the initial VHDL description As in the case of [5] simulations are used to evaluate the impact of the faults on the circuit behavior As mentioned in [9] the main drawback related to the use of simulations is the huge amount of

1773

time required to run the experiments when many faults have to be injected in a complex circuit

To cope with the time limitations imposed by simulation it has been proposed to take advantage of hardware prototyping using a FPGA-based hardware emulator [lo] Another advantage of emulation is to allow the designer to study the actual behavior of the circuit in the application environment taking into account real-time interactions [ll] When an emulator is used the initial VHDL description must of course be synthesizable In some limited cases the approaches developed for fault grading using emulators (eg [12] [13]) may be used to inject faults However such approaches are classically limited to stuck-at fault injection In the special case of the SimExpress emulator faults can also be injected in the circuit prototype by using built-in facilities of the emulator [14] Nevertheless such facilities do not exist today in the other emulators commercially available Furthermore such an injection is limited to the single stuck-at fault model In most cases modifications must therefore be introduced in the initial description taking into account that the description must remain synthesizable and satisfying a set of constraints related to the emulator hardware 191 The modifications are therefore not easy and furthermore it is often necessary to generate several modified descriptions each of them allowing the injection of a given set of faults In such a case the hardware emulator has in general to be completely reconfigured several times which is quite time-consuming and reduces the gain in execution time compared with simulation

The main goal of the work presented hereafter is to re- duce the reconfiguration time globally spent when run- ning a fault injection campaign on a hardware emulator This gain can be achieved in two ways reducing the generation time of the new configuration (replacing synthesis placement and routing by bitstream modifications) and reducing the configuration loading time (by means of a partial reconfiguration of the emulator)

B Run- Time Reconfiguration

The technique of Run-Time Reconfiguration (RTR) is to reconfigure hardware resources during application execution This approach has been proposed for example in [15] [16] Most FPGA-based systems reconfigure their FPGA at compile-time that is called Compile- Time Reconfiguration (CTR) [16] (Fig 1a) In this case once the application is configured into the PGA it remains constant during all the execution period Of course several executions of the application can fol- low Contrary to CTR RTR systems reconfigure the FPGA several times during the execution of the application [16] (Fig 1b)

One can distinguish different kinds of RTR methods When the whole FPGA memory is reloaded during a

C o nli g u ra ti on 0 Execution 0 Configuration Execution 0 0 Configuration Execution 0 0

Fig 1 The difference between Compile-Time Reconfiguration and Run-Time Reconfiguration (a) one configuration several executions (b) several configurations several executions during

execution of the application

reconfiguration (Global RTR) with an arbitrary number of executions for each configuration the execution of the application must be stopped during reconfiguration In this case the application is divided into distinct temporal phases where each phase is implemented as a single system-wide configuration that occupies all system FPGA resources [16]

It is also possible to reconfigure only subsets of the reconfigurable circuit This approach is called partial reconfiguration or Local RTR (Fig 2) In this case important time-saving is made compared with a complete reconfiguration of the components as reconfiguration is quite a time-consuming operation and with Local RTR not all the circuitry must be reconfigured to carry out changes

Execution h I I I

Fig 2 Local Run-Time Reconfiguration

111 POSSIBLE WAYS OF FAULT INJECTION

Fault injection can be carried out in several ways As previously mentioned the most widely used approach is the injection of faults by modifications of the VHDL description In a second step the VHDL code is synthesized and a netlist is generated that can be downloaded into the FPGA in bitstream format Finally the application is executed in the reconfigurable hardware and an analysis can be made In our approach the analysis ends up with a functional model of the system

1774

behaviour that shows how the system evolves when the faults occur This model is similar to a Markov chain and gives information about the error propagation paths the erroneous states actually reached and the probability of each propagation in the context of the workload used during the experiments

In this approach it is often necessary to generate several designs containing different kinds of fault injectors and corresponding to very similar structures These descriptions are then synthesized and downloaded sep- arately to analyse the behaviour of the circuit with the different faults (Fig 3a)

I Modified VHDL

Synthesis Place amp Route

1 Bitstream

(4

Bitsikam

Download

Global reconfigurauon

Execution

Analysis

Place amp Route

I Bitstream

I

Download 0 Fault injection

Local reconfiguration

Execution

Analysis

Fig 3 Fault injection using VHDL modifications with CTR (a) or by injecting faults directly into the prototype with Global

(b) or Local (c) reconfiguration

It is also possible to inject faults directly in the bitstream that was generated after the synthesis of the initial VHDL description In this case the description must not be re-synthesized after each fault injection step as the faults are injected at low-level directly in the bitstream Moreover RTR can be applied in this approach either Global RTR or partial reconfiguration This approach of fault-injection is shown on Fig 3b and c

A Handling the system state during reconfiguration

When partial reconfiguration is applied the parts of the circuit that are not reconfigured can even stay in operation while the other parts are reconfigured In case of Global RTR the system state must be handled

to keep the information computed before each configuration step a means must be provided to support inter-configuration communication [ 161 Because each phase occupies all reconfigurable resources interfaces between configurations are fixed and all circuit mod- ules can be designed under the same general context This approach was used to develop a hardware accelerator for image processing [17] Another application implementing Global RTR the Run-Time Reconfigura- tion Artificial Neural Network (RRANN) is described in [MI where a host PC stores all configuration information for the FPGAs monitors the progress of each stage of execution and supplies the appropriate configuration data to the FPGA board

B Dependence of the complexity of the analysis on the relevant failure modes of the system

The complexity of the analysis depends on the duration of each experiment (the number of test-vectors applied to the input of the circuit for each injected fault) and the number of observed signals (primary outputs or internal nodes) The increase in complexity is sensibly linear in both cases The complexity of the resulting model depends basically on two parameters

the level of precision of the analysis that can be related to the number of signals observed during the experiments The number of possible erroneous configurations and therefore the maximum number of states in the model increases exponentially with the number of signals observed However in practice the complexity can be reduced without noticeable information loss by considering groups of signals and their correctness rather than their exact value For example different erroneous values on a data bus may not be differentiated and may therefore be associated to a single error state in the model The complexity can also be con- trolled by carefully choosing the signals to observe the number of final states in the model ie the number of states associated with a condition end- ing the analysis of the experiment results Such conditions are typically related to either major fail- ures or error detections and are defined when spec- ifying the fault injection campaign In general the more faults can be tolerated or detected in the circuit the less states are identified in the generated model

IV THE DEVELOPED FAULT INJECTION

In ourresearch the approaches of Fig 3b and c are used In this case faults are injected at low-level directly in the bitstream so that the circuit must not be re-synthesized Stuck-at-0 and 1 and inverting faults were injected in our application into the description by modifying the bitstream To do this the JBits

TECHNIQUE

1775

toolset [19] was used and is described in Section IV- A Faults were injected on the inputs of the CLBs of the FPGA by modifying the contents of the LUTs of the CLBs as it is described in Section IV-B

A The JBits API

The JBits API is a Java-based tool set or application programming interface (API) that allows designers to write information directly to a Xilinx FPGA to carry out whatever customer logic operations were designed for it [19] [20] The JBits API permits the FPGA bitstream to be modified quickly allowing for fast reconfiguration of the FPGA With Virtex FPGAs the JBits API can partially or fully reconfigure the internal logic of the hardware device

The Virtex architecture allows this reconfiguration to be as extensive as necessary and still maintain tim- ing information [19] [20] The JBits API also makes it possible to integrate the operations of the FPGA with other system components such as an embedded processor a graphics coprocessor or any digital peripheral device

8 12 12 8

JBits applications or applets can use the Java API for Boundary Scan unveiled by Xilinx for platform in- dependent device configurations deployed locally or re- motely over the Internet [19] [20] These applets can be control programs consumer interface programs or updates Previously Java applets were only used to send software updates via the Internet The JBits API now makes its possible to create Java logic applets that can be used to send new hardware updates via the Internet

B The realisation of fault injection To implement the approaches in Fig 3b and 3c the bitstream can be read from a file and the modified bitstream can also be written to a file that can be downloaded to the board Similarly the bitstream can also be read directly from the device and the modified bitstream can be directly written to the device

In this experiment stuck-at and inverting faults were injected on the inputs of the CLB LUTs When injecting the fault some parameters must be given by the user to define the fault that should be injected the CLB row and column number the input of the CLB (Fl G4 etc) and the kind of the fault An algorithm has been developed that allows the user to give only the name in the high-level description instead of the CLB input For this the application needs a file generated during the synthesis that contains the assertion of the nets in the design and the CLB inputs

The faults are injected by modifying the contents of the LUT(s) Let us consider the example shown on Fig 4

In this example a stuck-at-1 fault is injected on the F4 input of the CLB It can easily be seen that in this case

9 11 10 13 15 14 13 15 14

9 11 10

F4

F4

F2

F3

F3

(b) Fig 4 Injection of a stuck-at-1 fault on the F4 input of a CLB

(a) original LUT state and (b) modified LUT

the output value for F4F3F2Fl=1000 must be written to the output value for F4F3F2F1=0000 the value for 1001 to 0001 etc

Such an algorithm for changing the values of the LUT can be found for each kind of faults (stuck-at-0 1 or inverting fault) for each input of the CLB and the fault injection can be made in a similar way

V RESULTS The experiments were realised on a XS40 board [21] using JBits 11 This board contains an XC401OXL FPGA device [22] which is not able to implement partial reconfiguration Like this the advantages of the approach of low-level fault injection (in the bitstream) were validated by the experiments but no experiments were made on partial reconfiguration Partial reconfiguration through JBits can only be made on some boards with the Virtex family 1231 However avoiding several synthesis runs results in important time-savings as the synthesis of a description can take from minutes up to hours depending on the size of the design The application developed injects a fault in the bitstream in only

1776

a few seconds

Evaluations made on Virtex devices have also shown that configuration time savings of several orders of mag- nitude can be expected when using partial reconfiguration [24] As an example injecting a stuck-at implies to modify 8 frames that requires 0424 ms on a XCV400 device instead of 169738 ms for a complete reconfiguration The same injection in a XCV2000E device would require 0816 ms to be compared with 677309 ms for a complete reconfiguration

With this approach it is possible to inject transient faults in the circuit as well as the device can be reconfigured and the fault(s) can be removed before the end of the execution of the application However for the injection of transient faults the newer FPGA families like the Virtex family is better suited as these devices are capable to be reconfigured partially while the rest of the device remains in operation

VI CONCLUSION In the experiments the approach of injecting faults at low-level in the bitstream without re-synthesizing the description has been validated Stuck-at and inversion faults can be injected on the input of any CLB in the CLB matrix This approach is novel in fault-injection as all the alternative methodologies are based on injection of faults at high-level generally in the VHDL descriptions Experiments concerning the approach us-

partial RTRbf the FPGA device-are ongoing

References

Michel T et al ldquoTaking advantage of ASICs to improve dependability with very low overheadsrdquo in European Design and Test Conference February-March 1994 pp 14-18 J Karlsson et al ldquoTwo fault injection techniques for test of fault handling mechanismsrdquo in ITC 1991 pp 140-149 J R Samson W Moreno and F Falquez ldquoValidating fault tolerant designs using laser fault injectionrdquo in IEEE Symposium on Defect and Fault Tolerance an VLSI Systems October 1997 pp 175-183 G Kanawati et al ldquoFERRARI a tool for the validation of system dependability propertiesrdquo in FTCS 1992 pp

T A Delong B W Johnson and J A Profeta 111 ldquoA fault injection technique for VHDL behavioral-level mod- elsrdquo IEEE Design and Test of Computers vol 13 pp 24-33 Winter 1996 E Jenn J Arlat M Rimen J Ohlsson and J Karlsson ldquoFault injection into VHDL models the MEFISTO toolrdquo in 24th International Symposium on Fault-Tolerant Com- puting June 1994 pp 66-75 S Svensson and J Karlsson ldquoDependability evaluation of the THOR microprocessor using simulation-based fault injectionrdquo Tech Rep 295 Chalmers University of Technol- ogy Department of Computer Engineering November 1997 J B o d P PQtillon and Y Crouzet ldquoMEFISTO-L a VHDL-based fault injection tool for the experimental as- sessment of fault tolerancerdquo in 28th FTCS June 1998 pp

R Leveugle ldquoTowards modeling for dependability of complex integrated circuitsrdquo in 5th IEEE International On- Line Testing workshop July 1999 pp 194-198 R Leveugle ldquoBehavior modeling of faulty complex VLSIs why and howrdquo in The Baltic Electronics Conference Oc- tober 1998 pp 191-194

336-344

168-173

E Bohl W Harter and M Trunzer ldquoReal time effect testing of processor faultsrdquo in 5th IEEE International On- Line Testing workshop July 1999 pp 39-43 K-TCheng et al ldquoFault emulation A new methodology for fault gradingrdquo IEEE Bans on Computer Aided Design of Integrated Carcuits and Systems vol 18 no 10 pp 1487- 1495 October 1999 R W Wieler Z Zhang and R D McLeod ldquoEmulating static faults using a xilinx based emulatorrdquo in IEEE Sym- posium on FPGAs for Custom Computing Machines April

3 Abke E Bohl and C Henno ldquoEmulation based real time testing of automotive applicationsrdquo in 4th IEEE Interna- tional On-Line Testing workshop July 1998 pp 28-31 P Lysaght and 3 Dunlop ldquoDynamic reconfiguration of field programmable gate arraysrdquo in Proc of the 1993 Int Workshop on Field Programmable Logic and Applications September 1993 pp 82-94 B L Hutchings and MJ Wirthlin ldquoImplementation a p proaches for reconfigurable logic applicationsrdquo in 5th Inter- national Workshop on Field Programmable Logic and Ap- plications August 1995 pp 419-428 0 Vellacott D Ross and M Turner ldquoAn fpga-based hardware accelerator for image processingrdquo in More FPGAs International workshop on field-programmable logic and applications September 1993 pp 299-306 JG Eldredge and BL Hutchings ldquoRun-time reconfiguration A method for enhancing the functional density of sram-based fpgasrdquo in Journal of VLSI Signal Processing

S A Guccione D Levi and P Sundararajan ldquoJbits Java- based interface for reconfigurable computingrdquo in 2nd An- nual MAPLD 1999 Xilinx 2100 Logic Drive San Jose CA 95124-3450 JBits February 1999 XESS Corporation 2608 Sweetgum Drive Apex NC 27502 XSdO XSP Board V14 September 1999 Xilinx XC4OOOE and XC4OOOX Series Field Programmable Gate A m y s May 1999 Xilinx Virtex Field Programmable Gate A m y s March 2000 L Antoni R Leveugle and B FehQr ldquoUsing run-time reconfiguration for fault injection in hardware prototypesrdquo in IEEE International Symposium on Defect and Fault Toler- ance in VLSI Systems October 2000 pp 405-413

1995 pp 110-115

1996 vol 12 pp 67-86

1777

time required to run the experiments when many faults have to be injected in a complex circuit

To cope with the time limitations imposed by simulation it has been proposed to take advantage of hardware prototyping using a FPGA-based hardware emulator [lo] Another advantage of emulation is to allow the designer to study the actual behavior of the circuit in the application environment taking into account real-time interactions [ll] When an emulator is used the initial VHDL description must of course be synthesizable In some limited cases the approaches developed for fault grading using emulators (eg [12] [13]) may be used to inject faults However such approaches are classically limited to stuck-at fault injection In the special case of the SimExpress emulator faults can also be injected in the circuit prototype by using built-in facilities of the emulator [14] Nevertheless such facilities do not exist today in the other emulators commercially available Furthermore such an injection is limited to the single stuck-at fault model In most cases modifications must therefore be introduced in the initial description taking into account that the description must remain synthesizable and satisfying a set of constraints related to the emulator hardware 191 The modifications are therefore not easy and furthermore it is often necessary to generate several modified descriptions each of them allowing the injection of a given set of faults In such a case the hardware emulator has in general to be completely reconfigured several times which is quite time-consuming and reduces the gain in execution time compared with simulation

The main goal of the work presented hereafter is to re- duce the reconfiguration time globally spent when run- ning a fault injection campaign on a hardware emulator This gain can be achieved in two ways reducing the generation time of the new configuration (replacing synthesis placement and routing by bitstream modifications) and reducing the configuration loading time (by means of a partial reconfiguration of the emulator)

B Run- Time Reconfiguration

The technique of Run-Time Reconfiguration (RTR) is to reconfigure hardware resources during application execution This approach has been proposed for example in [15] [16] Most FPGA-based systems reconfigure their FPGA at compile-time that is called Compile- Time Reconfiguration (CTR) [16] (Fig 1a) In this case once the application is configured into the PGA it remains constant during all the execution period Of course several executions of the application can fol- low Contrary to CTR RTR systems reconfigure the FPGA several times during the execution of the application [16] (Fig 1b)

One can distinguish different kinds of RTR methods When the whole FPGA memory is reloaded during a

C o nli g u ra ti on 0 Execution 0 Configuration Execution 0 0 Configuration Execution 0 0

Fig 1 The difference between Compile-Time Reconfiguration and Run-Time Reconfiguration (a) one configuration several executions (b) several configurations several executions during

execution of the application

reconfiguration (Global RTR) with an arbitrary number of executions for each configuration the execution of the application must be stopped during reconfiguration In this case the application is divided into distinct temporal phases where each phase is implemented as a single system-wide configuration that occupies all system FPGA resources [16]

It is also possible to reconfigure only subsets of the reconfigurable circuit This approach is called partial reconfiguration or Local RTR (Fig 2) In this case important time-saving is made compared with a complete reconfiguration of the components as reconfiguration is quite a time-consuming operation and with Local RTR not all the circuitry must be reconfigured to carry out changes

Execution h I I I

Fig 2 Local Run-Time Reconfiguration

111 POSSIBLE WAYS OF FAULT INJECTION

Fault injection can be carried out in several ways As previously mentioned the most widely used approach is the injection of faults by modifications of the VHDL description In a second step the VHDL code is synthesized and a netlist is generated that can be downloaded into the FPGA in bitstream format Finally the application is executed in the reconfigurable hardware and an analysis can be made In our approach the analysis ends up with a functional model of the system

1774



I Modified VHDL


1 Bitstream

(4

Bitsikam

Download


Execution

Analysis

Place amp Route

I Bitstream

I



Execution

Analysis












TECHNIQUE

1775


A The JBits API



8 12 12 8






9 11 10 13 15 14 13 15 14

9 11 10

F4

F4

F2

F3

F3






1776

a few seconds





References




336-344

168-173




1995 pp 110-115

1996 vol 12 pp 67-86

1777



I Modified VHDL


1 Bitstream

(4

Bitsikam

Download


Execution

Analysis

Place amp Route

I Bitstream

I



Execution

Analysis












TECHNIQUE

1775


A The JBits API



8 12 12 8






9 11 10 13 15 14 13 15 14

9 11 10

F4

F4

F2

F3

F3






1776

a few seconds





References




336-344

168-173




1995 pp 110-115

1996 vol 12 pp 67-86

1777


A The JBits API



8 12 12 8






9 11 10 13 15 14 13 15 14

9 11 10

F4

F4

F2

F3

F3






1776

a few seconds





References




336-344

168-173




1995 pp 110-115

1996 vol 12 pp 67-86

1777

a few seconds





References




336-344

168-173




1995 pp 110-115

1996 vol 12 pp 67-86

1777

[IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference....

Documents

Transcript of [IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference....