Evolvable Hardware

20
Rashad S. Oreifej, Rawad N. Al-Haddad, Heng Tan and Ronald F. DeMara University of Central Florida Layered Approach To Intrinsic Evolvable Hardware Using Direct Bitstream Manipulation Of Virtex II Pro Devices

description

Layered Approach To Intrinsic Evolvable Hardware Using Direct Bitstream Manipulation Of Virtex II Pro Devices. Rashad S. Oreifej, Rawad N. Al-Haddad, Heng Tan and Ronald F. DeMara University of Central Florida. Evolvable Hardware. Intelligent Search. Hardware Design. Automated - PowerPoint PPT Presentation

Transcript of Evolvable Hardware

Page 1: Evolvable Hardware

Rashad S. Oreifej, Rawad N. Al-Haddad, Heng Tan and Ronald F. DeMara

University of Central Florida

Rashad S. Oreifej, Rawad N. Al-Haddad, Heng Tan and Ronald F. DeMara

University of Central Florida

Layered Approach To Intrinsic Evolvable Hardware Using Direct Bitstream

Manipulation Of Virtex II Pro Devices

Page 2: Evolvable Hardware

Evolvable Hardware

AutomatedConstruction: develop Electronic Circuits by Intelligent Search

Applications: Design, Optimization, or Failure Recovery phases Evolvable

Hardware

Intelligent

Search

Hardware

Design

Amplifiers

Antennas

Filters

Bayesian

Simulated Annealing

Genetic Algorithms

Nearest Neighbor

FPGAs

Applications

IndividualIndividual(Chromosome)(Chromosome)

GENEGENE

GAs frequently use binary strings to GAs frequently use binary strings to represent candidate solutions: represent candidate solutions: genotypegenotype Translation to FPGA Configuration bitstream

maps genotype to phenotype FPGAs for evolving digital logic

Page 3: Evolvable Hardware

GAs and Evolution

Extrinsic Evolution

Genetic

Algorithm

software modelDone? Build it

Simulation in the loop

Genetic

Algorithm

Hardware in the loop

Intrinsic Evolution• Fitness

measured using physical device output

• Observes constraints imposed by internal structure

• Functional models abstract physical aspects of device

• Representation has to undergo placement and routing before implementation.

Genetic Algorithms:Genetic Algorithms: Implement guided trial-and-error search using

principles of Darwinian evolution Iterative selection enforces “survival of the fittest” Genetic operators - mutation, crossover, … - can be

used to refurbish designs

Page 4: Evolvable Hardware

Related Work

• Conventional vs. Evolutionary Design [Miller, Alg. Evol Strat. 98]Conventional vs. Evolutionary Design [Miller, Alg. Evol Strat. 98] GA is presented that can evolve 100% functional adder and multiplier circuits Explored the effect of the device physical constraints (Xilinx 6216 FPGA) Emphasized EH feasibility over FPGA implementation concerns

• Fitness-based vs. Population-based Evolution Fitness-based vs. Population-based Evolution [Keymeulen, IEEE Trans. Rel 02][Keymeulen, IEEE Trans. Rel 02] Design fault-insensitive electronic components using evolutionary techniques Online and offline repair techniques via an intrinsic design tool (EHWPack) Fine-grained CMOS Field Programmable Transistor Array (FPTA) architecture is used to evolve

analog multiplier and digital XNOR

• Intrinsic EHW on Virtex Devices [Intrinsic EHW on Virtex Devices [Hollingworth, ICES00]00] Evolution by partial reconfiguration of bitstream for changes from baseline circuit Runtime configuration using Xilinx’s JBits Interface (Java in the loop)

• Recent General-purpose Frameworks Support Bitstream ReuseRecent General-purpose Frameworks Support Bitstream Reuse Blodget et al [Blodget FPL03]

Two-layer framework for Virtex II devices using Xilinx Partial Reconfiguration Toolkit (XPART) utilzing a soft processor core within the FPGA

Williams et al [Williams ERSA04] Egret focuses on a full SOC solution using ICAP and an embedded Linux system on a Xilinx

Virtex II chip with bash shell scripts to perform operations, such as obtaining partial bit streams from remote servers, and initiating reconfiguration

Kalte et al [Kalte PDPS05] REPLICA (Relocation per online Configuration Alteration) filter uses the SelectMAP

interface to perform bitstream manipulation to carry out the relocation during the regular download process

Page 5: Evolvable Hardware

UCF Intrinsic Evolution Platform

MRRA

Host PC

Chromosome Manipulator

GA Engine

Bitstream File

PerformCrossOver()Perform Mutation()

EvaluateInput()

GetLUTConfig()... SetLUTConfig()...DownloadDesign()... Send()...

Receive()...

Read()Write()

JTAG

Send()Receive()

SerialToParallelInput()ParallelToSerialOutput()

GNAT

Virtex-II Pro

CircuitRAM

Program()ApplyInput()

ReadOutput()

The developed platform utilizes the following The developed platform utilizes the following hardwarehardware components on the FPGA chip: components on the FPGA chip:

1.1. JTAG (IEEE 1149.1) PortJTAG (IEEE 1149.1) Port• Half-duplex serial communication interfaceHalf-duplex serial communication interface• Connects to the General-purpose Native jtAg Connects to the General-purpose Native jtAg

Tester (GNAT) from the FPGA side, and to Tester (GNAT) from the FPGA side, and to the parallel port (IEEE 1284) on the host PC the parallel port (IEEE 1284) on the host PC using a Xilinx Parallel Cableusing a Xilinx Parallel Cable

• Confers input/output data exchanged Confers input/output data exchanged between the host PC and the FPGAbetween the host PC and the FPGA

2.2. GNATGNAT• Implemented in the bitstream to reside on Implemented in the bitstream to reside on

the reconfigurable area the reconfigurable area • Connects to the BSCAN_VIRTEX2 block via Connects to the BSCAN_VIRTEX2 block via

the Test Data Input (TDI), Test Data Output the Test Data Input (TDI), Test Data Output (TDO), and Control signals, and to the (TDO), and Control signals, and to the targeted circuit via a straightforward targeted circuit via a straightforward read/write bus interfaceread/write bus interface

3.3. Evolved CircuitEvolved Circuit• Circuit to be evolved on the FPGA chipCircuit to be evolved on the FPGA chip• Circuit peripherals are connected to the Circuit peripherals are connected to the

read/write bus of the GNAT to receive/deliver read/write bus of the GNAT to receive/deliver data throughput input/output data throughput input/output

Page 6: Evolvable Hardware

UCF Platform Software Components

The developed platform consists of following The developed platform consists of following softwaresoftware components: components:

1.1. GA EngineGA Engine • C++ based console application implemented using an object oriented C++ based console application implemented using an object oriented

architecturearchitecture• Implements a conventional population-basedImplements a conventional population-based GA with runtime GA with runtime

customizable parameterscustomizable parameters2.2. Chromosome ManipulatorChromosome Manipulator

• C based GA operators library (yet executed using Visual Studio .NET)C based GA operators library (yet executed using Visual Studio .NET)• Provides a logical abstraction and hardware transparency of genetic Provides a logical abstraction and hardware transparency of genetic

operators to the GA Engine moduleoperators to the GA Engine module3.3. MRRAMRRA

• Partitions operations into Partitions operations into Logic, TranslationLogic, Translation, and , and Reconfiguration Reconfiguration layers with a standardized set of APIslayers with a standardized set of APIs

• FPGA configurations are manipulated at runtime using on-chip FPGA configurations are manipulated at runtime using on-chip resources on Xilinx Virtex II Pro via PC (JTAG) or PowerPC resources on Xilinx Virtex II Pro via PC (JTAG) or PowerPC (SelectMAP)(SelectMAP)

4.4. Bitstream FileBitstream File• Pre-compiled baseline bitstream generated using the Xilinx CAD Pre-compiled baseline bitstream generated using the Xilinx CAD

toolstools• The platform manipulates this bitstream to carry out the physical The platform manipulates this bitstream to carry out the physical

mapping of the crossover or mutation mapping of the crossover or mutation

Page 7: Evolvable Hardware

Intrinsic Evolution Workflow

1. Initialization:1. Initialization:obtain configuration from .bitobtain configuration from .bit

3. Fitness Evaluation: 3. Fitness Evaluation: performed in two phasesperformed in two phases

FPGA ReconfigurationFPGA Reconfiguration Pattern EvaluationPattern Evaluation

2. GA Operations:2. GA Operations:derive new individualsderive new individuals

MRRA

Chromosome Manipulator

GA Engine

Bitstream File

Request Genotype Data Structure

Request LUT Configuration

Read Binary Content Config Binary Content

LUT Configuration

Genotype Data Structure

Xilinx ISE 9.1i / 6.3

Target Circuit HDL

Chromosome Manipulator

GA Engine

Perform Crossover or Mutation

Offspring or Mutated Individual

MRRA

Chromosome Manipulator

GA Engine

Evaluate Output for One Input Pattern

Send Input Pattern

Buffer Pattern

Read Output to Determine

Fitness

JTAG

Write Input Pattern Serially to JTAG

GNAT

Circuit

Shift Pattern Into GNAT Register

Buffer Pattern and Apply to the Circuit

Evaluated Output

Shift Pattern from GNAT

Register

Send Output Pattern Serially

MRRA

Chromosome Manipulator

GA Engine

Bitstream File

Start Fitness Evaluation

Download Individual onto

Device

Bitstream Updates

Downloaded Successfully

Ready

Custom xilinx scripts

Download Bitstream File

onto FPGA

JTAG

InitiateBitstream Download

Updated Bitstream

STARTSTART:: module-module-

based flowbased flow

IterateIterate: : frame-frame-

based flowbased flow

Page 8: Evolvable Hardware

Multilayer Runtime Reconfiguration Architecture (MRRA)

Framework for Dynamic Reconfiguration

High-Level Applications

Mapping Engine

RAM

Mic

ropr

oces

sor

SelectMAP / JTAG / ICAP

MRRA System

Reconfigurable Units

PC

PowerPC

and / or Logic Layer

Reconfiguration Layer

Translation Layer

• Three layers (Logic, Translation, and Reconfiguration) with well-defined interfaces promoting modularity and reuse within a set of high-level APIs to carry out the partial reconfiguration process with reduced manual intervention.

• Task-level Modularity: provide support at levels down to and including task-level granularity. A task is defined as an arbitrary function synthesized to a module that can be dynamically downloaded into the reconfigurable device: Module-based or Frame-based.

• Runtime Scenario Support: provide the ability to generate and reconfigure task bitstreams at runtime as well as design-time. Runtime scenarios envisioned at design-time may not necessarily know in advance which tasks will arrive nor when they will arrive, and in selected cases what some of their specific properties will be.

• Encapsulation: control logic of each layer self-contained with fixed interface to other layers. If new control algorithms are added or the device platform is changed, the system can be ported more readily.

. . .

PLB

PowerPC

OP

B

ReconfigurableModule

ReconfigurableModule

PCI

FPGA

Host PC

PLB/OPBBridge

Block RAM

UART

SRAMController

ICAP Controller

ExternalSRAM

JTAG

ICAP

SelectMAP

On Chip Data Flow

Reconfiguration Data Flow

External Data Flow

JTAG / SelectMAP / ICAP Reconfiguration Interfaces

Page 9: Evolvable Hardware

MRRA Logic Control Flow

Top-Level Design

Module-Level Design

Module Implementation

1D Area Management

2D Area Allocation

Final Assemble

Configuration Data

Logic Management

Design Time Flow

Run Time Flow

PlanAhead

FloorPlanner

User Logic

.ngc/.edf

.ngc/.edf

.ucf

.ucf

.bit

.bit .bit

.bit

.bit

Synthesis Tool

Scripts / ISE

• One-Dimensional Area Management performed on full physical FPGA device by partitioning into 1-dimensional column-based rectangles, for fixed and reconfigurable modules arranged based on size and specified area constraints. Tools, such as PlanAhead, are accommodated.

• Bus Macros maintain correct connections between modules by spanning boundaries of these rectangular regions. Next, the modules are implemented and verified individually to create the Module Implementation and optimized by additional Two-Dimensional Area Allocation placements inside each module to minimize the partial reconfiguration bitstream size.

• After initial bitstream download, precompiled partial bitstreams can be monitored by algorithms in Logic Layer and updated directly to device for dynamic reconfiguration. New modification requests can be generated by the user logic in the form of hardware-independent representation depicted by the Runtime Flow. Although boundary of module is fixed, physical logic resources inside can be modified at runtime.

Integrated and adopted Module-based Flow from the standard Xilinx flow plus selected area management ability and direct bit management process, we term Frame-based Flow.

Module-based utilized at design time. Later, translation engine supports autonomous reconfiguration without GUI interface.

Page 10: Evolvable Hardware

Direct Bitstream ManipulationConcept and Case Study

• Change one-bit full adder to a one-bit full subtracter • Both have three one-bit inputs and two one-bit outputs, 2 LUTs with identical logic connections between LUTs and I/O signals• Only difference is only one truth table stored inside one LUT, changing from 0xE8 to 0x8E• Practical case study: dynamically reconfigurable SHA-1/MD5 Message Digest hashing algorithms:

STEP FUNCTION RESOURCE UTILIZATION AND POWER EVALUATION

SHA-1 MD5 Combined

Baseline Module Based

Frame Based

Baseline Module Based

Frame Based Baseline Module Based Frame Based

Area (slice) 192 65 (33.9%) 32 881 168 (19.1%) 32 1068 324 (30.3%) 32

Dynamic Power (mW)

234.35 20.69 (8.8%) N/A 255.20 39.32

(15.4%) N/A 274.12

79.98 (29.18%)

N/A

Total Core Power (mW)

496.85 283.19

(57.0%) N/A 517.70

301.82 (58.3%)

N/A 536.62 342.28 (63.8%)

N/A

X Y

Cin / BinCout / Bout

S / D

Adder /

Subtracter

X Y Cin Cout S

0 0 0

0 0 1

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1

0 0

0 1

0 1

1 0

0 1

1 0

1 0

1 1

X Y Bin Bout D

0 0 0

0 0 1

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1

0 0

1 1

1 1

1 0

0 1

0 0

0 0

1 1

96

E8

X

Y S

Cout

Cin

96

8E

X

Y D

Bout

BinLogic

Switch

(a) 1 Bit Full Adder (b) 1 Bit Full Subtracter

Page 11: Evolvable Hardware

Direct Bitstream Management Equations deduced to locate logic content in V2Pro bitstream

0x96

24

39 MJA = 22MNA = 2

Offset = 287

Bitstream=0x6969

Frame Address= 0x2C0400

0x8E

25

39 MJA = 22MNA = 2

Offset = 282

Bitstream=0x7171

Frame Address= 0x2C0400

Column Number:

Row Number:

Truth Table:

A Direct Bitstream Mapping Example

overheadXMJA 2 12mod XMNA 125)1( YKOffset• Full configuration file: organized consecutively by frame without labeling:

overheadFFFYXXOffset LIOILIOBGCLK 424)1(125)79()2mod(424424222

• Each CLB has 4 slices in 2 cols/2 rows as XiYj, where X is the slice column number, 0 <= i <= 2N-1, beginning from left. N=number CLB cols. Y = row number 0 <= j <= 2K-1 from bottom to top and K=number CLB rows, e.g. XC2VP7 N=40, K=34• Configuration frame has unique 32-bit address of Block Address (BA), a Major Address (MJA), a Minor Address (MNA), and a byte number offset• Let X denote column and overhead include GCLK + leftmost IOB + IOI col (e.g. 3):

• In 5 bytes of slice, first 16 bits for G-LUT truth table (left to right as MSB to LSB) and the last 16 bits for F-LUT (reverse order from LSB to MSB). Each LUT max 4 inputs with up to 16 truth table elements but when less than 4 inputs utilized, remaining unused entries are filled with the duplicated effective values of the used entries:

Page 12: Evolvable Hardware

Experimental Setup

Target CircuitTarget Circuit 4-bit x 4-bit unsigned adder4-bit x 4-bit unsigned adder

ExperimentsExperiments

1) Unseeded Design: 1) Unseeded Design: random .bit populationrandom .bit population

2) Seeded Design: 2) Seeded Design: single functional .bit individualsingle functional .bit individual

3) Repair: 3) Repair: single randomly injected stuck-at-fault (0 or 1)single randomly injected stuck-at-fault (0 or 1)

GA ParametersGA Parameters

ParameterParameter Range EvaluatedRange Evaluated Value SelectedValue Selected

Number of LUTs for design 8 8

Number of LUTs for repair 8-13 13

Population Size 5-20 10

Mutation Rate 5%-90% 50%

Crossover Rate 30%-90% 60%

Tournament Size 1-8 6

Elitism Size 1-2 1

Page 13: Evolvable Hardware

Stuck-at Zero and One Fault Modeling

• Virtex II Pro chip has 16-bit LUTs with four input lines and one output

• If the Least Significant Bit (LSB) input pin is stuck-at zero, only the memory locations of the pattern (XXX0)2 will be accessible

• This behavior can be achieved by copying the content of the memory locations of the pattern (XXX0)2 into (XXX1)2 and overwriting their old values

• The same concept can be extended where the location of the stuck input line (0,1,2,3) determines the stride (1,2,4,8) between the memory locations to copy, and the value of the stuck at condition (zero or one) determines the direction of the copy operation (left or right)

0000

0100

0011

0010

0001

1010

1110

1101

1100

1011

0101

1001

1000

0111

0110

1111

4-Input LUT

Stuck-at Zero Copy Direction

Stuck-at One Copy Direction

(LSB)(MSB)

LUT address

Page 14: Evolvable Hardware

Performance Metrics

: The numerical measure of fitness for best individual of final generation, : The numerical measure of fitness for best individual of final generation, e.g. 2^(two 4-bit inputs) * 5-bit output=1280e.g. 2^(two 4-bit inputs) * 5-bit output=1280

: The arithmetic mean for the fitness of all individuals in the final : The arithmetic mean for the fitness of all individuals in the final generation of the rungeneration of the run

: The total number of generations in the run: The total number of generations in the run

: The time elapsed to perform the GA crossover and mutation during the : The time elapsed to perform the GA crossover and mutation during the entire runentire run

: The time elapsed to apply the input patterns and read back the : The time elapsed to apply the input patterns and read back the corresponding outputs for all the fitness evaluations during the entire runcorresponding outputs for all the fitness evaluations during the entire run

: The average time taken by a single genetic crossover for a certain GA : The average time taken by a single genetic crossover for a certain GA runrun

: The average time taken by a single genetic mutation for a certain GA run : The average time taken by a single genetic mutation for a certain GA run

maxF

finalF

G

totalCM

evaluationF

C

M

Page 15: Evolvable Hardware

Experimental Results Summary

Fastest convergence

Repair must overcome

failed resource limitation

MicrosecondOrder

Page 16: Evolvable Hardware

Circuit Evolution: Fitness vs. Time

700

800

900

1000

1100

1200

1300

2

14

26

38

50

62

74

86

98

11

0

12

2

13

4

14

6

15

8

17

0

18

2

19

4

20

6

Generations

Ma

x F

itn

es

s

Run 1 Run 2 Run 3 Run 4 Run 5

Unseeded Design

1170

1190

1210

1230

1250

1270

1290

2 11 20 29 38 47 56 65 74 83 92 101

Generations

Max

Fit

nes

s

Run 1 Run 2 Run 3 Run 4 Run 5

Seeded Design

Repair:Stuck-at

Fault

1150

1170

1190

1210

1230

1250

1270

1290

2 16 30 44 58 72 86 100

114

128

142

156

170

184

198

212

226

240

Generations

Max

Fit

nes

s

Run 1 Run 2 Run 3 Run 4 Run 5

Page 17: Evolvable Hardware

Results Summary

• An An intrinsic evolution platform is developed for genetic operators and evolution platform is developed for genetic operators and fitness assessment using API layers which fitness assessment using API layers which directly manipulate the configuration bitstream on Xilinx Virtex II Pro devices on Xilinx Virtex II Pro devices

• Three experiments were conducted: Three experiments were conducted: unseeded design, , seeded design, , and and repair

• Full design/repair is achievable using this platform with an average time of Full design/repair is achievable using this platform with an average time of 0.4 microseconds to perform the genetic mutation, to perform the genetic mutation, 0.7 microseconds to to perform the genetic crossover, and perform the genetic crossover, and 5.6 milliseconds for one input pattern for one input pattern intrinsic evaluation intrinsic evaluation

• Performance advantage of Performance advantage of three orders of magnitude over JBITS and over JBITS and more than more than seven orders of magnitude over the Xilinx design tool driven over the Xilinx design tool driven flow for realizing intrinsic genetic operators on a Virtex II Pro deviceflow for realizing intrinsic genetic operators on a Virtex II Pro device

• Current work is on utilizing partial reconfiguration to reduce JTAG transfer Current work is on utilizing partial reconfiguration to reduce JTAG transfer time and porting to Virtex-4 platformtime and porting to Virtex-4 platform

MillisecondOrderMultiple

Seconds

Page 18: Evolvable Hardware

References

[1] S. Vigander, "Evolutionary Fault Repair in Space Applications," in Dept. of Computer & Information Science, vol. Masters Thesis. Trondheim: Norwegian University of Science and Technology (NTNU), 2001.

[2] J. F. Miller, P. Thomson, and T. Fogarty., "Designing Electronic Circuits Using Evolutionary Algorithms. Arithmetic Circuits: A Case Study," in Algorithms and Evolution Strategy in Engineering and Computer Science, D. Quagliarella, J. Periaux, C. Poloni, and G. Winter, Eds. Chichester, England, 1998, pp. 105-131.

[3] D. Keymeulen, R. S. Zebulum, Y. Jin, and A. Stoica, "Fault-Tolerant Evolvable Hardware Using Field-Programmable Transistor Arrays," IEEE Transactions On Reliability, vol. 49, issue 3, September 2000.

[4] R. S. Oreifej, C. A. Sharma, and R. F. DeMara, "Expediting GA-Based Evolution Using Group Testing Techniques for Reconfigurable Hardware," in proc. International Conference on Reconfigurable Computing and FPGAs (Reconfig'06), San Luis Potosi, Mexico, September 20-22, 2006, pp. 106-113.

[5] R. F. DeMara and K. Zhang., "Autonomous FPGA Fault Handling through Competitive Runtime Reconfiguration," in Proc. of the NASA/DoD Conference on Evolvable Hardware (EH'05), Washington D.C., U.S.A, June 29-01, 2005.

[6] G. Hollingworth, S. Smith, and A. Tyrrell, "The intrinsic evolution of virtex devices through internet reconfigurable logic," in Proc. of the Third International Conference on Evolvable System, April 2000.

[7] H. Tan and R. F. DeMara, "A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management," in proc. of the International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA'05), Las Vegas, Nevada, U.S.A, June 27-30, 2005.

[8] D. Wallace, "Using the JTAG Interface as a General-Purpose Communication Port," www.xilinx.com/publications/xcellonline/xcell_53/xc_pdf/xc_jtag53.pdf, 2005.

[9] Xilinx, "Parallel Cable IV Connects Faster and Better," Xcell Journal, Spring 2002.[10] Xilinx, "Using a Microprocessor to Configure Xilinx FPGAs via Slave Serial or SelectMAP Mode," v1.4,

November 2003, [11] B. Blodget, P. James-Roxby, E. Keller, S. McMillan, and P. Sundararajan, “A Self-Reconfiguring Platform”, in

Proceedings of Field-Programmable Logic and Applications 2003, Lisbon, Portugal, September 1-3, 2003.[12] J. Williams, and N. Bergmann, “Embedded Linux as a Platform for Dynamically Self-Reconfiguring Systems-On-

Chip”, in Proceedings of Engineering of Reconfigurable Systems and Algorithms (ERSA 2004), Las Vegas, Nevada, USA, 21-24 June, 2004.

[13] H. Kalte, G. Lee, M. Porrmann, and U. Ruckert, “REPLICA: A Bitstream Manipulation Filter for Module Relocation in Partial Reconfigurable Systems”, in Proceedings of 19th IEEE International Proceedings of Parallel and Distributed Processing Symposium, Denver, Colorado, USA, April 04-08, 2005.

Page 19: Evolvable Hardware

MRRA Translation Process

Idle

Receive LUT List

Update Location Information

LUT 1

LUT 2

.

.

.

LUT N

Read LUT i

Check Location Flag

Check Logic Flag

Clear Location Modification Request

Set Area Translation Indicator

Area Translation Indicator

Logic Translation Indicator

Update Logic Information

Clear Logic Modification Request

Set Logic Translation Indicator

Call Location Translation Engine

Call Logic Translation Engine

I ++

Typedef struct tagLUTinfo { /* LUT status information */ unsigned short source[3]; /* The 4 input of the LUT */ unsigned char iTruthTable[2]; /* Current output truth table */ unsigned short cRow; /* Current row position */ unsigned short cColumn; /* Current column postion */ unsigned short destination[255]; /* The output of the LUT */ char GorFLUT; /* 0=G_LUT; 1=F_LUT */

/* Modification request */ unsigned short cFutureRow; /* Future Row */ unsigned short cFutureColumn; /* Future Column */ char SwitchLUTFlag; /*0= no change, 1= move position between G and F LUT*/ unsigned char iFutureTable[2]; /* Future Truth Table */ char PositionFlag; /* 0=no change; 1=update */ char TableFlag; /* 0=no change; 1=update */

} LUTInfo;

LUT Representation at Logic Layer

Page 20: Evolvable Hardware

Current Work:Direct Bitstream Management

TABLE VI TRANSLATION ENGINE EVALUATION

Test Circuit Oringal Equivalent Gate Occupied Slices Bit file Size (Byte) Genera-tion Time (V6.2) Generation Time (V9.1)

MRRA N/A 1472 548 K 4m 31s N/A

C17 6 8 66 K 67s 101s

C1908 603 41 89 K 69s 109s

B02 28 11 66 K 66s 107s

B03 160 45 75 K 70s 163s

MD5 2496 168 120K 71s 111s

OPTIMIZATION RESULTS

Module name # of

LUT. # of FF

# of block Multiplier

# of Slices

Original File Size (Byte)

Original Max. Delay (ns)

Optimized File Size (byte)

Optimized Max. Delay

(ns)

Area Saving

4 LUTs 4 16 0 12 64K 1.371 55K 1.347 14% Shifter 1 24 0 13 87K 1.377 63K 1.367 28%

Block Multiplier 8 25 1 17 88K 1.346 66K 1.346 25% LUT Multiplier 22 22 0 22 96K 1.367 68K 1.346 29%

SECDED 93 41 0 74 89K 1.355 60K 1.355 33% MD5 292 128 0 168 120K 1.380 84K 1.322 30%