1 July 2005 Autonomous FPGA Fault Handling Competitive Runtime Reconfiguration Autonomous FPGA Fault...
-
Upload
merry-griffith -
Category
Documents
-
view
229 -
download
2
description
Transcript of 1 July 2005 Autonomous FPGA Fault Handling Competitive Runtime Reconfiguration Autonomous FPGA Fault...
1 July 2005
Autonomous FPGA Fault HandlingAutonomous FPGA Fault Handlingthrough
Competitive Runtime ReconfigurationCompetitive Runtime ReconfigurationRonald F. DeMara Ronald F. DeMara and and Kening Zhang Kening Zhang
University of Central FloridaUniversity of Central Florida
Reprogrammable Device Failure
Duration:
Target:
Detection:
Isolation:
Diagnosis:
Recovery:
Transient: SEU Permanent: SEL, Oxide Breakdown, Electron Migration, LPD
Repetitive Readback [Wells00]
DeviceConfiguration
Approach: TMR(conventional
spatial redundancy)
BIST
Processing Datapath
DeviceConfiguration
Processing Datapath
Evolutionary
Bitwise Comparison
Invert BitValue
IgnoreDiscrepancy
MajorityVote
STARS[Abramovici01]
SupplementaryTestbench
CartesianIntersection
Worst-caseClock Period
Dilation
Replicate inSpare Resource
Characteristics
MethodsCED
[McCluskey04]
Duplex Output
Comparison
Fast Run-time Location
Select SpareResource
Sussex[Vigander01]
DuplexOutput
Comparison
(not addressed)
(not addressed)
unnecessary unnecessary
unnecessary
Population-basedGA using
Extrinsic FitnessEvaluation
EvolutionaryAlgorithm usingIntrinsic Fitness
Evaluation
Fault-Handling Techniques for SRAM-based FPGAs
CRR[DeMara05]
Previous Work Detection Characteristics of FPGA Fault-Handling Schemes
Fault Detection
Resource Coverage
Fault Isolation
Approach Fault Handling Method Latency Distinguish Transients Logic Inter-
connect Comparator Granularity
TMR Spatial voting Negligible No Yes Yes No Voting element
[Vigander01] Spatial voting & offline
evolutionary regeneration
Negligible No Yes No No Voting element
[Lohn, Larchev, DeMara03]
Offline evolutionary regeneration Negligible No Yes Yes No Unnecessary
[Lach98] Static-capability tile reconfiguration Relies on independent fault detection mechanism
STARS [Abramovici01] Roving Test Area Up to 8.5M
erroneous outputs Test pattern transients Yes Yes No LUT function
[Keymeulen, Stoica,
Zebulum00]
Population-based fault insensitive design
Design-time prevention emphasis No Yes Yes No Not addressed
at runtime
Competitive Runtime
Reconfiguration (CRR)
Competing configurations with temporal voting and online regeneration
Negligible
Transients are
attenuated automatically
Yes Yes Yes
Unnecessary, but can isolate
functional components
StrategiesStrategies: 1) Evolve redundancy into design before anticipated failure 2) Redesign after detection of failure 3) Combine desirable aspects of both strategies 1) + 2) …
CRR Arrangement in SRAM FPGA
Configurations in PopulationConfigurations in Population• C = CL CR
• CL = subset of left-half configurations• CR = subset of right-half configurations• |CL|=|CR |= |C|/2
Discrepancy OperatorDiscrepancy Operator• Baseline Discrepancy Operator is dyadic operator with binary output:
• Z(Ci) is FPGA data throughput output of configuration Ci
• Each half-configuration evaluates using embedded checker (XNOR gate) within each individual
• Any fault in checker lowers that individual’s fitness so that individual is no longer preferred and eventually undergoes repair
OthewiseCZCZ
CCRi
LiR
iLi
)()(10
Reconfiguration Algorithm
`
SR AM-based FPGA
LHalf-Configuration
Discrepancy Check L Discrepancy Check R
Function Logic L
CONFIGURATION BIT STREAM
INPUT DATA
Function Logic R
DATA OUTPUT
FEE
DB
AC
K
RHalf-Configuration
CONTROL
OFF
-CH
IP E
EPR
OM
( N
OTE
: a n
on-v
olat
ile m
emor
y is
alre
ady
requ
ired
to b
oot a
ny S
RA
MFP
GA
from
col
d st
art .
.. th
is is
not
an
addi
tiona
l chi
p )
Rji
Ljii CEORC ,,j =RS:
(Hamming Distance)
Rji
Ljii CEORC ,,j ^ =WTA:
(Equivalence)
Terminology and Characteristics
Pristine Pool: Pristine Pool: CP. For any CiC, is member of CP at generation G if and only if
Suspect Pool:Suspect Pool: CS. For any CiC, is member of CS at generation G if and only if
at least one of
Under Repair Pool:Under Repair Pool: CU: For any CiC, is member of CU at generation G if and
only if
Refurbished Pool:Refurbished Pool: CR: after Genetic Operator applied, the new generated individual is member of CR at generation G if and only if
01
G
K
RK
LK CC
)1(0 GKCC RK
LK
11
G
K
RK
LK CC
01
G
K
RK
LK CC
ED is Discrepancy CountDiscrepancy Count of Ci and EC is Correctness CountCorrectness Count of Ci
Length of Evaluation Fitness Window:Length of Evaluation Fitness Window: W = ED+ EC
Fitness Metric:Fitness Metric: f(Ci) =EC/ EW
1.1. InitializationInitialization Population P of functionally-identical yet physically-distinct configurations Partition P into sub-populations that use supersets of physically-distinct resources,
e.g. size |P|/2 to designate physical FPGA left-half or right-half resource utilization
2.2. Fitness AssessmentFitness Assessment Discrepancy Operator is some function of
bitwise agreement between each half’s output Four Fitness States defined for Configurations as {CP,CS,CU,CR} with transitions, respectively:
Pristine Suspect Under Repair Refurbished
Fitness Evaluation Window W determines comparison interval
3.3. RegenerationRegeneration Genetic Operators used to recover from fault based on Reintroduction Rate Operators only applied once then offspring returned to “service” without for concern
about increasing fitness
Sketch of CRR ApproachPremise: Recovery Complexity << Design Complexity
fitness assessment viafitness assessment via pairwise discrepancypairwise discrepancy
(temporal voting vs. (temporal voting vs. spatial voting)spatial voting)
States Transitions during lifetime of iStates Transitions during lifetime of ithth Half-Configuration Half-Configuration
Configuration Health States
pristine
suspect
refurbished
under repair
partial repair
L R
L = R
complete repair
primordial
L = R
L R
L R
L = R
L = R
LR
1
2
3
4
5
6
7
8
fi fOT
:L = R
: fi fOT
9
10
11
fi < fRT
L R:
fi < fRT
L R:
integral w ith
:fi fRT
:fi < fOT
COMPETITION
C O M P E T I T I O N
E V O L U T I O N
Procedural Flow under Competitive Runtime Reconfiguration
Initialization
Population partitioned intofunctionally-identical yet
physically-distincthalf-configurations
Fitness Adjustment
update fitness of onlyL and R based ondetection results
either L's or R'sfitness < Repair
Threshold?
Selectionchoose
FPGA configuration(s)labeled L and R
Detectionapply functional inputs
to compute FPGAoutputs using L, R
Adjust Controlsdetection mode, overlap interval, ...
invoke Genetic
Operators only once
and only on L or R
L=R
L=R
PRIMARYLOOP
discrepancyfree
L, R results
NO
YES
is
Integrates all fault handling stages using EC strategyIntegrates all fault handling stages using EC strategy Detects faults by the occurrence of discrepancy Isolates faults by accumulation of discrepancies Failure-specific refurbishment using Genetic Operators:
Intra-Module-Crossover, Inter-Module-Crossover, Intra-Module-Mutation
Realize online device refurbishmentRealize online device refurbishment Refurbished online without additional function or resource test vectors Repair during the normal data throughput process
Selection Process
Any Pristineindividuals?
Any Suspectindividuals?
Select* one Pristine individualas L half-configuration
Choose random number X on [0..1]
X >Re-introduction
rate?
YES
YES
YES
NO
NO
NO
* = selection that favors inventory rotation **= selection based on fitness ranking that favors correctness*** = selection based on fitness ranking that favors correctness with optional second-order metric such as routing delay (to automatically evolve better throughput performance at no additional cost)
Select** one Suspect individualas L half-configuration
Select*** one Refurbished individualas L half-configuration
Select*** one Under Repairindividual as R half-configuration
Select one Operational (Pristine*,Suspect**, or Refurbished***)
individual as R half-configuration
gotoDetectionprocess
X > R
Fitness Adjustment Procedure
Discrepancy?
Increase L's & R's fitnessaccording to fitness up-adjustment process
Decrease L's & R 's fitnessaccording to fitness down-adjustment process
Isthe individualPristine?
Mark individual as Suspect
Is itsfitness < Repair
Threshold?
YES
YES
NO
YES
NO
YES
Mark individual as Under Repair
Invoke Genetic Operators only onceand only on L or RMark individual as Refurbished
Isindividual Under
Repair?
Is itsfitness > Operational
Threshold?
YES
adjust controls& goto Selection process
fL,R>fOT
fL,R<fRT
Fitness Evaluation Window
• Fitness Evaluation WindowFitness Evaluation Window: W denotes number of iterations used to evaluate fitness before the state of
an individual is determined
• Determination ofDetermination of W for 3x3 multiplierfor 3x3 multiplier 6 input pins articulating 26=64 possible inputs W should be selected so that all possible inputs appear More formally,
Let rand(X) return some xi X at random Seek W : [ rand(X) ] = X with high probability
i=1
W
1
112
.....1
12.....
1
1
121
121
m
K
m
KK
DKK
PmK
xK
PK
PK
KP
KK
KxK
xK
xK
Kx
KK• xK = distinct orderings of K inputs
showing in D trials• if D constant, can calculate Pk>1
successively• probability PK of K inputs showing
after D trials is ratio of xK / KD
When K=64:
W Determination
Impact of Fault on Viable Individuals
• Existence of Positive Test VectorExistence of Positive Test Vector Input Ip comprises a articulating test iff Ci(Ip) Cji(Ip) = 1 So if a discrepancy is detected then some Ip exists which manifests the fault
• Minimal Case whenMinimal Case when Ip is Uniqueis Unique
Ip is unique if fault is observable under exactly one input pattern
• Probability Mass Function for Encountering Minimal CaseProbability Mass Function for Encountering Minimal Case Ip Consider W=600 yielding 99.5% coverage for a module with input space X=64
The number of input occurrences, 0 i 600, that randomly encounter Ip to identify
the fault is governed by the probability density function:
p.m.f. = where
W
iW
X
nXi
W
1
16000,1,64,600 inXW
Integer Multiplier Case Study• 3bit x 3bit unsigned multiplier3bit x 3bit unsigned multiplier automated design:esign:
– Building blocks Half-Adder: 18 templates created Full-Adder: 24 templates Parallel-And : 1 template created
– Randomly select templates for instantiation in modules
GA operatorsGA operatorsExternal-Module-CrossoverInternal-Module-Crossover Internal-Module-Mutation
GA parametersGA parametersPopulation size : 20 individuals Crossover rate : 5% Mutation rate : up to 80% per bit
Experimental EvaluationExperimental EvaluationXilinx Virtex II Pro on Avnet PCI board • Objective fitness function replaced by Objective fitness function replaced by
the Consensus-based Evaluation the Consensus-based Evaluation Approach and Relative FitnessApproach and Relative Fitness
• Elimination of additional test vectorsElimination of additional test vectors• Temporal Assessment processTemporal Assessment process
Experiments Demonstrate …Experiments Demonstrate …
Template Fault Coverage
Half-Adder Template A
Half-Adder Template B
Template ATemplate A– Gate3 is an AND gate– Will lose correctness if a Stuck-At-Zero fault occurs in second
input line of the Gate3, an AND gateTemplate BTemplate B
– Gate3 is a NOT gate and only uses the first input line– Will work correctly even if second input line is stuck at Zero or
One
Half-Adder Template A
Regeneration PerformanceRegeneration Performance
Difference (vs. Hamming Distance)Evaluation Window, Ew = 600Suspect Threshold: S = 1-6/600=99%Repair Threshold: R = 1-4/600 = 99.3%Re-introduction rate: r = 0.1
ParametersParameters:
Repairs evolvedRepairs evolved in-situ, in real-time, without additional test in-situ, in real-time, without additional test vectors, vectors, while allowing device to remainwhile allowing device to remain partially online. partially online.
Discrepancy Mirror
Fault CoverageFault Coverage
• Mechanism for Checking-the-Checker (“golden element” problem)
• Makes checker part of configuration that competes for correctness [DeMara PDPTA-05]
Discrepancy Mirror Circuit
Fault CoverageFault CoverageComponent Fault Scenarios Fault-FreeFunction Output A Fault Correct Correct Correct Correct
Function Output B Correct Fault Correct Correct Correct
XNORA Disagree (0) Disagree (0) Fault : Disagree(0) Agree (1) Agree (1)
XNORB Disagree (0) Disagree (0) Agree (1) Fault : Disagree(0) Agree (1)
BufferA 0 0 High-Z 0 1
BufferB 0 0 0 High-Z 1
Match Output 0 0 0 0 1
Influence of LUT utilizationInfluence of LUT utilization
Perpetually Articulating InputsPerpetually Articulating Inputswith Equiprobable Distributionwith Equiprobable Distribution
Intermittently Articulating InputsIntermittently Articulating Inputswith Equiprobable Distributionwith Equiprobable Distribution
• expected number of pairings grows sub-linearly in number of resources
• utilization below 20% or above 80% implicates (or exonerates) a smaller sub-set of resources
• 50% utilization, the expected number of pairings for 1,000, 10,000, and 100,000 resources are 11.1, 14.9, and 17.6
• at 90% utilization mean value of 258 pairings are required to isolate the faulty resource.
Future Work:Development Board to Self-Contained FPGA
Year 1 Year 3Year 2
CRR on a Chip(Xilinx Virtex-II Pro)
Control viaon-chip
Power PCRe-
config
Config
Data
Configurationsin On ChipRAM Blocks
FunctionalCLBs
ICAP
Bit file
Data
Output
Request
Avnet FPGA Development Board
PCI Interface
Virtex-IIPro FPGA
Off ChipRAM
Controlhosted on
PCOutput
Bit
file
Inpu
t Da
ta CRR on a Chip(Xilinx Virtex-II Pro)
Device Fault
Qualitative Analysis of CRR modelQualitative Analysis of CRR model• Number of iterations and completeness of regeneration repair • Percentage of time the device remains online despite physical resource
fault (availability)Hardware Resource ManagementHardware Resource Management
• Optimization of hardware profile for Xilinx Virtex II ProField Testing on SRAM-based FPGA in a Cubesat missionField Testing on SRAM-based FPGA in a Cubesat mission
Backup Slides
• On following pages …
Isolation: Block Duelling
• Algorithm based on group testingAlgorithm based on group testing methodsmethods• Successive intersection to assess health of resourcesSuccessive intersection to assess health of resources
Each configuration kk has a binary Usage Matrix UUk[i,j][i,j] 1 i m and 1 j n m, n are the number of rows and columns of resources in the device Elements Uk[i,j] = 1 are resources used in k
History Matrix H H [i,j][i,j] 1 i m and 1 j n, initially all zero, exists in which : entries represent the fitness of resources (i, j) Information regarding the fitness of resources over time is stored
A discrepant output will lead to an increase in the value of H[i,j], Uk[i,j] = 1 ,k S
All elements of H, corresponding to resources used by discrepant configuration will be incremented by one.
At any point in time, H[i,j] will be a record the outcomes of competitions m successive intersections among are performed
until |S|=1
Dueling Example0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 00 0 0 0 1 0 0 0 0 00 0 1 0 0 0 0 0 0 00 0 0 0 0 1 0 1 0 00 0 0 1 0 0 0 0 0 00 0 1 0 0 1 1 0 0 00 0 0 0 1 0 0 0 0 00 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 00 0 0 1 0 1 1 0 0 00 0 1 1 0 0 1 0 0 00 0 1 0 1 0 0 0 0 00 0 1 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 2 1 0 0 1 0 0 0
0 0 1 0 1 1 0 1 0 0
0 0 1 1 0 1 0 0 0 0
0 0 1 0 0 1 1 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
H H [i,j][i,j]@ t = 0
H H [i,j][i,j]@ t = 2
UU11 UU22
• H H [i,j] changes after [i,j] changes after CC1 1 andand C C2 2 are loadedare loaded• UU11 and and UU22 are corresponding are corresponding Usage MatricesUsage Matrices• (3,3) is identified as the faulty resource(3,3) is identified as the faulty resource
Fitness of configuration Fitness of configuration kk
k
k
Isolation of a single faulty individual with 1-out-of-64 impact
• Outliers are identified after W iterations elapsed• E.V. = (1/64)*600 = 9.375 from minimum impact faulty individual• Isolated individual’s f differs from the average DV by 33 after 1 or more observation intervals of length W
Isolation of a single faulty L individual with 10-out-of-64 impact
• Compare with 1-out-of-64 fault impact E.V. of (10/64)*600 = 93.75 discrepancies for faulty configuration One isolation will be complete approx. once in every 93.75/5 = 19 Observation Intervals Fault Isolation demonstrated in 100% of case
Isolation of 8 faulty individuals L4&R4 with 1-out-of-64 impact
• Expected isolations do not occur approximately 40% of the time Average discrepancy value of the population is higher Outlier isolation difficult Multiple faulty individual, Discrepancies scattered
Online Dueling Evaluation
• ObjectiveObjective Isolate faults by successive intersection between sets of FPGA
resources used by configurations Analyze complexity of Isolation process
• VariablesVariables Total resources available
Measured in number of LUTs Number of Competing Configurations
Number of initial “Seed” designs in CRR process Degree of Articulation
Some inputs may not manifest faults, even if faulty resource used by individual
Resource Utilization Factor Percentage of FPGA resources required by target application/design
Number of Iterations for Isolation Measure of complexity and time involved in isolating fault
Isolation of Faulty Resource at the FPGA resource (LUT) granularity
• 50625 LUTs50625 LUTs comparable to LUTs on a Xilinx Virtex II Pro FPGAXilinx Virtex II Pro FPGA Xilinx Virtex II Pro has approximately
67 columns, 78 rows 4 slices per CLB 2 LUTs per slice
Isolation of Faulty Resource:Effect of Articulation
• No direct, uniform relation between % Articulation and Number of Isolations!• Performance best when Articulation (%) = 50% 50% 10% 10%
Each successive intersection provides maximal information Greatest number of resources are intersected out of “suspect” pool.
For further info … EH Websitehttp://cal.ucf.edu
Fast Reconfiguration for Fast Reconfiguration for Autonomously Reprogrammable LogicAutonomously Reprogrammable Logic
• MotivationMotivation– Dynamic reconfiguration required by application– Exploit architectural & performance improvements fully– Reconfiguration delay – a major performance barrier
• Previous WorkPrevious Work• MethodologyMethodology
– Multilayer Runtime Reconfiguration Architecture (MRRA)– Spatial Management
• Prototype DevelopmentPrototype Development – Loosely-Coupled solution– Timing Analysis – System-On-Chip solution
Reconfiguration Demand during CRRReconfiguration Demand during CRR
For a complete repairFor a complete repair – Approximately 2,000 generations ( ) may be required– For each generation, # evaluations may be up to 100 evaluations– Yielding the Cumulative Number of Reconfigurations (CNR) up to
– For each reconfiguration task
)()()( iTiTiTL EDRTTATi
CNR
iitot LL
1
Even if reconfiguration delay alone is assumed to be in the order of tens or hundreds of milliseconds Ltot >= 5.5 hours
– Therefore, the total delay
CRGnewO
000,20 newCR OG
Previous Work - Tool LevelPrevious Work - Tool Level
Approach FPGA Supported
On-chip System
Bit Stream Reuse
System Coupling Degree
Potential Limitations
Moraes,Mesquita,
Palma, Moller
Virtex XCV300 devices
No N LooseLack of AreaRelocation Capability
Raghavan, Sutton
Xilinx Virtex
devicesNo N Loose Cumbersome
CAD flow
Blodget, McMillan
Virtex II devices Partial Y Medium
Limited hardware speed and capacity. Lack of
information for bit stream
reuse
Previous Work - Algorithm LevelPrevious Work - Algorithm Level
Approach Method Partial Reconfig
SpatialRelocation
TemporalParallelism
Area shape
Run-Time
Potential Limitations
Hauck, Li, Schwabe
Bit file compression N/A No N/A N/A No
Full reconfiguratio
n required
Shirazi, Luk, Cheung
Identifying common
componentsYes No Yes N/A No Design time
work required
Mak, Young Dynamic Partitioning Yes No Yes N/A Yes
Only desirable for large designs
Ganesan, Vemuri Pipelining Yes No Yes N/A Yes Limited
pipeline depth
Compton, Li, Knol, Hauck
Relocation and Defragmentatio
n with new FPGA
architecture
Yes Yes No Row-based YesSpecial FPGA architecture
required
Diessel, MiddendorfSchmeck, Schmidt
Task Remapped and Relocated Yes Yes No Rectangle Yes
Overhead for remapping
calculations
Herbert, Christoph,
Macro
Partitioning and 2D Hashing Yes Yes Yes Rectangle Yes
Rigid task modeling
assumptionscompression method temporal method spatial method
Multilayer Runtime Reconfiguration Architecture Multilayer Runtime Reconfiguration Architecture (MRRA)(MRRA)
Fault-RepairGenetic Algorithm
ReconfigurationEngineM
icro
proc
esso
r
System Bus
Virtex-II ProFPGA RAM
Control S
ystem
• Develop MRRA fast Develop MRRA fast reconfiguration paradigm for the reconfiguration paradigm for the CRR approachCRR approach
• Validate with real hardware Validate with real hardware platform along with detailed platform along with detailed performance analysis performance analysis
• First general-purpose framework First general-purpose framework for a wide variety of applications for a wide variety of applications requiring dynamic reconfiguration requiring dynamic reconfiguration
• Extend existing theories on Extend existing theories on reconfiguration reconfiguration
Avnet FPGA Development Board
PCI I nt er f ace
Virtex-IIPro FPGA
Off ChipRAM
Controlhosted on
PC
FPGA
Outp
ut
Bit
file
Inpu
t Da
ta
Loosely Coupled SolutionLoosely Coupled Solution
The entire system operates on a The entire system operates on a 32-bit basis32-bit basis
The The Virtex-II ProVirtex-II Pro is mounted on a is mounted on a development board which can then development board which can then
be interfaced with a WorkStation be interfaced with a WorkStation running running XilinxXilinx EDK and ISE. EDK and ISE.
Result AssessmentResult Assessment• Establish full functional framework of both prototypesEstablish full functional framework of both prototypes• Communication overhead, throughput and overall speed-up Communication overhead, throughput and overall speed-up
analysisanalysis Communication overhead for SOC solution is decreased to micro or sub-
micro second order Vs. milliseconds order of Loosely Coupled solution Up to 5-fold speedup is expected compared to the Loosely Coupled solution
• Translation Complexity AnalysisTranslation Complexity Analysis The quantity of information that needs to be translated to generate the
reconfiguration bitstream Simplification from file level to bit level is expected
• Storage Complexity AnalysisStorage Complexity Analysis– The memory space required for the run-time algorithms– Decreased memory requirement is expected due to the translation
complexity improvement
Project Milestones
Nov2004
Start
Jan2005
Mar2005
May2005
Jul2005
Sep2005
Nov2005
Jan2006
Mar2006
May2006
Jul2006
Sep2006
Nov2006
Jan2007
Mar2007
Jul2007
API &SEC
circuit
Scripts GArepresentationfor prototype 1
Performanceanalysis forprototype 1
on 3*3multiplier
OS forthe
SOC
ICAPcircuit
Reconfig.Peformance
Report
SOCFinal
Report
Performanceanalysis for
prototype 1 onQuad Decoder
circuit
HWHW Schedule:Schedule:
SW Schedule:SW Schedule:Nov2004
Start
Jan2005
Mar2005
May2005
Jul2005
Sep2005
Nov2005
Jan2006
Mar2006
May2006
Jul2006
Sep2006
Nov2006
Jan2007
Mar2007
Jul2007
Evaluate CRRParameters in3x3 multiplier
design
Design GUIof 3X3
multiplier
Build VHDLmodule and
incorporate intothe hardware
prototype
FPGA-resident
CRR
Implementthe SECcircuitdesign
OptimizedParametersfor layeredcomb/seqdesigns
Regen.Final
Report
Performanceanalysis for
prototype 1 onQuad Decoder
circuit
PublicationsAcceptedAccepted ManuscriptsManuscripts1. R. F. DeMara and K. Zhang, “Autonomous FPGA Fault Handling through Competitive Runtime
Reconfiguration,” to appear in NASA/DoD Conference on Evolvable Hardware(EH’05), Washington D.C., U.S.A., June 29 – July 1, 2005.
2. H. Tan and R. F. DeMara, “A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management,” to appear in International Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA’05), Las Vegas, Nevada, U.S.A, June 27 – 30, 2005.
3. R. F. DeMara and C. A. Sharma, “Self-Checking Fault Detection using Discrepancy Mirrors,” to appear in International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’05), Las Vegas, Nevada, U.S.A, June 27 – 30, 2005.
SubmittedSubmitted ManuscriptsManuscripts1. R. F. DeMara and K. Zhang, “Populational Fault Tolerance Analysis Under CRR Approach,”
submitted to International Conference on Evolvable Systems (ICES’05), Barcelona, Sept. 12 – 14, 2005.
2. R. F. DeMara and C. A. Sharma, “FPGA Fault Isolation and Refurbishment using Iterative Pairing,” submitted to IFIP VLSI-SOC Conference, Perth, W. Australia, October 17 – 19, 2005.
Manuscripts In-preparationManuscripts In-preparation 1. R. F. DeMara and K. Zhang, “Autonomous Fault Occlusion through Competitive Runtime
Reconfiguration,” submission planned to IEEE Transactions on Evolutionary Computation. 2. R. F. DeMara and C. A. Sharma, “Multilayer Dynamic Reconfiguration Supporting
Heterogeneous FPGA Resource Management,” submission planned to IEEE Design and Test of Computers.
Field TestingField TestingImplementation of CRR on-board SRAM-based FPGA in a Cubesat mission
EHW Environments• Evolvable Hardware (EHW) Environments enable experimental methods to research soft computing intelligent search techniques• EHW operates by repetitive reprogramming of real-world physical devices using an iterative refinement process:
GeneticAlgorithm
Hardware in the loop
orTwo
modes
of
Evolvabl
e
Hardwar
e
Extrinsic Evolution
GeneticAlgorithm
software modelDone? Build it
device “design-time”refinement
Simulation in the loop
Intrinsic Evolution
device “run-time”refinement
new approach to
Autonomous Repair of failed devices
Stardust Satellite: • >100 FPGAs onboard• hostile environment: radiation, thermal stress• How to achieve reliability to avoid mission failure???
Application
Genetic Algorithms (GAs)
Mechanism coarsely modeled after neo-Darwinism (natural selection + genetics)
selection of
parents
population of candidate solutions
parents
offspring
crossover
mutation
evaluatefitness
ofindividuals
replacement
start
Fitnessfunction
Goal reached
Genetic Mechanisms
• Guided trial-and-error search techniques using principles of Darwinian evolution iterative selection, “survival of the fittest” genetic operators -- mutation, crossover, … implementor must define fitness function
• GAs frequently use strings of 1s and 0s to represent candidate solutions if 100101 is better than 010001 it will have more chance to breed and
influence future population
• GAs “cast a net” over entire solution space to find regions of high fitness
• Can invoke Elitism Operator (E=1, E=2 …) guarantees monotonically increasing fitness of best individual over all
generations
Commercial Applications: Nextel: frequency allocation for cellular phone networks -- $15M
predicted savings in NY market Pratt & Whitney: turbine engine design --- engineer: 8 weeks;
GA: 2 days w/3x improvement
International Truck: production scheduling improved by 90% in 5 plants
NASA: superior Jupiter trajectory optimization, antennas, FPGAs
Koza: 25 instances showing human-competitive performance such as analog circuit design, amplifiers, filters
GA Success Stories
Representing Candidate Solutions
IndividualIndividual(Chromosome)(Chromosome)
GENEGENE
Representation of an individual can be using discrete values (binary, integer, or any other system with a discrete set of values)
Example of Binary DNA Encoding:
Genetic Operators
t t + 1
mutation
recombination (crossover)
reproduction
selection
Crossover Operator
Population: . . .
1 1 1 1 1 1 1 0 0 0 0 0 0 0 parentscut cut
1 1 1 0 0 0 0 0 0 0 1 1 1 1 offspring