Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  ·...

31
Song Peng - ASYNC06 Self-Healing Asynchronous Arrays Self-Healing Asynchronous Arrays Song Peng and Rajit Manohar Song Peng and Rajit Manohar Computer Systems Laboratory Cornell University, USA Computer Systems Laboratory Cornell University, USA

Transcript of Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  ·...

Page 1: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Self-Healing Asynchronous Arrays

Self-Healing Asynchronous Arrays

Song Peng and Rajit ManoharSong Peng and Rajit Manohar

Computer Systems LaboratoryCornell University, USA

Computer Systems LaboratoryCornell University, USA

Page 2: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

MotivationMotivation

Fault tolerant (FT) VLSI designincrease fabrication yieldimprove circuit reliability

No efficient general FT method for asynchronous circuits yet

no clock → different fault behaviortraditional FT techniques ineffective or inefficient

Fault tolerant (FT) VLSI designincrease fabrication yieldimprove circuit reliability

No efficient general FT method for asynchronous circuits yet

no clock → different fault behaviortraditional FT techniques ineffective or inefficient

Page 3: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

ContributionsContributions

Developed a general design for self-healing asynchronous arrays

suitable for clockless circuitsany number (K) of hard and soft errorsautomatic reconfiguration: self-healingsmall hardware cost, low overheads, and good scalability (with K)

Developed a general design for self-healing asynchronous arrays

suitable for clockless circuitsany number (K) of hard and soft errorsautomatic reconfiguration: self-healingsmall hardware cost, low overheads, and good scalability (with K)

Page 4: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

OutlineOutline

A general self-healing asynchronous design frameworkImplementing self-healing async arraysExperimental evaluationConclusions

A general self-healing asynchronous design frameworkImplementing self-healing async arraysExperimental evaluationConclusions

Page 5: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

FT Design PhilosophiesFT Design Philosophies

Hardwired replication-and-voting (NMR)deadlock complicates voting procedurevoter: performance bottlenecklarge H/W overhead

Reconfigurable fault tolerant designself-checking logic, spare resources, reconfigurationno voting, less H/W costfault recovery time → but little impact in overall

Hardwired replication-and-voting (NMR)deadlock complicates voting procedurevoter: performance bottlenecklarge H/W overhead

Reconfigurable fault tolerant designself-checking logic, spare resources, reconfigurationno voting, less H/W costfault recovery time → but little impact in overall

Page 6: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

General Framework of Self-healing Asynchronous CircuitGeneral Framework of Self-

healing Asynchronous Circuit

ReconfigurationLogic

DeadlockDetection

Fail-stop Circuitof K-FT Graph Topology

Page 7: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Design OverviewDesign OverviewQuasi-delay-insensitive (QDI) circuitsAsynchronous arrays

valid model for most VLSI modules with identical components (adder, multiplier, FIR)node: a VLSI componentedge: a communication channel between two neighbor componentsK-FT graph: K-FT array with external in/outs

Quasi-delay-insensitive (QDI) circuitsAsynchronous arrays

valid model for most VLSI modules with identical components (adder, multiplier, FIR)node: a VLSI componentedge: a communication channel between two neighbor componentsK-FT graph: K-FT array with external in/outs

Page 8: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Implement Fail-Stop Behavior*Implement Fail-Stop Behavior*PCHB (Precharge Half Buffer) template

QDI circuit with domino stylecan construct any QDI logic

FS-PCHB templatePCHB + self-checking logica stuck-at fault or single event upset → FS-PCHB deadlocks*

PCHB (Precharge Half Buffer) template

QDI circuit with domino stylecan construct any QDI logic

FS-PCHB templatePCHB + self-checking logica stuck-at fault or single event upset → FS-PCHB deadlocks*

* S. Peng and R. Manohar, “Efficient failure detection in pipelined asynchronous circuits”, in DFT 2005

F

Control

f0/f1

fe

In

Ine

PCHB Circuit Diagram

en

Page 9: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Detect DeadlockDetect Deadlock

A timer (delay element) watches the data channel activity

current-starved cascaded inverter chaina valid transition + the next transition expected→ start timerthe next valid transition not occur for a specific amount of time → timer expires: deadlock

A timer (delay element) watches the data channel activity

current-starved cascaded inverter chaina valid transition + the next transition expected→ start timerthe next valid transition not occur for a specific amount of time → timer expires: deadlock

Page 10: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Online Self-ReconfigurationOnline Self-Reconfiguration

Reconfiguration overviewpass gates → connections of QDI circuitreconfiguration → pass gate control signals

No fault locationsearch a workable configurationhardware cost reducedlonger fault recovery

little performance impact in overall

Reconfiguration overviewpass gates → connections of QDI circuitreconfiguration → pass gate control signals

No fault locationsearch a workable configurationhardware cost reducedlonger fault recovery

little performance impact in overall

Page 11: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

General Block Diagram of Self-Reconfiguration Logic

General Block Diagram of Self-Reconfiguration Logic

Finite State Machine

Combinational Logic

Deadlock Detector

Reset Logic

Local ResetPass Gate Control Signals

DataChannel

Page 12: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

K-FT Array ModelsK-FT Array ModelsFull-duplication model

high redundancy, simple reconfigurationMin-spare model*

min redundancy (K spares), complex reconfigurationSmall-degree model

medium redundancy, medium reconfigurationAll three models

each external in/out: K+1 copies

Full-duplication modelhigh redundancy, simple reconfiguration

Min-spare model*min redundancy (K spares), complex reconfiguration

Small-degree modelmedium redundancy, medium reconfiguration

All three modelseach external in/out: K+1 copies

* S. Peng and R. Manohar, “Fault tolerant asynchronous adder through dynamic self-reconfiguration”, in ICCD 2005.

Page 13: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Full-Duplication ArrayFull-Duplication Array

Constructionadd K full copies of array: (K+1) in totalonly external in/outs reconfigurable

Self-reconfigurationreconfigure = switch to another copysimple reconfiguration logic

(K+1)-bit one-hot counter

Constructionadd K full copies of array: (K+1) in totalonly external in/outs reconfigurable

Self-reconfigurationreconfigure = switch to another copysimple reconfiguration logic

(K+1)-bit one-hot counter

Page 14: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Small-Degree ArraySmall-Degree ArrayConstruction

add a medium number (>K) of spare nodesrecursive constructionresulting graph

K+1 treesmax node degree constant: small fanout

Self-reconfigurationsearch a configuration = select a path from all treesreconfiguration logic

one-hot counter, multiple mod-3 counters

Constructionadd a medium number (>K) of spare nodesrecursive constructionresulting graph

K+1 treesmax node degree constant: small fanout

Self-reconfigurationsearch a configuration = select a path from all treesreconfiguration logic

one-hot counter, multiple mod-3 counters

(K+1)-bit,to select a tree

To choose different branches in a tree walk

Page 15: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Small-Degree Array Example: 1-FT 4-Node Array

Small-Degree Array Example: 1-FT 4-Node Array

a c b

: pass-gate

1 0

a c b

end1 d0

deadlock1

2

2

13

4

5

60 1

clk

clk

01 01 01

Page 16: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Experimental EvaluationExperimental Evaluation

Target circuit: 64-bit QDI adder1-bit adder: FS-PCHB

fine-grained fault toleranceEvaluate three FT graph models

H/W costperformance, energy overheadfault recovery time

Compare with NMR (with voter core)

Target circuit: 64-bit QDI adder1-bit adder: FS-PCHB

fine-grained fault toleranceEvaluate three FT graph models

H/W costperformance, energy overheadfault recovery time

Compare with NMR (with voter core)

Page 17: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Evaluation: H/W CostEvaluation: H/W CostH/W cost: transistor countCritical circuit: self-reconfiguration logic/voterNormalize H/W costs to baseline adder

H/W cost: transistor countCritical circuit: self-reconfiguration logic/voterNormalize H/W costs to baseline adder

571618.411.98.7156990.040.161.80826.310.27.444.6817.30.020.090.2146.416.145.213.011.410.010.040.2123.624.093.162.480.620.010.030.141NMRDUPSMLMINNMRDUPSMLMIN

TotalCriticalK

MIN: min-spare, SML: small-degree, DUP: full-duplication

Page 18: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Other EvaluationsOther Evaluations

Performance and EnergyHSPICE: TSMC 0.18um, 25ºCnormalize throughputs and energy to baseline adder

Worst fault recovery time total number of configurations in FT arrayO(expected fault recovery time)

Performance and EnergyHSPICE: TSMC 0.18um, 25ºCnormalize throughputs and energy to baseline adder

Worst fault recovery time total number of configurations in FT arrayO(expected fault recovery time)

Page 19: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Evaluation: PerformanceEvaluation: Performance

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 2 3 4 5 6 7 8

K

MINSMLDUPNMR

Page 20: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Evaluation: EnergyEvaluation: Energy

3

4

5

6

7

8

9

10

11

12

1 2 3 4 5 6 7 8

K

MINSMLDUPNMR

Page 21: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Evaluation:Fault Recovery Time

Evaluation:Fault Recovery Time

0

50

100

150

200

250

300

350

400

450

500

1 2 3 4 5 6 7 8

K

MINSMLDUP

Page 22: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

ConclusionConclusion

A general design for self-healing QDI circuits

applicable no critical timing assumptiongeneral K hard/soft errors toleratedself-healing automatic reconfigurationefficient low overheads, good scalability

Can be applied to synchronous designas long as fail-stop

A general design for self-healing QDI circuits

applicable no critical timing assumptiongeneral K hard/soft errors toleratedself-healing automatic reconfigurationefficient low overheads, good scalability

Can be applied to synchronous designas long as fail-stop

Page 23: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Self-Healing Asynchronous Arrays

Self-Healing Asynchronous Arrays

Song Peng and Rajit ManoharSong Peng and Rajit Manohar

Computer Systems LaboratoryCornell University, USA

Computer Systems LaboratoryCornell University, USA

Page 24: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Backup SlidesBackup Slides

Page 25: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Quasi-delay-insensitive (QDI) Circuits

Quasi-delay-insensitive (QDI) Circuits

An important class of asynchronous circuits

no gate/wiring timing assumptionother than isochronic forks

data communication by message passinghandshake → causality and event-orderingself-checking potential

An important class of asynchronous circuits

no gate/wiring timing assumptionother than isochronic forks

data communication by message passinghandshake → causality and event-orderingself-checking potential

Page 26: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Implement Fail-Stop BehaviorImplement Fail-Stop Behavior

Fault Modelingboth hard and soft errorshard error → single stuck-at fault (SSAF)

cover many defects and permanent faultssoft error → single event upset (SEU)

cover most transient faultshigh reliability potential

Fault Modelingboth hard and soft errorshard error → single stuck-at fault (SSAF)

cover many defects and permanent faultssoft error → single event upset (SEU)

cover most transient faultshigh reliability potential

Page 27: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Baseline QDI Circuit TemplateBaseline QDI Circuit TemplatePre-charge Half Buffer (PCHB)

pre-charge domino logic style → fastcan construct almost all QDI logic

Pre-charge Half Buffer (PCHB)pre-charge domino logic style → fastcan construct almost all QDI logic

Control

DataComputation

X0

X1

Xe

Y0

Y1

Ye

en

Page 28: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Full-Duplication Array Example: 2-FT 2-Node Array

Full-Duplication Array Example: 2-FT 2-Node Array

1 0 00 1

: pass-gate

1

Page 29: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Min-Spare ArrayMin-Spare ArrayConstruction*

add K spare nodesreplicate each external connectionadd redundant internal connections

Self-reconfigurationreconfigure = pick up another set of N nodes from (N+K) nodesFSM = log2( ) -bit incrementercombinational logic necessary

Construction*add K spare nodesreplicate each external connectionadd redundant internal connections

Self-reconfigurationreconfigure = pick up another set of N nodes from (N+K) nodesFSM = log2( ) -bit incrementercombinational logic necessary

* S. Peng and R. Manohar, “Fault tolerant asynchronous adder through dynamic self-reconfiguration”, in ICCD 2005.

N+K

K

Page 30: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

Min-Spare Array Example:2-FT 2-Node Array

Min-Spare Array Example:2-FT 2-Node Array

Incrementer

Com

binational Logic

: pass-gate

Page 31: Self-Healing Asynchronous Arraystima.univ-grenoble-alpes.fr/conferences/async/Technical... ·  · 2006-03-27No efficient general FT method for ... Developed a general design for

Song Peng - ASYNC06

SummarySummary

All models outperform NMR for fine-grained FT design

smaller overheads, better scalability with KMin-spare model: minimum H/W costFull-duplication model: smallest critical circuit, shortest fault recovery timeSmall-degree model: medium overheads

All models outperform NMR for fine-grained FT design

smaller overheads, better scalability with KMin-spare model: minimum H/W costFull-duplication model: smallest critical circuit, shortest fault recovery timeSmall-degree model: medium overheads