Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de...

59
Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th , 2013

Transcript of Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de...

Page 1: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Asynchronous Circuits

Jordi CortadellaUniversitat Politècnica de Catalunya, Barcelona

Collège de FranceMay 14th, 2013

Page 2: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Goals

• Convince ourselves that:– designing an asynchronous circuit is easy– synchronous and asynchronous circuits are similar– asynchronous circuits bring new advantages

• Not to discourage designers with exotic and sophisticated asynchronous schemes

Collège de France 2013 Asynchronous circuits 2

Page 3: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Clocking

Collège de France 2013 Asynchronous circuits

Nvidia KeplerTM GK110

• How to distribute the clock?

• How to determine the clockfrequency?

• How to implement robustcommunications?

• How to reduce and manageenergy?

3

28nm, 7.1B transistors, 550mm2, 2688 CUDA cores,Base clock: 836MHz, Memory clock: 6GHz

Page 4: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Collège de France 2013 Asynchronous circuits 4

Page 5: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Synchronous circuits

Page 6: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Synchronous circuit

Collège de France 2013 Asynchronous circuits

CombinationalLogic

Flip

Flo

ps

Flip

Flo

ps

PLL

6

Page 7: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

12112

Synchronous circuit

Collège de France 2013 Asynchronous circuits

CL

Two competing paths:• Launching path• Capturing path

Launching path < Capturing path + Period

CLKtree + CL < CLKtree + Period

CL < Period (no clock skew)

2PLL

7

Page 8: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Source-synchronous

Collège de France 2013 Asynchronous circuits

CLKgen matched delay matched delay matched delay

• No global clock required

• More tolerance to PVT variations

• Period > longest combinational path

• Good for acyclic pipelines

Launching path

Capturing path

8

Page 9: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

CLKgen

?

Source-synchronous with forks and joins

Collège de France 2013 Asynchronous circuits

How to synchronize incoming events?

9

Page 10: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

C element (Muller 1959)

Collège de France 2013 Asynchronous circuits

CA

BC

A

B

C

A B C0 0 00 1 C1 0 C1 1 1

10

Page 11: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

C element (Muller 1959)

Collège de France 2013 Asynchronous circuits

A

B C

A

B

C

A B C0 0 00 1 C1 0 C1 1 1

MAJ

11

(many implementations exist)

Page 12: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Multi-input C element

Collège de France 2013 Asynchronous circuits

C

C

C

C

C

C

a1

a2

a3

a4

a5

a6

a7

c

12

Page 13: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Completion detection

Page 14: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Completion detection

Collège de France 2013 Asynchronous circuits

CLKgen

fixed delay

The fixed delay must be longer than theworst-case logic delay (plus variability)

Q: could we detect when a computation has completed ASAP ?

14

Page 15: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

A 1 SP 0 SP 1 SP 1 SP

Delay-insensitive codes: Dual Rail• Dual rail: every bit encoded with two signals

Collège de France 2013 Asynchronous circuits

A.t A.f A0 0 Spacer0 1 01 0 11 1 Not used

A.t

A.f

15

Page 16: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Dual-Rail AND gate

Collège de France 2013 Asynchronous circuits

A B C

SP SP SP

0 - 0

- 0 0

SP 1 SP

1 SP SP

1 1 1

A

BC

A.t

A.f

B.t

B.f

C.t

C.f

16

Page 17: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Dual-Rail Inverter

Collège de France 2013 Asynchronous circuits

A Z

SP SP

0 1

1 0

A.t

A.f

Z.t

Z.f

17

Page 18: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Dual-Rail AND/OR gate

Collège de France 2013 Asynchronous circuits

A

BC

A.t

A.f

B.t

B.f

C.t

C.f

A

BC

A.f

A.t

B.f

B.t

C.f

C.tA

BC

18

Page 19: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Dual rail: completion detection

Dual-rail logic

•••

•••

Collège de France 2013 Asynchronous circuits 19

00

00

00

00

00

00

00

00

00

00

00

00

00

01

10

10

10

01

01

01

10

01

10

10

01

01

Page 20: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Dual rail: completion detection

Dual-rail logic

•••

•••

C done

Completion detection tree

Collège de France 2013 Asynchronous circuits 20

Page 21: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Dual rail: completion detection

Collège de France 2013 Asynchronous circuits

AND

OR

INV

AND

CLKgen

21

Page 22: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Dual rail: completion detection

Collège de France 2013 Asynchronous circuits

AND

OR

INV

AND

C

22

C

Page 23: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Single rail data vs. dual railSome back-of-the-envelope estimations:

Collège de France 2013 Asynchronous circuits

Single rail Dual RailArea 1 2Delay 1 << 1Static power 1 2Dynamic power < 0.2 2

Dual rail:• Good for speed• Large area• High power comsumption

23

Page 24: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Handshaking

Page 25: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Handshaking

Collège de France 2013 Asynchronous circuits

CLKgen unknown delay

Assume that the source module can provide data at any rate:

• When should the CLK generator send an event if the

internal delays of the circuit are unknown?

Solution: handshaking

25

Page 26: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Handshaking

Collège de France 2013 Asynchronous circuits

I have data

I want data

Data

Request

Acknowledge

26

Page 27: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Asynchronous elastic pipeline

C

ReqIn ReqOut

AckIn AckOut

C C C

• David Muller’s pipeline (late 50’s)• Sutherland’s Micropipelines (Turing award, 1989)

Collège de France 2013 Asynchronous circuits 27

Page 28: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Multiple inputs and outputs

Collège de France 2013 Asynchronous circuits 28

Page 29: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Multiple inputs and outputs

Collège de France 2013 Asynchronous circuits

delay

29

Page 30: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Channel-based communication• A channel contains data and handshake wires

Collège de France 2013 Asynchronous circuits

DataReq

Ack

30

DataReq

Ack

Page 31: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Two-phase protocol

• Every edge is active• It may require double-edge triggered flip-flops or

pulse generators

Collège de France 2013 Asynchronous circuits

Data 1 Data 2 Data 3

Req

Ack

Data

Data transfer Data transfer

31

Page 32: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Four-phase protocol

• Valid data on the active edge of Req• Req/Ack must return to zero before the next transfer• Different variations of the 4-phase protocol exist

Collège de France 2013 Asynchronous circuits

Data 1 Data 2 Data 3

Req

Ack

Data

Data transfer Data transfer

32

Page 33: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

How to memorize?

Collège de France 2013 Asynchronous circuits

CombinationalLogic LL

delay

CC

? ?

2-phase or 4-phase ?

33

Page 34: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

How to memorize?

Collège de France 2013 Asynchronous circuits

CombinationalLogic LL

delay

CC

Pulsegenerator

2-phase

34

Page 35: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

How to memorize?

Collège de France 2013 Asynchronous circuits

CombinationalLogic LL

delay

CC 4-phase

35

Page 36: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Performance analysis

Page 37: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Ring oscillators

Collège de France 2013 Asynchronous circuits

CC

CC

C

• Every ring requires an odd number of inverters

• The cycle period is determined by the slowest ring

• The cycle period is adapted to the operating conditions(temperature, voltage)

37

1

2 3 4

5

6 7

Page 38: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Why asynchronous?

Page 39: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Modularity• Time-independent functional composability

– Performance may be affected (but not functionality)

Collège de France 2013 Asynchronous circuits 40

A BDataReq

AckB’

Page 40: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Tracking variability

Collège de France 2013 Asynchronous circuits 41

matched delay

Page 41: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Tracking variability

delay

best typ worst

multi-corner matched delay

critical paths

Good correlation for:

• Process variability (systematic)

• Global voltage fluctuations

• Temperature

• Aging (partially)Collège de France 2013 Asynchronous circuits 42

Page 42: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Margins

Gate and wire delays (typ) P V T AgingPLLJitter

Skew

Rigid Clocks:

Cycle period

Gate and wire delays (typ) P V TA

gin

g

Elastic Clocks:

Skew

Cycle period

Margin reduction

Speed-up / Power savings

Collège de France 2013 Asynchronous circuits 43

Page 43: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

wasted timecomputation time

Rigid clock

computation time

Cycle period

Cycle period

Elastic clock

Clock elasticity

Collège de France 2013 Asynchronous circuits 44

Page 44: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Voltage scaling and power savings

-24%-14%

3 ARM926 coreson the same die

Collège de France 2013 Asynchronous circuits 45

Page 45: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Design Automation

Page 46: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Design automation paradigms• Synthesis of asynchronous controllers

– Logic synthesis from Petri nets orasynchronous FSMs

• Syntax-directed translation– Correct-by-construction composition of handshake

components

• De-synchronization– Automatic transformation from synchronous to

asynchronousCollège de France 2013 Asynchronous circuits 47

Page 47: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Synthesis of asynchronous controllers

Collège de France 2013 Asynchronous circuits 48

DSr

LDS

LDTACK

D

DTACK

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

Page 48: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Synthesis of asynchronous controllers

Collège de France 2013 Asynchronous circuits 49

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

DTACKD

DSr

LDS

LDTACK

Example: Petrify

Page 49: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Syntax-directed translation

Collège de France 2013 Asynchronous circuits 50

(A || B) ; C

P = (A || B) ; C

Page 50: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Syntax-directed translation

Collège de France 2013 Asynchronous circuits 51

par

A B

C

A || B

seq

P = (A || B) ; C

Page 51: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Syntax-directed translation

Collège de France 2013 Asynchronous circuits 52

seq

par

A B

C

P = (A || B) ; C

Page 52: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Syntax-directed translation

Collège de France 2013 Asynchronous circuits 53

A B

P = (A ; B) seq

Page 53: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Syntax-directed translation

Collège de France 2013 Asynchronous circuits 54

c := a + b +

c

a b

Page 54: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Syntax-directed translation

Collège de France 2013 Asynchronous circuits

SEQ

xR

R

RWMUX

yR

R

RWMUX

*

DMX-

DMX-

DMX <>

DMX <

do

→→ @

áá ññ→

out

int = type [0..255]& gcd: main proc (in? chan <<int,int>> & out! chan int)begin x, y: var int| forever do in?<<x,y>>

; do x <> y then if x < y then y:=y-x else x:=x-y fi od

; out!x odend

Sources:

J. Kessels and A. Peeters.DESCALE: A Design Experiment for a SmartCard Application Consuming Low Energy,in Principles of Asynchronous Circuit Design, A Systems Perspective,Eds., J. Sparso and S. Furber, Kluwer Academic Publishers, 2001.

P.A.Beerel, R.O. Ozdag and M. Ferretti.A Designer’s Guide to Asynchronous VLSI,Cambridge University Press, 2010. 55

Page 55: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

De-synchronization• Strategy: substitute the clock tree

by local clocks and handshakes

• Combinational logic and latches are not modified

• More tolerance to variability– Similar area, less power and/or more speed

• Cortadella, Kondratyev, Lavagno and Sotiriou. Desynchronization: Synthesis of asynchronous circuits from synchronous specifications.IEEE TCAD, Oct 2006.

Collège de France 2013 Asynchronous circuits 56

Page 56: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Synchronous operation

Collège de France 2013 Asynchronous circuits

CLKgen

Transforming a synchronous circuit into asynchronous (automatically)

57

Page 57: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

De-synchronization

Collège de France 2013 Asynchronous circuits

Transforming a synchronous circuit into asynchronous (automatically)

59

Page 58: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Conclusions• Asynchrony offers flexibility in time

– Modularity– Dynamic adaptability– Tolerance to variability

• Better optimization of power/performance

• Why isn’t it an important trend in circuit design?– Lack of commercial EDA support (timing sign-off)– Designers do not feel comfortable with “unpredictable” timing– Other aspects: testing, verification, …

• De-synchronization might be a viable solutionCollège de France 2013 Asynchronous circuits 61

Page 59: Asynchronous Circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona Collège de France May 14 th, 2013.

Collège de France 2013 Asynchronous circuits 62