Introduction to asynchronous circuit design: specification and synthesis Jordi Cortadella,...

38
Introduction to asynchronous circuit design: specification and synthesis Jordi Cortadella, Universitat Politècnica de Catalunya, Spain Michael Kishinevsky, Intel Corporation, USA Alex Kondratyev, Theseus Logic, USA Luciano Lavagno, Università di Udine, Italy
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    228
  • download

    0

Transcript of Introduction to asynchronous circuit design: specification and synthesis Jordi Cortadella,...

Introduction to asynchronous circuit design:

specification and synthesis

Jordi Cortadella, Universitat Politècnica de Catalunya, Spain

Michael Kishinevsky, Intel Corporation, USA

Alex Kondratyev, Theseus Logic, USA

Luciano Lavagno, Università di Udine, Italy

Outline

• I: Introduction to basic concepts on

asynchronous design

• II: Synthesis of control circuits from STGs

• III: Advanced topics on synthesis of controlcircuits from STGs

• IV: Synthesis from HDL and other synthesis paradigms

Note: no references in the tutorial

Introduction toasynchronous circuit design:

specification and synthesis

Part I:

Introduction to basic concepts on asynchronous circuit design

Outline

• What is an asynchronous circuit ?

• Asynchronous communication

• Asynchronous logic blocks

• Micropipelines

• Control specification and implementation

• Delay models

• Why asynchronous circuits ?

Synchronous circuit

R R R RCL CL CL

CLK

Implicit synchronization

Asynchronous circuit

R R R RCL CL CL

Explicit synchronization: Req/Ack handshakes

Req

Ack

Synchronous communication

• Clock edges determine the time instants where data must be sampled

• Data wires may glitch between clock edges (set-up/hold times must be satisfied)

• Data are transmitted at a fixed rate(clock frequency)

1 1 0 0 1 0

Dual rail

• Two wires per bit– “00” = spacer, “01” = 0, “10” = 1

• n-bit data communication requires 2n wires

• Each bit is self-timed

• Other delay-insensitive codes exist

1 1

0 0

1

0

Bundled data

• Validity signal– Similar to an aperiodic local clock

• n-bit data communication requires n+1 wires

• Data wires may glitch when no valid

• Signaling protocols– level sensitive (latch)– transition sensitive (register): 2-phase / 4-phase

1 1 0 0 1 0

Example: memory read cycle

• Transition signaling, 4-phase

Valid address

Address

Valid data

Data

A A

DD

Example: memory read cycle

• Transition signaling, 2-phase

Valid address

Address

Valid data

Data

A A

DD

Asynchronous modules

• Signaling protocol:reqin+ start+ [computation] done+ reqout+ ackout+ ackin+reqin- start- [reset] done- reqout- ackout- ackin-

(more concurrency is also possible, e.g. by overlapping the return-to-zero phase of step i-1 with the evaluation phase of step i)

Data IN Data OUT

req in req out

ack in ack out

DATAPATH

CONTROL

start done

Asynchronous latches: C element

CA

BZ

A B Z+

0 0 00 1 Z1 0 Z1 1 1

Vdd

Gnd

A

A

A

AB

B

B

B

Z

Z

Z

Dual-rail logic

A.t

A.f

B.t

B.f

C.t

C.f

Dual-rail AND gate

Valid behavior for monotonic environment

Completion detection

•••

•••

C done

Completion detection tree

Differential cascode voltage switch logic

start

start

A.t

B.t

C.t

A.fB.fC.f

Z.tZ.f

done

3-input AND/NAND gate

Bundled-data logic blocks

•••

•••

delaystart done

logic

Conventional logic + matched delay

Micropipelines (Sutherland 89)

L L L Llogic logic logic

Rin

Aout

C C

C C

Rout

Aindelay

delay

delay

Data-path / Control

L L L Llogic logic logic

Rin RoutCONTROL AinAout

Control specification

A+

B+

A-

B-

A

B

A inputB output

Control specification

A+

B+

A-

B-

A B

Control specification

A+

B-

A-

B+

A B

Control specification

A+

C-

A-

C+A

C

B+

B- B

C

Control specification

A+

C-

A-

C+A

C

B+

B-B

C

Control specification

CC

Ri

Ro

Ai

Ao

Ri+

Ao+

Ri-

Ao-

Ro+

Ai+

Ro-

Ai-

Ri Ro

Ao Ai

FIFOcntrl

A simple filter: specification

y := 0;loop x := READ (IN); WRITE (OUT, (x+y)/2); y := x;end loop

RinAin

Aout Rout

ININ

OUTOUT

filter

A simple filter: block diagram

x y+

controlRin

Ain

Rout

Aout

Rx AxRy Ay Ra Aa

ININOUTOUT

• x and y are level-sensitive latches (transparent when R=1)• + is a bundled-data adder (matched delay between Ra and Aa)• Rin indicates the validity of IN• After Ain+ the environment is allowed to change IN• (Rout,Aout) control a level-sensitive latch at the output

A simple filter: control spec.

x y+

controlRin

Ain

Rout

Aout

Rx AxRy Ay Ra Aa

ININOUTOUT

Rin+

Ain+

Rin-

Ain-

Rx+

Ax+

Rx-

Ax-

Ry+

Ay+

Ry-

Ay-

Ra+

Aa+

Ra-

Aa-

Rout+

Aout+

Rout-

Aout-

A simple filter: control impl.

Rin+

Ain+

Rin-

Ain-

Rx+

Ax+

Rx-

Ax-

Ry+

Ay+

Ry-

Ay-

Ra+

Aa+

Ra-

Aa-

Rout+

Aout+

Rout-

Aout-

C

Rin

Ain

Rx Ax RyAy AaRa

Aout

Rout

Control: observable behavior

Rx+

Rin+

Ax+ Ra+ Aa+ Rout+ Aout+ z+ Rout- Aout- Ry+

Ry- Ay+Rx-Ax-Ay-

Ain-

Ain+

Ra-

Rin-

Aa-z-

C

Rin

Ain

Rx Ax RyAy AaRa

Aout

Rout

z

Taking delays into account

x+

x-

y+

y-

z+

z- xz

yx’

z’

Delay assumptions:• Environment: 3 times units• Gates: 1 time unit

events: x+ x’- y+ z+ z’- x- x’+ z- z’+ y-

time: 3 4 5 6 7 9 10 12 13 14

Taking delays into account

x+

x-

y+

y-

z+

z- xz

yx’

z’

Delay assumptions: unbounded delays

events: x+ x’- y+ z+ x- x’+ y-

time: 3 4 5 6 9 10 11

very slow

failure !

Gate vs wire delay models

• Gate delay model: delays in gates, no delays in wires

• Wire delay model: delays in gates and wires

Delay models for async. circuits

• Bounded delays (BD): realistic for gates and wires.– Technology mapping is easy, verification is difficult

• Speed independent (SI): Unbounded (pessimistic) delays for gates and “negligible” (optimistic) delays for wires.– Technology mapping is more difficult, verification is easy

• Delay insensitive (DI): Unbounded (pessimistic) delays for gates and wires.– DI class (built out of basic gates) is almost empty

• Quasi-delay insensitive (QDI): Delay insensitive except for critical wire forks (isochronic forks).– Formally, it is the same as speed independent

– In practice, different synthesis strategies are used

BD

SI QDI

DI

Motivation (designer’s view)

• Modularity– Plug-and-play interconnectivity

• Reusability– IPs with abstract timing behaviors

• High peformance– Average-case performance (no worst-case delay

synchronization)– No clock skew (local timing assumptions)

• Many interfaces are asynchronous– Buses, networks, ...

Motivation (technology aspects)

• Low power– Automatic clock gating

• Electromagnetic compatibility– No peak currents around clock edges

• Robustness– High immunity to technology and environment

variations (in-die variations, temperature, power supply, ...)

Dissuasion

• Concurrent models for specification– CSP, Petri nets, ...: no more FSMs

• Difficult to design– Hazards, synchronization

• Complex timing analysis– Difficult to estimate performance

• Difficult to test– No way to stop the clock

But ... some successful stories

• Philips

• AMULET microprocessors

• Sharp

• Intel (RAPPID)

• IBM (interlocked pipeline)

• Start-up companies:– Theseus Logic, Cogency

• ...