CDA 3103 Computer Organization Review Instructor: Hao Zheng Dept. Comp. Sci & Eng. USF.
1 Bridging the gap between asynchronous design and designers Hao Zheng.
-
Upload
zavier-ryals -
Category
Documents
-
view
221 -
download
1
Transcript of 1 Bridging the gap between asynchronous design and designers Hao Zheng.
![Page 1: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/1.jpg)
Bridging the gap between asynchronous design
and designers
Hao Zheng
![Page 2: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/2.jpg)
Outline
What is an asynchronous circuit ?
Asynchronous communication
Asynchronous design styles (Micropipelines)
Asynchronous logic building blocks
Control specification and implementation
Delay models and classes of async circuits
Why asynchronous circuits ?
![Page 3: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/3.jpg)
Synchronous circuit
R R R RCL CL CL
CLK
Implicit (global) synchronization between blocksClock period > Max Delay (CL + R)
Time is an independent physical variable (quantity)
![Page 4: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/4.jpg)
Asynchronous circuit
R R R RCL CL CL
Req
Ack
Explicit (local) synchronization:Req / Ack handshakes
Time = events + quantity Time does not exist if nothing happens (Aristotle)
![Page 5: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/5.jpg)
Motivation for Asynchronous
Asynchronous design is often unavoidable: Asynchronous interfaces, arbiters etc.
Modern clocking is multi-phase and distributed – and virtually ‘asynchronous’ (cf. GALS – next slide):
Mesachronous (clock travels together with data) Local (possibly stretchable) clock generation
Robust asynchronous design flow is coming (e.g. VLSI programming from Philips, NCL from Theseus Logic, fine-grain pipelining from Fulcrum)
![Page 6: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/6.jpg)
Motivation (Technology Aspects)
Low power Automatic clock gating
Electromagnetic compatibility No peak currents around clock edges
Security No ‘electro-magnetic difference’ between logical ‘0’
and ‘1’in dual rail codeRobustness
High immunity to technology and environment variations (temperature, power supply, ...)
![Page 7: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/7.jpg)
Motivation (Designer’s View)
Modularity for system-on-chip design Plug-and-play interconnectivity
Average-case peformance No worst-case delay synchronization
Many interfaces are asynchronous Buses, networks, ...
![Page 8: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/8.jpg)
Globally Async Locally Sync (GALS)
Local CLK
R RCL
Async-to-sync Wrapper
Req1
Req2
Req3
Req4
Ack3
Ack4Ack2
Ack1
Asynchronous World
Clocked Domain
![Page 9: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/9.jpg)
Key Design Differences
Synchronous logic design: proceeds without taking timing correctness
(hazards, signal ack-ing etc.) into account Combinational logic and memory latches
(registers) are built separately Static timing analysis of CL is sufficient to
determine the Max Delay (clock period) Fixed set-up and hold conditions for latches
![Page 10: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/10.jpg)
Key Design Differences
Asynchronous logic design: Must ensure hazard-freedom, signal ack-ing, local
timing constraints Combinational logic and memory latches (registers)
are often mixed in “complex gates” Dynamic timing analysis of logic is needed to
determine relative delays between paths
To avoid complex issues, circuits may be built as Delay-insensitive and/or Speed-independent (Maller’s theory vs Huffman asynchronous automata)
![Page 11: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/11.jpg)
Verification and Testing Differences
Synchronous logic verification and testing: Only functional correctness aspect is verified and
tested Testing can be done with standard ATE and at low
speedAsynchronous logic verification and testing:
In addition to functional correctness, temporal aspect is crucial: e.g. causality and order, deadlock-freedom
Testing must cover faults in complex gates (logic+memory) and must proceed at normal operation rate
Delay fault testing may be needed
![Page 12: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/12.jpg)
Synchronous communication
Clock edges determine the time instants where data must be sampled
Data wires may glitch between clock edges (set-up/hold times must be satisfied)
Data are transmitted at a fixed rate(clock frequency)
1 1 0 0 1 0
![Page 13: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/13.jpg)
Dual Rail
Two wires with L(low) and H (high) per bit “LL” = “spacer”, “LH” = “0”, “HL” = “1”
n-bit data communication requires 2n wires
Each bit is self-timed
Other delay-insensitive codes exist (e.g. k-of-n) and event-based signalling (choice criteria: pin and power efficiency)
1 1
0 0
1
0
![Page 14: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/14.jpg)
Bundled Data
Validity signal Similar to an aperiodic local clock
n-bit data communication requires n+1 wires
Data wires may glitch when no validity signal.
Signaling protocols level sensitive (latch) transition sensitive (register): 2-phase / 4-phase
1 1 0 0 1 0
![Page 15: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/15.jpg)
Example: Memory Read Cycle
Transition signaling, 4-phase
Valid address
Address
Valid data
Data
A A
DD
![Page 16: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/16.jpg)
Example: Memory Read Cycle
Transition signaling, 2-phase
Valid address
Address
Valid data
Data
A A
DD
![Page 17: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/17.jpg)
Asynchronous Modules
Signaling protocol:
reqin+ start+ [computation] done+ reqout+ ackout+ ackin+reqin- start- [reset] done- reqout- ackout- ackin-
(more concurrency is also possible)
Data IN Data OUT
req in req out
ack in ack out
DATAPATH
CONTROL
start done
![Page 18: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/18.jpg)
Asynchronous Latches: C element
CA
BZ
A B Z+
0 0 00 1 Z1 0 Z1 1 1
Vdd
Gnd
A
A
A
AB
B
B
B
Z
Z
Z
[van Berkel 91]
Static Logic Implementation
![Page 19: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/19.jpg)
C-element: Other Implementations
A
A
B
B
Gnd
Vdd
Z
A
A
B
B
Gnd
Vdd
Z
Weak inverter
Quasi-StaticDynamic
![Page 20: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/20.jpg)
Dual-Rail Logic
A.t
A.f
B.t
B.f
C.t
C.f
Dual-rail AND gate
Valid behavior for monotonic environment
![Page 21: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/21.jpg)
Completion Detection
Dual-rail logic
•••
•••
C done
Completion detection tree
![Page 22: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/22.jpg)
22
Differential Cascode Voltage Switch Logic
start
start
A.t
B.t
C.t
A.fB.fC.f
Z.tZ.f
done
3-input AND/NAND gate
N-type transistor network
![Page 23: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/23.jpg)
Examples of Dual-Rail Design
Asynchronous dual-rail ripple-carry adder (A. Martin, 1991)
Critical delay is proportional to logN (N=number of bits)
32-bit adder delay (1.6m MOSIS CMOS): 11ns versus 40 ns for synchronous
Async cell transistor count = 34 versus synchronous = 28
More recent success stories (modularity and automatic synthesis) of dual-rail logic from Null-Convension Logic from Theseus Logic
![Page 24: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/24.jpg)
Bundled-Data Logic Blocks
Single-rail logic
•••
•••
delaystart done
Conventional logic + matched delay
![Page 25: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/25.jpg)
Micropipelines (Sutherland 89)
C
Join Merge
Toggle
r1
r2
g1
g2
d1
d2
Request-Grant-Done (RGD)Arbiter
Call
r1
r2
ra
a1
a2Select
inoutf
outt
sel
inout0out1
Micropipeline (2-phase) control blocks
![Page 26: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/26.jpg)
Micropipelines (Sutherland 89)
L L L Llogic logic logic
Rin
Aout
C C
C C
Rout
Aindelay
delay
delay
![Page 27: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/27.jpg)
DataPath / Control
L L L Llogic logic logic
Rin RoutCONTROL AinAout
Synthesis of control is a major challenge
![Page 28: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/28.jpg)
Control specification
A+
B+
A-
B-
A
B
A inputB output
![Page 29: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/29.jpg)
Control specification
A+
B-
A-
B+
A B
![Page 30: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/30.jpg)
Control specification
A+
C-
A-
C+A
C
B+
B- B
C
![Page 31: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/31.jpg)
Control specification
A+
C-
A-
C+A
C
B+
B-B
C
![Page 32: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/32.jpg)
Control Specification
CC
Ri
Ro
Ai
Ao
Ri+
Ao+
Ri-
Ao-
Ro+
Ai+
Ro-
Ai-
Ri Ro
Ao Ai
FIFOcntrl
![Page 33: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/33.jpg)
Gate vs Wire delay models
Gate delay model: delays in gates, no delays in wires
Wire delay model: delays in gates and wires
![Page 34: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/34.jpg)
Delay Models for Async. Circuits
Bounded delays (BD): realistic for gates and wires. Technology mapping is easy, verification is
difficult
Speed independent (SI): Unbounded (pessimistic) delays for gates and “negligible” (optimistic) delays for wires.
Technology mapping is more difficult, verification is easy
Delay insensitive (DI): Unbounded (pessimistic) delays for gates and wires.
DI class (built out of basic gates) is almost empty
Quasi-delay insensitive (QDI): Delay insensitive except for critical wire forks (isochronic forks).
In practice it is the same as speed independent
BD
SI QDI
DI
![Page 35: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/35.jpg)
Environment models
Slow enough environment = Fundamental mode
(Inputs change AFTER system has settled)
Reactive environment = I/O mode
(Inputs may change once the first output changes)
![Page 36: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/36.jpg)
Correctness of a Circuit wrt Delay Assumptions
a
bz
C-element: z = ab +zb + za
a
b z
![Page 37: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/37.jpg)
Resistance
Concurrent models for specification CSP, Petri nets, ...: no more FSMs
Difficult to design Hazards, synchronization
Complex timing analysis Difficult to estimate performance
Difficult to test No way to stop the clock
![Page 38: 1 Bridging the gap between asynchronous design and designers Hao Zheng.](https://reader035.fdocuments.in/reader035/viewer/2022070306/5519b9495503467a578b490b/html5/thumbnails/38.jpg)
But ... some successful stories
PhilipsAMULET microprocessorsSharpIntel (RAPPID)Start-up companies:
Theseus logic, Fulcrum, Self-Timed Solutions
Recent blurb: It's Time for Clockless Chips, by Claire Tristram (MIT Technology Review, v. 104, no.8, October 2001: http://www.technologyreview.com/magazine/oct01/tristram.asp) ….