Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf ·...

29
Advances in Designing Clockless Digital Systems Advances in Designing Advances in Designing Clockless Clockless Digital Systems Digital Systems Prof. Steven M. Prof. Steven M. Nowick Nowick [email protected] [email protected] Department of Computer Science Department of Computer Science Columbia University Columbia University New York, NY, USA New York, NY, USA

Transcript of Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf ·...

Page 1: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

Advances in Designing Clockless Digital Systems

Advances in Designing Advances in Designing ClocklessClockless Digital SystemsDigital Systems

Prof. Steven M. Prof. Steven M. [email protected]@cs.columbia.edu

Department of Computer ScienceDepartment of Computer ScienceColumbia UniversityColumbia UniversityNew York, NY, USANew York, NY, USA

Page 2: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#2

IntroductionIntroduction

Synchronous vs. Asynchronous Systems?Synchronous vs. Asynchronous Systems?

Synchronous Systems:Synchronous Systems: use a use a global clockglobal clockentire system operates entire system operates at fixedat fixed--raterate

uses uses “centralized control”“centralized control”

clock

Page 3: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#3

Introduction (cont.)Introduction (cont.)

Synchronous vs. Asynchronous Systems? (cont.)Synchronous vs. Asynchronous Systems? (cont.)

Asynchronous Systems:Asynchronous Systems: no global clockno global clock

components can operate atcomponents can operate at varying ratesvarying rates

communicate locallycommunicate locally via via “handshaking”“handshaking”

uses uses “distributed control” “distributed control”

“handshakinginterfaces”(channels)

Page 4: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#4

Trends and ChallengesTrends and Challenges

Trends in Chip Design: Trends in Chip Design: next decadenext decade“Semiconductor Industry Association (SIA) Roadmap”“Semiconductor Industry Association (SIA) Roadmap” (97(97--8)8)

Unprecedented Challenges:Unprecedented Challenges:complexity and scale (= size of systems)complexity and scale (= size of systems)

clock speedsclock speeds

power managementpower management

reusability & scalabilityreusability & scalability

“time“time--toto--market”market”

Design becoming unmanageable using a centralized Design becoming unmanageable using a centralized single clock (synchronous) approach….single clock (synchronous) approach….

Page 5: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#5

Trends and Challenges (cont.)Trends and Challenges (cont.)

1. Clock Rate:1. Clock Rate:

1980: 1980: several several MegaHertzMegaHertz

2001: 2001: ~750 ~750 MegaHertzMegaHertz -- 1+ 1+ GigaHertzGigaHertz2005:2005: several several GigaHertzGigaHertz

Design Challenge:Design Challenge:

“clock skew”:“clock skew”: clock must be clock must be nearnear--simultaneoussimultaneous across across entire chipentire chip

Page 6: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#6

Trends and Challenges (cont.)Trends and Challenges (cont.)

2. Chip Size and Density:2. Chip Size and Density:

Total #Transistors per Chip: Total #Transistors per Chip: 6060--80% increase/year80% increase/year~1970: ~1970: 4 thousand4 thousand (Intel 4004 microprocessor)(Intel 4004 microprocessor)

today: today: 5050--200+ million200+ million

2006 and beyond:2006 and beyond: towards 1towards 1 billion+billion+

Design Challenges:Design Challenges:system complexity, design time, clock distributionsystem complexity, design time, clock distributionclock will require 10clock will require 10--20 cycles to reach across chip20 cycles to reach across chip

Page 7: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#7

Trends and Challenges (cont.)Trends and Challenges (cont.)

3. Power Consumption3. Power Consumption

Low power: everLow power: ever--increasing demandincreasing demand

consumer electronics:consumer electronics: batterybattery--poweredpowered

highhigh--end processors:end processors: avoid expensive fans, packagingavoid expensive fans, packaging

Design Challenge:Design Challenge:

clock clock inherentlyinherently consumes power consumes power continuouslycontinuously

“power“power--down” techniques: complex, only partly effectivedown” techniques: complex, only partly effective

Page 8: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#8

Trends and Challenges (cont.)Trends and Challenges (cont.)

4. Time4. Time--toto--Market, Design ReMarket, Design Re--Use, ScalabilityUse, Scalability

Increasing pressure for faster Increasing pressure for faster “time“time--toto--market”.market”. Need:Need:reusable components:reusable components: “plug“plug--andand--play” designplay” design

flexible interfacing:flexible interfacing: under varied conditions, voltage scalingunder varied conditions, voltage scaling

scalable design:scalable design: easy system upgradeseasy system upgrades

Design Challenge:Design Challenge: mismatch w/ central fixedmismatch w/ central fixed--rate clockrate clock

Page 9: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#9

Trends and Challenges (cont.)Trends and Challenges (cont.)

5. Future Trends: “Mixed Timing” Domains5. Future Trends: “Mixed Timing” Domains

Chips themselves becoming Chips themselves becoming distributed systems….distributed systems….contain many subcontain many sub--regions, regions, operating at different speeds:operating at different speeds:

Design Challenge:Design Challenge: breakdown of single centralizedbreakdown of single centralizedclock controlclock control

Page 10: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#10

Asynchronous Design: Potential AdvantagesAsynchronous Design: Potential Advantages

Several Potential Advantages:Several Potential Advantages:

Lower PowerLower Powerno clockno clock components use power only “on demand”components use power only “on demand”

Robustness, ScalabilityRobustness, Scalability

no global timingno global timing “mix“mix--andand--match” variablematch” variable--speed componentsspeed components

composablecomposable/modular design style /modular design style “object“object--oriented”oriented”

Higher PerformanceHigher Performance

systems not limited to “worstsystems not limited to “worst--case” clock ratecase” clock rate

Page 11: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#11

Asynchronous Design: Some Recent DevelopmentsAsynchronous Design: Some Recent Developments

1. Philips Semiconductors:1. Philips Semiconductors:commercial use: 100 million commercial use: 100 million asyncasync chips for consumer electronics:chips for consumer electronics:

pagers, cell phones, smart cards, digital passports, automotive pagers, cell phones, smart cards, digital passports, automotive 33--4x lower power4x lower power,, less electromagnetic interference (“EMI”)less electromagnetic interference (“EMI”)

2. Intel:2. Intel:experimental: Pentium instructionexperimental: Pentium instruction--length decoder = “RAPPID” (1990’s)length decoder = “RAPPID” (1990’s)33--4x faster 4x faster than synchronous subsystemthan synchronous subsystem

3. Sun Labs:3. Sun Labs:commercial use: highcommercial use: high--speed speed FIFO’sFIFO’s in recent “Ultra’s” (memory access)in recent “Ultra’s” (memory access)

4. IBM Research:4. IBM Research:experimental: highexperimental: high--speed pipelines, filters, mixedspeed pipelines, filters, mixed--timing systemstiming systems

Recent Startups:Recent Startups: Fulcrum, Fulcrum, TheseusTheseus Logic, Handshake Solutions, Logic, Handshake Solutions, SilistrixSilistrix

Page 12: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#12

Asynchronous CAD Tools: Recent DevelopmentsAsynchronous CAD Tools: Recent Developments

DARPA’sDARPA’s “CLASS” Program:“CLASS” Program: ClocklessClockless InitiativeInitiative (2003(2003--07)07)

Goals:Goals:-- CAD tool: CAD tool: produce viable produce viable commercialcommercial--grade grade asyncasync tool flowtool flow-- demonstration: demonstration: a complex Boeing ASIC chipa complex Boeing ASIC chip

Participants:Participants:Lead (PI):Lead (PI): BoeingBoeingIndustrial participants:Industrial participants:

Philips (via Philips (via asyncasync incubated startup, “Handshake Solutions”)incubated startup, “Handshake Solutions”)TheseusTheseus Logic, Logic, CodetronixCodetronix

Academic participants:Academic participants:Columbia, UNC, UW, Yale, OSUColumbia, UNC, UW, Yale, OSU

Targets:Targets: cover wide “design space” cover wide “design space” –– very robust to highvery robust to high--speed circuitsspeed circuits

Columbia’s role:Columbia’s role: (i) high(i) high--speed pipelines, (ii) CAD optimizationsspeed pipelines, (ii) CAD optimizations

Page 13: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#13

Asynchronous Design: ChallengesAsynchronous Design: Challenges

Critical Design Issues:Critical Design Issues:

components must components must communicate cleanly:communicate cleanly: ‘hazard‘hazard--free’ designfree’ design

highlyhighly--concurrent designs:concurrent designs: much harder to verify!much harder to verify!

Lack of Automated “ComputerLack of Automated “Computer--Aided Design” Tools:Aided Design” Tools:

most commercial “CAD” tools targeted to synchronous most commercial “CAD” tools targeted to synchronous

Page 14: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#14

What Are CAD Tools?What Are CAD Tools?

Software programs to aid digital designers = Software programs to aid digital designers = “computer“computer--aided design” toolsaided design” tools

automatically automatically synthesize synthesize and and optimizeoptimize digital circuitsdigital circuits

Output:optimized circuit

implementation

Input:desired circuit

specificationCADTOOL

Page 15: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#15

Asynchronous Design ChallengeAsynchronous Design Challenge

Lack of Existing Asynchronous Design Tools:Lack of Existing Asynchronous Design Tools:

Most commercial “CAD” tools targeted to synchronousMost commercial “CAD” tools targeted to synchronous

Synchronous CAD tools: Synchronous CAD tools: major drivers of growth in microelectronics industry major drivers of growth in microelectronics industry

Asynchronous “chickenAsynchronous “chicken--andand--egg” problem:egg” problem:

few CAD tools few CAD tools less commercial use of less commercial use of asyncasync designdesign

especially lacking: tools for designing/especially lacking: tools for designing/optmzngoptmzng. large systems. large systems

Page 16: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#16

Overview: My Research AreasOverview: My Research Areas

CAD Tools for Asynchronous Controllers (CAD Tools for Asynchronous Controllers (FSM’sFSM’s))“MINIMALIST” Package“MINIMALIST” Package: for synthesis + optimization: for synthesis + optimization

Other Research Areas:Other Research Areas:

CAD Tools for Designing LargeCAD Tools for Designing Large--Scale Scale AsyncAsync SystemsSystems

MixedMixed--Timing Interface Circuits:Timing Interface Circuits:

for interfacing sync/for interfacing sync/asyncasync systemssystems

HighHigh--Speed Asynchronous PipelinesSpeed Asynchronous Pipelines

Page 17: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#17

CAD Tools for CAD Tools for AsyncAsync ControllersControllers

MINIMALIST:MINIMALIST: developed at Columbia University [1994developed at Columbia University [1994--] ] extensible CAD package for synthesis of extensible CAD package for synthesis of asynchronous controllersasynchronous controllersintegrates synthesis, optimization and verification toolsintegrates synthesis, optimization and verification toolsused in 80+ sites/17+ countries (being taught in IIT Bombay)used in 80+ sites/17+ countries (being taught in IIT Bombay)URL:URL: httphttp://://www.cs.columbia.edu/asyncwww.cs.columbia.edu/async

Includes several optimization tools: Includes several optimization tools: State MinimizationState MinimizationCHASM: CHASM: optimal state encodingoptimal state encoding22--Level HazardLevel Hazard--Free Logic MinimizationFree Logic MinimizationVerilogVerilog backback--endend

Key goal: Key goal: facilitate designfacilitate design--space explorationspace exploration

Page 18: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#18

Example: “PEExample: “PE--SENDSEND--IFC” (HP Labs)IFC” (HP Labs)Inputs:req-sendtreqrd-iqadbld-outack-pkt

Outputs:tackpeackadbld

0

1

2

7

3

4

5

6

8

9

10

req-send+ treq+ rd-iq+/adbld+

adbld-out+/peack+

rd-iq-/peack- adbld-

tack+

adbld-out- treq-rd-id+/ adbld+

adbld-out+/peack+

rd-iq-/ peack-adbld- tack-

adbld-out- treq+ ack-pkt+/ peack+ tack+

ack-pkt- treq-/peack- tack-

treq-/tack-

treq+/tack+

ack-pkt+/peack- tack-

adbld-out-treq- ack-pkt+/

peack+

req-send-/--

adbld-out-treq+ rd-iq+/

adbld+

From HP Labs“Mayfly” Project:

B.Coates, A.Davis, K.Stevens,“The Post Office

Experience: Designing a Large Asynchronous Chip”,

INTEGRATION: the VLSI Journal, vol. 15:3,pp. 341-66 (Oct. 1993)

Page 19: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#19

EXAMPLE (cont.):EXAMPLE (cont.):

Examples:

Design-Space Explorationusing MINIMALIST:

optimizing for area vs. speed

Page 20: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#20

CAD Tools for LargeCAD Tools for Large--Scale Asynchronous SystemsScale Asynchronous Systems

Target Architecture:Target Architecture:Input Specification:Input Specification:= “Control Data= “Control Data--flow Graph”flow Graph”

RegisterRegister RegisterRegister

Functional Functional UnitUnit

Functional Functional UnitUnit

Functional Functional UnitUnit

CtrlrCtrlr 11

CtrlrCtrlr 22

CtrlrCtrlr 33

control unitcontrol unit

StartStart

LoopLoop C< 0C< 0

B:=2dx+dxB:=2dx+dx C:=X<aC:=X<a

M:=U*X1M:=U*X1

EndloopEndloop

EndEnd

X:=X:=X+dxX+dx C:=X<aC:=X<a

Target:Target:-- synthesize synthesize distributed control distributed control -- 1 controller per functional unit1 controller per functional unit[Theobald/Nowick, IEEE Design Automation Conf. (2001)]

Page 21: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#21

MixedMixed--Timing InterfacesTiming Interfaces

AsynchronousDomain

SynchronousDomain 2

AsynchronousDomain

SynchronousDomain 1

Goal: provide low-latency communication between “timing domains”

Challenge: avoid synchronization errors

Goal: provide low-latency communication between “timing domains”

Challenge: avoid synchronization errors

Page 22: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#22

MixedMixed--Timing Interfaces: SolutionTiming Interfaces: SolutionAsync-Sync FIFO

AsynchronousDomain

SynchronousDomain 1

SynchronousDomain 2

Asy

nc-S

ync

FIFO

Sync

-Asy

ncFI

FO

… developed complete family of mixed-timing interface circuits[Chelcea/Nowick, IEEE Design Automation Conf. (2001)]

… developed complete family of mixed-timing interface circuits[Chelcea/Nowick, IEEE Design Automation Conf. (2001)]

Solution: insert mixed-timing FIFO’s ⇒ provide safe data transferSolution: insert mixed-timing FIFO’s ⇒ provide safe data transfer

AsynchronousDomain

Mixed-Clock FIFO’s

Page 23: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#23

HighHigh--Speed Asynchronous PipelinesSpeed Asynchronous Pipelines

global clock

NON-PIPELINED COMPUTATION: “datapath component” =adder, multiplier, etc.

SYNCHRONOUS

Page 24: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#24

HighHigh--Speed Asynchronous PipelinesSpeed Asynchronous Pipelines

global clock

SYNCHRONOUS

“PIPELINED COMPUTATION”: like an assembly line

no global clock ASYNCHRONOUS

Page 25: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#25

HighHigh--Speed Asynchronous PipelinesSpeed Asynchronous Pipelines

Goal: Goal: extremely fast extremely fast asyncasync datapathdatapath componentscomponents

speed:speed: comparable to comparable to fastestfastest existing synchronous designsexisting synchronous designs

additional benefits:additional benefits:

dynamically adapt to variabledynamically adapt to variable--speed interfaces: speed interfaces: voltage scaling!voltage scaling!

“elastic” processing of data in pipeline“elastic” processing of data in pipeline

no clock distributionno clock distribution

Contributions: Contributions: 3 new 3 new asyncasync pipeline stylespipeline styles [SINGH/NOWICK][SINGH/NOWICK]

MOUSETRAP:MOUSETRAP: static logicstatic logicHighHigh--Capacity/Capacity/LookaheadLookahead:: dynamic logicdynamic logic

Obtain multiObtain multi--GigaHertzGigaHertz speedsspeedsUsed by IBM, currently incorporated into Philips tool flowUsed by IBM, currently incorporated into Philips tool flow

Page 26: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#26

MOUSETRAP: A Basic FIFO (no computation)MOUSETRAP: A Basic FIFO (no computation)

Stages communicate using Stages communicate using transitiontransition--signaling:signaling:

reqN

ackN-1

reqN+1

ackN

Data Latch

Latch Controller

doneN

Data in Data out

En

Stage NStage N-1 Stage N+1

[Singh/Nowick, IEEE Int. Conf. on Computer Design (2001)]

Page 27: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#27

“MOUSETRAP” Pipeline: w/computation“MOUSETRAP” Pipeline: w/computation

logicData Latch

Latch Controller

doneN

logiclogic

reqreqNN

ackN-1

reqreqN+N+11

ackN

Stage N+1

delaydelay

Stage N

delaydelaydelaydelay

Stage N-1

Function Blocks:Function Blocks: use “synchronous” singleuse “synchronous” single--rail circuits (not hazardrail circuits (not hazard--free!)free!)“Bundled Data” Requirement:“Bundled Data” Requirement:

each each ““reqreq”” must arrive must arrive afterafter data inputs valid and stabledata inputs valid and stable

Page 28: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#28

Page 29: Advances in Designing Clockless DDigital Systemsigital Systemsnowick/async-overview-pt1-revd.pdf · Advances in Designing Clockless DDigital Systemsigital Systems Prof. Steven M.

#29

MOUSETRAP: A Basic FIFOMOUSETRAP: A Basic FIFOStages communicate using Stages communicate using transitiontransition--signaling:signaling:

reqN

ackN-1

reqN+1

ackN

Data Latch

Latch Controller

doneN

Data in Data out

En

1 transition1 transitionper data item!per data item!

Stage NStage N-1 Stage N+1

One Data Item