Advances in Designing Clockless Digital Systems

29
Advances in Designing Advances in Designing Clockless Digital Systems Clockless Digital Systems Prof. Steven M. Nowick Prof. Steven M. Nowick [email protected] [email protected] Department of Computer Science Department of Computer Science Columbia University Columbia University New York, NY, USA New York, NY, USA

description

Advances in Designing Clockless Digital Systems. Prof. Steven M. Nowick [email protected]. Department of Computer Science Columbia University New York, NY, USA. Introduction. Synchronous vs. Asynchronous Systems? Synchronous Systems: use a global clock - PowerPoint PPT Presentation

Transcript of Advances in Designing Clockless Digital Systems

Page 1: Advances in Designing  Clockless Digital Systems

Advances in Designing Advances in Designing Clockless Digital SystemsClockless Digital SystemsAdvances in Designing Advances in Designing

Clockless Digital SystemsClockless Digital Systems

Prof. Steven M. NowickProf. Steven M. Nowick

[email protected]@cs.columbia.edu

Department of Computer ScienceDepartment of Computer ScienceColumbia UniversityColumbia UniversityNew York, NY, USANew York, NY, USA

Page 2: Advances in Designing  Clockless Digital Systems

#2

IntroductionIntroduction

Synchronous vs. Asynchronous Systems?Synchronous vs. Asynchronous Systems?

Synchronous Systems:Synchronous Systems: use a use a global clockglobal clock entire system operates entire system operates at fixed-rateat fixed-rate

uses uses “centralized control”“centralized control”

clock

Page 3: Advances in Designing  Clockless Digital Systems

#3

Introduction (cont.)Introduction (cont.)

Synchronous vs. Asynchronous Systems? Synchronous vs. Asynchronous Systems? (cont.)(cont.)

Asynchronous Systems:Asynchronous Systems: no global clockno global clock components can operate atcomponents can operate at varying ratesvarying rates

communicate locallycommunicate locally via via “handshaking”“handshaking”

uses uses “distributed control” “distributed control”

“handshaking interfaces”(channels)

Page 4: Advances in Designing  Clockless Digital Systems

#4

Trends and ChallengesTrends and Challenges

Trends in Chip Design: Trends in Chip Design: next decadenext decade ““Semiconductor Industry Association (SIA) Semiconductor Industry Association (SIA) Roadmap”Roadmap” (97-8) (97-8)

Unprecedented Challenges:Unprecedented Challenges: complexity and scale (= size of systems)complexity and scale (= size of systems) clock speedsclock speeds power managementpower management reusability & scalabilityreusability & scalability ““time-to-market”time-to-market”

Design becoming unmanageable using a Design becoming unmanageable using a centralized single clock (synchronous) centralized single clock (synchronous) approach….approach….

Page 5: Advances in Designing  Clockless Digital Systems

#5

Trends and Challenges (cont.)Trends and Challenges (cont.)

1. Clock Rate:1. Clock Rate:

1980: 1980: several MegaHertzseveral MegaHertz 2001: 2001: ~750 MegaHertz - 1+ GigaHertz~750 MegaHertz - 1+ GigaHertz 2005:2005: several GigaHertzseveral GigaHertz

Design Challenge:Design Challenge: ““clock skew”:clock skew”: clock must be clock must be near-near-simultaneoussimultaneous across entire chip across entire chip

Page 6: Advances in Designing  Clockless Digital Systems

#6

Trends and Challenges (cont.)Trends and Challenges (cont.)

2. Chip Size and Density:2. Chip Size and Density:

Total #Transistors per Chip: Total #Transistors per Chip: 60-80% 60-80%

increase/yearincrease/year ~1970: ~1970: 4 thousand4 thousand (Intel 4004 microprocessor)(Intel 4004 microprocessor)

today: today: 50-200+ million50-200+ million

2006 and beyond:2006 and beyond: towards 1towards 1 billion+billion+

Design Challenges:Design Challenges: system complexity, design time, clock distributionsystem complexity, design time, clock distribution clock will require 10-20 cycles to reach across chipclock will require 10-20 cycles to reach across chip

Page 7: Advances in Designing  Clockless Digital Systems

#7

Trends and Challenges (cont.)Trends and Challenges (cont.)

3. Power Consumption3. Power Consumption

Low power: ever-increasing demandLow power: ever-increasing demand

consumer electronics:consumer electronics: battery-powered battery-powered

high-end processors:high-end processors: avoid expensive fans, avoid expensive fans,

packagingpackaging

Design Challenge:Design Challenge: clock clock inherentlyinherently consumes power consumes power continuouslycontinuously

““power-down” techniques: complex, only partly power-down” techniques: complex, only partly

effectiveeffective

Page 8: Advances in Designing  Clockless Digital Systems

#8

Trends and Challenges (cont.)Trends and Challenges (cont.)

4. Time-to-Market, Design Re-Use, Scalability4. Time-to-Market, Design Re-Use, Scalability

Increasing pressure for faster Increasing pressure for faster “time-to-market”.“time-to-market”. Need: Need: reusable components:reusable components: “plug-and-play” design “plug-and-play” design

flexible interfacing:flexible interfacing: under varied conditions, voltage under varied conditions, voltage

scalingscaling

scalable design:scalable design: easy system upgradeseasy system upgrades

Design Challenge:Design Challenge: mismatch w/ central fixed-rate clock mismatch w/ central fixed-rate clock

Page 9: Advances in Designing  Clockless Digital Systems

#9

Trends and Challenges (cont.)Trends and Challenges (cont.)

5. Future Trends: “Mixed Timing” Domains5. Future Trends: “Mixed Timing” Domains

Chips themselves becoming Chips themselves becoming distributed distributed systems….systems…. contain many sub-regions, contain many sub-regions, operating at operating at different speeds:different speeds:

Design Challenge:Design Challenge: breakdown of single centralizedbreakdown of single centralizedclock controlclock control

Page 10: Advances in Designing  Clockless Digital Systems

#10

Asynchronous Design: Asynchronous Design: Potential AdvantagesPotential Advantages

Several Potential Advantages:Several Potential Advantages:

Lower PowerLower Power no clockno clock components use power only “on demand”components use power only “on demand”

Robustness, ScalabilityRobustness, Scalability no global timingno global timing“mix-and-match” variable-speed “mix-and-match” variable-speed componentscomponents

composable/modular design style composable/modular design style “object-oriented” “object-oriented”

Higher PerformanceHigher Performance systems not limited to “worst-case” clock ratesystems not limited to “worst-case” clock rate

Page 11: Advances in Designing  Clockless Digital Systems

#11

Asynchronous Design: Some Recent Asynchronous Design: Some Recent DevelopmentsDevelopments

1. Philips Semiconductors:1. Philips Semiconductors: commercial use: 100 million async chips for consumer electronics:commercial use: 100 million async chips for consumer electronics: pagers, cell phones, smart cards, digital passports, pagers, cell phones, smart cards, digital passports, automotive automotive

3-4x lower power3-4x lower power,, less electromagnetic interference (“EMI”) less electromagnetic interference (“EMI”)

2. Intel:2. Intel: experimental: Pentium instruction-length decoder = “RAPPID” experimental: Pentium instruction-length decoder = “RAPPID” (1990’s)(1990’s)

3-4x faster 3-4x faster than synchronous subsystemthan synchronous subsystem

3. Sun Labs:3. Sun Labs: commercial use: high-speed FIFO’s in recent “Ultra’s” (memory commercial use: high-speed FIFO’s in recent “Ultra’s” (memory access)access)

4. IBM Research:4. IBM Research: experimental: high-speed pipelines, filters, mixed-timing systemsexperimental: high-speed pipelines, filters, mixed-timing systems

Recent Startups:Recent Startups: Fulcrum, Theseus Logic, Handshake Solutions, Silistrix Fulcrum, Theseus Logic, Handshake Solutions, Silistrix

Page 12: Advances in Designing  Clockless Digital Systems

#12

Asynchronous CAD Tools: Recent Asynchronous CAD Tools: Recent DevelopmentsDevelopments

DARPA’s “CLASS” Program:DARPA’s “CLASS” Program: Clockless InitiativeClockless Initiative (2003- (2003-07)07)

Goals:Goals:- CAD tool: - CAD tool: produce viable produce viable commercial-grade async tool flowcommercial-grade async tool flow- demonstration: - demonstration: a complex Boeing ASIC chipa complex Boeing ASIC chip

Participants:Participants: Lead (PI):Lead (PI): Boeing Boeing Industrial participants:Industrial participants:

Philips (via async incubated startup, “Handshake Solutions”)Philips (via async incubated startup, “Handshake Solutions”) Theseus Logic, CodetronixTheseus Logic, Codetronix

Academic participants:Academic participants: Columbia, UNC, UW, Yale, OSUColumbia, UNC, UW, Yale, OSU

Targets:Targets: cover wide “design space” – very robust to high-speed cover wide “design space” – very robust to high-speed

circuitscircuits

Columbia’s role:Columbia’s role: (i) high-speed pipelines, (ii) CAD (i) high-speed pipelines, (ii) CAD

optimizationsoptimizations

Page 13: Advances in Designing  Clockless Digital Systems

#13

Asynchronous Design: Asynchronous Design: ChallengesChallenges

Critical Design Issues:Critical Design Issues: components must components must communicate cleanly:communicate cleanly: ‘hazard-free’ ‘hazard-free’

designdesign

highly-concurrent designs:highly-concurrent designs: much harder to verify! much harder to verify!

Lack of Automated “Computer-Aided Design” Lack of Automated “Computer-Aided Design”

Tools:Tools: most commercial “CAD” tools targeted to synchronous most commercial “CAD” tools targeted to synchronous

Page 14: Advances in Designing  Clockless Digital Systems

#14

What Are CAD Tools?What Are CAD Tools?

Software programs to aid digital Software programs to aid digital

designers = designers = ““computer-aided design” toolscomputer-aided design” tools

automatically automatically synthesize synthesize and and optimizeoptimize digital digital circuitscircuits

CADTOOL

Input:desired circuit specification

Output:optimized circuit implementation

Page 15: Advances in Designing  Clockless Digital Systems

#15

Asynchronous Design ChallengeAsynchronous Design Challenge

Lack of Existing Asynchronous Design Tools:Lack of Existing Asynchronous Design Tools:

Most commercial “CAD” tools targeted to Most commercial “CAD” tools targeted to

synchronoussynchronous

Synchronous CAD tools: Synchronous CAD tools: major drivers of growth in microelectronics industry major drivers of growth in microelectronics industry

Asynchronous “chicken-and-egg” problem:Asynchronous “chicken-and-egg” problem: few CAD tools few CAD tools less commercial use of async less commercial use of async designdesign

especially lacking: tools for designing/optmzng. especially lacking: tools for designing/optmzng.

large systemslarge systems

Page 16: Advances in Designing  Clockless Digital Systems

#16

Overview: My Research AreasOverview: My Research Areas

CAD Tools for Asynchronous Controllers CAD Tools for Asynchronous Controllers

(FSM’s)(FSM’s) ““MINIMALIST” PackageMINIMALIST” Package: for synthesis + : for synthesis +

optimizationoptimization

Other Research Areas:Other Research Areas:

CAD Tools for Designing Large-Scale Async SystemsCAD Tools for Designing Large-Scale Async Systems

Mixed-Timing Interface Circuits:Mixed-Timing Interface Circuits:

for interfacing sync/async systemsfor interfacing sync/async systems

High-Speed Asynchronous PipelinesHigh-Speed Asynchronous Pipelines

Page 17: Advances in Designing  Clockless Digital Systems

#17

CAD Tools for Async ControllersCAD Tools for Async Controllers

MINIMALIST:MINIMALIST: developed at Columbia University [1994-] developed at Columbia University [1994-] extensible CAD package for synthesis of extensible CAD package for synthesis of asynchronous controllersasynchronous controllers integrates synthesis, optimization and verification toolsintegrates synthesis, optimization and verification tools used in 80+ sites/17+ countries (being taught in IIT Bombay)used in 80+ sites/17+ countries (being taught in IIT Bombay) URL:URL: httphttp://www.cs.columbia.edu/async://www.cs.columbia.edu/async

Includes several optimization tools: Includes several optimization tools: State MinimizationState Minimization CHASM: CHASM: optimal state encodingoptimal state encoding 2-Level Hazard-Free Logic Minimization2-Level Hazard-Free Logic Minimization Verilog back-endVerilog back-end

Key goal: Key goal: facilitate design-space explorationfacilitate design-space exploration

Page 18: Advances in Designing  Clockless Digital Systems

#18

Example: “PE-SEND-IFC” (HP Example: “PE-SEND-IFC” (HP Labs)Labs)Inputs:req-sendtreqrd-iqadbld-outack-pkt

Outputs:tackpeackadbld

0

1

2

7

3

4

5

6

8

9

10

req-send+ treq+ rd-iq+/adbld+

adbld-out+/peack+

rd-iq-/peack- adbld- tack+

adbld-out- treq-rd-id+/ adbld+

adbld-out+/peack+

rd-iq-/ peack- adbld- tack-

adbld-out- treq+ ack-pkt+/ peack+ tack+

ack-pkt- treq-/peack- tack-

treq-/tack-

treq+/tack+

ack-pkt+/peack- tack-

adbld-out-treq- ack-pkt+/

peack+

req-send-/--

adbld-out- treq+ rd-iq+/ adbld+

From HP Labs “Mayfly” Project:B.Coates, A.Davis, K.Stevens, “The Post Office Experience: Designing a Large Asynchronous Chip”, INTEGRATION: the VLSI Journal, vol. 15:3, pp. 341-66 (Oct. 1993)

Page 19: Advances in Designing  Clockless Digital Systems

#19

EXAMPLE (cont.):EXAMPLE (cont.):

Examples:

Design-Space Explorationusing MINIMALIST:

optimizing for area vs. speed

Page 20: Advances in Designing  Clockless Digital Systems

#20

CAD Tools for Large-Scale CAD Tools for Large-Scale Asynchronous SystemsAsynchronous Systems

Target:Target: - synthesize - synthesize distributed distributed control control - 1 controller per - 1 controller per functional unitfunctional unit

Target Architecture:Target Architecture:

RegisterRegister RegisterRegister

FunctionaFunctional l

UnitUnit

FunctionaFunctional l

UnitUnit

FunctionaFunctional l

UnitUnit

StartStart

LoopLoop C< 0 C< 0

B:=2dx+dxB:=2dx+dx C:=X<aC:=X<a

M:=U*X1M:=U*X1

EndlooEndloopp

EndEnd

X:=X+dxX:=X+dx C:=X<aC:=X<a

Input Specification:Input Specification:= “Control Data-flow Graph”= “Control Data-flow Graph”

control control unitunit

Ctrlr 1Ctrlr 1

Ctrlr 2Ctrlr 2

Ctrlr 3Ctrlr 3

[Theobald/Nowick, IEEE Design Automation Conf. (2001)]

Page 21: Advances in Designing  Clockless Digital Systems

#21

Mixed-Timing InterfacesMixed-Timing Interfaces

AsynchronousDomain

SynchronousDomain 1

SynchronousDomain 2

Goal: provide low-latency communication between “timing domains”

Challenge: avoid synchronization errors

Goal: provide low-latency communication between “timing domains”

Challenge: avoid synchronization errors

AsynchronousDomain

Page 22: Advances in Designing  Clockless Digital Systems

#22

Mixed-Timing Interfaces: Mixed-Timing Interfaces: SolutionSolution

AsynchronousDomain

SynchronousDomain 1

SynchronousDomain 2

Async-Sync FIFO

Async-Sync FIFO

Sync-Async FIFO

Mixed-Clock FIFO’s

… developed complete family of mixed-timing interface circuits [Chelcea/Nowick, IEEE Design Automation Conf. (2001)]

… developed complete family of mixed-timing interface circuits [Chelcea/Nowick, IEEE Design Automation Conf. (2001)]

Solution: insert mixed-timing FIFO’s provide safe data transferSolution: insert mixed-timing FIFO’s provide safe data transfer

AsynchronousDomain

Page 23: Advances in Designing  Clockless Digital Systems

#23

global clock

NON-PIPELINED COMPUTATION:

High-Speed Asynchronous High-Speed Asynchronous PipelinesPipelines

“datapath component” = adder, multiplier, etc.

SYNCHRONOUS

Page 24: Advances in Designing  Clockless Digital Systems

#24

global clock

SYNCHRONOUS

ASYNCHRONOUS

“PIPELINED COMPUTATION”: like an assembly line

no global clock

High-Speed Asynchronous High-Speed Asynchronous PipelinesPipelines

Page 25: Advances in Designing  Clockless Digital Systems

#25

Goal: Goal: extremely fast async datapath componentsextremely fast async datapath components speed:speed: comparable to comparable to fastestfastest existing synchronous designs existing synchronous designs additional benefits:additional benefits:

dynamically adapt to variable-speed interfaces: dynamically adapt to variable-speed interfaces: voltage scaling!voltage scaling! ““elastic” processing of data in pipelineelastic” processing of data in pipeline no clock distributionno clock distribution

Contributions: Contributions: 3 new async pipeline styles3 new async pipeline styles [SINGH/NOWICK][SINGH/NOWICK] MOUSETRAP:MOUSETRAP: static logicstatic logic High-Capacity/Lookahead:High-Capacity/Lookahead: dynamic logicdynamic logic

Obtain multi-GigaHertz speedsObtain multi-GigaHertz speedsUsed by IBM, currently incorporated into Philips tool flowUsed by IBM, currently incorporated into Philips tool flow

High-Speed Asynchronous High-Speed Asynchronous PipelinesPipelines

Page 26: Advances in Designing  Clockless Digital Systems

#26

reqN

ackN-1

reqN+1

ackN

Data Latch

Latch Controller

doneN

Data in Data out

Stage NStage N-1 Stage N+1

En

MOUSETRAP: A Basic FIFO (no MOUSETRAP: A Basic FIFO (no computation)computation)

Stages communicate using Stages communicate using transition-transition-signaling:signaling:

[Singh/Nowick, IEEE Int. Conf. on Computer Design (2001)]

Page 27: Advances in Designing  Clockless Digital Systems

#27

Stage N+1

logic

delaydelay

Stage N

Data Latch

Latch Controller

doneN

logic

delaydelay

Stage N-1

logic

delaydelayreqreqNN

ackN-1

reqreqN+N+11

ackN

““MOUSETRAP” Pipeline: MOUSETRAP” Pipeline: w/computationw/computation

Function Blocks:Function Blocks: use “synchronous” single-rail circuits use “synchronous” single-rail circuits (not hazard-free!)(not hazard-free!)

““Bundled Data” Requirement:Bundled Data” Requirement: each each “req”“req” must arrive must arrive afterafter data inputs valid and stable data inputs valid and stable

Page 28: Advances in Designing  Clockless Digital Systems

#28

Page 29: Advances in Designing  Clockless Digital Systems

#29

reqN

ackN-1

reqN+1

ackN

Data Latch

Latch Controller

doneN

Data in Data out

Stage NStage N-1 Stage N+1

En

MOUSETRAP: A Basic FIFOMOUSETRAP: A Basic FIFOStages communicate using Stages communicate using transition-transition-

signaling:signaling:1 transition1 transitionper data item!per data item!

One Data Item