Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

58
Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment ISPD 2005 San Francisco, CA ISPD 2005 San Francisco, CA May 5th, 2005 May 5th, 2005 Mario R. Casu - Mario R. Casu - Politecnico di Torino Politecnico di Torino and and Luca Macchiarulo Luca Macchiarulo - - University of University of Hawaii at Manoa Hawaii at Manoa

description

Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment. ISPD 2005 San Francisco, CA May 5th, 2005 Mario R. Casu - Politecnico di Torino and Luca Macchiarulo - University of Hawaii at Manoa. Outline. Communication concerns at the physical layer - PowerPoint PPT Presentation

Transcript of Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Page 1: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ISPD 2005 San Francisco, CA ISPD 2005 San Francisco, CA

May 5th, 2005May 5th, 2005

Mario R. Casu - Mario R. Casu - Politecnico di TorinoPolitecnico di Torino

and and Luca MacchiaruloLuca Macchiarulo - - University of Hawaii at University of Hawaii at ManoaManoa

Page 2: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

OutlineOutline

Communication concerns at the physical Communication concerns at the physical layerlayer

Great Expectations of “Wire Pipelining”Great Expectations of “Wire Pipelining”– No block DelayNo block Delay– Block delay limitationBlock delay limitation

Computation localityComputation locality Adaptive CommunicationsAdaptive Communications Floorplanning strategy for adaptive Floorplanning strategy for adaptive

systemssystems Experimental resultsExperimental results

Page 3: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Wire pipelining - conceptWire pipelining - concept

Wire delay: Wire delay: substantial share substantial share of overall delayof overall delay

Global wires Global wires difficult to deal difficult to deal withwith

Global wires Global wires scaling does not scaling does not follow follow – TransistorsTransistors– Local wiringLocal wiring

Del

Page 4: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Wire pipelining - conceptWire pipelining - concept

Introducing a Introducing a latch/FF reduces latch/FF reduces the timing the timing constraintsconstraints

Similar to classical Similar to classical pipelining pipelining

Del’

Del’’

Page 5: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Critical LengthCritical Length

Maximal length for Maximal length for which the wire can which the wire can be driven at a be driven at a given frequencygiven frequency– Optimum number Optimum number

of buffersof buffers– Optimum buffer Optimum buffer

dimensionsdimensions– Optimum wire Optimum wire

sizingsizing

Del=1/f

Page 6: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Wire PipeliningWire Pipelining

Above Critical Above Critical length clocked length clocked elements are elements are needed (pipeline needed (pipeline stages)stages)

Del>1/f

Page 7: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

“Wire Pipelining” techniques“Wire Pipelining” techniques

Problem: maintaining functionality with a Problem: maintaining functionality with a minimum loss in performance.minimum loss in performance.

Solutions:Solutions:– Globally Asynchronous Locally Synchronous – Globally Asynchronous Locally Synchronous –

GALSGALS– RetimingRetiming– Regular Distributed Register (J. Cong)Regular Distributed Register (J. Cong)– c-slowing (S. Sapatnekar) c-slowing (S. Sapatnekar) – Latency Insensitive Protocols (L. Carloni)Latency Insensitive Protocols (L. Carloni)

Page 8: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Shell

LIPs: ConceptLIPs: Concept

Pearl Relay Station

Page 9: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Shell – Relay Station InteractionShell – Relay Station Interaction

valid stop

Page 10: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Feedback TopologyFeedback Topology

τ

τ

τ

τ

00

0

Page 11: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Feedback TopologyFeedback Topology

0

τ

0

0

τ

τ

Page 12: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Feedback TopologyFeedback Topology

τ

0

τ

0

1

τ

0τ1

Page 13: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Feedback TopologyFeedback Topology

1

τ

τ

1

τ

1

0τ1τ

Page 14: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Feedback TopologyFeedback Topology

τ

1

1

1

τ

τ

0τ1ττ

Page 15: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Feedback TopologyFeedback Topology

τ

τ

τ

τ

2

2

0τ1ττ2

Page 16: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Feedback Topology: PerformanceFeedback Topology: Performance Void data circulate in the Void data circulate in the

loops: initially as many loops: initially as many as relay stations (as relay stations (ss))

““Period” of void-stop Period” of void-stop equal to the number of equal to the number of shells (shells (ss) and relay ) and relay station (station (rr) in the loop) in the loop

Worst loop fixes thr.Worst loop fixes thr. T=s/(s+r)T=s/(s+r) TTaa=2/4, Tb=2/5 =2/4, Tb=2/5

T=2/5T=2/5 τ

τ

τ

τ

2

2

0τ1ττ2

a b

Page 17: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Classical FloorplanningClassical Floorplanning

Problem: find a Problem: find a placement of (soft or placement of (soft or hard) blocks that hard) blocks that optimally fits a floorplanoptimally fits a floorplan

Optimality is Optimality is Whitespace, overall Whitespace, overall Wirelength, critical path, Wirelength, critical path, or a combinationor a combination

Page 18: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Floorplanning for Throughput [ISPD2004]Floorplanning for Throughput [ISPD2004]

The optimal floorplan The optimal floorplan in our case is that in our case is that which guarantees the which guarantees the maximum throughput maximum throughput compatible with given compatible with given blocks’ dimensionsblocks’ dimensions

Maximum throughput Maximum throughput is equivalent to the is equivalent to the worst cost-to-time worst cost-to-time ratio loopratio loop

Page 19: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

New Heuristic Throughput ComputationNew Heuristic Throughput Computation Heuristic: Heuristic:

– Statically compute the shortest loop l(e) in Statically compute the shortest loop l(e) in which every edge appearswhich every edge appears

– For every optimization iteration: For every optimization iteration: Cost(e)=1/l(e)*floor(length/CCost(e)=1/l(e)*floor(length/Clengthlength)) TotCost=TotCost=cost(e)cost(e)

Page 20: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Throughput-frequency trade-offThroughput-frequency trade-off

f=1/L

T=1

DR0=1.1/L=1/L

Page 21: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Throughput-frequency trade-offThroughput-frequency trade-off

f=2/L

T=2/(2+2)=1/2

DR=1/2.2/L=1/L

No advantage!

Page 22: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Throughput-frequency trade-offThroughput-frequency trade-off

f=1/L L L

L/2

T=1

DR0=1/L.1=1/L

Page 23: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Throughput-frequency trade-offThroughput-frequency trade-off

L/2

L/2

L/2

L/2

L/2

f=2/L T=3/(3+2)

DR=2/L.3/5=6/5L

Page 24: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Data Rate as the basic performance metric – Speed-upData Rate as the basic performance metric – Speed-up Wire pipelining allows increased frequencyWire pipelining allows increased frequency But it decreases the throughput according to But it decreases the throughput according to

the previous considerationsthe previous considerations Real performance is given by DATA Real performance is given by DATA

RATE=Thr*fRATE=Thr*f Advantage w.r.t. non-pipelined systems to be Advantage w.r.t. non-pipelined systems to be

assessed through DR measuresassessed through DR measures Speed-Up SU=DR/DRSpeed-Up SU=DR/DR00

L/(lL/(lmm+l+lmaxmax)<SU<L/l)<SU<L/lmm Floorplanning can be extremely beneficial Floorplanning can be extremely beneficial

if it can reduce the average branch length if it can reduce the average branch length llmm

Page 25: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Block delay effectBlock delay effect

Blocks put a cap to the max frequencyBlocks put a cap to the max frequency– ffmaxmax<1/max(d<1/max(dii))

ii

We can measure delay in “length”, by using a proportionality We can measure delay in “length”, by using a proportionality factorfactor

Block delay can enter in the picture if signals are Block delay can enter in the picture if signals are latched at the input or output side onlylatched at the input or output side only

L

ld

Page 26: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Block delay modelsBlock delay models

We used two different modelsWe used two different models– Delay proportional to block edgeDelay proportional to block edge

Rationale: complexity of logic is related to block Rationale: complexity of logic is related to block sizesize

Minimum constant of proportionality=1: delay is Minimum constant of proportionality=1: delay is the same needed for the fastest signal to the same needed for the fastest signal to traverse the entire block traverse the entire block

Optimistic assumptionOptimistic assumption– Delay constant, related to technology and Delay constant, related to technology and

equal to 13FO4equal to 13FO4 Derived for assumption in the roadmapDerived for assumption in the roadmap More realistic for high performance designMore realistic for high performance design More pessimistic (see below)More pessimistic (see below)

Probably the reality is somehow between the Probably the reality is somehow between the two casestwo cases

Page 27: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Speed-up with block delaySpeed-up with block delay

Taking the block delay into account modifies Taking the block delay into account modifies the previous considerationsthe previous considerations

max(Lmax(Lii+d+dii)/(l)/(lmm+d+dmm+d+dmaxmax)<SU<max(L)<SU<max(Lii+d+dii)/(l)/(lmm+d+dmm))

In general, much worse than previous caseIn general, much worse than previous case

Page 28: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Throughput driven floorplan experimentsThroughput driven floorplan experiments We used the floorplanner described in ISPD’04 We used the floorplanner described in ISPD’04

to evaluate the optimal frequency (maximum to evaluate the optimal frequency (maximum DR)DR)

On GSRC and MCNC benchmarks with input-On GSRC and MCNC benchmarks with input-output informationoutput information

No block delay: No block delay: – SU varies between 0.8 to 36%SU varies between 0.8 to 36%– Better on benchmarks with greater complexityBetter on benchmarks with greater complexity

Block delayBlock delay– Proportional to blocks’ edges: -7% to 44%Proportional to blocks’ edges: -7% to 44%– Equal to 13FO4: -11% to 12%Equal to 13FO4: -11% to 12%– MCNC suite shows the worse behaviorMCNC suite shows the worse behavior

High speed systems with highly optimized High speed systems with highly optimized blocks lead to negligible or irrelevant SU, for an blocks lead to negligible or irrelevant SU, for an high increase of clock frequency.high increase of clock frequency.

Page 29: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Space for better performance?Space for better performance?

Not all point to point connections are actually Not all point to point connections are actually used at every clock cycle.used at every clock cycle.

Ex. CPU to Cache communication.Ex. CPU to Cache communication.

Read cycle

Addr

Data-in

Data-out

Page 30: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Space for better performance?Space for better performance?

Not all point to point connections are actually Not all point to point connections are actually used at every clock cycle.used at every clock cycle.

Ex. CPU to Cache communication.Ex. CPU to Cache communication.

Write cycle

Addr

Data-in

Data-out

Page 31: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Space for better performance?Space for better performance?

Unused communication channel effectively break Unused communication channel effectively break throughput-limiting loopsthroughput-limiting loops

Pipelining without limitation can become possiblePipelining without limitation can become possible

Stream Write cycle

Addr 1

Data-out 1τ

Page 32: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Space for better performance?Space for better performance?

Unused communication channel effectively break Unused communication channel effectively break throughput-limiting loopsthroughput-limiting loops

Pipelining without limitation can become possiblePipelining without limitation can become possible

Stream Write cycle

Addr 2

Data-out 2

Addr 1

Data-out 1

Page 33: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Space for better performance?Space for better performance?

Unused communication channel effectively break Unused communication channel effectively break throughput-limiting loopsthroughput-limiting loops

Pipelining without limitation can become possiblePipelining without limitation can become possible

Stream Write cycle

Addr 3

Data-out 3

Addr 2

Data-out 2

Page 34: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Adaptive Latency Insensitive ProtocolAdaptive Latency Insensitive Protocol Need a mechanism to allow discarding useless Need a mechanism to allow discarding useless

“packets” by blocks: Adaptive communication“packets” by blocks: Adaptive communication Details out of the scope of the paper butDetails out of the scope of the paper but

– It is possible thorugh a simple modification of It is possible thorugh a simple modification of the original protocolthe original protocol

– Requires the introduction of “oracles” Requires the introduction of “oracles” predicting unused inputs for each blockpredicting unused inputs for each block

– We designed a functional implementation in We designed a functional implementation in synthesizable VHDLsynthesizable VHDL

– We proved the correctness of the We proved the correctness of the implementation (absence of deadlocks and implementation (absence of deadlocks and correct signal sequencing)correct signal sequencing)

Page 35: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluationALIP performance evaluation

The adaptiveness of the approach prevents a The adaptiveness of the approach prevents a static prediction of performancestatic prediction of performance

However, a few conclusion can be reached:However, a few conclusion can be reached:– The performance is bounded above by static LIPThe performance is bounded above by static LIP– Performance in long sequences of input Performance in long sequences of input

independence is equivalent to the simplified independence is equivalent to the simplified network with the channel removednetwork with the channel removed

If the system experiences unfrequent “context If the system experiences unfrequent “context switching” on its channels, such that at any switching” on its channels, such that at any given time the performance is static Thgiven time the performance is static Th ii, the , the average performance can be approximated as:average performance can be approximated as:– Th=Th=ii.Th.Thii

i: fraction of time with performance Thi: fraction of time with performance Th i i

Page 36: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Stream Write cycle

Addr 1

Data-out 1τ

Ck=1Valid Data=1

Page 37: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Stream Write cycle

Addr 2

Data-out 2

Addr 1

Data-out 1

Ck=2Valid Data=2

Page 38: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Stream Write cycle

Addr 3

Data-out 3

Addr 2

Data-out 2

Ck=3Valid Data=3

Page 39: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Read cycle

Addr 4 Addr 3

Data-out 3

Ck=4Valid Data=4

Page 40: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Read cycle

----- Addr 4

Ck=5Valid Data=5

ττ

Page 41: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Read cycle

-----

Ck=6Valid Data=5

Data-in4τ

τ

Page 42: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Read cycle

Ck=7Valid Data=5

-----

τ

Data-in4

τ

Page 43: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Read cycle

Ck=8Valid Data=6

-----

τAddr 5

τ

Page 44: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ALIP performance evaluation - ExampleALIP performance evaluation - Example

Read cycle

Ck=8Valid Data=6Throughput=3/4Th1=1Th2=1/2=1/22=1/2

-----

τAddr 5

τ

Page 45: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Adaptive communication performance evaluation - assumptions

Adaptive communication performance evaluation - assumptions Assumption 1: No time lost in “context Assumption 1: No time lost in “context

switching”switching”– Unrealistic, but acceptable for burst Unrealistic, but acceptable for burst

communication, and consistent with communication, and consistent with experimentsexperiments

Assumption 2: Channels behave in a Assumption 2: Channels behave in a statistically independent fashionstatistically independent fashion– Only single clock cycle independence is Only single clock cycle independence is

important for our purposesimportant for our purposes Under 1 and 2, we can compute channel Under 1 and 2, we can compute channel

activities and use them to weight the activities and use them to weight the connectionsconnections

Page 46: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Floorplanning for Throughput – adaptive caseFloorplanning for Throughput – adaptive case The optimal floorplan The optimal floorplan

in our case is that in our case is that which guarantees the which guarantees the maximum throughput maximum throughput compatible with given compatible with given blocks’ dimensionsblocks’ dimensions

Maximum throughput Maximum throughput is equivalent to the is equivalent to the worst cost-to-time worst cost-to-time ratio loop, ratio loop, weighted weighted by the by the looploop activation activation ratioratio

It can be It can be approximated by approximated by taking into account taking into account the the channelchannel activation activation ratioratio

Page 47: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

New Heuristic Throughput ComputationNew Heuristic Throughput Computation Heuristic: Heuristic:

– Statically compute the shortest loop l(e) in Statically compute the shortest loop l(e) in which every edge appearswhich every edge appears

– For every optimization iteration: For every optimization iteration: Cost(e)=1/l(e)*floor(length/CCost(e)=1/l(e)*floor(length/Clengthlength)*)*(e)(e) TotCost=TotCost=cost(e)cost(e)

The only change consists in the inclusion of The only change consists in the inclusion of the term the term (e)(e)

Page 48: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ExperimentsExperiments

GSRC/MCNC benchmarksGSRC/MCNC benchmarks– Burst modeBurst mode– Uniformly distributed phases and activation Uniformly distributed phases and activation

timestimes– Comparison between non-pipelined solution and Comparison between non-pipelined solution and

adaptively pipelined (13FO4 case)adaptively pipelined (13FO4 case)– After optimization, a VHDL netlist is After optimization, a VHDL netlist is

automatically generated and simulated to automatically generated and simulated to measure the real performance of the system (as measure the real performance of the system (as opposed to the approximation from the opposed to the approximation from the floorplanner)floorplanner)

Results:Results:– SU between 16 and 44%SU between 16 and 44%– Monotonous behavior in the legal intervalMonotonous behavior in the legal interval– Limitations due mainly to FO4 delaysLimitations due mainly to FO4 delays

Page 49: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

ExperimentsExperiments

MPEG decoderMPEG decoder– Strict data dependencyStrict data dependency– Optimization as in other casesOptimization as in other cases– Simulation as before Simulation as before andand with real channel with real channel

utilization profilesutilization profiles Results:Results:

– SU of 42% with block delay, 76% withoutSU of 42% with block delay, 76% without– Real SU of 31% (effect of non-random Real SU of 31% (effect of non-random

correlation)correlation)

Page 50: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Conclusions and future workConclusions and future work

Pure “blind” pipelining fails to achive available Pure “blind” pipelining fails to achive available optimization, due to neglect of common optimization, due to neglect of common informationinformation

Adaptive protocols can take advantage of the Adaptive protocols can take advantage of the information available to the blocksinformation available to the blocks

We will concentrate onWe will concentrate on– Automated extraction of information from the Automated extraction of information from the

blocksblocks– Power optimization (power/timing trade-offs)Power optimization (power/timing trade-offs)– Routing constraints effectsRouting constraints effects

Page 51: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Thank you

Page 52: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Shell – Relay Station InteractionShell – Relay Station Interaction

valid stop

a

Page 53: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Shell – Relay Station InteractionShell – Relay Station Interaction

valid stop

b

a

Page 54: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Shell – Relay Station InteractionShell – Relay Station Interaction

valid stop

c

b

Page 55: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Shell – Relay Station InteractionShell – Relay Station Interaction

valid stop

d

bc

Page 56: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Feedforward equalizationFeedforward equalization

Maximum Maximum performance can be performance can be recovered by recovered by equalizing various equalizing various pathspaths

Longest path Longest path computation to computation to obtain the obtain the appropriate number appropriate number of added relay of added relay stationsstations

Page 57: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

Critical Length and Pipelining Stages (ITRS projections)Critical Length and Pipelining Stages (ITRS projections)

YearYear NodeNode Clock Clock FrequencyFrequency

Critical Critical

LengthLength

StagesStages10 10 mmmm

34 mm34 mm

20012001 130 130 nmnm

1.684 GHz1.684 GHz 17.11 mm17.11 mm 00 11

20022002 115 115 nmnm

2.317 GHz2.317 GHz 12.17 mm12.17 mm 00 22

20032003 100 100 nmnm

3.088 GHz3.088 GHz 8.95 mm8.95 mm 11 33

20042004 90 nm90 nm 3.990 GHz3.990 GHz 7.37 mm7.37 mm 11 4420052005 80 nm80 nm 5.173 GHz5.173 GHz 5.28 mm5.28 mm 11 6620062006 70 nm70 nm 5.631 GHz5.631 GHz 4.63 mm4.63 mm 22 7720072007 65 nm65 nm 6.739 GHz6.739 GHz 4.16 mm4.16 mm 22 88

Page 58: Floorplan Assisted Data Rate Enhancement through Wire Pipelining: A Real Assessment

General Performance EvaluationGeneral Performance Evaluation Generic netlists of blocks are feedforward Generic netlists of blocks are feedforward

connections of loopsconnections of loops If feedforward connections are equalized, If feedforward connections are equalized,

“worst” loop dominates throughput“worst” loop dominates throughput Problem formulation: max cost-to-time ratio Problem formulation: max cost-to-time ratio

(polynomial time).(polynomial time).