Giuseppe S. GarceaDelft University of Technology
Delft, The [email protected]
Ralph H.J.M. Otten
Eindhoven University of TechnologyEindhoven, The Netherlands
are wires plannable?are wires plannable?
Ralph Otten
wire planningwire planning
1987: providing floorplan design with alignment constraints floorplan is a data structure capturing the relative positions
(i.e. no geometry, possibly overlap, several optimizations) alignment to save wire area (data path generator) often tremendous reduction in routing complexity in essence not limited to "data path" regularity
1998: fixing (and maximizing) time budgets for modules remove global iteration from synthesis fix total path delay provide pre-placement and pin positioning data enable early retiming, layer assignments, system partitioning ensure satisfaction of system timing requirements
this talk: iteration free synthesis: what is needed? trends in chip industry: where do the wires go? some directions
Ralph Otten
conceptualdesign behavioral
synthesis logicsynthesis
layoutsynthesis
foot print
library
technology
datapreparation
gate and net
list
weightedincidencestructure
wire length and areaminimization under
technology constraints
iteration free synthesis (silicon compilers)iteration free synthesis (silicon compilers)
timing was an incidental,usually surprisingly good,result of a synthesis flow with size as its prime objective
Ralph Otten
conceptualdesign behavioral
synthesis logicsynthesis
layoutsynthesistiming
analysis
foot print
library
technology
datapreparation
timingoptimization
buffer insertion,transistor sizing,
fanout trees
wire loads,resistances,critical paths
iterative timing optimizationiterative timing optimization
Ralph Otten
timing awareness in conventional flowstiming awareness in conventional flows
synthesis: uses delay models but has very limited information
timing is the (arbitrary) outcome of
desired: a flow that satisfies timing constraints exactly whenever possible
resynthesis: accepts additional constraints and wire load models
layout synthesis: tries to reduce total wire length and area
• a sequence of optimizations with other objectives• adding constraints and resynthesis bringing it to a local optimum •adding more constraints and resynthesis bringing it to another local optimum
TIMING CLOSURE
Ralph Otten
sutherland's delay formula
sr
R o
oin scC pp scC LC
poin
Loo cbr
CC
cbr note: the absence of resistancenbetween
driver and load
p fg
g: computing effortsize independent !
depends on:• function• topology
• device size
p: inherent(parasitic) delaysize independent1/f : restoring effort
g/f : effort delay
if f is kept constant, then delay stays constant
Ralph Otten
continuously sized networkscontinuously sized networks
ca
cb
cn
C
C f g c aa C f g c bb
C f g c nn
gate size C f aC f g p c pinputs
xinputs
x C f aC f g p c pinputs
xinputs
x C f aC f g p c pinputs
xinputs
x
the size of a gatewith constant delay
varies linearly with the load:gate size = a f C
f, the scaling factor,is the same for all input
a, the area sensititivity,is a property of the gate,that is function,topology,
sizing
Ralph Otten
continuously sized networkscontinuously sized networks
the size of a gatewith constant delay
varies linearly with the load:gate size = a f C
Cj
Ck
qi
j
i
k
c f n jjij
c f n kkik
j)i(foj
jiji
kkikjjiji
i
c f nq
c f n c f n q
c
in vector notation:
cf N q c D q c f N-I D q f N-I c 1-D
Ralph Otten
timing closuretiming closure
p fg the size of a gate
with constant delayvaries linearly with the load:
gate size = a f C
Grodstein, e.a. ICCAD, 1995Sutherland and Sproul:
VLSI, 1991
gain-basedsynthesis
constantdelay
methodology
fixed delays
fixed timing
performanceplanning
guaranteedtiming
Ralph Otten
synthesis under timing constraints
conceptualdesign behavioral
synthesis logicsynthesis
areaoptimization layout
synthesistiminganalysis
foot print datapreparation
library
technology
sizeassignment area
solve the correspondingleontieff system
q f N-I c 1-D
insert buffers to reduce area
no iterative loop has been created!
Ralph Otten
size assignment
q f N-I c 1-D
N f
sizes
floorplanoptimization
qimposedcapacitances
N'
wirelengths
synthesis+netlist
restoringeffort
layoutsynthesis
weightedincidence
matrix
vector ofeffort reciprocals
that is Cin/Couta vector
implied bythe calculated
input capacitances
netlistpossibly modified by
inserted buffers
TIMING GUARANTEED
- for f was fixed - buffers inserted
for area recoveryonly where
enough slack is available !
Ralph Otten
resistive interconnectresistive interconnect
problem 1: how to cope with resistive interconnect while their delay models cannot be made size independent?
vst
Rtr c.lvtr
0.5Cw+Cp0.5CW
RW
ph rg
pcorinCwC
inCwR21gcor
2wC
wRwCpCtrR
r is not size independent
Ralph Otten
the new synthesis problemthe new synthesis problem
problem 2: how can we prevent synthesis from generating networks that preclude satisfying timing constraints, while timing correct networks exist?problem 1: how to cope with resistive interconnect while their delay models cannot be made size independent?
• sutherland's principle of uniform stage effort• brayton's uniform stage delay• technology mapping for speed
logic synthesis is to provide an initial netlist and the restoring effort 1/f for every gate !
how can synthesis be guided to produce networksthat lead to "fast enough" implementations?
wire planning
Ralph Otten
synthesis with wire planning
conceptualdesign behavioral
synthesis logicsynthesis
areaoptimization layout
synthesistiminganalysis
wireplanning
foot print datapreparation
library
technology
timing budgets
sizeassignment area
preplacementpin assignment
layer assignmentwire structures
no iterative loop has been created!
Ralph Otten
global wire theoryglobal wire theory
global interconnections are always point-to point wires
first moment matching is accurate enough
restoring circuits are modeled with sakurai's first order model
global wires are interconnects whose delay can be improved by inserting restoring circuitryassumptions
the length of a section , the critical length, dependson the wiring layer, but not on the buffer size , and tends to be constant when measured in feature sizes
the delay of an optimally segmented line is linear in its length,path delay is therefore independent of the positionof the restoring circuits on the path
the delay of a section of an optimally buffered line is the same for all layers
Ralph Otten
wire planning considerationswire planning considerations
the definition of global wires creates a two-level hierarchy
global wires will be optimally buffered
a wire planning scenario: allocate delays to global paths assign time budgets to modules create net lists for the modules assign size to all gates
given the path delays, and convex trade-off between module size and delay,size optimization is efficiently solvable,and produces time budgets for each module
logic synthesis has to create net lists for the modules with given time budgets, and assign restoring effortsto the gates
size assignment is done by solving the leontieff system
Ralph Otten
remaining problemsremaining problems
problem 2: how can we prevent synthesis from generating networks that preclude satisfying timing constraints, while timing correct networks exist?problem 1: how to cope with resistive interconnect while their delay models cannot be made size independent?
problem 3: optimally buffered lines fix input and output capacitances, and therefore constrain the total effort along a path, and thus the delay of that path.
cin
Cout
optimally buffered lines have fixed input /output capacitance
Ralph Otten
discrete librariesdiscrete libraries
problem 2: how can we prevent synthesis from generating networks that preclude satisfying timing constraints, while timing correct networks exist?
problem 4: does the fact that libraries are not continuously sizable defeat timing closure by fixing individual gate delays?
problem 1: how to cope with resistive interconnect while their delay models cannot be made size independent?
problem 3: optimally buffered lines fix input and output capacitances, and therefore constrain the total effort along a path, and thus the delay of that path.
derivation assumes continuous sizability !
libraries are mostly discrete and offer limited range in sizes
Ralph Otten
some problems of timing closuresome problems of timing closure
problem 2: how can we prevent synthesis from generating networks that preclude satisfying timing constraints, while timing correct networks exist?
problem 4: does the fact that libraries are not continuously sizable defeat timing closure by fixing individual gate delays?
problem 1: how to cope with resistive interconnect while their delay models cannot be made size independent?
problem 5: can the efficiency of load independent mapping for speed be advantageous under a constant delay methodology?
problem 3: optimally buffered lines fix input and output capacitances, and therefore constrain the total effort along a path, and thus the delay of that path.
Ralph Otten
are wires plannable?
a solid basis for wire planning pin placement for detour free routing valid retiming early layer assignment . . . . . . .
resistive interconnect and guiding synthesiswire planning
iteration free synthesissize assignment to achieve proper timing
for that we need:
Ralph Otten
wire planswire plans
a wire plan for a functional network is a position for each of its function nodes, and a pin assignment for all its primary inputs and outputs
a global wire plan is a wire plan of which all arcs represent global wires, andwill be laid out as optimally buffered lines.
a wire plan is monotonic if all its arcs can be laid out such thatthe L1-length of every directed path in the networkis equal to the L1-distance between its end points
given a pin assignment, no global wire plan is faster than a monotonic wire plan (if functions have fixed delays)
given a pin assignment, monotonic wire plans have the least wire capacitance
Ralph Otten
wire plans for given pin assignmentwire plans for given pin assignment
the inbox of a node is the smallest iso-rectanglecontaining its support
a functional networkhas a monotonic wire plan
with respect toa given pin assignment
fifevery node has
one and only one bridge
the outbox of a node is the smallest iso-rectanglecontaining its range
a bridge of a node is a minimum L2-length lineconnecting the inbox and the outbox
Ralph Otten
existence criterionexistence criterion
its in- or outbox is a single point
a functional networkhas a monotonic wire plan
with respect toa given pin assignment
fifevery node has
one and only one bridge
the existence of monotonic wire plan of a functional network for a given pin assignmentcan be checked on a node-by-node basis:
its inbox and outbox are perpendicular iso-lines
its outbox is in the projection of the inbox
Ralph Otten
are wires plannable?
resistive interconnect and guiding synthesiswire planning
iteration free synthesissize assignment to achieve timing
delay prediction is needed and should be enabledoptimally buffered global interconnect
Ralph Otten
trends in chip industrytrends in chip industry
many laws in chip industry fit a specific generic form:
)V(h)U(f
dVdU differential equation with an integral
(solvable by separation of variables)
Ralph Otten
moore's law
nu
mb
er o
f tr
ansi
sto
rs
10
1 K
70 year80 90 00
10
10
10
10
10
10
10
10
3
4
5
6
7
8
9
10
11
4 K
16 K
64 K
256 K
1 M
4 M
64 M
256 M
1 G
intel microprocessors
static memory
NdtdN
the growth rate of chip complexitywill be proportionalto the achieved complexity to date
[Gordon Moore, 1964]
proportionality constant,"moore exponent m",0.2 for processors, and0.4 for memory
N=numerical complexity of the module (e.g. the chip)
Ralph Otten
rent's rule
NT
dNdT
the growth rate of the terminal countwith the complexityof the modulewill be proportionalto the averagenumber of terminalsper submodule
[Landman, Russo, 1971]
proportionality constant,"rent exponent r",
N=numerical complexity of the module (e.g. the chip)
T(N) = the number of terminals of a module with numerical complexity N
Ralph Otten
r=0.45K=0.82
r=0.63K=1.4
rent’s curves
100 1,000 10,000 100,000 1,000,00010
100
1,000
10,000
static ram
dynamic ram
microprocessors
gate arrays
high performance computers
chip level
board level
[Bakoglu, 1987]
r=0.25K=82
r=0.5K=1.9
r=0.12K=6
r=0.1K=4
NT
dNdT
Ralph Otten
process exponents
10 -2
10-1
100
101
3 atom layers
gate oxide thickness
source/drain junction depth
minimum feature size
[ m]
10-3
1960 1970 1980 1990 2000 2010Year
LdtdL
the reduction rate of device sizeswill be proportionalto the achieved device size
[Status2000,ICE, 2000]
proportionality constantsare pretty close in value,and will be calledthe "process exponent p",
Ralph Otten
straverius laws
NdtdN
moore's law on chip industry
TN
dTdN
rent's rule on intra-module communication
L dtdL
observed miniaturization in chip technology
many laws in chip industry have generic form:
)V(h)U(f
dVdU differential equation with an integral
(solvable by separation of variables)
there are many more!!!
Ralph Otten
another old rule
massivememory
machines
massiveparallel
machines
in a balanced computer system the size of primary memory in bytesis close to the number of instructions per second
amdahl'sconstant
[Richard P. Case, 60's]
how primary memory should be supplied to a processor with a given speed
pentium IV
80486
cray 2
cray 1vax 11
ibm 360
processorspeed( MIPS)
memorysize (Mb)
1
1
1k
1k
Ralph Otten
memory-to-compute ratio
m
c
o SS
A)t(
)t(
down scaling forces
the memory-to-compute ratio to increase
M(t)
C(t)SmMScC
M(tO)
C(tO)
downscaling makes
memory (by Sm) and
processor (by Sc)
smaller
processing becameA times faster
due to downscaling
to rebalance the system
memory has to be extended
)1b( L
sb
very fast !!!
[Paul Stravers, 2000] bL
sbpdtdL
Ldtd
o
oo tC
tMt
tCtM
t
Ralph Otten
buffer area under global wire assumptionsbuffer area under global wire assumptions
nl rc
osc
oo2
o asrcrbncbrs)n,(l,T
ba
cc
c r bc r a
rccr
critoptoooo
o l/s
max
crit
l l I dl)l(Pl N area buffer
0 acbrn
T
2
2
n
l rcoo
c r ac r b
optcrit
oo n
ll
0l cc rbs
T
2o
s
ro
o
oc r
c r opts
note: buffer area is independent of
wire resistance
r.l
cos
ro/s /n
nc.l
Ralph Otten
wire length distributionwire length distribution
P(l), the wire length distribution, is usually obtained by requiring that rent's rule must be satisfied
max
crit
l l I dl)l(Pl N area buffer
donath-feuer: pareto-levy distribution
sastry-parker: weibull distribution
davis-de-meindl: explicit (long) formulas separate for two regions
3r2lg)l(P
)lgexp(lrg)l(P r1r
Ralph Otten
relative buffer arearelative buffer area
•
tota
l bu
ffer
are
a / d
ie a
rea
0.1
0.2
0.3
0.4
0.25 0.20 0.15 0.10 0.05
L
0.5
• ••
•
•
•••
r =0.55
r =0.45
r =0.63
r =0.75
•• ••
using formulae of davis-de-meindl
max
crit
l l I dl)l(Pl N area buffer
r =0.55
r =0.45
10 -1
10 -1 2.10 -1
10 -2
10 -3
5.10 -2 m
r =0.63
L
tota
l bu
ffer
are
a / d
ie a
rea
r =0.75
Ralph Otten
are wires plannable?
resistive interconnect and guiding synthesiswire planning
iteration free synthesissize assignment to achieve timing
delay prediction is needed and should enable wire planningoptimally buffered global interconnect
the memory share of a balanced processor chip area will increase very fast with scaling
new architectures optimal buffering forces almost all functionality from a single layer chip
new technologies
Ralph Otten
multilayer integrationmultilayer integration
recrystall-ization
layergrowth
sidewallmetallization
filmtransfer
multilayer integration
growing stacking
seeding verticalintegration
already triedbefore 1980
the true3D integration
main disadvantage:early layers have to go through many cycles
main disadvantage:poor alignment of
inter-layer via's
Ralph Otten
benefitsbenefits
global interconnect length considerably reduced
folding datapaths over layers and determining optimum crossing points can shorten cycle time
much smaller total footprint for the same functionality
different technologies for different layers are feasible
industry sustained its miraculous growth up to now without it
technological feasibility for vlsi only shown recently
economical feasibility not yet proven
virtually no adequate cad-support
no design experience with multilayer integration
why not fully exploited today ?
Ralph Otten
possible layer dedicationpossible layer dedication
AlOSi
Si
2
buffers, optical receivers, i.o
processor, first level cache
second level cache interfaces
advanced memory technology
polyimide
AlOSi
Si
2
polyimide
AlOSi
Si
2
polyimide
Si
optical clock receivers,line repeaters,
regular i/o [Otten,1980]
processors(the main heat source),
first level memory
second level cachefor performance improvement
[M.B. Kleiner, S.A.Kühn, P. Ramm, W.Weber, 1995]
high densityadvanced
memory technology
Ralph Otten
thermal analysisthermal analysis
400
m
350
300
2500.5 1.0 1.5
temperature increase
ºC
AlOSi
Si
2
buffers, optical receivers, i.o
processor, first level cache
second level cache interfaces
advanced memory technology
polyimide
AlOSi
Si
2
polyimide
AlOSi
Si
2
polyimide
Si
[M.B. Kleiner, S.A.Kühn, P. Ramm, W.Weber, 1995]
Ralph Otten
are wires plannable?
resistive interconnect and guiding synthesiswire planning
iteration free synthesissize assignment to achieve timing
delay prediction is needed and should enable wire planningoptimally buffered global interconnect
the memory share of a balanced processor chip area will increase very fast with scaling
new architectures optimal buffering forces almost all functionality from a single layer chip
new technologies multilayer integration
may ease all of the abovenew theories
today we are far from plannable wiring!
Giuseppe S. GarceaDelft University of Technology
Delft, The [email protected]
Ralph H.J.M. Otten
Eindhoven University of TechnologyEindhoven, The Netherlands
are wires plannable?are wires plannable?
Top Related