Large-Scale Circuit Placementcadlab.cs.ucla.edu/~cong/slides/git_april03.pdfPeko01 12506 13865 113...

Large-Scale Circuit Placement:The Gap and Promise

Contributors: Chin-Chih Chang, Kenton Sze, Tim Kong, MichailRomesis, Joe Shinnerl, Min Xie, Xin Yuan

Jason CongComputer Science Department

University of California, Los Angeles

[email protected]

2

Outline

Optimality and scalability study of placement problem -- gap analysisResearch on multilevel large-scale placement problemOur research plan

3

Optimality and Scalability Study---Motivation

Lack of significant progress in wirelength reduction

Rate of reduction is about 5-10% every 2-3 yearsLatest developments in placement differ mainly in runtime

Where do we standHow much room for further improvement?Will existing placement engines scale well to 10+M gate designs?

Need to quantify the optimality and scalability of state-of-the-art placement engines

4

Optimality and Scalability Study---Motivation (II)

Most work compare only with existing heuristicsUse real design based benchmarks

ISPD98 [C. Alpert 1998]

Use synthetic benchmarkscirc and gen [M. D. Hutton et al, 1998]gnl [D. Stroobandt et al, 2000]

Little understanding about the gap from the optimal

5

Optimality and Scalability Study---Related Work

Quantified Suboptimality of VLSI Layout Heuristics [L. Hagen et al, 1995]

Construct scaled instance with known upperboundx

x x x

x x x

x x x

? Over 10% area suboptimality in TimberWolf Notable wirelength suboptimality in GORDIAN-L But test cases are small, the largest netlist is less than 40K

6

Our Contribution: Placement Example Construction with Known Optimal Wirelength

Construct instances with known optimal using the characteristic of the original problem

?

Optimality and Scalability Study of Existing Placement Algorithms [C. Chang et al, 2003]

Studied the optimality and scalability of existing algorithms on constructed instances

7

Construction of Placement Examples with Known Optimal Wirelength (PEKO Examples)

InputDesired number of placeable modules tNet Distribution Vector (NDV) D = ( d2, d3, … dp ), dkis the # of k-pin nets in the circuit

t and D are extracted from a real circuitOutput

Cell library LNetlist N with known optimal wirelength

ConstraintN has D as its NDV

8

Our Algorithm for Constructing PEKO Examples

All the modules are of equal size, and there is no space between rows and adjacent modulesFor 2-pin nets , connect any two adjacent modulesFor each n-pin net , connect the n modules in a rectangular region close to a square, i.e., the length of each side is close to sqrt(n)The wirelength is of each n-pin net is given by

/ 2n n n + −

9

Illustration: PEKO Example Construction

Input : t = 64, D = {d2=34,d3=20,d4=7,d5=4,d6=2, d7=1}

Total WL = 110

#2-pin nets = 34, WL = 34

#7-pin nets = 1, WL = 4

#3-pin nets = 20, WL = 40

#5-pin nets = 4, WL = 12

#6-pin nets = 2, WL = 6

#4-pin nets = 7, WL= 14

• Method first conceived by K. Boese (1995), but not implemented

10

White Space Insertion

Option 1: expanding one dimension of the chip

Option 2: removing some of the nets

Need for white spacemimic real designsEase for legalization

11

Four New Suites of Placement Examples with Known Optimal Wirelength

Module number t and NDV extracted from ISPD98 [C. Alpert, 1998]Two suites without pads (suite1 and suite2)

suite2 is derived by scaling t and NDV by a factor of 10

Two suites with pads (suite3 and suite4)suite4 is derived by scaling t and NDV by a factor of 10

15% white space by expanding on dimension of the chip

URL: http://ballade.cs.ucla.edu/~pubbench/peko.htm

12

PEKO Characteristics

ckt #cell #net #row Optimal WLPeko01 12506 13865 113 8.14E+05Peko02 19342 19325 140 1.26E+06Peko03 22853 27118 152 1.50E+06Peko04 27220 31683 166 1.75E+06Peko05 28146 27777 169 1.91E+06Peko06 32332 34660 181 2.06E+06Peko07 45639 47830 215 2.88E+06Peko08 51023 50227 227 3.14E+06Peko09 53110 60617 231 3.64E+06Peko10 68685 74452 263 4.73E+06Peko11 70152 81048 266 4.71E+06Peko12 70439 76603 266 5.00E+06Peko13 83709 99176 290 5.87E+06Peko14 147088 152255 385 9.01E+06Peko15 161187 186225 402 1.15E+07Peko16 182980 189544 429 1.25E+07Peko17 184752 188838 431 1.34E+07Peko18 210341 201648 460 1.32E+07

ckt #cell #net #row Optimal WLPeko01x10 125060 138650 335 8.14E+06Peko02x10 193420 193250 441 1.26E+07Peko03x10 228530 271180 479 1.50E+07Peko04x10 272200 316830 523 1.75E+07Peko05x10 281460 277770 532 1.91E+07Peko06x10 323320 346600 570 2.06E+07Peko07x10 456390 478300 677 2.88E+07Peko08x10 510230 502270 715 3.14E+07Peko09x10 531100 606170 730 3.64E+07Peko10x10 686850 744520 830 4.73E+07Peko11x10 701520 810480 839 4.71E+07Peko12x10 704390 766030 840 5.00E+07Peko13x10 837090 991760 916 5.87E+07Peko14x10 1470880 1522550 1214 9.01E+07Peko15x10 1611870 1862250 1271 1.15E+08Peko16x10 1829800 1895440 1354 1.25E+08Peko17x10 1847520 1888380 1360 1.34E+08Peko18x10 2103410 2016480 1451 1.32E+08

PEKO Suite1 ( 12.5k – 210k ) PEKO Suite2 ( 125k – 2.1M )

13

Tested four State-of-the-Art PlacersCapo [A. E. Caldwell et al, 2000]

based on multilevel partitioneraims to enhance the routability

Dragon [M. Wang et al, 2000]uses hMetis for initial partitionSA with bin-based swapping

mPL [T. Chan et al, 2000]nonlinear programming on the coarsest levelGoto based relaxation

QPlace [Cadence Inc.]quadratic programmingcomponent of Silicon Ensemble

14

Experiment with State-of-the-Art Placers Using PEKO Suite1

1.001.201.401.601.802.002.202.402.602.80

0 50000 100000 150000 200000 250000

#cells

Mul

tiple

of O

ptim

al

Dragon v.2.20 capo v.8.0 mPL v.1.2 qplace v.5.1.55

05000

1000015000200002500030000350004000045000

0 50000 100000 150000 200000 250000

#cells

runt

ime(

s)


Existing algorithms are 66-153% away from the optimal on PEKO On examples with pads

mPL and QPlace show improvement of 12% and 10% respectivelyDragon and Capo do not benefit much from the additional information

There is significant room for improvement in placement algorithms!

15

Experiment with State-of-the-Art Placers Using PEKO Suite1 & Suite2

0

10000

20000

30000

40000

50000

60000

10000 100000 1000000 10000000#cells

runt

ime(

s)


1.00

1.20

1.40

1.60

1.80

2.00

2.20

2.40

2.60

2.80

10000 100000 1000000 10000000

#cells

Mul

tiple

of O

ptim

al


Capo, QPlace and mPL scales well in runtimeAverage solution quality of each tool shows deterioration by an additional 4% to 25% when the problem size increases by a factor of 10QoR of the existing placement algorithms can be 80% - 180% away from the optimal for large designs

17

Limitation of PEKO Examples

Optimal solution includes local nets onlyUnlikely for real designs

Measure wirelength only Timing and routability are important objectives for placement algorithms as well

18

Impact of Global Connections in Real Examples

circuit height width WL of longest net

WL contribution of longest 10%

ibm01 8158 4530 7148 51%ibm02 8158 6430 14224 46%ibm03 8158 6740 10624 58%ibm04 8158 9140 15171 53%ibm05 8158 11055 19064 47%ibm06 8158 8715 13966 61%ibm07 8158 14605 14051 51%ibm08 8158 15895 16142 60%ibm09 8158 16395 13780 55%ibm10 8158 27890 30755 53%ibm11 16350 10925 19234 59%ibm12 16350 15545 26748 52%ibm13 16350 12230 19539 59%ibm14 16350 25475 26370 61%ibm15 16350 23785 27284 63%ibm16 16350 34015 42860 59%ibm17 16283 38895 45686 56%ibm18 16350 37065 52846 64%

Produced by Dragon on ISPD98The wirelengthcontribution from global connections can be significant!Need to consider the impact of global connections

19

Placement Examples with Known Upperbounds (PEKU)

Extend PEKO by introducing non-local nets to mimic global connections

All the modules are of equal size, and there is no space between rows and adjacent modules

For nets of degree ii, a subset of them are generated by randomly connecting ii modules, the rest are generated optimally as in PEKO

20

Placement Examples with Known Upperbounds (PEKU)

Input : t = 64, D = {d2=34,d3=20,d4=7,d5=4,d6=2, d7=1} α=0.2

Total WL = 160

Generate 28 2-pin optimally

Generate 6 2-pin randomly


Generate 4 3-pin randomly

Generate 6 4-pin optimallyGenerate 1 4-pin randomly


Generate 2 6-pin optimallyGenerate 1 7-pin optimally

21

Placement Examples with Global Connections only (G-PEKU)

Each net connects either a row or column

Obvious upper boundSum the length of each row and column

Similar to datapath examples

Input : t = 64

22

PEKU Suite

Module numbers and NDVs extracted from ISPD98 Remove connections with pads Vary α from 0 to 10% 15% white space by expanding one dimension of the chip

23

PEKU Suite% non-

local nets

circuit #cell #net #rowRow

utilization

LB UB

Peku01 12506 14111 113 85% 8.14E+05 8.14E+05Peku05 28146 28446 169 85% 1.91E+06 1.91E+06Peku10 68685 75196 263 85% 4.73E+06 4.73E+06Peku15 161187 186608 402 85% 1.15E+07 1.15E+07Peku18 210341 201920 460 85% 1.32E+07 1.32E+07Peku01 12506 14111 113 85% 8.14E+05 9.23E+05Peku05 28146 28446 169 85% 1.91E+06 2.24E+06Peku10 68685 75196 263 85% 4.73E+06 6.17E+06Peku15 161187 186608 402 85% 1.15E+07 1.71E+07Peku18 210341 201920 460 85% 1.32E+07 2.01E+07Peku01 12506 14111 113 85% 8.14E+05 1.02E+06Peku05 28146 28446 169 85% 1.91E+06 2.63E+06Peku10 68685 75196 263 85% 4.73E+06 7.52E+06Peku15 161187 186608 402 85% 1.15E+07 2.30E+07Peku18 210341 201920 460 85% 1.32E+07 2.75E+07

Up to 10%

0

0.25%

0.50%

…

URL: http://cadlab.cs.ucla.edu/~pubbench/peku.htm

24

G-PEKU Suite

circuit #cell #net #row UBGPeku01 12506 224 113 7.93E+05GPeku05 28146 336 169 1.79E+06GPeku10 68685 525 263 4.38E+06GPeku15 161187 803 402 1.03E+07GPeku18 210341 918 460 1.34E+07

Module numbers extracted from ISPD98

URL: http://cadlab.cs.ucla.edu/~pubbench/peku.htm

25

Studied Four State-of-the-Art PlacersCapo [A. Caldwell et al, 2000]

Based on multilevel partitionerAims to enhance the routability

Dragon [M. Wang et al, 2000]Uses hMetis for initial partitionSA with bin-based swapping

mPL [T. Chan et al, 2000]Nonlinear programming on the coarsest levelGoto based relaxation

mPG [C. Chang et al, 2002]Uses FC clustering and hierarchical density control Incremental A-tree for routability

26

Wirelength Comparison between State-of-the-Art Placers Using PEKU

1

1.2

1.4

1.6

1.8

2

2.2

0.00% 0.25% 0.50% 0.75% 1.00% 2.00% 5.00% 10.00%% of non-local nets

Qua

lity

Rat

io

Capo v.8.5 Dragon v.2.20 mPG v.1.0 mPL v.2.0

mPL’s QR increases when α is increased from 0 to 0.75%, while for the other three placers, QRs are steadily decreasingAbsolute value of the QRs may not be meaningful, but it helps to identify the technique that works best under each scenario

27

Wirelength Comparison between State-of-the-Art Placers Using G-PEKU

The gap between their solutions and the upper bound varies between 79% and 102% in the worst caseAnother validation that there is significant room for improvement for the placement problem

circuitDragon v.2.20

QRCapo v.8.5

QRmPG v.1.0

QRmPL v.2.0

QRGPeku01 1.98 1.56 1.91 1.69GPeku05 2.01 1.69 1.97 1.83GPeku10 2.02 1.72 1.98 1.94GPeku15 1.99 1.79 1.97 1.97GPeku18 2.02 1.78 1.98 1.98

28

Significant Opportunity for Better Circuit Placement Algorithms

Best available placement algorithms can be 80% - 180% away from optimalA significant research and business opportunity

Use of copper interconnect is equivalent to 30% wirelength reductionOne process generation (e.g. from 0.13um to 0.10um) is equivalent to 30% wirelength reductionBoth requires multi-billion dollar investment!

Better placement may extend/accelerate Moore’s law by 2-3 generations !

29

Outline

Optimality and scalability study of placement problemOur research on large-scale placement problemOur research plan

30

Overview of UCLA research on Large-Scale Placement Problem

Multilevel Optimization --- a highly scalable framework suitable for complex constraints

Hierarchy construction by recursive aggregationIntralevel optimization by various techniquesInterpolation to transfer from level to level

Two Parallel EffortsmPL: multilevel multiheuristic/hybrid optimizationmPG: multilevel simulated annealing with congestion control and mixed block support

31

Initial Fine-Grain Problem

Multilevel Optimization Framework

Intermediate Level

Intermediate Level

Coarse-Grain Problem Direct, nonhierarchical solution

Aggregate

Aggregate

Intermediate LevelRelaxation (Refinement)

Intermediate LevelRelaxation (Refinement)

Final Fine-Grain Problem.Thorough Relaxation and

Detailed Solution

Aggregate Interpolate

InterpolateAggregateetc. etc. Interpolate

Interpolate

32

Multilevel Methods in Scientific Computation

Originally developed to solve boundary-value PDE problems on spatial domains

Discretized elliptic PDE is a structured, positive-definite system of linear equations

Rapidly extended to problems without physical grids during the last 30 years

Algebraic Multigrid (AMG)

Most research has been for continuous models, but recent efforts have targeted large-scale discrete optimization

Graph/Hypergraph Partitioning, traveling salesman, VLSI Physical Design

33

Overview of mPL

Recursive ClusteringVersion 1.0 Edge-Separability (CapForest)Versions 1.1 – 2.0: FirstChoice

Nonlinear Programming at coarsest level(s)Slot assignment and discrete refinement at all levelsRecent Enhancements to Version 2.0

AMG-based weighted disaggregationQuadratic relaxation on subsets (QRS) at all levelsDistance-based reaggregation for iterated multilevel flow

34

Multilevel Placement Study

CoarseningESC, AMG-based, First Choice Clustering (FC)

Relaxation (Intralevel Optimization)Interior-point nonlinear programmingk-cycle Goto-style discrete exchangeQuadratic relaxation on subsets (QRS)

InterpolationDeclustering + partitioning + slot assignmentAMG-style weighted averaging

Iterated Multilevel FlowRepeated/Recursive V-cycles with distance-based reaggregation

35

Coarsening by Recursive Aggregation

Edge-Separability Clustering (ESC) [Cong and Lim, 2000]Use CAPFOREST [Ibaraki and Nagamochi, 1992] to

estimate all-pairs min-cut q(x,y) in N log N time. Rank pairs by connectivity and area.

First-Choice Clustering (FC) [Karypis, 1999]Match each vertex with a neighboring vertex with which it

shares the most total hyperedge weight, subject to area-balance constraints. Clusters are connected components.

Weighted Aggregation (AMG)Split the nodes into C-points and F-points. Associate each F-

point with several C-points by weighted average.

36

Multilevel Placement StudyCoarsening

ESC, AMG, FC, FC+opt, PD-FCRelaxation (Intralevel Optimization)

Interior-point nonlinear programmingk-cycle Goto-style discrete exchangeQuadratic relaxation on subsets (QRS)


Iterated Multilevel FlowRepeated/Recursive V-cycles with reaggregation

37

mPL Coarse-Level Formulation

Nonlinear-Programming FormulationDirect formulation for the coarse placement problemCells are modeled as circular disks for smoothnessQuadratic wirelength objective on a clique-modelPairwise nonoverlap constraints

Can accelerate evaluation with adaptation of Fast MultipoleMethod

Nonuniform sizes are OK, but reshaping is difficult to incorporate efficiently

Reasonable performance for coarse-level sizes N

38

Interior-Point-Based Nonlinear Programming

Analogous to force-directed methods, but with direct handling of nonconvex objectives and constraintsEffective but relatively expensive; affordable only at the coarsest level(s) of mPL 1.0—2.0.Net impact: 15% wirelength improvement overall

Global cell movement is not scalable to finer levelsPlan: Restrict to cell subsets to produce improved, scalable local relaxations at ALL levelsTo date, we have implemented Linear-Programming-based and

Quadratic-Programming-based subset relaxations

39

Goto-based Discrete Relaxation

Each cell’s optimal location is readily calculated when all other cells are held fixed.Compute a chain A, B, C, D, E, whereB is a randomly selected neighbor of A’s optimal location, etc.Examine all permutations of the chain and take the best one.Problem: the chain is not closed (A is not necessarily near any other cell’s optimal location).

40

Quadratic Relaxation on Noncontiguous Subsets (QRS)

Select a subset M of cells to move.M is obtained as segments of length 3 along a DFS vertex traversal of the netlist, where starting the DFS at a vertex connected with largest wirelength.Identify other cells and pads, F, connected to M by nets in

Decouple the horizontal and vertical problems.

}.|{ φ≠∩∈= MeEeEM

41

Solving the subproblem

Problem formulation (horizontal case):

Iterative solve the weighted quadratic minimization problem, using the current solution to determine the weight (Gordian-L).

number. small is , )(||

1 where

|)(|))((min )()(

2

ε

ε

∑

∑ ∑

∈

∈ ∈

=

+−−

eve

Ee evke

ke

vxe

x

xvxxvx

M

42

Ripple-move legalization [Hur and Lillis, 2000]Calculate a max-gain monotone path on the bin grid

Because QRS ignores overlap constraints, post-QRS cell swaps are used to remove the overlap.

43

Impact of QRS RelaxationmPL 1.2 mPL 2.0 (with QRS)2 V-cycles; AMG; No QRS 2 V-cycles; AMG; QRS

Circuit WL Time WL %improTime Rel. TimeBin sizeibm04 7.20E+06 506 6.83E+06 5.14% 866 1.71 2x2ibm07 1.11E+07 749 1.02E+07 8.11% 1302 1.74 2x2ibm09 1.22E+07 860 1.13E+07 7.38% 1569 1.82 3x3ibm10 1.97E+07 1285 1.91E+07 3.05% 2419 1.88 3x3ibm14 4.22E+07 2524 4.03E+07 4.50% 5846 2.32 3x3ibm16 5.72E+07 4018 5.25E+07 8.22% 16760 4.17 4x4ibm17 7.19E+07 5051 6.78E+07 5.70% 12240 2.42 4x4ibm18 5.63E+07 4743 5.45E+07 3.20% 13507 2.85 4x4

avg. 5.66% 2.36

44





Iterated Multilevel FlowRepeated/Recursive V-cycles with reaggregation

45

AMG-style Linear Interpolation

The inherited position of a cluster component ( ) is determined by several cluster positions, not just its own.

46

AMG-based Interpolation

Use the clique-model (graph) to define connectivity weightsFor each FC-cluster, select one node of maximal connectivity as a C-pointEach C-point is placed at its cluster’s position.Each F-point is placed at the weighted average of the C-points and F-points to which it is connectedThe F-points’ positions can be iteratively improved.

47

Impact of AMG Interpolation1 vcycle 1 vcycle wirelength runtimemPL1.1 mPL1.1 AMG %improved %incr

Circuit dom. WL Time dom. WL Time by AMGibm04 7.71E+06 261 7.33E+06 427 4.93% 63.60%ibm07 1.18E+07 396 1.12E+07 437 5.08% 10.35%ibm09 1.29E+07 455 1.28E+07 563 0.78% 23.74%ibm10 2.11E+07 661 2.08E+07 744 1.42% 12.56%ibm14 4.54E+07 1982 4.28E+07 2226 5.73% 12.31%ibm16 5.88E+07 3187 5.73E+07 3615 2.55% 13.43%ibm17 8.17E+07 4173 8.08E+07 4416 1.10% 5.82%ibm18 5.81E+07 4051 5.78E+07 4496 0.52% 10.98%

avg. 2.76% 19.10%

48





Iterated Multilevel FlowRepeated/Recursive V-cycles with distance-based reaggregation

49

Iterated Multilevel Flow

Iterated V-Cycles O(logN)) F-Cycle (O(logNlogN))

Backtracking V-Cycle O(logN)

50

Adjustable Vertex Affinity for Reggregation

Initially, affinity is connectivity + area balancing

Subsequently, distance is incorporated to retain the information from preceding cycles

51

Impact of 2nd V-cycleusing proximity and connectivity in the 2nd+ coarsening pass(es)

(NP+Goto-exchange only; AMG; FC Coarsening)

Circuit Rel. Time %improvedibm04 1.10 2.32%ibm07 1.40 3.57%ibm09 1.46 4.69%ibm10 1.49 1.44%ibm14 1.28 1.64%ibm16 1.31 2.27%ibm17 1.30 7.43%ibm18 1.26 5.36%

Avgs. 1.33 3.59%

52

Summary of Improvements: 12% wirelength reduction

Achieved using the following techniques (mPL2.0)Coarsening by area-balanced first-choice clusteringRelaxation (Intralevel Optimization)

Interior-point nonlinear programming at coarsest levelk-cycle Goto-style discrete exchange at every levelQuadratic relaxation on subsets (QRS) at every level

Interpolation by AMG-style weighted averagingIterated V-cycles with distance-based reaggregation

53

mPL2.0 vs. mPL1.1, Capo8.5, Dragon and Gordian-L

Circuit mPL1.0 Capo8.5 Dragon Gor-L mPL1.0 Capo8.5 Dragon Gor-Libm04 1.12 1.07 0.93 1.00 0.30 0.51 2.91 1.82ibm07 1.16 1.14 0.97 1.07 0.30 0.54 2.98 3.37ibm09 1.14 1.12 1.01 1.04 0.29 0.59 4.75 4.31ibm10 1.11 1.09 0.98 0.98 0.27 0.49 4.20 5.84ibm14 1.13 1.05 0.92 1.01 0.34 0.47 2.48 6.78ibm16 1.12 1.12 0.95 1.05 0.19 0.22 2.84 4.88ibm17 1.21 1.13 1.00 1.00 0.34 0.33 5.27 8.04ibm18 1.07 1.06 0.92 0.99 0.30 0.31 4.34 9.56

Averages 1.13 1.10 0.96 1.02 0.29 0.43 3.72 5.58

Wirelength/mPL2 CPU time/mPL2Uniform-Cell-Size IBM/ISPD 98 Circuits

54

mPL2.0 vs. mPL1.1, Capo8.5, Dragon and Gordian-L

0.00

1.00

2.00

3.00

4.00

5.00

6.00

0.95 1.00 1.05 1.10 1.15

scaled wirelength

scal

ed ru

ntim

e mPL2.0mPL1.1Capo8.5DragonGordian-L

55

Net Impact of ImprovementsFC, QRS relax, AMG, 2 V-cycles

mPL2.0-Dom. mPL2.0-Dom.vs. mPL 1.1-Dom.

Circuit WL CPU WL CPUibm04 6.83E+06 866 0.89 3.32ibm07 1.02E+07 1302 0.86 3.29ibm09 1.13E+07 1569 0.88 3.45ibm10 1.91E+07 2419 0.91 3.66ibm14 4.03E+07 5846 0.89 2.95ibm16 5.25E+07 16760 0.89 5.26ibm17 6.78E+07 12240 0.83 2.93ibm18 5.45E+07 13507 0.94 3.33

Average 0.88 3.52

56

Comparison of mPL on PEKO using PEKO Suite-I and Suite-II

Comparison with the optimal

0.00

0.50

1.00

1.50

2.00

0 100000 200000 300000 400000 500000 600000

#cells

Mul

tiple

of o

ptim

al

m PL v.1.2 m PL v.2.0

Total runtime

0

5000

10000

15000

20000

0 100000 200000 300000 400000 500000 600000

#cells

runt

ime(

s)

mPL v.1.2 mPL v.2.0

mPL v.2.0 improves the Quality Ratio by 8% on the averagemPL v.2.0 increases the runtime by 131% on the average

57

Multilevel SA-based Coarse Placement: mPG

Multi-level coarse placement for physical hierarchy generation

A multi-level SA-based frameworkSupport mixed-size large scale global placementSupport routing congestion controlIntegrate retiming with placement

58

Summary

There is significant opportunity for circuit placement improvement

Possibly equal to several process generation advancement

Multilevel method is a promising approach to large-scale circuit placement

59

Our Research Objectives

Fast, scalable, superior-quality placements by multilevel optimization

Aim at 20-30% improvement (= 1 technology generation!) in 3 yearsSupport

large-scale mixed-size placement problemConstraint-driven placement for delay, routing etc.Incremental placement to support dynamic netlistadjustment.

60

Related PublicationsT. Chan, J. Cong, T. Kong and J. Shinnerl, “Multilevel Optimization for Large-scale Circuit Placement,” ICCAD 2000. Tim Kong. Novel Techniques for Large-Scale Circuit Placement. Ph.D. Thesis, CS Dept., UCLA 2002.Chin-Chih Chang, Jason Cong, David Pan, Xin Yuan. “Physical Hierarchy Generation with Routing Congestion Control”, ISPD 2002.Chin-Chih Chang, Jason Cong, Min Xie. “Optimality and Scalability Study of Existing Placement Algorithms.” ASP-DAC, 2003.Chin-Chih Chang, Jason Cong, Xin Yuan. “Multi-level Placement for Large-Scale Mixed-Size IC Designs”, ASP-DAC 2003. Cong and Shinnerl (editors). Multilevel Optimization and VLSICAD. Kluwer Academic Publishers, 2002.

61http://www.wkap.nl/prod/b/1-4020-1081-8

Large-Scale Circuit Placementcadlab.cs.ucla.edu/~cong/slides/git_april03.pdfPeko01 12506 13865 113...

Documents

Transcript of Large-Scale Circuit Placementcadlab.cs.ucla.edu/~cong/slides/git_april03.pdfPeko01 12506 13865 113...