A Timing-Driven Soft-Macro Resynthesis Method in Interaction with Chip Floorplanning
description
Transcript of A Timing-Driven Soft-Macro Resynthesis Method in Interaction with Chip Floorplanning
A Timing-Driven Soft-MacroResynthesis Method in
Interaction with ChipFloorplanning
A Timing-Driven Soft-MacroResynthesis Method in
Interaction with ChipFloorplanning
Hsiao-Pin Su1 2 Allen C.-H. Wu1 Youn-Long Lin1
1Department of Computer Science
Tsing Hua University
Hsinchu, Taiwan, R.O.C
2Taiwan Semiconductor Manufacturing Co., Ltd.
{Email: [email protected]}
OutlineOutline
Introduction
Motivation
The Proposed Method
Experiments
Conclusions
A Typical HDL-based Design FlowA Typical HDL-based Design Flow
HDL Synthesis
Floorplanning
P & R
Timing Analysis
OK?
RC-ExtractionDelay Calculation
Chip Layout
Yes
No
HDL Synthesis
Floorplanning
P & R
RC-ExtractionDelay Calculation
Timing Analysis
Chip Layout
No
HDL DescriptionHDL Description
YesOK?
MotivationMotivation
Develop a complete chip design method which incorporates a soft-macro placement and resynthesis method in interaction with chip floorplanning for area and timing improvement.
Motivation (cont’)Motivation (cont’)
Top
HM1
HM2SM2 SM3 SM4SM1
HM1 HM2SM1 SM3 SM4SM2
HM1 SM1 SM2
HM2SM4
SM3SM3SM3
( b )
Motivation (cont’)Motivation (cont’)
HM2
HM1 SM1 SM2
The critical path delay
SM4
SM3SM3
Resynthesize SM3 by relaxing its timing constraints.
SM3
Saved area( a )
Slack > 0
HM1 SM1
HM2
SM2
SM3
SM4
( d )
Motivation(cont’)Motivation(cont’)
HM2SM3
HM1 SM1
( c )
SM4
SM2SM2
Resynthesize SM2 by tightening its timing constraints.
SM2SM2
Timing violation
ConsiderationsConsiderations
How to decide HDL design hierarchy?
How to guide soft-macro placement by utilizing hierarchy information?
How to integrate design tasks and point tools at different design level to form a complete chip design methodology?
How to exploit the interaction between different design tasks?
The Proposed MethodThe Proposed Method
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo
Module ResynthesisSoft-MacroPlacement
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
The Proposed MethodThe Proposed Method
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
HDL Synthesis
The Proposed MethodThe Proposed Method
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Hierarchy-tree ConstructionHierarchy-tree Construction
The main objective is to preserve the design hierarchy information from HDL design description during soft macro formation.
Top
HM1
HM2SM2 SM3 SM4SM1
Soft-Macro FormationSoft-Macro Formation
Clock-based clustering Group the macros connected to the same clock source
into the same cluster.
Decomposition of large soft-macros. A large macro is too rigid for macro Placement.
Clustering of small soft-macros. Many small macros increase the computational
complexity.
Clock-based ClusteringClock-based Clustering
Partition circuit based on the clock connection.
Localize the distribution of clock signal. If clock signal is distributed to many modules then it
may have difficulty to balance the clock skew or cause area penalty when balance the clock skew on top module.
HM1
HM2SM4
Large-Macro DecompositionLarge-Macro Decomposition
Macros
TotalCellsSavg
SavgU th
#
#
2
Split cluster if cluster size is larger than the size threshold by using FM partitioning method.
Big size threshold:
Small-Macros ClusteringSmall-Macros Clustering
Merge clusters if cluster size is smaller than the size threshold.
Small size threshold:
Macros
TotalCellsSavg
SavgLth
#
#
1.0
Clustering Cost FunctionClustering Cost Function
Cost Function:
CritConnC ij,ijij
Conn
1;S
SS 0,
1;S
SS,
S
SS
W2WW
WW
ij
th
ji
th
ji
th
ji
ijji
ji
Critelse, 0,
1;S
SS , VVCrit_path if 1, ij
th
jiji
Connectivity Consideration:
Criticality Consideration:
The Proposed MethodThe Proposed Method
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
The Proposed MethodThe Proposed Method
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Block PlacementBlock Placement
HM
HM
IO
IOIO
IO
The Proposed MethodThe Proposed Method
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
The Proposed MethodThe Proposed Method
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-Macro PlacementSoft-Macro Placement
Inputs: a set of soft-macros and the available area for soft macros.
Outputs: the locations of all soft macros.
Algorithm: 1st step: force-directed-based placement. 2nd step: line-sweep-based soft-macro assignment.
Force-directed-based PlacementForce-directed-based Placement
HM
HM
IO
IOIO
IO
HM
HM
IO IO
HM
HM
IOIO
Force-directed-based PlacementForce-directed-based Placement
SM2 SM3
SM4
SM1
HM
HM
IOIO
HM
HM
IO IO
Soft-Macro Area ExtractionSoft-Macro Area Extraction
HM
HM
SM area
IO
IOIO
IO
Sweeping-based Soft-Macro Assignment ( Y direction )Sweeping-based Soft-Macro Assignment ( Y direction )
SM1
SM2
SM3
SM4X
Y
SM3
SM1HM
HM
SM areaSM areaSM2 & SM4
Sweeping-based Soft-Macro AssignmentSweeping-based Soft-Macro Assignment
SM1
SM2
SM3
SM4X
Y
SM3
SM1
SM area
HM
HM
Sweeping-based Soft-Macro Assignment X directionSweeping-based Soft-Macro Assignment X direction
SM4SM2
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Soft-MacroPlacement
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
Timing Ok & no morearea improvement
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
Module ResynthesisModule Resynthesis
Slack Computation: Calculate the slack value for each inter-macro signal pa
th
Soft-Macro Resynthesis Candidate Selection: If there exists a negative slack value of any soft-macro
then pick the one with highest negative slack as the candidate to resynthesize using tightened timing constraint
If all timing satisfies the timing constraint then pick the one with highest positive slack value as the candidate to resynthesize using relaxed timing constraint
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
Soft-MacroPlacement
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
P&R
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
RC Extraction &Delay Calculation
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
Post-layoutTiming Analysis
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
Timing Ok & no morearea improvement
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
Yes
The Proposed FlowThe Proposed Flow
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
Module Resynthesis
P&R
Block Placement
Soft-MacroFormation
YesNo
Soft-MacroFormation
Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Soft-MacroPlacement
Timing Ok & no morearea improvement
Chip Layout
The Experiment Environment SetupThe Experiment Environment Setup
HDL Description
HDL Synthesis RC Extraction &Delay Calculation
Chip Layout
Post-layoutTiming Analysis
P&R
Block Placement
YesNo Yes
RTL netlist
Timing constraint
Soft-Macro group
Hard macro location
Routed database
SDF file
Soft-Macro location
Timing Ok & no morearea improvement
Soft-MacroPlacement
Soft-MacroFormation
Module Resynthesis
Synopsys
Cadence(Block Placement)
Avant!(P&R)
Avant!(STAR-RC)Avant!(STAR-DC)
Synopsys(Design Time)
BenchmarksBenchmarks
EX Nets #IO #HM #SM(B/A) GateSM GateTotal
Ind1 15,373 83 13 157/22 38,240 75,000
Ind2 27,404 155 8 150/28 75,361 95,000
Ind3 53,344 73 31 292/50 124,180 230,000
Results (Ind1 @ TSMC 0.5um)Results (Ind1 @ TSMC 0.5um)
Iter GateTot Area(um2) Delay(ns) Tresyn Teco
1 75,000 25,250,625 18.35 6(hr) 4(hr)
2 75,039 25,251,118 15.87 5(hr) 4(hr)
3 75,020 25,252,218 13.91 4(hr) 3(hr)
.002% 13.5%
10.5%.004%
EX #IO #HM #SM(B/A) GateSM GateTotal
Ind1 83 13 157/22 38,240 75,000
Results (Ind2 @ TSMC 0.5um)Results (Ind2 @ TSMC 0.5um)
Iter GateTot Area(um2) Delay(ns) Tresyn Teco
1 95,000 27,957,500 38.25 9(hr) 7(hr)
2 95,140 28,037,599 33.71 8(hr) 5(hr)
3 95,172 28,039,755 33.30 6(hr) 3(hr)
.29% 12%
1.2%.01%
EX #IO #HM #SM(B/A) GateSM GateTotal
Ind2 155 8 150/28 75,361 95,000
Results (Ind3 @ TSMC 0.5um)Results (Ind3 @ TSMC 0.5um)
Iter GateTot Area(um2) Delay(ns) Tresyn Teco
1 230,000 52,560,000 19.83 13(hr) 9(hr)
2 230,299 52,563,260 18.95 8(hr) 7(hr)
3 230,431 52,565,788 17.64 7(hr) 6(hr)
.06% 13.5%
10.5%-.05%
EX #IO #HM #SM(B/A) GateSM GateTotal
Ind3 73 31 292/50 124,180 230,000
Results (Ind2 @ 0.25um)Results (Ind2 @ 0.25um)
Iter GateTot Area(um2) Delay(ns) Tresyn Teco
1 95,000 6,250,000 27.68 10(hr) 7(hr)
2 95,249 6,388,250 25.67 9(hr) 5(hr)
3 96,561 6,566,400 21.67 9(hr) 5(hr)
4 97,808 7,022,500 19.32 8(hr) 4(hr)
2.2% 7.3%
14.4%3.8%
9.8% 8.3%
EX #IO #HM #SM(B/A) GateSM GateTotal
Ind2 155 8 150/28 75,361 95,000
Results (Ind2 @ 0.5um)Results (Ind2 @ 0.5um)
The original critical path and new critical path of Ind2 using the 0.5um library after two resynthesis iterationsThe original critical path and new critical path of Ind2 using the 0.5um library after two resynthesis iterations
ConclusionsConclusions
Preserving design hierarchy for soft-macro placement leads to significant improvements in circuit timing.
Exploiting the interaction between HDL-synthesis, floorplanning, and place-and-route is important to design quality.
Many open problems need to be studied, such as the initial timing budgeting for each module, place hard-macro and soft-macro simultaneously.