A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK...
Transcript of A COMPUTER-AIDED DESIGN SOFTWARE MODULE FOR CLOCK...
A COMPUTER-AIDED DESIGN SOFTWARE MODULE
FOR CLOCK TREE SYNTHESIS
IN VERY LARGE SCALE INTEGRATION DESIGN
CHEW EIK WEE
UNIVERSITI TEKNOLOGI MALAYSIA
A COMPUTER-AIDED DESIGN SOFTWARE MODULE
FOR CLOCK TREE SYNTHESIS
IN VERY LARGE SCALE INTEGRATION DESIGN
CHEW EIK WEE
A thesis submitted in fulfilment of the
requirements for the award of the degree of
Master of Engineering (Electrical)
Faculty of Electrical Engineering
Universiti Teknologi Malaysia
FEBRUARY 2008
iii
Specially dedicated to
my beloved family
iv
ACKNOWLEDGEMENTS
First and foremost, I would like to express my deepest gratitude and
appreciation to my supervisor, Prof. Dr. Mohamed Khalil Mohd Hani, for guiding
me during my master’s study and helping me in discovering the wonderful field in
the computer-aided design of electronic systems. This study would not be possible
without his invaluable suggestions and support. Besides the technical knowledge, he
is also my teacher of life.
Special thanks to Encik Nasir Shaikh Husin for his mentoring, support,
sharing his knowledge and consistent help to my work over the last two years. I also
deeply thank to my friends and colleagues, Yuan Wen, Heng Sun, Illiasaak,
Norashikin, Arif, and En Zulkifli Yusof for their best friendship, help, and support.
They made my life in the UTM colorful and enjoyable. Many thanks also to the
technicians in VLSI-ECAD Lab, En. Zulkffli bin Che Embong and En.
Khomarudden bin Mohd Khair Juhari, in creating a conducive learning and research
environment in the lab.
I would like to thank my parents and my family members for their love
during my years in graduate school. Their care always provides the warmest support
in my life and work, wherever I am.
Most importantly, I would like to special thank to my friend, Bee Leng, who
has provided her meaningful and precious suggestions and consistent support during
the last few years, to the completion of this research and thesis.
v
ABSTRACT
With the evolution of Very Large Scale Integration (VLSI) fabrication
technology, circuit size has grown and line width has decreased. In effect, the
transistor transit time and the time to drive signal lines across chips have also
decreased. Thus, interconnections have become the dominating factor in
determining circuit performance and reliability in the design of a VLSI circuit.
Clock distribution network, which is one of the biggest and most important nets in
any synchronous VLSI chip, is sensitive to these variations. The increased line
resistance is one of the primary reasons for the increasing significance of clock
distribution networks on synchronous performance. Furthermore, failing to control
the clock skew can also severely limit the maximum performance of the entire
system and create catastrophic race conditions in which an incorrect data signal may
be latched by register. This thesis proposes a Computer Aided-Design (CAD)
software module for useful-skew tree synthesis in deep-sub-micron VLSI design.
Building the proposed CAD software module involves the implementations of the
abstract topology generation algorithm, skew constraints scheduling algorithm and
clock tree construction algorithm. Due to the lack of availability of circuit data and
necessary software tools, we introduced a different test methodology to test the
reliability of the proposed CAD software module. Two different test data have been
used to verify the functionality of the CAD software module. Tests on the random
input data show that our CAD software module successfully synthesizes clock trees
that satisfy the entire clock skew constraints, and at the same time, achieve a shorter
wire-length. Tests on benchmark circuit’s data show that our CAD software module
successfully synthesizes clock trees that not only satisfy the skew constraint value,
but also further reduced the computed applicable clock period of the circuit.
vi
ABSTRAK
Evolusi perkembangan teknologi fabrikasi “Very Large Scale Integraion”
(VLSI) telah menyebabkan saiz litar semakin besar dan saiz lebar pengalir mengecil.
Keadaan ini menyebabkan masa melintasi transistor telah berkurangan dan seterusnya
masa memandu isyarat melintasi cip turut berkurangan. Oleh itu, penyambung-
penyambung dalaman cip menjadi faktor utama menentukan prestasi dan
kebolehpercayaan rekabentuk litar VLSI sekarang. Rangkaian pengagihan isyarat jam,
merupakan salah satu rangkaian terluas dan terpenting bagi cip segerak VLSI, amatlah
sensitif kepada perubahan perkembangan ini. Penambahan rintangan akibat daripada
pengurangan saiz lebar pengalir merupakan salah satu sebab utama penambahan
kepentingan oleh rangkaian isyarat jam dalam menentukan prestasi kesegerakan litar.
Tambahan pula, kegagalan mengawal pencongan jam menghadkan lagi maksima
prestasi keseluruhan sistem dan boleh mewujudkan keadaan pelarian data, iaitu data-
data yang salah akan diselak oleh pendaftar. Tesis ini mencadangkan satu modul
perisian Rekabentuk Bantuan Komputer (CAD) yang dapat mensintesiskan “useful-skew
tree” untuk rekabentuk VLSI sub-mikro-dalam. Pembangunan modul perisisian yang
dicadangan tersebut melibatkan implementasi algoritma penjanaan topologi abstrak,
penjadualan pencongan jam dan pembinaan pepohon jam. Disebabkan kekurangan data-
data litar dan perisian-perisian yang penting, kami memperkenalkan metodologi ujian
yang berbeza untuk menguji kebolehpercayaan modul perisian CAD yang dicadangkan.
Terdapat dua data ujian yang berbeza telah digunakan untuk mengenalpasti kewibawaan
modul perisisan CAD tersebut. Keputusan ujian pada data-data rawak masukan
menunjukkan pepohon jam yang disintesiskan oleh modul perisisan CAD dapat
memuaskan semua kekangan pencongan jam, dan pada masa yang sama dapat mencapai
panjangan wayar yang lebih pendek. Keputusan ujian pada data-data litar tanda aras
menunjukkan modul perisian CAD kami berjaya mensintesiskan pepohon jam yang
bukan sahaja dapat memuaskan semua kekangan pencongan jam, bahkan juga dapat
mengurangkan tempoh jam litar.
vii
TABLE OF CONTENTS
CHAPTER TITLE PAGE
DECLARATION ii
DEDICATION iii
ACKNOWLEDGEMENTS iv
ABSTRACT v
ABSTRAK vi
TABLE OF CONTENTS vii
LIST OF TABLES xi
LIST OF FIGURES xii
LIST OF SYMBOLS xvi
LIST OF APPENDICES xvii
1 INTRODUCTION 1
1.1 Background
1.2 Problem Statement
1.3 Objective
1.4 Scope of Work
1.5 Project Approach and Tools
1.6 Research Contribution
1.7 Thesis Organization
1
4
5
5
6
8
8
2 LITERATURE REVIEW, THEORY AND CONCEPTS 10
2.1 Previous Related Work
2.2.1 Clock Scheduling
2.2.2 Clock Tree Construction
2.2 Clock Skew – Definitions and Concepts
10
10
12
16
viii
2.2.1 Double-Clocking
2.2.2 Zero-Clocking
2.3 Concept of Useful-Skew
2.4 Elmore Delay Model
2.2.1 Numerical Example of Elmore Delay
Computation
17
19
20
25
26
3 ABSTRACT TOPOLOGY GENERATION AND CLOCK
SKEW SCHEDULING −−−− DEFINITIONS AND
ALGORITHMS
28
3.1 Introduction
3.2 Abstract Topology Generation
3.2.1 The Octagon of S
3.2.2 Reference Set, REFi
3.2.3 Bi-partitioning the Set S
3.3 Clock Skew Scheduling
3.3.1 Bellman-Ford Algorithm
28
29
38
41
41
45
52
4 USEFUL-SKEW TREE CONSTRUCTION −
ALGORITHM AND DESIGN
54
4.1 Overview
4.2 UST/DME Algorithm
4.2.1 Skew Constraint Matrix
4.2.2 Deferred-Merge Embedding (DME) Approach
4.3 Construction of Merging Region
4.3.1 Joining Segment Identification
4.3.2 Skew Commitment
4.3.3 Shortest Distance Region (SDR) Determination
4.3.4 Exact Merging Region Construction
4.4 Internal Node Embedding
54
55
58
59
60
66
67
70
70
77
5 CLOCK TREE SYNTHESIS PROGRAM DESIGN 81
5.1 Useful-Skew Tree (UST) Construction sub-module 81
ix
5.1.1 Supporting Unit
5.1.2 Main Program Unit
5.1.3 Skew Constraint Matrix, D Computation Unit
5.1.4 Joining Segment Identification Unit
5.1.5 Incremental Skew Scheduling Unit
5.1.6 Merging Region Construction Unit
5.1.7 Internal Node Embedding Unit
5.2 Abstract Topology Generation sub-module
5.2.1 The Main Program
5.2.2 Sorting Subunit
5.2.3 Octagon Sinks Computation Unit
5.2.4 Partition Unit
5.2.5 Diameter Determination Unit
5.3 Clock Skew Scheduling sub-module
5.3.1 Constraint Graph Updating Unit
5.3.2 Graph Relaxation Unit
82
84
86
87
87
88
94
94
95
96
97
97
98
98
99
100
6 TESTING METHODOLOGIES 101
6.1 Overview
6.2 The Function of benchParse.pl
6.3 The Function of TPRParse.pl
6.4 The Function of CapacitanceParse.pl
6.5 The Function of delayTrace.pl
6.6 Source-Sink Delay Computation for Output
101
103
104
110
110
111
7 TEST AND RESULT 116
7.1 Random Data Test
7.1.1 Experimental Result for TEST_DATA_1
7.1.2 Experimental Result for TEST_DATA_2
7.1.3 Experimental Result for TEST_DATA_3
7.2 Real Circuit Data Test
7.2.1 Experiment Result of Circuit s208
7.2.2 Experiment Result of Circuit s298
116
117
120
122
124
125
126
x
7.2.3 Experiment Result of Circuit s382
7.3 Experimental Result of Previous Relevant Researches
127
128
8 CONCLUSIONS 132
8.1 Concluding Remarks
8.2 Recommendations for Future Work
132
133
REFERENCES 137
Appendices A - H 143 – 236
xi
LIST OF TABLES
TABLE NO. TITLE PAGE
3.1
3.2
3.3
7.1
7.2
7.3
7.4
7.5
Computation of partition cost for example in Figure 3.6
Coordination computation result of set S in Figure 3.7
The weight(s) of S based on REF1 = {s10, s17, s9}
Summary of skew evaluation result for TEST_DATA_1
Wire-length (µm) comparison between UST-BP and ZST
and BST [(Xi and Dai, 1997)]
Comparison of wire-lengths among ZST, BST/DME, UST-
BP, and UST/DME (with BST/DME topology) [(Tsao and
Koh, 2000)]
Comparison of wire-lengths among ZST, UST/DME, and
PV-UST/DME [(Chaturvedi and Hu, 2003)]
Results of PVR-UST/DME with buffer insertion
[(Chaturvedi and Hu 2003)]
37
40
42
120
129
129
130
131
xii
LIST OF FIGURES
FIGURE NO. TITLE PAGE
1.1
1.2
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
2.11
2.12
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
Physical design cycle
Project workflow of CAD software module development
Symmetric H-tree clock distribution network
Definition of clock skew
Scenario of double-clocking a) initial stage b) when arrival
of first clock pulse c) wrong data latch-in by FFj
Scenario of zero-clocking a) initial stage b) when arrival of
first clock pulse c) when clock pulse arrives at si d) FFj still
latch-in the old data
Zero skew clock distribution design
Non-zero skew clock distribution design
Compensation of negative delay value
A synchronous circuit modeling
Permissible range of skewij
Skew constraint graph
Equivalent π-model of a wire segment
Example of interconnect wire for delay computation a) wire
b) wire modeling based on the π-Model
Example of clock tree routing by (Wu et al., 2003)
An example of sink nodes and its coordination
Overview of bi-partitioning process
Abstract topology generation
An example of set S with 3 sink nodes
An example of dividing a set of S into SL and SR
Example of an octagon of S for a 10 sink nodes set, S
Octagon of S’s boundary construction
2
7
13
17
18
20
21
21
22
22
24
24
25
27
29
30
31
32
36
37
38
39
xiii
3.9
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
3.18
3.19
3.20
3.21
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
4.10
4.11
4.12
4.13
Partitioning result based on the set REF1
Balanced bipartition result that based on reference set REF2
Balanced bipartition result that based on reference set REF3
Balanced bipartition result that based on reference set REF4
Balanced bipartition result that based on reference set REF5
Balanced bipartition result that based on reference set REF6
(a) Example circuit and (b) its constraint graph
(a) Example circuit and (b) its constraint graph
Example circuit
Skew constraint graph (with p unknown), GC for circuit in
Figure 3.17
Applying Deoker’s algorithm
Skew constraint graph, GC (with p = 7.43) for the example
of Figure 3.17
Implementation of the optimized clock period
Design structure of proposed CAD software module
The DME merging process
Example input data to UST/DME algorithm a) clock pin
floorplan b) abstract topology c) skew constraint value
Merging Region Construction – Overview
Example for merging region construction process
Example of joining segment JS
Example of delay and skew computation based on joining
segment JS. (a) point of view in binary-tree structured. (b)
point of view in Manhattan plane
Example of shortest distance region SDR (u,w)
Start node visitation on Lu and the skew23 evaluation
Node visitation on arrow pointed node and the skew23
evaluation
Node visitation on arrow pointed node and the skew23
evaluation
Merging region, mr(v) is determined
Another set of merging region, mr(v) is determined
42
43
43
43
44
44
45
46
47
48
48
50
51
55
59
61
61
65
66
68
70
71
72
73
73
74
xiv
4.14
4.15
4.16
4.17
4.18
4.19
4.20
4.21
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
5.11
5.12
5.13
5.14
6.1
6.2
6.3
Node visitation on arrow pointed node and the skew23
evaluation
Node visitation on arrow pointed node and the skew23
evaluation
Node visitation on arrow pointed node and the skew23
evaluation
Last node visitation on Lw and the skew23 evaluation
Final result of merging region, mr(v) construction
Merging region mr(w) re-computation (by comparing to the
Figure 4.6)
Internal node embedding process
Completed useful-skew clock tree
Design structure of useful-skew tree construction sub-
module
Binary tree-like structures two-dimensional array
Programming flowchart of UST/DME algorithm
C Code for Skew Constraint Matrix, D Computation Unit
C Code for Incremental Skew Scheduling Unit
Example of SDR between joining segments
Flowchart of merging region construction within SDR
Detour-wiring when skew value > upper skew constraint
bound
Delay evaluation when skew value < lower skew constraint
bound
Design structure of abstract topology generation sub-
module
Programming flowchart for sink nodes partitioning process
C Code fragment of bubble sort procedure
Design structure of clock skew scheduling sub-module
Programming flowchart of clock skew scheduling
A sample of sink nodes netlist file
A sample of delay netlist file
A sample of .bench netlist
74
75
75
76
76
77
79
80
82
83
85
85
87
90
91
92
93
94
95
96
98
99
102
103
104
xv
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
6.12
6.13
6.14
6.15
6.16
6.17
6.18
6.19
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
7.12
8.1
8.2
A sample of .bch netlist
A sample of .bch netlist
A sample of .tpr netlist
Invoke standard place and route
Select a TPR netlist
SPR core setup
Run standard place and route (SPR)
Generated layout design through SPR
A sample of spice netlist
A sample of .pcp file
A sample of .cap File
A sample of pathDELAY.txt
Example of synthesized clock tree
C-code implementation of segmental delay computation
C-code implementation of source-to-sink delay
computation
Example of source-sink delay computation result
Synthesized useful-skew tree of TEST_DATA_1
Computation result of propagation delay of each sink node
Useful-skew tree in binary-tree structure view
Synthesized useful-skew tree of TEST_DATA_2
Computation result of propagation delay of each sink node
Useful-skew tree in binary-tree structure view
Synthesized useful-skew tree of TEST_DATA_3
Computation result of propagation delay of each sink node
Process flow of real circuit data testing
Synthesized useful-skew tree for circuit s208
Synthesized useful-skew tree for circuit s298
Synthesized useful-skew tree for circuit s382
Top-level block diagram of the proposed CAD software
module
Permissible skew range with safety margin ddition
104
105
105
106
106
107
107
108
109
109
110
111
112
113
114
114
118
118
119
120
121
121
123
123
125
126
127
128
133
134
xvi
LIST OF SYMBOLS
VLSI - Very Large Scale Integration
CAD - Computer Aided-Design
EDA - Electronic Design Automation
IC - Integrated Circuit
PC - Personal Computer
GHz - Giga Hertz
UST - Useful-Skew Tree
ZST - Zero-Skew Tree
DME - Deferred-Merge Embedding
BB - Balanced Bipartition
SDR - Shortest Distance Region
FSR - Feasible Skew Range
ISCAS - International Symposium on Circuit and System
FF - Flip-Flop
Perl - Practical Extraction and Report Language
GUI - Graphics User Interface
xvii
LIST OF APPENDICES
APPENDIX TITLE PAGE
A
B
C
D
E
F
G
H
Summary of CAD Software Module
Example of Balanced Bipartitioning Process
Numerical Example of Floyd-Warshall’s
Algorithm
Numerical Example of Incremental Scheduling
Method
Skew Evaluation Result for TEST_DATA_2
Skew Evaluation Result for TEST_DATA_3
C Code For Cad Software Module
Perl Script For myNetlistParser
143
145
150
156
160
162
165
217
CHAPTER 1
INTRODUCTION
The thesis proposes a Computer Aided-Design (CAD) software module for
useful-skew tree (UST) synthesis in deep-sub-micron VLSI integrated circuit design.
The CAD software module is coded in C and Perl, and runs in a Microsoft Windows
2000 PC environment. In this chapter the project background, problem statement,
research objectives, scope of work, and thesis organization are presented.
1.1 Background
The last few decades brought explosive growth in the electronics industry due
to the rapid advances in integration technologies and the different benefits of large
scale system design. The manufacturing capability and complexity, combined with
the economic benefits of large electronic systems, are forcing a revolution in the
design of these systems and challenging system designers who are involved in the
design of integrated circuits. Integrated circuits today consist of hundreds of millions
of transistors. Due to the tremendous increase in complexity, automating the design
process has become a crucial issue.
The phase associated with the task of automatically designing a circuit using
Computer Aided Design (CAD) tools is called Electronic Design Automation (EDA).
The objective of the EDA research field is to fully automate the tasks for nearly
every aspect of the development cycle, from the circuit specification and design entry
2
to the performance analysis, layout generation and verification. A large subset of
problems in VLSI EDA is computationally intensive, and future EDA tools will
require even more accuracy and computational capabilities. For a complicated
problem in the modern VLSI design, an appropriate approach is to use a “divide-and-
conquer” strategy, in which the whole design task is broken down into several sub-
tasks (Sherwani, 1996). These sub-tasks are then more manageable to be solved
using mathematical and heuristic techniques.
Due to the large number of components and the details required by the
fabrication process, the physical design needs the help of computers. As a result,
almost all phases of physical design extensively use CAD tools and many phases
have already been partially or fully automated. The design is carried out in stages.
The final circuit of such an IC can have up to a billion of components; it is delivered
in a step-by-step manner. This is accomplished in several stages such as partitioning,
floorplanning, placement, routing, and compaction. The different stages of physical
design cycle are shown in Figure 1.1. A fully tested and error-free design at the
switch level can be the starting point for a physical design. It is to be realized as the
final circuit using (typically) a million components in the foundry’s library.
Figure 1.1: Physical design cycle
Circuit Design
Floorplaning
&
Placement
Routing
Compaction
Extraction and
Verification
Physical Design Cycle
Partitioning
Fabrication
3
Among all the stages, routing is a difficult problem, and much research has
been performed on this subject. This is because almost all problems in routing are
computationally complex. Essentially, the task of routing stage is to complete the
interconnections between blocks according to a specified netlist. The goal of a router
is to complete all these circuit connections using the shortest possible wire length,
routing with least number of layers and achieving timing optimization. This is
usually performed in three phases, referred to as the Global Routing, Detailed
Routing and Specialized Routing.
In physical design of VLSI system, there are two types of nets that need
special attention in routing: Clock nets and Power/Ground nets. Clock nets need to
be routed with great precision, since the actual length of the path of a net from its
entry point to its terminals determines the maximum clock frequency on which a chip
may operate. In a synchronous system, chip performance is directly proportional to
its clock frequency. A clock router needs to take several factors into account,
including the resistance and capacitance of the metal layers, noise and cross talk in
wires, and the type of load to be driven. In addition, the clock signal must arrive
simultaneously at all functional units with little or no waveform distortion. The clock
is also known as major power-consuming net. The clock net is only about 4% of the
total routing length of the entire interconnects, but power consumption of clock net is
not minor (Magen et al., 2004). In addition, compared to power and ground routing,
clock routing is relatively more complex.
As a result, designing a clock distribution network has thus become critical
not only for correct synchronization of data, but also can improve circuit
performance and reduces power dissipation. Consequently, one of the challenges that
EDA engineers have to face is to develop a new modeling capability and synthesis
techniques that help to control the clock distribution network routing effectively.
4
1.2 Problem Statement
When the VLSI feature size becomes progressively smaller, moving into the
Deep Submicron or Ultra-Deep Submicron Technology (sometimes, referred as
nanometer VLSI) era, previously negligible variation effects start to affect circuit
performance and yield significantly. Clock skew, is one of the variation effects that is
always eliminated by previous clock distribution network design, where conventional
clock designs have placed emphasis on seeking zero clock skew. Due to deep-sub-
micron below 0.25 µm process, the long global interconnect lines become highly
resistive as line dimensions are decreased. This increased line resistance is one of the
primary reasons for the increasing significance of interconnection delay and now is
of greater importance than the device delays. The interconnection delays eventually
contribute to a large part of the clock signal delay. This leads to new problem to
arise, as achieving a near-zero skew design is more and more difficult. Thus, the
clock skew cannot longer be ignored. The control of clock skew can also severely
limit the maximum performance of the entire system and create catastrophic race
conditions in which an incorrect data signal may latch within a register.
As a result, in the modern high speed VLSI design era, clock design plays a
crucial role in determining chip performance and facilitating timing and design
convergence. Clock routing is important in the layout design of a synchronous digital
system as it influences correctness, area, speed, power dissipation, power/ground
supply noise, process variations and thermal issues of the synthesized system.
Consequently, the automatic synthesis of the clock tree network has gained
considerable attention in the design community.
The clock net is usually one of the first nets to be routed, and consequently
block own routing areas for nets routed subsequently. As it is one of the largest nets,
the area occupied is also a concern. Also, the clock net accounts for a significant
fraction of the system power dissipation as it switches most frequently and is a large
net. Carrying the heaviest load and switching at a high frequency, the clock typically
dissipates about 40% of the total interconnect power in a synchronous digital system
(Magen et al., 2004).
5
In short, drastically increased requirements for high performance and high
speed VLSI circuits have posed challenges to the clock routing design, where clock
distribution network with elaborated timing characteristics and minimal wire-length
is crucial for multi-GHz VLSI designs. However, today's CAD systems do not help
much with the problems of time and often leave the final analysis to the designer.
High performance clock design is indeed and area of active research.
Responding to the problem stated above, a CAD software module specifically
for clock tree synthesis is proposed. By providing as inputs, the post-placement
layout data, together with the data paths propagation delay and the clock pins’
coordination and loading capacitance, a Useful-skew Tree (UST) is synthesized. The
output clock tree data is written in a netlist file, which will contain the information
on how the entire clock sinks are routed toward the clock source.
1.3 Objective
The objective of this thesis is to propose the development of a Computer
Aided-Design (CAD) software module for Useful-skew Tree (UST) synthesis in
deep-sub-micron VLSI integrated circuit design. The design of this CAD software
module applies several routing algorithms, including the UST/DME algorithm for
useful-skew tree generation, Balanced Bipartition (BB) method for abstract topology
generation, and Deoker’s graph-theoretic approach for clock skew scheduling.
1.4 Scope of Work
(i) A demonstration application prototype is developed to be used in the system
validation of the proposed CAD software module.
(ii) Ideally, the clock tree synthesis should be performed on the post-placement
layout data. Due to lack of availability of actual circuit data and necessary
6
software tools, the performance evaluation to our CAD software module is
restricted to:-
(a) Actual circuit data, is applied with International Symposium on
Circuits and Systems (ISCAS89) benchmark circuits which are re-
created manually with Tanner Tools.
(b) The required data such as data path delays, coordination and loading
capacitance of each clock pin is extracted, and then transformed
manually to our CAD software module readable format.
(c) The reliability of synthesized clock tree is verified by manually
comparing the computed skew value with the prescribed skew
constraints.(as applying the synthesized clock tree to actual circuit
layout is impossible to be performed in current stage)
1.5 Project Approach and Tools
Figure 1.2 illustrates the overview of project workflow to the development of
the proposed CAD software module. On completing of the software sub-modules, the
sub-modules are integrated and tested. The tests are conducted using (i) 3
randomized input data and (ii) actual circuit data of ISCAS89 benchmark circuit. A
front-end GUI program is developed using Perl for use in demonstration of the
results. Satisfactory outcomes from the performance evaluation successfully
conclude the research.
7
Figure 1.2: Project workflow of CAD software module development
The following software tools are used in this work.
i. Microsoft Visual C++ 6.0 – used to develop the entire core program
of proposed CAD software module.
ii. Perl scripting – used to develop the netlist parsers and the front-end
prototyping program.
iii. Tanner Tools – used to re-produce ISCAS89 benchmark circuit in
layout drawing design.
START
Problem Formulation
Work Scope Determination
Literature
Review
Documentation
&
Thesis
Writing
NO
CAD Subsystem Development
Abstract
Topology
Generator
Development
Useful-Skew Tree Synthesizer
Development
Clock
Skew
Scheduler
Development
Functionality Test
Random
.Data
Test
Actual
Circuit
Data
Test
Meet
Functionality?
END
YES
8
1.6 Research Contribution
i. Investigation on the clock distribution network routing issue in deep
submicron technology VLSI design, and systematic study on the clock
skew scheduling and clock tree routing technique for an efficient
implementation.
ii. Delivering a CAD software module for useful-skew tree synthesis where
the synthesized clock tree layout is not only minimized in term of wire-
length, but also fulfills the prescribed skew constraint values.
1.7 Thesis Organization
This thesis is organized into seven chapters. First chapter is the introduction
chapter. It covers the background of the research, the problem statement, research
objectives, scope of work, project approach and tools, and the significance of the
research.
The second chapter reviews the fundamental concept of clock distribution
network design research. It consists of some previous related research work and
background theory.
Chapter three describes the algorithm of the abstract topology generation and
clock skew scheduling algorithm. Both of the algorithms are used to pre-process the
post-placement layout data so as to transform it to the required input data of
UST/DME algorithm.
Chapter four focuses on the description of our main clock tree construction
algorithm, UST/DME algorithm. The architecture design of the proposed CAD
9
software module is presented where the UST/DME algorithm is applied as the main
engine of the useful-skew tree synthesis.
Chapter five delivers the detailed implementation work of the proposed CAD
software module which is integrated from the useful-skew tree construction sub-
module, abstract topology generation sub-module and clock skew scheduling sub-
module.
In Chapter Six, the testing methodology is presented. We discuss on how the
test platform is set-up and how the skew values of synthesized clock tree can be
computed.
Chapter Seven shows the experimental results of the functionality test to the
developed CAD software module. Part of ISCAS89 benchmark circuits is used in the
experiments.
The final chapter summarizes the research findings and suggests potential
future work.
REFERENCES
Bakoglu, H., Walker, J.T. and Meindl, J.D. (1986). A symmetric clock-distribution
tree and optimal high-speed interconnection for reduced clock skew in ULSI and
WSI circuit. Proceedings of IEEE International Conference on Computer Design,
118-122.
Brglez, F., Bryan, D., and Kozminski, K. (1989) Combinational profiles of sequential
benchmark circuits. Proceedings of IEEE International Symposium on Circuits and
Systems. 1929–1934.
Ch’ng, H. S., Chew, E. W., Nasir, S. H., and Mohamed Khalil, H. (2006)
Accelerating graph algorithms with priority queue processor. In Proceedings of
Regional Postgraduate Conference on Engineering and Science (RPCES 2006) July
26-27 JB, Malaysia: UTM 257-262.
Chao, T. H., Hsu, Y. C., and Ho, J. M. (1992a) Zero skew clock net routing. In
proceedings of ACM/IEEE Design Automation Conferences. 518-523.
Chao, T. H., Hsu, Y. C. H., Ho, J. M., Boese, K. D., and Kahng, A. B. (1992b) Zero
skew clock routing with minimum wirelength. IEEE Transactions on Circuit
System,November 1992, 39(11):799–814.
Chaturvedi, R. and Hu, J. (2003) A simple yet effective merging scheme for
prescribed-skew clock routing. In Proceedings of the 21st IEEE International
Conference on Computer Design (ICCD’03) 282–287
138
Chen, Y. P. and Wong, D. F. (1996) An Algorithm for Zero-Skew Clock Tree
Routing with Buffer Insertion. In Proceedings of European Design and Test
Conferences (ED&TC). 230-236
Cho, J. D., and Sarrafzadeh, M. (1995) A buffer distribution algorithm for high-
performance clock net optimization. IEEE Transactions on VLSI Systems Vol.3,
No.1, pp.84–97.
Cong, J., and Koh, C. K. (1995) Minimum-cost bounded-skew clock routing.
Proceeding of International Symposium Circuits and Systems. 322–327
Cong, J., Kahng, A. B., Koh, C.K., and Tsao. C. W. A. (1998) Bounded-skew clock
and Steiner routing. In ACM Transactions on Design Automation of Electronics
Systems 3(3):341–388.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2001) Introduction to
Algorithms. 2nd
Edition. The MIT Press, McGraw-Hill Book Company.
Deokar, R. B., and Sapatnekar, S. S. (1995). A graph-theoretic approach to clock
skew optimization. Proceedings of the 1994 IEEE International Symposium on
Circuits and System volume 1, pages 407–410.
Desai, U., Tam, S., Kim, R., Zhang, J., and Rusu, S. (2000) Itanium processor clock
design. In Proceedings of the 2000 international symposium on Physical design 94–
98.
Dhar, S., Franklin, M. A. and Wann, D. F. (1984). Reduction of clock delays in VLSI
structure. Proceedings of IEEE International Conference on Computer Design 778-
783.
Donno, M., Macii, E, and Mazzoni, L. (2004) Power-aware. clock tree planning In
Proceedings International Symposium on Physical Design 138-147
139
Edahiro, M. (1993) A clustering-based optimization algorithm in zero skew routings.
Proceedings of the ACM/IEEE Design Automation Conference (DAC), 612–616.
Friedman, E. G. (1989). Performance limitations in synchronous digital systems.
University California, Irvine: Ph.D. dissertation.
Farrahi, A. H., Chen, C., Srivastava, A. (2001) Activity-driven clock design. In IEEE
Transactions nn Computer-Aided Design Of Integrated Circuits and Systems June
2001. Vol. 20, No. 6, pp. 705 – 714.
Fishburn, J. P. (1999). Clock skew optimization. IEEE Transactions on Computer
39(7):945–951.
Fisher, A. L., and Kung, H. T. (1982). Synchronizing large systolic arrays.
Proceedings of SPIE, 44-52.
Guthaus, M. R., Sylvester, D., and Brown, R. B. (2006) Clock buffer and wire sizing
using sequential programming. In Design Automation Conferences (DAC’06) July
24-28. San Francisco, California: ACM, 1041-1046.
Hatamian, M. (1988) Understanding clock skew in synchronous systems. Concurrent
Computations (Algorithms, Architecture and Technology). New York: Plenum, 87–
96.
Held, S., Korte, B., Maßberg, J., Ringe, Matthias., and, Vygen, J. (2003) Clock
scheduling and clock-tree construction for high performance ASICs In Proceedings
of the International Conference on Computer Aided Design (ICCAD’03). 232 – 239.
Hu, T. C. and Kuh, E. (1985) Theory and Concepts of Circuit Layout. VLSI Circuit
Layout:Theory and Design. IEEE Press, New York.
Huang, D. J. H., Kahng, A. B., and Tsao, C. W. A. (1995). On the bounded-skew
clock and steiner routing problems. Proceeding of 32nd Design Automation
Conferences 508–513.
140
Jackson, M. A. B. Srinivasan, A. and Kuih, E. S. (1990). Clock routing for high
performance IC’s. In Proceedings of the ACM/IEEE Design Automation Conference
322-327.
Kahng, A. B., Cong, J. and Robins, G. (1993). Matching based Models for high
performance clock routing. IEEE Transaction on CAD of Integrated Circuit and
System, August 1993, 12:1157-1169.
Kourtev, I. S., and Friedman, E. G. (1999). Clock skew scheduling for improved
reliability via quadratic programming. Proceedings of the 1999 IEEE/ACM
International Conference on Computer-Aided Design, 239–243.
Lam, W. C. D. and Koh, C. K. (2006) Process variation robust clock tree routing. In
Proceedings of the 2005 Conference on Asia South Pacific Design Automation 606 –
611
Magen, N., Kolodny, A., Weiser, U., and Shamir, N. (2004) Interconnect-Power
Dissipation in a Microprocessor. In Proceedings of the 2004 International Workshop
On System Level Interconnect Prediction, 7–13.
Neves, J. L., and Friedman, E. G. (1996). Optimal clock skew scheduling tolerant to
process variations. Proceedings of the 33rd
Annual Conference on Design
Automation, 623–628.
Pullela, S., Menezes, N., Omar, J., and Pillage, L. T. (1993) Skew and delay
optimization for reliable buffered clock trees. IEEE International Conferences on
Computer Aided Design. 556–562.
Sakallah, K. A., Mudge, T. N. and Olukoton, O. A. (1990). checkTc and minTc:
Timing verification and optimal clocking of synchronous digital circuits.
Proceedings of IEEE International. Conferences. on Computer-Aided Design,
November 1990, 552–555.
141
Shenoy, N. Brayton, R. K., and Sangiovanni-Vincentelli, A. L. (1992). Graph
algorithms for clock schedule optimization. Proceedings of IEEE International.
Conferences. on Computer-Aided Design, 132–136.
Sherwani, N. (1996). Algorithms for VLSI Physical Design Automation. 2nd
Edition.
Dordrecht, The Netherlands: Kluwer Academic Publishers.
Stinson, J. and Rusu, S. (2003) A 1.5GHz third generation Itanium-2 processor. In
Proceedings of the 40th conference on Design automation 706–709
Szymanski, T. G., and Shenoy, N. (1992). Verifying clock schedules. Proceedings of
IEEE International. Conferences. on Computer-Aided Design, November 1992, 124–
131.
Takahashi, A., Inoue, K., and Kajitani, Y. (1997) Clock-Tree Routing Realizing a
Clock-Schedule for Semi-Synchronous Circuits. In Proceedings of IEEE/ACM
International Conference on Computer Aided Design (ICCAD) '97. 260-265.
Tsai, J. L., Chen, T. H., and Chen, C. C. P. (2004) Zero-skew clock tree optimization
with buffer insertion/sizing and wire sizing. In IEEE Transactions on Computer-
Aided Design of Integrated Circuit and System. Vol. 23, No.4, pp. 565-527.
Tsai, Jeng-Liang (2005) Clock Tree Synthesis for Timing Convergence and Timing
Yield Improvement in Nanometer Technologies. University of Wisconsin-Madison:
Ph.D. Thesis.
Tsao, C. W. A., and Koh, C. K. (2000) UST/DME: a clock tree router for general
skew constraints. Proceedings of the IEEE/ACM ICCAD. 400–405.
Tsay, R. S. (1993). An exact zero-skew clock routing algorithm. IEEE Transactions
on Computer-Aided Design of Integration Circuit System, CAD-12, 242–249.
142
Wang, K., and Marek-Sadowska, M. (2004) Buffer sizing for. clock power
minimization subject to general skew. constraints. In Design Automation
Conferences (DAC’04). June 7-11. San Diego, California: ACM, 159–164.
Xi, J. G., and Dai, W. W. M. (1995) Buffer insertion and sizing under process
variations for low power clock distribution,” In Proceedings of 32nd
Design
Automation Conferences. June.
Xi, J. G., and Dai, W. W. M. (1997) Useful-skew clock routing with gate sizing for
low power design. Journal of VLSI Signal Processing. 16(2/3):163–179,
Zanella, S., Nardi, A., Neviani, A., Quarantelli, M., Saxena, S., and Guardiani, C.
(2000) Analysis of the impact of process variations on clock skew. IEEE
Transactions on Semiconductor Manufacturing, vol. 13, no. 4, pp. 401—407.
Zhu, Q., Dai, W. W. M., and Xi, J. G. (1993) Optimal sizing of high speed clock
networks based on distributed rc and transmission line models. IEEE International
Conferences on Computer Aided Design. November 1993, 628–633.