Variation-Aware Layer Assignment With Hierarchical ...EMERGING TOPICS IN COMPUTING ... for the...

IEEE TRANSACTIONS ON

EMERGING TOPICSIN COMPUTING

Received 3 October 2013; revised 31 December 2013; accepted 10 January 2014. Date of publication 9 April 2014;date of current version 4 February 2015.

Digital Object Identifier 10.1109/TETC.2014.2316503

Variation-Aware Layer Assignment WithHierarchical Stochastic Optimization

on a Multicore PlatformXIAODAO CHEN1, DAN CHEN1,2, (Member, IEEE), LIZHE WANG1,3, (Senior Member, IEEE),

ZE DENG1, RAJIV RANJAN4, ALBERT Y. ZOMAYA5, (Fellow, IEEE), ANDSHIYAN HU6, (Senior Member, IEEE)

1School of Computer Science, China University of Geosciences, Wuhan 430074, China2Collaborative and Innovative Center for Educational Technology, Wuhan 430079, China

3Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100093, China4Commonwealth Scientific and Industrial Research Organisation Computational Informatics Division,

Clayton South, VIC 3169, Australia5School of Information Technologies, University of Sydney, Sydney, NSW 2006, Australia

6Department of Electrical and Computer Engineering, Michigan Technological University, Houghton, MI 49931 USA

CORRESPONDING AUTHOR: D. CHEN ([email protected])

This work was supported in part by the National Natural Science Foundation of China under Grant 61272314 and Grant 61361120098, inpart by the Program for New Century Excellent Talents in University under Grant NCET-11-0722, in part by the Fundamental Research Fundsfor Central Universities, China University of Geosciences, Wuhan, China, under Grant CUG120114, Grant CUG130617, Grant CUG140614,and Grant CUG140612, and in part by the Beijing Microelectronics Technology Institute through the University Research Programme under

Grant BM-KJ-FK-WX-20130731-0013.

ABSTRACT As the very large scale integration (VLSI) technology enters the nanoscale regime, VLSIdesign is increasingly sensitive to variations on process, voltage, and temperature. Layer assignment tech-nology plays a crucial role in industrial VLSI design flow. However, existing layer assignment approacheshave largely ignored these variations, which can lead to significant timing violations. To address thisissue, a variation-aware layer assignment approach for cost minimization is proposed in this paper.The proposed layer assignment approach is a single-stage stochastic program that directly controls thetiming yield via a single parameter, and it is solved using Monte Carlo simulations and the Latin hyper-cube sampling technique. A hierarchical design is also adopted to enable the optimization process on amulticore platform. Experiments have been performed on 5000 industrial nets, and the results demon-strate that the proposed approach: 1) can significantly improve the timing yield by 64% in comparisonwith the nominal design and 2) can reduce the wire cost by 15.7% in comparison with the worst casedesign.

INDEX TERMS Layer assignment, variation-aware design, stochastic programming.

I. INTRODUCTIONAs technology enters the nanoscale regime, VLSI designis increasingly sensitive to process, voltage and tempera-ture (PVT) variations, and circuit timing is significantlyimpacted by variations. In practice, design without consid-ering variations can lead to significant impact. For instance,variation of the interconnect resistances along a global tim-ing critical net can make the net violate its timing con-straint, which could even result in the malfunction of the

whole circuit. Therefore, the variation-aware design is highlydesirable for the VLSI design.A large multitude of research works have been focused

on statistical variation-aware design. Existing statistical opti-mization techniques generally choose gate sizing as the targetproblem such as [1] and [2]. In these methods, each circuitparameter is characterized by a probability density func-tion (PDF) to account for the variational effects in contrastto a single deterministic value as in deterministic design.

488

2168-6750 2014 IEEE. Translations and content mining are permitted for academic research only.Personal use is also permitted, but republication/redistribution requires IEEE permission.

See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. VOLUME 2, NO. 4, DECEMBER 2014

Chen et al.: Variation-Aware Layer Assignment With Hierarchical Stochastic Optimization



They often need the assumption on variation distribution suchas Gaussian distribution which limits the applicability of thetechniques. There are some existing variation-aware opti-mization techniques without probability distribution assump-tions such as an interval arithmetic based technique proposedin [3]. The problem with this technique is that it is notalways accurate to use interval arithmetic to approximate anyprobability distribution.

In VlSI design, layer assignment, which is to assign wiresto different layers, is a powerful technique to effectivelyreduce the interconnect delay. It has been heavily used inthe industrial physical synthesis flow [5]. Since wires on thethick metal layers have much smaller resistances comparedto those on thin metal layers, assigning wires there caneffectively reduce the delay. Layer assignment for efficienttiming closure has been considered in [6] where the problemis formulated as using minimum amount of wire resources tomeet a timing target for a given buffered routing tree. Thisformulation is particularly interesting when design proceedsto the late stage of a physical synthesis flow [7]. At suchstage, significant layout tuning needs to be avoided. Mean-while, layer assignment is desired because it only needsto change wire layer but not gate locations [6]. With theabove formulations, Li et al. proposed an approach to per-form concurrent buffer insertion and layer assignment andHu et al. proposed a polynomial time approximation schemefor minimum cost layer assignment [5], [6]. However, theydo not consider variations which are quite important andmay significantly impact the design. There are some previousworks on variation related layer assignment techniques, suchas the redundant via insertion enhanced maze routing [9] andthe lithography aware detailed routing [11]. However, they arefocused on specific type of variations. The other issue withthe polynomial time approximation scheme for minimumcost layer assignment [6] is their assumption that every wirebetween adjacent set of buffers is assigned to the same layer,i.e., no cross-layer assignment is allowed. This is appropriatefor the designs in the early stage but not in the late stage. It iswell-known that in the late stage, cross-layer assignment isvery useful for the router to further tune the wire shapes/layerassignment to reduce coupling [8], optimize vias [9], andimprove timing [10]. To address issues mentioned before, thiswork considers a variation-aware layer assignment approachwhich can deal with general variationmodels and allow cross-layer assignment. We propose a layer assignment approachbased on a stochastic programming method to achieve abovegoals.

The proposed approach aims to minimize the wire costwith timing constraints using stochastic programming. Theapproach is focused on the general variation models witha systematic Monte Carlo simulation. In this way, theproposed approach can handle variations without anyassumptions on variation models. More importantly, thiswork considers the variations in during the layer assign-ment the late design stage in which no significant designchange is desired. This work also modifies/improves the

classic stochastic programming framework to a new single-stage stochastic programming. In general, stochastic pro-gramming is a powerful technique in handling the designwith variations. The previous work [4] uses the stochasticprogramming for the continuous gate sizing problem butnot for the discrete layer assignment problem. It follows theclassic two-stage stochastic programming framework; and itonlyminimizes the expected delay and cannot directly controlthe timing yield. As a result, they usually achieve the yield100% in their simulations (see [4]) indicating the waste ofthe resources. Observing the above critical limitation, theclassic stochastic programming framework has been largelymodified/improved in this work. Our stochastic programmingis a single-stage stochastic programming technique, which isable to directly control the timing yield via a single param-eter. It is able to handle the variation-aware optimization onlayer assignment from a global point of view and does notneed assumptions on the variation distributions. In addition,the hierarchical optimization allows the stochastic programto be parallelized in a multi-core computing environment.Furthermore, the proposed approach uses the Latin Hyper-cube sampling techniques for efficient Monte Carlo simula-tions. The main contributions of the paper is summarized asfollows.• A stochastic programming method is proposed forvariation-aware layer assignment with timing con-straints and cost consideration. It allows the usage ofa parameter to control the yield of the obtained layerassignment solution.

• The stochastic programming formulation is able to han-dle the variation-aware optimization on layer assignmentfrom a global point of view; and it does not need theassumptions on the variation distributions.

• The new algorithm can significantly improve the timingyield compared to the nominal design and significantlyreduce the wire cost compared to worst-case design.

The rest of the paper is organized as follows: Section IIintroduces integer non-linear programming based layerassignment without considering variations, which is alsothe foundation for the proposed variation-aware design.Section III studies variations in layer assignment andproposes a parallel stochastic programming based variation-aware layer assignment. Section IV-B presents the experi-mental results with analysis. A summary of work is given inSection V.

II. INTEGER NON-LINEAR PROGRAMMING BASEDLAYER ASSIGNMENT APPROACHFor the ease of presentation, this work first studies the layerassignment method without considering the variations whichis based on the integer programming. This method is alsothe foundation for our proposed variation-aware design whichwill be introduced in the next section. The following subsec-tions describe the problem formulation and integer program-ming formulation for layer assignment without consideringvariations.

VOLUME 2, NO. 4, DECEMBER 2014 489


EMERGING TOPICSIN COMPUTING Chen et al.: Variation-Aware Layer Assignment With Hierarchical Stochastic Optimization

A. PROBLEM FORMULATIONWithout considering the variations, our minimum cost layerassignment formulation with timing constraints follows theone in [6]. For completeness, they are included as follows.A buffered routing tree T = (V ,E) is given as the inputwhere V consists of driver, sinks, Steiner nodes and bufferlocations. E ⊆ V × V and let n = |V |. Denote byvr ,Vt (T ),Vb(T ) the driver, the set of sinks and the setof buffers, respectively. The buffers in the buffered treecan be computed by various buffering techniques such as[14] and [15]. Given a buffer b, denote by Cb,Rb,Kb theinput capacitance, driving resistance and intrinsic delay,respectively.

A set of m routing layers, denoted by L = {l1, l2, . . . , lm},are also given. Given an edge e = (u, v) on a layer l, let De,ldenote its delay. In this work, for the simplicity in illustrat-ing out approach, the Elmore delay model is adopted whichis widely used in physical synthesis [5]. Namely, De,l =Re,l · (Ce,l/2+ C(v)) where Re,l,Ce,l,C(v) refer to the edgeresistance, edge capacitance and load capacitance viewing atthe endpoint of the edge, respectively. Note that the accuracyof the Elmore delay model can be further improved by thetechnique in [16]. In fact, since our problem is formulatedas a general non-linear integer mathematical program, morecomplicated and accurate delay model can be used.

FIGURE 1. An example on layer assignment.

Each sink in v ∈ Vt (T ) is associated with a sink capac-itance denoted by Cv and a required arrival time denotedby RAT (v). RAT (v) specifies the latest time a signal needsto arrive at the sink v. The driver vr is associated with anarrival time, denoted by AT (driver). A net is said to satisfythe timing constraint if the arrival time at each sink is nogreater than its required arrival time. Equivalently, this meansthat the required arrival time at the driver is no earlier than itsarrival time AT (driver). Given an edge e on a layer l, denoteby we,l the cost of the edge. For example, wire cost couldbe defined using wire area or wire congestion estimation.In this paper, to illustrate the effectiveness of the approach,wire area is used as in [6]. Let tree cost denote the costof a layer assignment for the whole tree. It is defined asthe sum of the costs of all the edges in a layer assignmentsolution. Our problem is to meet a timing target by layerassignment with minimal tree cost. It can be formulated asfollows [6].

Timing Constrained Minimum Cost Layer Assignment:Given a buffered binary routing tree T = (V ,E), a set ofrouting layers L, and cost of each wire on each layer, tocompute a layer assignment solution such that the requiredarrival time at driver is no earlier than its arrival time and thetree cost is minimized.Take Figure 1 as an example. There are m layers for the

simple two pin net and different layers have different wirecosts. It is desirable to compute the layer assignment solutionwith the minimum cost satisfying the timing constraint. Forexample, the layer l2 will be used if the timing constraint is32ps in the example.

B. INTEGER PROGRAMMING FORMULATION FOR LAYERASSIGNMENT WITHOUT CONSIDERING VARIATIONSWe first formulate timing driven layer assignment prob-lem as an integer programming problem. This is in thesame spirit as the dynamic programming formulation in[6] and [14]. Note that the timing driven layer assign-ment problem without considering variations is already anNP-complete problem as shown in [6]. For the convenienceof illustration, the routing tree will be modified by addingsome nodes to form some zero-length wires. Precisely, it ismodified such that there is one node immediately before andone node immediately after any buffer location. In addition, anode will be introduced to each branch at a branching node.Refer to Figure 2 for an example. The wires between anyadjacent nodes are classified into three categories, namely,standard wire, branch wire and buffer wire. A branch wirerefers to a zero-length wire formed by adding extra nodesat the end of each branch. A buffer wire refer to a zero-length wire formed by adding extra nodes before and aftera buffer. All other wires are standard wires. In Figure 2,wires (8, 4), (8, 7) are branch wires, wires (10, 9), (3, 2) arebuffer wires, and wires (2, 1), (4, 3), (6, 5), (7, 6), (9, 8) arestandard wires. The constraints associated with each type ofwires are formulated as follows.

FIGURE 2. A modified routing tree where more nodes are addedwithout changing the topology of the tree.

Standard wire: Given a standard wire e, introduce binaryvariables xe,l for the layer assignment on e. That is, letxe,l = 1 denote assigning e to layer l. For any edge e, itcan only be assigned to one layer. Thus, xe,l1 + xe,l2 + · · · +xe,lm = 1. Given a node v, let Q(v) denote the requiredarrival time at v and C(v) the downstream capacitance when

490 VOLUME 2, NO. 4, DECEMBER 2014




viewing at v. Recall that Re,l,Ce,l denote the wire resistanceand capacitance of ewhen assigning e to layer l, respectively.Consider a standard wire e = (u, v). Q(u) can be computedas Q(u) = Q(v) −

∑mi=1[R(e, li) · [C(e, li)/2 + C(v)]]xe,li .

Note that in the above, only one xe,li will be 1 due to theconstraint xe,l1+xe,l2+· · ·+xe,lm = 1.C(u) can be computedas C(u) = C(v)+

∑mi=1 Ce,lixe,li .

Buffer wire: When a buffer is reached, C will be set tothe buffer capacitance since layer assignment will not changebuffer. Otherwise, ECO placement needs to be invoked whichis not desired in the late design stage. Q will be updatedconsidering the buffer delay. For the buffer wire (v1, v2) =(3, 2) in Figure 2, we have

Q(v1) = Q(v2)− Db(v2),C(v1) = Cb, (1)

where Cb is the buffer capacitance for buffer b and Db(v) isthe buffer delay computed asDb(v) = Rb ·C(v)+Kb where Rband Kb the buffer resistance and intrinsic delay, respectively.Note that the driver can be treated in the same way as a buffer(e.g., the buffer wire for the driver is (10, 9) in Figure 2). Thetiming constraint says that Q(driver) ≥ AT (driver) at theinput of the driver.

Branch wire: Suppose that two branches B1 and B2 meetat a branching node v and the newly introduced nodes arev1 and v2. For example, (v, v1) = (8, 4), (v, v2) = (8, 7) inFigure 2. Q(v),C(v) can computed as follows.

Q(v) = min{Q(v1),Q(v2)},C(v) = C(v1)+ C(v2). (2)

Note that min is taken since worst-case performance needs tobe guaranteed. This is to follow the dynamic programmingprocedure in [6] and [14].

Recall that Cv denotes the sink capacitance when v is asink and AT (driver) and RAT (v) denote the arrival time andrequired arrival time at driver and at v, respectively. we,ldenotes the wire cost of a wire e at layer l. The timing con-strained minimum cost layer assignment can be formulated inEqn. (3).

min∑

∀ standard wire e

m∑i=1

we,lix(e, li)

s.t. Q(vj)−m∑i=1

[Re,liCe,li/2+ Re,liC(vj)]xe,li

= Q(vi) ∀ standard wire e = (vi, vj)

C(vj)+m∑i=1

Ce,lixe,li = C(vi),

∀ standard wire e = (vi, vj)

Q(vi) ≤ Q(vj) ∀ branch wire e = (vi, vj)

C(vi) =∑

∀ branch wire e=(vi,vj)

C(vj)

Q(vi) = Q(vj)− [Rb,vi · C(vj)+ Kb,vi ],

∀ buffer wire e = (vi, vj) and buffer b is located

C(vi) = Cb,vi ,

∀ buffer wire e = (vi, vj) and buffer b is located

Q(driver) ≥ AT (driver), at the input of the driver

Q(v) = RAT (v) ∀v is a sink

C(vi) = Cvi , vi is a sink

xe,li = {0, 1} ∀e, i. (3)

To concentrate on illustrating our main idea, this workdoes not consider via delay. However, it can be easilyhandled by modifying the constraints since vias can bemodeled as resistors and capacitors as well. For exam-ple, for two connecting wires (v1, v2) and (v2, v3), with-out handling via, the capacitance at v2 is is C(v2) =C(v2,v3),l1x(v2,v3),l1+C(v2,v3),l2x(v2,v3),l2+C(v3). To handle via,C(v2) = C(v2,v3),l1x(v2,v3)l1 + C(v2,v3),l2x(v2,v3)l2 + C(v3) +x(v1,v2),l1 · x(v2,v3),l2 · Cvia,l1,l2 + x(v1,v2),l2 · x(v2,v3),l1 · Cvia,l1,l2where Cvia,l1,l2 is the capacitance of the via connectingl1 and l2. It will just make the integer non-linear programmorecomplicated but not impact our algorithmic framework.

It is helpful to illustrate Eqn. (3) using a simple examplein Figure 2. Assuming that both buffers are of type b inFigure 2, the corresponding integer programming formulationis in Eqn. (4).

min∑

e∈{(2,1),(4,3),(6,5),(7,6),(9,8)}

m∑i=1

we,lix(e, li)

s.t.

Q(1)−m∑i=1

[R(2,1),liC(2,1),li/2+R(2,1),liC(1)]x(2,1),li=Q(2),

Q(3)−m∑i=1

[R(4,3),liC(4,3),li/2+R(4,3),liC(3)]x(4,3),li=Q(4),

Q(5)−m∑i=1

[R(6,5),liC(6,5),li/2+R(6,5),liC(5)]x(6,5),li=Q(6),

Q(6)−m∑i=1

[R(7,6),liC(7,6),li/2+R(7,6),liC(6)]x(7,6),li=Q(7),

Q(8)−m∑i=1

[R(9,8),liC(9,8),li/2+R(9,8),liC(8)]x(9,8),li=Q(9),

C(1)+m∑i=1

C(2,1),lix(2,1),li = C(2),

C(3)+m∑i=1

C(4,3),lix(4,3),li = C(4),

C(5)+m∑i=1

C(6,5),lix(6,5),li = C(6),

C(6)+m∑i=1

C(7,6),lix(7,6),li = C(7),

C(8)+m∑i=1

C(9,8),lix(9,8),li = C(9),

Q(8) ≤ Q(4),




Q(8) ≤ Q(7),

C(8) = C(4)+ C(7),

Q(3) = Q(2)− [Rb,3 · C(2)+ Kb,3],

Q(10) = Q(9)− [Rb,10 · C(9)+ Kb,10],

C(3) = Cb,3,

C(10) = Cb,10,

Q(driver) = Q(10) ≥ AT (driver) = AT (10),

Q(1) = RAT (1),

Q(5) = RAT (5),

C(1) = C1,

C(5) = C5,

xe,li = {0, 1} ∀e, i. (4)

Note that in Eqn. (3), C(v),Q(v), xe,l are variables andEqn. (3) is not a linear program due to C(v)xe,l . Moreimportantly, when variations are considered, the constants inEqn. (3) (which are coefficients and constant terms) such aswire capacitance Ce,l are random variables as mentioned inSection III-A. For example, each Ce,l can be modeled as

Ce,l = C0e,l +

∂C∂W

∑(a,b)

α(a,b)1W(a,b)

+∂C∂T

∑(a,b)

β(a,b)1T(a,b) + δg + ε, (5)

where 1W and 1T are random variables. In addition, bufferresistances (or capacitance and intrinsic delay) are shown asRb,vi to incorporate the fact that buffer resistance at differentlocations vi may have different variations. The key problemis to solve an integer program considering the uncertainty onthe coefficients and constant terms.

III. VARIATION-AWARE STOCHASTICLAYER ASSIGNMENTThe focus of this paper is variation-aware design.To handle variations in layer assignment, a parallel stochasticoptimization method based on the integer programming layerassignment method is proposed. The variation model and tra-ditional stochastic programming are first studied. After that,a parallel yield-driven stochastic optimization framework isproposed.

A. VARIATION MODELSIn this work, the variations on gates and wires are considered.To model the circuit variations, the widely accepted modelingtechniques proposed in [17] is used. The model is a first orderapproximation of a general non-linear variation model and isable to capture the major components of most variations [17].This model has been extensively used in statistical timinganalysis and optimization works such as [4] and [18]. In thispaper, to illustrate the effectiveness of the proposed approach,for interconnects, variations on metal width W and metalthickness T are considered. However, our approach is notlimited to these variations and other variations can be handled.

For a wire e in layer l, the wire capacitance can be modeledas

Ce,l = C0e,l +

∂C∂W

1W +∂C∂T1T + δe,l + ε, (6)

where C0e,l refers to the nominal value, ∂C

∂W and ∂C∂T refer to

the sensitives of wire capacitance to metal width and metalthickness, respectively. 1W and 1T are random variableswhich refer to variations in meta width and metal thickness,respectively. δe,l and ε are random variables which refer tothe local and global variations. We similarly model the wireresistance. Since the layer assignment is applied for a bufferedrouting tree, variations on driver, buffers and sinks also needto be considered. In this work, to illustrate the effectivenessof the proposed approach, variations on gate length L andthreshold voltage Vt are considered and other variations canbe handled as well. They can be similarly modeled as above.For example, gate capacitance can be modeled as

Cg = Cg0 +∂C∂L1L +

∂C∂V

1V + δg + ε. (7)

As in [18], to handle spatial correlation which is especiallyimportant for a global net, a mesh is layered on the circuit.Each segment of wire in a grid will be indexed by (a, b).Following [18], incorporating spatial correlation into consid-eration, we have

Ce,l = C0e,l +

∂C∂W

∑(a,b)

α(a,b)1W(a,b)

+∂C∂T

∑(a,b)

β(a,b)1T(a,b) + δe,l + ε, (8)

where α, β are the parameters, and 1W(a,b) and 1T(a,b)are independent components which can be obtained throughperforming principal component analysis to the correlatedrandom variations. The models on gate capacitance andresistance considering spatial correlations can be similarlyderived. Refer to [18] for more details.

B. TWO-STAGE STOCHASTIC PROGRAMMINGStochastic programming is a popular technique to solve themathematical program with uncertainty. It has been appliedto various fields such as scheduling, telecommunications,computational finance, and production control [19]. One ofthe most well-known stochastic programming techniques isthe two-stage stochastic programming [19].In the first stage, one makes an optimization decision (i.e.,

compute a layer assignment solution) without consideringthe wire variations. In the second stage, when the variationsare considered, the first-stage solution will be adjusted suchthat certain objective is optimized (delay of the circuit isminimized). The simple separate optimization as abovewouldlead to the inferior solution since it solely relies on the second-stage adjustment to improve the solution quality. Thus, thestochastic programming with recourse is often used to link thetwo stages. It computes the solution minimizing the expecteddelay considering the variations which involves the secondstage and it is also an iterative procedure.





Following the conventions in the stochastic programmingliterature [20], the above can be formulated as follows. For theconvenience of illustration, only the stochastic linear programis shown.

min cT x + E[Q(x, ϑ(ω))]

s.t.

Ax ≥ b

x ≥ 0, (9)

where cT x denotes the optimization objective without varia-tions (e.g., nominal design), E[·] denotes the expected value,and Q(x, ϑ(ω)) denotes the optimal objective value for thesecond-stage problem. E[Q(x, ϑ(ω))] serves as a penaltyfunction (e.g., additional delay due to variations). The secondstage problem is defined as follows.

min = p(ω)T y

s.t.

T (ω)x +W (ω)y = h(ω)

y ≥ 0, (10)

where ω are random variables, p(ω), T (ω), W (ω) are ran-dom coefficients and h(ω) are random constants. Note thatin the first-stage problem, x is the variable while in thesecond stage problem, y is the variable and x is treatedas a constant. The above ϑ(ω) refers to the tuples consist-ing of p(ω),T (ω), h(ω). E[·] is taken with the probabilitydistribution of ω. To solve the two-stage stochastic linearprogram, a commonly-used technique is the sample averageapproximation technique [12], [13], and [20]. In this method,roughly speaking, each time some random samples are gen-erated according to the probability density function of ω andthe expected value E[Q(x, ϑ(ω))] can be approximated byaveraging the Q values among samples. This procedure isiterated until some stopping criterion is met.

The main weakness of the above framework is that it triesto minimize the expected delay that is only weakly related tothe timing yield.

C. SINGLE-STAGE STOCHASTIC OPTIMIZATIONFRAMEWORKThe target of variation-aware optimization is actually to com-pute the best solution satisfying a yield target. In the contextof layer assignment this means that the minimum cost layerassignment should be computed such that 99% of circuitssatisfy a timing constraint T . The above classic stochasticprogramming framework does not apply in this case. Theprevious stochastic programming based gate sizing techniquein [4] follows the above framework and thus they cannotdirectly control the yield as well.

Our idea is to formulate the stochastic layer assignmentproblem into into a single integer program instead of a two-stage integer program (with two separate integer programs).This allows us to introduce a parameter called robust parame-ter, denoted by γ , to directly control the yield. Further, such a

new framework allows us to easily parallelize the algorithm.Our idea in this paper is motivated from and in the same spiritas the previous works [12] [13]. However, the works [12] [13]use the robust parameter idea to tackle the embedded systemdesign problems but not the layer assignment problem.

1) THE SINGLE-STAGE STOCHASTIC PROGRAMThe traditional two-stage stochastic program iteratively pro-cesses the yield factor tuning and yield factor evaluation untilit satisfies the stop criteria. In contrast, the proposed single-stage stochastic program for this layer assignment case, hasbeen designed to explicitly control the yield value by a onestep integer program. For this layer assignment problem,without loss of solution quality, the proposed single-stagemethod has better performance efficiency than the traditionaltwo-stage one. For this layer assignment problem, withoutloss of solution quality, the proposed single-stage method hasbetter runtime than the traditional two-stage one. We willtransform the integer program with random coefficients inEqn. (3) to an integer program without random variables.This is accomplished by instantiating random coefficients todeterministic constants through Monte Carlo samples. Let usdenote the constraints in Eqn. (3) by C(�), where � is aninstance of the set of all coefficients/constants in Eqn. (3).That is, a � corresponds to a Monte Carlo sample. Since in aMonte Carlo sample, the wire width and wire thickness for allwires in all layers are known, the wire capacitances and wireresistances for all wires in all layers can be obtained. Moreimportantly, they are constants in each Monte Carlo sample.Note that they are the coefficients/constants in Eqn. (3). Theabove also applies to the gate capacitances and resistances.Suppose that we have randomly generated k Monte Carlosamples for the variation-aware layer assignment problem.Let �i denote the i-th Monte Carlo sample in the totally kMonte Carlo samples. The new integer programming formu-lation is as follows.

min∑

∀ standard wire e

m∑i=1

we,lix(e, li)

s.t. C(�1)C(�2). . .

C(�k )xe,li = {0, 1} ∀e, i. (11)

For example, Eqn. (4) can be instantiated as follows. Forsimplicity, suppose that there are only two layers, namely,m = 2. Further suppose that there are two Monte Carlosamples. In the first Monte Carlo sample, Ce,l1 = 2 andCe,l2 = 4 for all wire e, and Re,l1 = 20 and Re,l2 = 10for all wire e. The capacitance of each buffer/sink/driver is5 and resistance is 5. The buffer/driver intrinsic delay is 1.In the second Monte Carlo sample, Ce,l1 = 4 and Ce,l2 = 8for all wire e, and Re,l1 = 40 and Re,l2 = 20 for all wire e.The capacitance of each buffer/sink/driver is 10 and resistanceis 10. The buffer/driver intrinsic delay is 2.




Let Qi,Ci denote the variables corresponding to the sam-ple �i. The Monte Carlo simulation based integer programcorresponding to Figure 2 is shown in Eqn. (16). Note thatthis is a deterministic integer program with no random vari-ables. In the above formulation, the target variables are xe,liand they are shared in all C(�1), C(�2), . . . , C(�k ). How-ever, each C(�i) has different set of variables of capaci-tance Ci(v) and RAT Qi(v). In particular, we are interestedin Qi(driver) which is the RAT at driver (e.g., Qi(10) inEqn. (16)) in a C(�i). Since we have k samples, we haveQ1(driver),Q2(driver), . . . ,Qk (driver).

Note that by default, we have set all of the Qi(driver) ≥AT (driver) as in Eqn. (16). This means that we are to computethe worst-case design since all of the k samples need tosatisfy the timing constraint. We are to investigate setting1k

∑ki=1 Qi(driver) ≥ AT (driver) as a constraint. Roughly

speaking, this means that we compute a design satisfying thetiming constraint in an average-case sense.

Let C(�i)\Qi(driver) denote the set of constraints exceptthe constraint of ‘‘Q(v) ≥ AT (driver)’’ in Eqn. (3). Roughlyspeaking, the average-case design can be obtained from

min∑

∀ standard wire e

m∑i=1

we,lix(e, li)

s.t. C(�1)\Q1(driver)

C(�2)\Q2(driver)

. . .

C(�k )\Qk (driver)

1k

k∑i=1

Qi(driver) ≥ AT (driver)

xe,li = {0, 1} ∀e, i. (12)

Consider a yield target Y , e.g., Y = 99%. We actuallywant to compute a design such that 99% of Qi(driver) satisfythe timing constraint, i.e., greater than AT (driver). Thus, ifwe would be able to sort all k Qi(driver) and set the 0.01ksmallest one to be greater thanAT (driver), wewould computea design exactly matching the yield requirement with nooverdesign. However, one cannot sort the constraints in amathematical programming formulation. Thus, we propose touse the following approximate method.

Two additional variables are introduced. The first vari-able Qw computes the worst-case RAT among all k layerassignment solutions, and the second variable Ta to com-pute the average-case RAT value among all k layer assign-ment solutions. Precisely, as mentioned above, Qw =

max{Q1(driver),Q2(driver), . . . ,Qk (driver)} and Qa =1k

∑ki=1 Qi(driver). These can be certainly formulated in

mathematical program.Subsequently, a parameter, called robust parameter

denoted by γ , is introduced to control the tradeoff betweenthe worst-case delay Qw and the average delay Qa. Precisely,the constraint is set as (1 − γ )Qa + γQw ≥ AT (driver).Clearly, setting γ = 0 leads to the average case design while

setting γ = 1 leads to the worst case design. γ is directlyrelated to the timing yield and varying γ , different tradeoffcan be obtained. Our formulation is as follows.

min∑

∀ standard wire e

m∑i=1

we,lix(e, li)

s.t. C(�1)\Q1(driver)

C(�2)\Q2(driver)

. . .

C(�k )\Qk (driver)

Qw ≤ Qi(driver) ∀i

Qa =1k

k∑i=1

Qi(driver)

(1− γ )Qa + γQw ≥ AT (driver)

xe,li = {0, 1} ∀e, i. (13)

In the example of Eqn. (16), this means that we replace

Q1(10) ≥ AT (driver) = AT (10)

Q2(10) ≥ AT (driver) = AT (10) (14)

with

Qw ≤ Q1(10)

Qw ≤ Q2(10)

Qa = [Q1(10)+ Q2(10)]/2

(1− γ )Qa + γQw ≥ AT (driver) = AT (10). (15)

Note again that the above formulation is a determinis-tic integer program without any random coefficients and γis a constant. The benefit of the whole technique is thatthe variation is handled in a global way since each timethe mathematical program is solved in a global fashionby the mathematical program solver. There are two issueswith the above formulation, namely, how large the value ofk can be and how to set γ .

2) HIERARCHICAL OPTIMIZATIONSuppose that k = 5000 Monte Carlo samples are used. Thismeans that the obtained integer program has the size 5000×of the original layer assignment integer program in Eqn. (3).Solving this large system is computationally prohibitive. Wewill first reduce the number of Monte Carlo samples used inthe stochastic programming formulation. For this, the LatinHypercube sampling based Monte Carlo simulation will beused. This will enable us to reduce the sample sizes fromcommonly used large number samples to a small size sampleset. Refer to Section III-C.4 for details.The problem is that even for k = 200 Latin Hypercube

samples, the integer program in Eqn. (13) would still be toolarge to be efficiently solved for large signal nets. We use thefollowing hierarchical optimization technique to tackle thisdifficulty. First pick r samples out of k samples and removethem from k samples. We formulate the integer program asin Eqn. (13) except that there are only r samples in contrast





to k samples. r is chosen such that the resulting integerprogram can be efficiently solved using the mathematicalprogramming solver.

After computing the solution, we find the Qi(driver) clos-est to AT (driver). This Q and the corresponding sampleroughly determines the yield on these r samples since onejust needs to calculate the number of Q(driver) smaller thanthe picked Qi(driver) and it gives the number of samplessatisfying the timing constraint. Due to this, the sample corre-sponding to the picked Qi(driver) is called a critical samplefor those r samples since it determines the yield on them. Wethen proceed to the other r samples out of k − r samplesand perform the same as above to find the second criticalsample. This process is iterated until all k samples have beenconsidered.

In a total, we will have dk/re integer programs each ofwhich contain r samples (perhaps except the last integerprogram). dk/re critical samples will be computed. Theywill be used to formulate a new integer program whichcontains dk/re samples. If dk/re samples are still too largeto be efficiently solved, we can recursively apply the samedecomposition procedure as above on them. As indicatedin our experiments, a two-level decomposition is sufficient.Precisely, we set k = 200 and r = 15. Each of the 14first-level integer programs contains r = 15 samples (exceptthe last one). After solving all 14 such integer programs, wehave 14 critical samples which will be formulated into thesecond-level integer program.When solving this second-levelinteger program, we need to guarantee that the worst-casedesign still satisfies the timing constraint since they are therepresentative critical samples in each set. When all of themsatisfy the timing constraint, there should be enough samples(which are the ones less critical than them) satisfying thetiming constraint and therefore the yield target is satisfied.It is also clear that r = 15 is chosen to balance the first-levelinteger program size and second-level integer program size.

Note that γ and the yield target are correlated in weaksense. The first-level solution to Eqn. (13) (consisting ofr = 15 samples) depends on γ . If the yield target is set to99%, it does not mean that we need the Q of all of r = 15samples to be ≥ AT (driver). The number of such samplesdepends on γ . In fact, typically γ is much smaller than 1 for99% yield target according to our experiments. On the otherhand, due to the change in sample space between two levels,the criticality estimation might not be accurate. Therefore, theabove decomposition technique is not guaranteed to computethe optimal solution. However, it works well as indicated byour experiments.

Note that each integer program is actually hard to solve dueto integer constraints. Thus, we relax each integer constraintsto real constraint, solve it using a mathematical programmingsolver (IPOPT), and then applies the well-known sequentialrounding technique to round real-valued solution to integersolution. Such a technique has been widely used in VLSICAD (see., e.g., [21]). For completeness, we briefly describeit as follows. First, the integer constraints are relaxed as

0 ≤ xe,li ≤ 1,∀e, i. The new mathematical program is thensolved where the solution may contain fractional numbers.All variables in the solution that are equal to 0 or 1 willbe fixed accordingly. This is accomplished by adding newconstraints such as xe,li = 1 to Eqn. (13) if xe,li is equal to 1 inthe solution of the relaxedmathematical program. In addition,In the solution, those variables close to 1 or 0 will be roundedand fixed. Precisely, the variables < δ will be rounded to 0and the variables > 1 − δ will be rounded to 1. If there isno such variable, the one with the smallest rounding errorwill be rounded. The above can be accomplished by addingnew constraints such as xe,li = 1 or xe,li = 0 to Eqn. (13)for some xe,li . The new mathematical program will be solvedagain (with some fixed variables) to obtain the new solution.This process is iterated until all variables are integers.

FIGURE 3. Illustration of the hierarchical optimization. Eachinteger program can be assigned to different cores (subject toavailability) and synchronization is needed only when solvingthe final (second-level) dk/re critical samples.

min∑

e∈{(2,1),(4,3),(6,5),(7,6),(9,8)}

m∑i=1

we,lix(e, li)

s.t.

Q1(1)− [20+ 20C1(1)]x(2,1),l1−[20+ 10C1(1)]x(2,1),l2 = Q1(2),

Q1(3)− [20+ 20C1(3)]x(4,3),l1−[20+ 10C1(3)]x(4,3),l2 = Q1(4),

Q1(5)− [20+ 20C1(5)]x(6,5),l1−[20+ 10C1(5)]x(6,5),l2 = Q1(6),

Q1(6)− [20+ 20C1(6)]x(7,6),l1−[20+ 10C1(6)]x(7,6),l2 = Q1(7),

Q1(8)− [20+ 20C1(7)]x(9,8),l1−[20+ 10C1(8)]x(9,8),l2 = Q1(9),

C1(1)+ 2x(2,1),l1 + 4x(2,1),l1 = C1(2),

C1(3)+ 2x(4,3),l1 + 4x(4,3),l1 = C1(4),

C1(5)+ 2x(6,5),l1 + 4x(6,5),l1 = C1(6),

C1(6)+ 2x(7,6),l1 + 4x(7,6),l1 = C1(7),




C1(8)+ 2x(9,8),l1 + 4x(9,8),l1 = C1(8),

Q1(4) ≥ Q1(8),

Q1(7) ≥ Q1(8),

C1(8) = C1(4)+ C1(7),

Q1(3) = Q1(2)− [5 · C1(2)+ 1],

Q1(10) = Q1(9)− [5 · C1(9)+ 1],

C1(3) = 5,

C1(10) = 5,

Q1(1) = RAT (1),

Q1(5) = RAT (5),

C1(1) = 5,

C1(5) = 5,

Q1(driver) = Q1(10) ≥ AT (driver) = AT (10),

Q2(1)− [80+ 40C2(1)]x(2,1),l1−[80+ 20C2(1)]x(2,1),l2 = Q2(2),

Q2(3)− [80+ 40C2(3)]x(4,3),l1−[80+ 20C2(3)]x(4,3),l2 = Q2(4),

Q2(5)− [80+ 40C2(5)]x(6,5),l1−[80+ 20C2(5)]x(6,5),l2 = Q2(6),

Q2(6)− [80+ 40C2(6)]x(7,6),l1−[80+ 20C2(6)]x(7,6),l2 = Q2(7),

Q2(8)− [80+ 40C2(7)]x(9,8),l1−[80+ 20C2(8)]x(9,8),l2 = Q2(9),

C2(1)+ 4x(2,1),l1 + 8x(2,1),l1 = C2(2),

C2(3)+ 4x(4,3),l1 + 8x(4,3),l1 = C2(4),

C2(5)+ 4x(6,5),l1 + 8x(6,5),l1 = C2(6),

C2(6)+ 4x(7,6),l1 + 8x(7,6),l1 = C2(7),

C2(8)+ 4x(9,8),l1 + 8x(9,8),l1 = C2(8),

Q2(4) ≥ Q2(8),

Q2(7) ≥ Q2(8),

C2(8) = C2(4)+ C2(7),

Q2(3) = Q2(2)− [10 · C2(2)+ 2],

Q2(10) = Q2(9)− [10 · C2(9)+ 2],

C2(3) = 10,

C2(10) = 10,

Q2(1) = RAT (1),

Q2(5) = RAT (5),

C2(1) = 10,

C2(5) = 10,

Q2(driver) = Q2(10) ≥ AT (driver) = AT (10),

xe,li = {0, 1} ∀e, i. (16)

FIGURE 4. (a) 8 samples generated by standard sampling(b) 8 samples generated by Latin Hypercube sampling.

3) PARALLEL OPTIMIZING PROCEDURESThe hierarchical optimization on layer assignment can makethe large system efficiently solvable. More importantly, it canbe easily parallelized in the multi-core programming environ-ment. This is due to the fact that the integer programs formedby each set of (first-level) r samples can be solved totallyindependently. The synchronization is only needed when wecome to solve the second-level integer program (consisting ofdk/re critical samples). This process is illustrated in Figure 3.It remains to show how to determine γ . In fact, this can

be accomplished by searching for various γ with certain stepsize. For example, if the step size is 0.1, one could search forγ from 0.1 to 0.2 then to 0.3 by noting that γ = 0 refersto the average-case design and γ = 1 refers to the worst-case design. For each solution, we will evaluate its yield andreturn the minimum cost solution satisfying the yield target.It is clear that one can parallelize this procedure since themathematical program corresponding to different γ will notinteract with each other. This can also be easily implementedin multi-core programming environment.

4) LATIN HYPERCUBE SAMPLING BASEDMONTE CARLO SIMULATIONIn our formulation, k = 200 Latin Hypercube samples areused. These are sufficient to obtain a close approximationto the full-fledged Monte Carlo simulation technique. Thelatter uses simple sampling, often needs over 5000 samplesand optimization with these samples would be very time con-suming. Latin Hypercube sampling allows us to sample thevariation space more evenly, which avoids generating manysamples in a small local region [22]. Consequently, one cancover most simulation space with small number of samples.Such a technique has been used in statistical timing analysisand optimization in VLSI CAD literatures [23] and [24]. Theyhave demonstrated excellent yield estimation accuracy com-pared to the full-fledged 5000 samples based Monte Carlosamples. For example, the work [22] shows that comparedto 5000 samples based Monte Carlo simulation, 200 LatinHypercube samples based Monte Carlo simulation can haveerror of<1%. Note that our approach is not restricted to LatinHypercube sampling technique. Other sample reduction tech-nique such as Quasi-Monte carlo techniques can be certainlyused.





TABLE 1. Comparison of the layer assignment solutions by nominal design (without considering variations), worst-case design(worst-corner) and the proposed stochastic programming based design. The results on 10 selected global or local nets and theaverage result on 5000 nets are shown. Note that most nets in 5000 nets have size < 20.

For completeness, the details of generating Latin Hyper-cube samples are included as follows. Let us use a simpletwo-dimensional simulation example to illustrate the LatinHypercube sampling technique. Given two random variablesx, y, we are to generate 8 simulation samples. Suppose thatthey are as shown in Figure 4(a) by the standard sample gen-eration technique. These samples do not evenly distributedin the simulation space and more importantly, they do notcover the simulation space well. The simulations with thesesamples would not lead to good yield estimation accuracy. Togenerate samples well distributed in the simulation space, oneneeds to memorize the locations of the previously generatedsamples. Latin Hypercube sampling technique accomplishesthis by dividing simulation space into rows and columns andrequiring that only one sample can be generated in each rowand column. For example, in Figure 4(b), we first divide thesimulation space into 8 × 8 equal-probability bins. When asample is generated from a bin, its corresponding row andcolumn will be removed to avoid generating more samplesin the row and column. It can be seen that the generatedsamples could be well distributed in the simulation space.Therefore, one can use a much smaller number of samples(200 samples as in [22], [23], and [25]) to well approx-imate the full-fledged 5000 samples based Monte Carlosimulation. Refer to [22] for further details. Latin Hyper-cube sampling has been successfully used in VLSI CAD(see, e.g., [23]–[25]).

IV. EXPERIMENTS AND RESULTSExtensive experiments were performed to validate the pro-posed layer assignment approach, which targets to reducethe wire cost with timing yield consideration. The followingsubsections describe the experimental setups and analysis theexperimental results.

A. EXPERIMENTAL SETUPSThe proposed stochastic layer assignment approach is imple-mented in C++. Sequential rounding is used to compute theinteger solutions and in each step the nonlinear optimiza-tion tool IPOPT [26] is used to solve the relaxed mathe-matical programming problem. The approach is tested on a

Pentium IV machine with 2.5GHz Quad-Core CPU and8G memory. The experiments are performed to a set of5000 industrial buffered nets and there are eight lay-ers. Most buffered nets have+ ≤ 20 wires. The wirecost is measured by scaled wire area in this paper.However, other metric can be easily incorporated in ourstochastic programming approach. In the experiments,variations are assumed to follow Gaussian distribution.Due to the usage of Monte Carlo simulation, our techniqueis not limited to Gaussian distribution and can handle otherdistribution as well. In the experiments, metal width, metalthickness, gate length and gate threshold voltage are assumedto follow normal distributions and the 3σ value for eachvariation source is set to 15% of the mean value. In our exper-iments, yield target is set to 99%. As mentioned above, LatinHypercube sampling based Monte Carlo simulations with200 Latin Hypercube samples are used for yield estimation.Note that after optimization, full-fledged Monte Carlo simu-lations with 5000 samples are used to accurately evaluate thetiming yield for the layer assignment solution.

B. EXPERIMENTAL RESULTSRefer to Table 1 for the results. It shows the average resulton these 5000 nets as well as the results on 10 selected netsat various scales. Note that the large nets among these 10nets do not represent the typical nets in 5000 nets since mostbuffered signal nets are often small, i.e., have≤20wires in ourindustrial signal nets. Nominal design is obtained by solvingEqn. (3) using the nominal value of each circuit parameterand thus no variations are considered. Worst-case design isobtained using the worst-case corner, i.e., setting each capac-itance and resistance to the worst case µ + 3σ corner (µ isthe mean and σ is the standard deviation). Stochastic designis obtained from the proposed our stochastic layer assignmentapproach. Note that all of the above are integer programs andthus sequential rounding is involved. Note that some nets areglobal nets and some are local nets. Although they have thesimilar number of wires, they can have quite different totallength and thus the cost could be very different. We make thefollowing observations.




FIGURE 5. Yield-Cost tradeoff curve for Net5.

• The nominal design does not consider variations andthus its yield is always too small (35.2% on average).

• The worst-case design is too conservative about thevariations and its yield is always 100% which leadssignificantly waste of the resources. This is clear bycomparing cost results between our proposed stochasticprogramming approach to the worst case design.

• Our stochastic programming approach computes a goodtradeoff between the nominal design and worst-casedesign. We are always able to closely match the yieldtarget. Compared to nominal design, our stochasticdesign largely improves its yield. Compared to worst-case design, our stochastic design achieves 15.7%reduction in cost.

• The stochastic optimization is slower than comput-ing the nominal design and worse-case design. How-ever, one can tackle this through parallelizing ourstochastic programming approach since the proposedstochastic technique is parallelization friendly as indi-cated in Section III-C.3. The parallelized version of ourstochastic programming technique is implemented in aquad-core computing environment. It can be seen thaton average we reduce the runtime by 2.54×. It wouldbe expected that with more cores, the speedup would bemore significant.

• The average running time of the proposed algorithm inparallel is 27.2 seconds, but for some cases, the runningtime can be greater than 200 seconds. The reasons are:(1) the integer program taking longer time with complexnets, and (2) a large amount of γ searching iterations.

Varying the robust parameter γ , we are able to obtain dif-ferent yield-cost tradeoff. A sample yield-cost tradeoff curveis shown in Figure 5 for Net5. The curve shows a tradeoffbetween the area and the yield. To satisfy the large yield value,more area cost are required. For this net, to achieve 98% yield,the area cost has to be greater than 165.

V. CONCLUSIONThis study aims at handling variations on process, voltage andtemperature to avoid timing violations brought by these vari-ations in VLSI design. A variation-aware layer assignmentapproach has been successfully developed focusing on thegeneral variation models. The proposed approach explores asystematic stochastic programming method for timing yielddriven layer assignment, such that the wire cost can be mini-mized while satisfying the yield constraint. The single-stagestochastic program has been designed to explicitly control theyield, which exhibits advantages in timing yield control overits traditional two-stage counterpart. In addition, the hierar-chical optimization framework allows parallel execution ofthe optimization process on a multi-core platform to gainsignificant improvement in runtime performance.The results obtained from experiments on 5000 industrial

nets demonstrate that the the proposed approach (1) cansignificantly improve the timing yield by 64.0% in compari-son with the nominal design and (2) can reduce the wire costby 15.7% in comparison with the worst-case design. It is alsoobserved that a quad-core implementation of our stochasticoptimization approach can reduce the runtime by 2.54× incomparison with its sequential implementation.

REFERENCES[1] D. Sinha, N. Shenoy, and H. Zhou, ‘‘Statistical gate sizing for tim-

ing yield optimization,’’ in Proc. IEEE/ACM ICCAD, Nov. 2005,pp. 1037–1041.

[2] A. Davoodi and A. Srivastava, ‘‘Variability driven gate sizing for binningyield optimization,’’ IEEE Trans. Very Large Scale Integr. (VLSI) Syst.,vol. 16, no. 6, pp. 683–692, Jun. 2008.

[3] A. Singhee, C. Fang, J. Ma, and R. Rutenbar, ‘‘Probabilistic interval-valuedcomputation: Toward a practical surrogate for statistics inside cad tools,’’in Proc. 43rd Annu. DAC, 2006, pp. 167–172.

[4] V. Khandelwal and A. Srivastava, ‘‘Monte-Carlo driven stochasticoptimization framework for handling fabrication variability,’’ in Proc.IEEE/ACM ICCAD, Nov. 2007, pp. 105–110.

[5] Z. Li, C. Alpert, S. Hu, T. Muhmud, S. Quay, and P. Villarrubia, ‘‘Fastinterconnect synthesis with layer assignment,’’ in Proc. ISPD, 2008,pp. 71–77.

[6] S. Hu, Z. Li, and C. Alpert, ‘‘A polynomial time approximation scheme fortiming constrained minimum cost layer assignment,’’ in Proc. IEEE/ACMICCAD, Nov. 2008, pp. 112–115.

[7] C. Alpert et al., ‘‘Techniques for fast physical synthesis,’’ Proc. IEEE,vol. 95, no. 3, pp. 573–599, Mar. 2007.

[8] R. M. D. Wu, J. Hu, and M. Zhao, ‘‘Layer assignment for crosstalk riskminimization,’’ in Proc. ASP-DAC, 2004, pp. 159–162.

[9] G. Xu, L.-D. Huang, D. Z. Pan, and M. D. F. Wong, ‘‘Redundant-viaenhanced maze routing for yield improvement,’’ in Proc. ASP-DAC, 2005,pp. 1148–1151.

[10] M. Cho, K. Lu, K. Yuan, and D. Z. Pan, ‘‘Boxrouter 2.0: Architecture andimplementation of a hybrid and robust global router,’’ in Proc. IEEE/ACMICCAD, Jun. 2007, pp. 503–508.

[11] M. Cho, K. Yuan, Y. Ban, and D. Z. Pan, ‘‘ELIAD: Efficient lithographyaware detailed router with compact post-OPC printability prediction,’’ inProc. 45th Annu. DAC, Jun. 2008, pp. 504–509.

[12] X. Chen, T. Wei, and S. Hu, ‘‘Uncertainty-aware household appliancescheduling considering dynamic electricity pricing in smart home,’’ IEEETrans. Smart Grid, vol. 4, no. 2, pp. 932–940, Jun. 2013.

[13] T. Wei, X. Chen, and S. Hu, ‘‘Reliability-driven energy efficient taskscheduling for multiprocessor real-time systems,’’ IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 30, no. 10, pp. 1569–1573,Oct. 2011.





[14] J. Lillis, C.-K. Cheng, and T.-T. Y. Lin, ‘‘Optimal wire sizing and bufferinsertion for low power and a generalized delay model,’’ IEEE J. Solid-State Circuits, vol. 31, no. 3, pp. 437–447, Mar. 1996.

[15] S. Hu et al., ‘‘Fast algorithms for slew constrained minimum cost buffer-ing,’’ IEEETrans. Comput.-AidedDes. Integr. Circuits Syst., vol. 26, no. 11,pp. 2009–2022, Nov. 2007.

[16] A. Abou-Seido, B. Nowak, and C. Chu, ‘‘Fitted Elmore delay: A simple andaccurate interconnect delay model,’’ IEEE Trans. Very Large Scale Integr.(VLSI) Syst., vol. 12, no. 7, pp. 691–696, Jul. 2004.

[17] C. Visweswariah, K. Ravindran, K. Kalafala, S. G.Walker, and S. Narayan,‘‘First-order incremental block-based statistical timing analysis,’’ in Proc.41st Annu. DAC, 2004, pp. 331–336.

[18] H. Chang and S. Sapatnekar, ‘‘Statistical timing analysis under spatial cor-relations,’’ IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 24,no. 9, pp. 1467–1482, Sep. 2005.

[19] J. Birge and F. Louveaux, Introduction to Stochastic Programming.New York, NY, USA: Springer-Verlag, 1997.

[20] A. J. Kleywegt and A. Shapiro, ‘‘Stochastic optimization,’’ in Handbookof Industrial Engineering, G. Salvendy, Ed., 3rd ed. New York, NY, USA:Wiley, 2001.

[21] S. Shah, A. Srivastava, D. Sharma, D. Sylvester, D. Blaauw, andV. Zolotov, ‘‘Discrete Vt assignment and gate sizing using a self-snappingcontinuous formulation,’’ in Proc. IEEE/ACM ICCAD, Nov. 2005,pp. 705–712.

[22] K. Fang and L. Runze, Design and Modelling for Computer Experiments.Cleveland, OH, USA: CRC Press, 2005.

[23] S. Hu and J. Hu, ‘‘Unified adaptivity optimization of clock and logicsignals,’’ in Proc. IEEE/ACM ICCAD, 2007, pp. 125–130.

[24] A. Singhee, S. Singhal, and R. A. Rutenbar, ‘‘Practical, fast Monte Carlostatistical static timing analysis: Why and how,’’ in Proc. IEEE/ACMICCAD, Nov. 2008, pp. 125–130.

[25] S. K. Tiwary, P. K. Tiwary, and R. A. Rutenbar, ‘‘Generation of yield-awarePareto surfaces for hierarchical circuit design space exploration,’’ in Proc.43rd Annu. DAC, 2006, pp. 31–36.

[26] (2009, Jan.). Ipopt Home Page [Online]. Available: https://projects.coin-or.org/ipopt

XIAODAO CHEN received the B.Eng. degree intelecommunication from the Wuhan University ofTechnology, Wuhan, China, and the M.Sc. degreein electrical engineering and the Ph.D. degree incomputer engineering from Michigan Technologi-cal University, Houghton, USA, in 2006, 2009, and2012, respectively.

He is currently an Assistant Professor with theSchool of Computer Science, China University ofGeosciences, Wuhan.

DAN CHEN (M’02) received the B.Sc. degree inapplied physics from Wuhan University, Wuhan,China, the M.Eng. degree in computer sciencefrom the Huazhong University of Science andTechnology, Wuhan, and the M.Eng. and Ph.D.degrees in computer engineering from NanyangTechnological University, Singapore.

He is currently a Professor, the Head of theDepartment of Network Engineering, and theDirector of the Scientific Computing Laboratory

with the School of Computer Science, China University of Geosciences,Wuhan. He was an HEFCE Research Fellow with the University of Birming-ham,U.K. His research interests includemodeling and simulation of complexsystems, neuroinformatics, and high-performance computing.

LIZHE WANG (SM’09) received the B.Eng.(Hons.) and M.Eng. degrees from Tsinghua Uni-versity, Beijing, China, and the Dr. Eng. (magnacum laude) degree in applied computer sciencefrom University Karlsruhe (now Karlsruhe Insti-tute of Technology), Karlsruhe, Germany.

He is a 100-Talent Program Professor with theInstitute of Remote Sensing andDigital Earth, Chi-nese Academy of Sciences, Beijing, and a ChuTianChair Professor with the School of Computer Sci-

ence, China University of Geosciences, Wuhan, China.Prof. Wang is a fellow of the Institution of Engineering and Technology

and the British Computer Society.

ZE DENG received the B.Sc. degree from theChina University of Geosciences, Wuhan, China,the M.Eng. degree from Yunnan University, andthe Ph.D. degree from the Huazhong University ofScience and Technology, China. He is currently anAssistant Professor with the School of ComputerScience, China University of Geosciences, wherehe is currently also a Post-Doctoral Researcherwith the Faculty of Resources.

RAJIV RANJAN is a Research Scientist andJulius Fellow with the CSIRO ComputationalInformatics Division (formerly CSIRO ICT Cen-tre). His expertise is in datacenter cloud com-puting, application provisioning, and performanceoptimization. He received the Ph.D. degree in engi-neering from the University of Melbourne in 2009.He has authored 62 scientific and peer-reviewedpapers (seven books, 25 journals, 25 conferences,and five book chapters). His h-index is 20, with

a lifetime citation count of 1660+ (Google Scholar). His papers have alsoreceived 140+ ISI citations. His 70% of journal papers and 60% of confer-ence papers have been A*/A ranked ERA publication.

ALBERT Y. ZOMAYA (F’04) is currently a ChairProfessor of High Performance Computing andNetworking and an Australian Research CouncilProfessorial Fellowwith the School of InformationTechnologies, University of Sydney. He is also theDirector of the Centre for Distributed and HighPerformance Computing, which was established in2009. He was the CISCO Systems Chair Professorof Internetworking from 2002 to 2007, and theHead of the School from 2006 to 2007 in the same

school. Prior to his current appointment, he was a Full Professor with theSchool of Electrical, Electronic and Computer Engineering, University ofWestern Australia, where he led the Parallel Computing Research Laboratoryfrom 1990 to 2002. He served as an Associate-, Deputy-, and Acting-Head inthe same department, and held numerous visiting positions and has extensiveindustry involvement. He received the Ph.D. degree from the Department ofAutomatic Control and Systems Engineering, Sheffield University, U.K.




SHIYAN HU (SM’10) received the Ph.D. degreein computer engineering from Texas A&M Uni-versity, College Station, in 2008.

He is currently an Assistant Professor withthe Department of Electrical and ComputerEngineering, Michigan Technological University,Houghton, where he serves as the Director ofthe Michigan Tech VLSI CAD Research Labora-tory. He was a Visiting Professor with the IBMAustin Research Laboratory, Austin, TX, USA,

in 2010. He has authored over 50 journal and conference publications.His current research interests include computer-aided design for very largescale integrated circuits on nanoscale interconnect optimization, low-poweroptimization, and design for manufacturability.

Dr. Hu has served as a Technical Program Committee Member for afew conferences, such as the International Conference on Computer-AidedDesign (ICCAD), International Symposium on Physical Design, Interna-tional Society for Quality Electronic Design, International Symposium onVery Large Scale Integration, and International Symposium on Circuits andSystems. He was a recipient of the Best Paper Award Nomination fromICCAD in 2009.


Variation-Aware Layer Assignment With Hierarchical ...EMERGING TOPICS IN COMPUTING ... for the...

Documents

Transcript of Variation-Aware Layer Assignment With Hierarchical ...EMERGING TOPICS IN COMPUTING ... for the...