2001 Low-Power CMOS With Subvolt Supply Voltages

7/31/2019 2001 Low-Power CMOS With Subvolt Supply Voltages

1/7

394 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001

obtained over the 1000 experiments are presented in Table II for four

of the benchmark circuits (similar trends were observed for the other

circuits). We can see that our dual-threshold selection heuristic per-

forms much better than a random selection of gates. By repeating

the random experiments a large number of times (1000), we attain a

high level of confidence that our algorithms can provide a very fast

and near-optimal solution, as opposed to randomized optimization al-

gorithms like simulated annealing.

VI. CONCLUSION

We have demonstrated a new approach to low power optimization of

digital static CMOS circuits for dual-threshold voltage manufacturing

processes. The algorithms developed allow the designer to assign one

of two threshold voltages to all the gates in the circuit. The assignment

is performed in such a way that subsequent optimization for low

power operation yields a significant reduction in the total power

consumption of the circuit. Experiments were conducted on several

ISCAS89 benchmark circuits and results indicate that significant

improvement in power consumption, over single high- circuits,can be achieved. The algorithm is fast and typically completes

in a few CPU seconds.

REFERENCES

[1] A. Chandrakasan and R. Brodersen, Minimizing power consumptionin digital CMOS circuits, Proc. IEEE, vol. 83, pp. 498523, Apr.1995.

[2] J. Cong and C.-K. Koh, Simultaneous driver and wire sizing for per-formance and power optimization, IEEE Trans. VLSI Syst., vol. 2, pp.408425, Dec. 1994.

[3] D. Liu and C. Svensson, Trading speed for low power by choice ofsupply and threshold voltages, IEEE J. Solid-State Circuits, vol. 28,

pp. 1017, Jan. 1993.[4] Z. Chen and J. Plummer, Low threshold voltage quarter micron MOS-FETs for low power applications, in Proc. IEEE Symp. Low Power

Electronics, 1995, pp. 7879.[5] P. Pant, V. De, and A. Chatterjee, Simultaneous power supply,

threshold voltage and transistor size optimization for low power opera-tion of CMOS circuits, IEEE Trans. VLSI Syst., vol. 6, pp. 538545,Dec. 1998.

[6] R. Gonzalez, B. M. Gorden, and M. Horowitz, Supply and thresholdvoltage scaling for low power CMOS, IEEE J. Solid-State Circuits, vol.32, pp. 12101216, Aug. 1997.

[7] J. Burr and J. Shott, A 200 mv self-testing encoder-decoder circuitusing stanford ultra low power CMOS, in Proc. Int. Solid-State Cir-cuits Conf., Feb. 1994, pp. 8485.

[8] L. Wei, Z. Chen, M. Johnson, K. Roy, and V. De, Design and optimiza-tion of low voltage high performance dual threshold CMOS circuits, inProc. Design Automation Conf., 1998, pp. 489494.

[9] Q.Wang andS. Vrudhula, Staticpower optimizationof deep submicronCMOS circuits for dual technology, in Proc. Int. Conf. Computer-Aided Design, 1998, pp. 490494.

[10] S. S. Sapatnekar, V. B. Rao, P. M. Vaidya, and S. M. Kang, An exactsolution to the transistor sizing problem for CMOS circuits usingconvex optimization, IEEE Trans. Comput.-Aided Design, vol. 12, pp.16211632, Nov. 1993.

[11] N. Hendenstierna and K. Jeppson, CMOS circuit speed and buffer op-timization, IEEE Trans. Comput.-Aided Design, vol. 6, pp. 270281,Mar. 1987.

[12] B. Hoppe, G. Neuendorf, D. Schmitt-Landsiedel, and W. Specks,Optimization of high-speed CMOS logic circuits with analyticalmodels for signal delay, chip area and dynamic power dissipa-tion, IEEE Trans. Computer-Aided Design, vol. 9, pp. 236247, Mar.1990.

Low-Power CMOS with Subvolt Supply Voltages

Mircea R. Stan

AbstractWe first present a circuit taxonomy along the space and timedimensions, which is useful for classifying generic low-power techniques,

followed by an analysis of optimal power supply and threshold voltagesand transistor sizing for minimizing the energy-delay product of a class of

complementary metaloxidesemiconductor (CMOS) digital circuits.

Index TermsDigital-complementary metaloxidesemiconductor(CMOS) VLSI, low-power design, low voltage, power consumption model.

I. INTRODUCTION

Power consumption in complementary metaloxidesemiconductor

(CMOS) has two components: ac (dynamic) power that varies with

operating frequency and dc (static) power that is independent of fre-

quency [1][3]. The two major sources of dynamic power are the ca-

pacitive current for charging and discharging load capacitances and the

short circuit (or overlap) current [4]. When the supply voltage is ag-

gressively scaled down the percentage of short circuit power becomes

smaller and tends to zero as gets close to [5]. The two major

sources of static power are the subthreshold current [6], [7] and the

junction leakage current. In deep submicron technologies the junction

leakage becomes negligible compared to the subthreshold current, but

other leakage phenomena like gate oxide tunneling and gate induced

drain leakage (GIDL) are likely to become important [8], [9].

Although recognized as an important method to reduce power [1],

scaling the power supply voltage has been historically driven by relia-

bility concerns (gate oxide breakdown voltage and leakage) and not by

power reduction strategies. The SIA Technology Roadmap [10], [11]

predicts a V V inthe year2009for a 70 nmtechnology,

and a V V in the year 2012 for a 50 nm technology.

In what follows we show that a as low as 0.8 V should be used for

low-power circuits even with current 0.25 and 0.18 processes as it

provides the optimum energy-delay product for the design.

A. Figures of Merit for Low-Power Design

The classic two-dimensional VLSI design space tries to minimize

the circuit area and delay in order to reduce cost and im-

prove performance, by using optimizations with objective functions

such as , , and [12]. The new emphasis on low power adds

a third dimension (power) to the previously two-dimensional design

space [13], but, except for a few cases [14], most of the research in low-

power design is still two-dimensional with objective functions such as

(power), (energy), and (energy-delayproduct).1 The power

itself is a poor candidate for optimization as it canalways be lowered

trivially by reducing the clock frequency. The energy is an appro-

priate figure of merit for applications without stringent performance re-

quirements, but, when performance is critical, the energy-delay productis a good compromise between the need to reduce power while

still operating at reasonable speed [15].

Manuscript received February 20, 1999; revised September 23, 1999. Thiswork was supported in part by NSF CAREER Award MIP-9703440.

The author is with the Electrical Engineering Department, University of Vir-ginia, Charlottesville, VA 22903 USA (e-mail: [email protected]).

Publisher Item Identifier S 1063-8210(01)00699-0.

1The notation for energy and for energy-delay product is used tounderscore thereplacementof area by powerin theclassic and figuresof merit.

10638210/01$10.00 2001 IEEE
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


2/7


3/7


Fig. 1. for and and .

Replacing (5) in (4) we obtain the desired result

(6)

This provides a simple voltage scaling rule-of-thumb for optimal en-

ergy-delay product simply by looking at the ratio of ac to dc power.

A large ratio suggests that more parallelism should be used to reduce

dynamic power, while a small ratio implies that a more serial imple-

mentation will be optimal by reducing leakage.

C. Variation of with

Untilnow wasassumedbutatverylow the short-channel

effectsbecomelesscriticaland approachestheidealvalueof .Inorder

to observe thevariation of with supply voltage for a given technology

we extracted an approximate value numerically from HSPICE simula-

tions using BSIM3v3 [21] models. This extraction was done by curve

fitting analytically the simulated characteristics for different

values to the -power law equation [20]. Fig. 2 shows that indeed, as

the voltage gets into the sub-Volt range , and for the

optimal as was previously determined in [3]. It should be

understood though that, as thetechnology scales to thenanometer range,

the devices become velocity saturated at even smaller supply voltages

which means that becomes again , even for low values.

When both and are aggressively scaled we cannot ignore

any of the two terms in (2) and . This forces usto revisit the first order results presented in the previous sections. The

results concerning do not change drastically with as was seen in

Fig. 1, butthe optimal supplyvoltagebecomes V V

which is larger than all previously reported first order analytical

results [3], [16], [22].

When theoptimum ratio ofac and dc power is also larger than

the value computed in the previous section. Numerically, for

and , the optimal ratio is which again is

much larger than previously published results.

These analytical results have been verified through simulation. A

custom designed carry-lookahead adder [23], [24] was simulated with

HSPICE using BSIM3v3 models [21] for a current 0.25 CMOS tech-

nology. The simulation results are a good confirmation of the analyt-

ical values, especially concerning the to ratio. The optimum

values obtained from the simulation are mV, and

V for a low activity case, and mV and

V for a high activity case. Theoptimal values from sim-

ulation are indeed very close to , but the accuracy of

is not as good. This is due to the low logic depth of the carry-lookahead

adder and to the fact that it is not truly a homogeneous and stationary

circuit. This results in a small value for which places the optimiza-

tion on the steep part of the curve in Fig. 1.

D. Sizing for Minimum Energy-Delay Product

Scaling the voltages has assumed an effective capacitive load

as a function of technology, switching activity, ratio of gate to para-

sitic and interconnect capacitance, fanout, etc. Here we show analyti-

cally (the result was also shown graphically in [ 15]) that the optimal

transistor sizing for minimum energy-delay product is obtained when

the transistor capacitance equals the interconnect capacitance. Further-

more, this optimal sizing is independent of voltage scaling, hence it can

be done in parallel with optimizing and .

The circuit model for this section includes the fanout and an ex-

plicit interconnect capacitance as in Fig. 3. From Section III-A we

use the fact that the optimal ratio of ac to dc power is a fixed fraction

. Equation (1) can thus be approximated as

(7)

and can be rewritten such that transistor sizes become ex-

plicit (see the Appendix)

with
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


4/7

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001 397

Fig. 2. as a function of supply voltage.

where

the switching activity;

the fanout;

the ratio of the width of the PMOS to the width of the

NMOS;

the logical effort [25];

the ratio of parasitic (e.g., diffusion, short-circuit

equivalent, etc.) capacitance to gate capacitance;

and the transistor and interconnect capacitances, respec-tively.

can bereplaced in (7) by denoting as the equivalent

transistor width that has the same capacitance as the average wire

(8)

with

By taking the derivative of (8) with respect to and setting to zero

we obtain the optimum sizing as . This means that the

optimal sizing is such that the transistor capacitance

is equal to the interconnect capacitance . This

result is totally independent of voltage scaling and partially contradictsthe common wisdom to use minimum size transistors in low-power

design. Assuming an average interconnect capacitance of 40 fF [17]

it results that the optimal transistor width should be m

which is many ( 30) times larger than the minimum size in a 0.25

technology.

The result above is generic in nature and needs to be used only as

a starting point for more detailed optimizations. One obvious missing

item is the variation of the average interconnect capacitance (which is

considered fixed here) as a function of increasing transistor sizes. A

more detailed analysis could also consider the effect of the transistor

to transistor ratio, the effect of rise time and fall time on short-circuit

power [26], or even the simultaneous sizing of transistors and wires

[27], [28].

E. Buffered Design for Optimal Energy-Delay Product

As the optimal sizing in the presence of interconnect parasitics is

much larger than minimum size, it is natural to examine the effect of a

buffered circuit style which can drive the interconnect more efficiently

as in Fig. 4. A buffered circuit style is widely used in dynamic Domino

circuits and was proposed for static logic as QuadRail [29], [30].

First we rewrite the equivalent of (8) by assuming that the effect of

the large interconnect capacitance is only seen at the output buffer

(9)

is the sizing for the logic gate and is the sizing for the buffer,

the fanout only affects the buffer, and the logical effort for the buffer

is 1.

By taking the derivative with respect to and and setting to

zero to obtain the optimal values

where was the optimal size for the unbuffered case.The first observation is that for (no fanout) and (no

logical effort) the optimal sizes are

and . The energy-delay product in

this case remains the same as in the unbuffered case.

When and are the savings can be quite large. For example

for and , and

which leads to

a saving of 87% in energy-delay product

.
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


5/7


Fig. 3. Circuit with average fanout , average interconnect load , and logic depth .

Fig. 4. Circuit with logic and buffer stages.

Fig. 5. Constant energy-delay curves for .

F. Parameter Variations

Until now the analysis assumes that all parameters ( , sizes,

etc.) can be precisely controlled; in reality there are always going to

be parameter variations due to process, temperature, etc. The inter-die

variations can be compensated by back-biasing [31], but intra-die

random variations [32] represent a bigger problem which may limit

the effectiveness of aggressive threshold scaling in the future. Energy-

delay curves as inFig.5 showa nearly flatregion near the minimum for

voltages slightly larger than the optimum values and a steep increase

for smaller values [16]. This suggests a conservative approach: opti-

mize for worst case such that the variations due to process and tem-

perature will only make the actual values larger than optimum. This

makes the typical design suboptimal but for small parameter variations

the increase will not be significant. A variation of 25% in for ex-

ample leads to an increase of 20% in energy-delay for the values in the

Appendix. A better approach could use statistical methods [33] and let

the voltages also be smaller than optimum but with a small probability

and thus bring the typical case closer to the optimum.

IV. CONCLUSION

We have presented several new results related to optimal voltages

andsizing of CMOS circuits for minimum energy-delayproduct. These
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


6/7

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001 399

TABLE ITYPICAL PARAMETERS FOR A 0.25- m CMOS TECHNOLOGY

results were presented in the context of a generic circuit classification

along space and time which helps understand the limitations and appli-

cability of various low-power techniques.

APPENDIX

All the computations in Section III are based on a large number of

parameters depending on technology, circuit style, etc. Here some of

the values are explained analytically although some of the choices aremore or less arbitrary in order to track published data. Table I summa-

rizes the results.

and arethe capacitanceand current forthe average gate while

and are the values per micron of transistor width [16]. The value

of fF/ m can be justified by using the following formulas and

a choice of parameters as in Table I

with

Assuming (PMOS transistor twice larger than NMOS) and

(a fudge factor to account for parasitic capacitances be-sides interconnect and for the short-circuit current), . From

Section III-D the transistor capacitance equals the

interconnect capacitance which means that for an average

fF [17], average fanout and average switching activity

fF fF. The equivalent inter-

connect transistor width m and the optimal

transistor width m, where

fF/ m . Finally we obtain

fF/ m as in [16].

The zero-threshold current A/ m can be computed with the

following formulas [7]:

wherethe carrier mobility cm /V s and m which

results in the desired A/ m [16].

and can be computed using the following:

with as another fudge factor to account for the logical effort[25] and the reduced current drive in deep-submicron, compared to an

ideal inverter. This results in ps and .

ACKNOWLEDGMENT

The author would liketo thank A. Forestier for help with some of the

simulations and T. Callaway for providing the carry-lookahead adder

circuit used for simulation in Section III-C.

REFERENCES

[1] A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, Low-powerCMOS digital design, Proc. IEEE, vol. 81, 1992.

[2] A. P. Chandrakasan and R. W. Brodersen, Minimizing power consump-tion in digital CMOS circuits, Proc. IEEE, vol. 83, pp. 498523, Apr.

1995.[3] J. B. Burr and A. M. Peterson, Ultra low power CMOS technology, in

Proc. NASA VLSI Design Symp., 1991.[4] S. Turgis, N. Azemard, and D. Auvergne, Explicit evaluation of short

circuit power dissipationfor CMOS logicstructures, in Proc. Int.Symp.Low Power Design, Dana Point, CA, Apr. 1995, pp. 129134.

[5] A. Alvandpour, P. Ededefors, and C. Svensson, Separation and extrac-tionof short-circuit power consumption in digital CMOSVLSI circuits,in Proc. Int. Symp. Low Power Electronics Design, Monterey, CA, Aug.1998, pp. 245249.

[6] T. A. Fjeldly and M. Shur, Threshold voltage modeling and the sub-threshold regime of operation of short-channel MOSFET, IEEE Trans.

Electron Devices, vol. 40, pp. 137145, Jan. 1993.[7] T. Grotjohn and B. Hoefflinger, A parametric short-channel MOS tran-

sistor model for subthreshold and strong inversion current,IEEE Trans.Electron Devices, pp. 234246, 1984.

[8] K. Roy, Low power design and leakage control techniques for deep

submicron ICs, in Tutorial at VLSI Design Conf., Jan. 1999.[9] A. Keshavarzi, K. Roy, and C. Hawkins, Intrinsic IDDQ: Origins, re-

duction, and applications in deep sub-low-power CMOS ICs, in Proc.Int. Test Conf., 1997, pp. 146155.

[10] SIA, The national technology roadmap for semiconductors, , 1997.[11] SIA, The international technology roadmap for semiconductors,,

1998.[12] J. Ullman, Ed., Computational Aspects of VLSI. Rockville, MD:

Comput. Sci. Press, 1984.[13] D. Singh, J. M. Rabaey, M. Pedram, F. Catthoor, S. Rajgopal, N. Sehgal,

and T. J. Mozdzen, Power conscious CAD tools and methodologies: Aperspective, Proc. IEEE, vol. 83, pp. 570594, Apr. 1995.

[14] C. Chen and C. Tsui, Toward the capability of providing power-area-delay trade-off at the register transfer level, in Proc. Int. Symp. LowPower Electronics Design, Monterey, CA, Aug. 1998, pp. 2429.

[15] M. Horowitz, T. Indermaur, and R. Gonzalez, Low-power digital de-sign, in Proc. Symp. Low Power Electronics, Oct. 1994, pp. 811.

[16] R. Gonzales, B. M. Gordon, and M. A. Horowitz, Supply and thresholdvoltage scaling for low-power CMOS,IEEE J. Solid-State Circuits, vol.32, pp. 12101216, Aug. 1997.

[17] A. J. Bhavnagarwala, B. Austin, and J. D. Meindl, Minimum supplyvoltage for bulk Si CMOS GSI, in Proc. Int. Symp. Low Power Elec-tronics Design, Monterey, CA, Aug. 1998.

[18] D. Liu and C. Svensson, Trading speed for low-power by choice ofsupply and threshold voltages, IEEE J. Solid-State Circuits, vol. 28,pp. 1017, Jan. 1993.

[19] C. Svensson and A. Alvandpour, Low power and low voltage CMOSdigital circuit techniques, in Proc. Int. Symp. Low Power Electronics

Design, Monterey, CA, Aug. 1998, pp. 710.[20] T. Sakurai and R. Newton, Alpha-power law MOSFET model and its

applications to CMOS inverter delayand otherformulas,IEEE J. Solid-State Circuits, vol. 25, pp. 584594, Apr. 1990.

[21] U. Berkeley. (1997) BSIM3v3.1 SPICE MOS device model. [Online].Available: http://www-device.EECS.Berkeley.EDU/bsim3/.
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


7/7


[22] M. R. Stan, Optimal voltages and sizing for low power, in Proc. VLSIDesign Conf., Goa, India, Jan. 1999.

[23] T. K. Callaway andE. E. Swartzlander, Estimating thepower consump-tion of CMOS adders, in Proc. Symp. Computer Arithmetic, Windsor,ON, Canada, 1993, pp. 210216.

[24] T. K. Callaway, Area, delay, and power modeling of CMOS adders andmultipliers, Ph.D. dissertation, Dept. Elect./Comput. Eng., Univ. TexasAustin, TX, Dec. 1996.

[25] I. E. Sutherland and R. F. Sproull, Logical effort: Designing for speed

on the back of an envelope, in Proc. Conf. Advanced Research in VLSI,Nov. 1991.[26] M. Borah, R. M. Owens, and M. J. Irwin, Transistor sizing for mini-

mizing power consumption of CMOS circuits under delay constraint,in Proc. Int. Symp. Low Power Design, Dana Point, CA, Apr. 1995, pp.167172.

[27] N. Menezes, R. Baldick, and L. T. Pileggi, A sequential quadratic pro-gramming approach to concurrent gate and wire sizing, in Proc. Int.Conf. Computer-Aided Design, Nov. 1995, pp. 144151.

[28] J. Cong, C.-K. Koh, and K.-S. Leung, Simultaneous driver and wiresizing for performance and power optimization, IEEE Trans. VLSISyst., vol. 2, pp. 408425, Dec. 1994.

[29] R. K. Krishnamurthy and L. R. Carley, Exploring the design spaceof mixed-swing QuadRail for low-power digital circuits, IEEE Trans.VLSI Syst., vol. 5, pp. 388400, Dec. 1997.

[30] L. R. Carley and I. Lys, QuadRail: A design methodology for low-power ICs, IEEE Trans. VLSI Syst., vol. 2, pp. 383390, Dec. 1994.

[31] M. Miyazaki, H. Mizuno, and K. Ishibashi, A delay distributionsqueezing scheme with speed-adaptive threshold-voltage CMOS(SA-Vt CMOS) for low voltage LSIs, in Proc. Int. Symp. Low Power

Electronics Design, Monterey, CA, Aug. 1998, pp. 4853.[32] X. Tang, V. De, and J. Meindl, Effects of random MOSFET parameter

fluctuations on totalpower consumption, in Proc. Int.Symp. LowPowerElectronics Design, 1996, pp. 233236.

[33] M. Orshansky, J. Chen, C. Hu, C.-P. Wan, and P. Bendix, Direct sam-pling methodology for statistical analysis of scaled CMOS technolo-gies, IEEE Trans. Semiconduct.r Manufact., vol. 12, 1999.

Power Estimation for Large Sequential Circuits

Joseph N. Kozhaya and Farid N. Najm

AbstractA power estimation approach is presented in which blocks of

consecutive vectors are selected at random from a user-supplied realisticinput vector set and the circuit is simulated for each block starting from an

unknown state. This leads to two (upper and lower) bounds on the desiredpower value which can be quite tight (under 10% difference between thetwo in many cases). As a result, the power dissipation is obtained by simu-lating only a fraction of the potentially very large vector set.

Index TermsFinite-state machine (FSM), power estimation, sequentialcircuit.

I. INTRODUCTION

Maximizing circuit speed and minimizing chip area used to be the

only major concerns of VLSI designers. In recent years, power con-

sumption of integrated circuits (ICs) has proved to be just as impor-

tant of a concern. Thus, VLSI designs nowadays emerge as a tradeoff

Manuscript received April 10, 1999; revised August 30, 2000.J. N. Kozhaya is with the Electrical and Computer Engineering Department,

University of Illinois, Urbana-Champaign, IL 61801 USA.F. N. Najm is with theElectricaland Computer EngineeringDepartment,Uni-

versity of Toronto, Toronto, ON M5S 3G4, Canada.Publisher Item Identifier S 1063-8210(01)00692-8.

among three goals: minimum area, maximum speed, and minimum

power dissipation.

Power dissipation is a major concern of the semiconductor industry.

This is because excessive power dissipation causes overheating, which

may lead to soft errors or permanent damage. It also limits battery life

in portable equipment. Thus, there is a need to accurately estimate the

power dissipation of an IC during the design phase. We should note

that by power estimation we refer to the problem of average power

estimation. This is different from the estimation of the worst case in-

stantaneous power. Chip reliability and equipment lifetime are directly

related to the average power.

Several approaches have been proposed for power estimation [1],

especially for estimation at the gate-level. However, even at the gate-

level, the problem is not yet completely solved. At least two open prob-

lems remain: 1) Accurate and fastestimation of the average power dis-

sipated by individual gates, typically inside an optimization loop and

2) Accurate and fast estimation of the total average power dissipation

in large sequential circuits. The words accurate and fast are em-

phasized in both cases to indicate that existing techniques are either

inaccurate and fast or accurate and slow. The fact that the first problem

is not yet solvedhas been clearly illustrated in [2]. In thispaper, we will

argue and demonstrate that the second problem is also still open, and

we offer a new method which provides accurate and fast estimation of

the total average power of large sequential circuits.

Since the power is pattern-dependent, the average power dissipation

of a circuit is not well-defined until a specific vector set is chosen. For

combinational circuits, this may not be very critical, because different

vector sets may dissipate approximately the same power, provided they

have approximately equal values of switching activity. Thus, using a

set of randomly generated vectors (with the right statistics) may be ap-

propriate for these circuits. However, this does not hold for sequen-

tial circuits because a real vector set (as opposed to a randomly gen-

erated, artificial vector set) may contain specific vector sequences that

put the circuit in specific operational modes or subspaces of its large

state space and, in different operational modes, the circuit may dissi-

pate quite different values of power. All one has to do is think of all

the many different operational modes of a large microprocessor. Thus,

for sequential circuits, the power may be critically dependent on the

specific vector sequences that occur during typical operation.

Most existing techniques of power estimation consider simply the

average switching activity and signal probability of the input signals

and use either static probability propagation methods [3][6] or dy-

namic Monte Carlo simulation using randomly generated vectors [ 7],

[8]. In either case, one runs the risk of taking the circuit into parts of its

state space where it does not belong, i.e., into modes of operation that

are unrealistic and may never be exercised in practice. When this hap-

pens, there is no guarantee that the estimated power has any relation to

what the circuit will actually dissipate under typical operation.

To illustrate this problem, we have considered a number of sequen-

tial circuits and constructed two sets of input vectors for each. Both

sets of vectors have the same switching activity and signal probability

for each input node. However, in one vector set, the input signals were

generated at random, without any correlation between them, and in the

other nonzero correlations were considered, both in space (between

pairs of bits in the same vector) and in time (between pairs of con-

secutive vectors). The intention is that these correlations would mimic

to some degree the relationships that typically exist between signals,

such as signals resulting from decoded instructions or general control

signals. Note that these correlations are only the simplest kinds of cor-

relation relations because they do not model the temporal correlations

that canexist in vector streams over several clock cycles. We emphasize

10638210/01$10.00 2001 IEEE
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-

2001 Low-Power CMOS With Subvolt Supply Voltages

Documents

Transcript of 2001 Low-Power CMOS With Subvolt Supply Voltages