2001 Low-Power CMOS With Subvolt Supply Voltages

download 2001 Low-Power CMOS With Subvolt Supply Voltages

of 7

Transcript of 2001 Low-Power CMOS With Subvolt Supply Voltages

  • 7/31/2019 2001 Low-Power CMOS With Subvolt Supply Voltages

    1/7

    394 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001

    obtained over the 1000 experiments are presented in Table II for four

    of the benchmark circuits (similar trends were observed for the other

    circuits). We can see that our dual-threshold selection heuristic per-

    forms much better than a random selection of gates. By repeating

    the random experiments a large number of times (1000), we attain a

    high level of confidence that our algorithms can provide a very fast

    and near-optimal solution, as opposed to randomized optimization al-

    gorithms like simulated annealing.

    VI. CONCLUSION

    We have demonstrated a new approach to low power optimization of

    digital static CMOS circuits for dual-threshold voltage manufacturing

    processes. The algorithms developed allow the designer to assign one

    of two threshold voltages to all the gates in the circuit. The assignment

    is performed in such a way that subsequent optimization for low

    power operation yields a significant reduction in the total power

    consumption of the circuit. Experiments were conducted on several

    ISCAS89 benchmark circuits and results indicate that significant

    improvement in power consumption, over single high- circuits,can be achieved. The algorithm is fast and typically completes

    in a few CPU seconds.

    REFERENCES

    [1] A. Chandrakasan and R. Brodersen, Minimizing power consumptionin digital CMOS circuits, Proc. IEEE, vol. 83, pp. 498523, Apr.1995.

    [2] J. Cong and C.-K. Koh, Simultaneous driver and wire sizing for per-formance and power optimization, IEEE Trans. VLSI Syst., vol. 2, pp.408425, Dec. 1994.

    [3] D. Liu and C. Svensson, Trading speed for low power by choice ofsupply and threshold voltages, IEEE J. Solid-State Circuits, vol. 28,

    pp. 1017, Jan. 1993.[4] Z. Chen and J. Plummer, Low threshold voltage quarter micron MOS-FETs for low power applications, in Proc. IEEE Symp. Low Power

    Electronics, 1995, pp. 7879.[5] P. Pant, V. De, and A. Chatterjee, Simultaneous power supply,

    threshold voltage and transistor size optimization for low power opera-tion of CMOS circuits, IEEE Trans. VLSI Syst., vol. 6, pp. 538545,Dec. 1998.

    [6] R. Gonzalez, B. M. Gorden, and M. Horowitz, Supply and thresholdvoltage scaling for low power CMOS, IEEE J. Solid-State Circuits, vol.32, pp. 12101216, Aug. 1997.

    [7] J. Burr and J. Shott, A 200 mv self-testing encoder-decoder circuitusing stanford ultra low power CMOS, in Proc. Int. Solid-State Cir-cuits Conf., Feb. 1994, pp. 8485.

    [8] L. Wei, Z. Chen, M. Johnson, K. Roy, and V. De, Design and optimiza-tion of low voltage high performance dual threshold CMOS circuits, inProc. Design Automation Conf., 1998, pp. 489494.

    [9] Q.Wang andS. Vrudhula, Staticpower optimizationof deep submicronCMOS circuits for dual technology, in Proc. Int. Conf. Computer-Aided Design, 1998, pp. 490494.

    [10] S. S. Sapatnekar, V. B. Rao, P. M. Vaidya, and S. M. Kang, An exactsolution to the transistor sizing problem for CMOS circuits usingconvex optimization, IEEE Trans. Comput.-Aided Design, vol. 12, pp.16211632, Nov. 1993.

    [11] N. Hendenstierna and K. Jeppson, CMOS circuit speed and buffer op-timization, IEEE Trans. Comput.-Aided Design, vol. 6, pp. 270281,Mar. 1987.

    [12] B. Hoppe, G. Neuendorf, D. Schmitt-Landsiedel, and W. Specks,Optimization of high-speed CMOS logic circuits with analyticalmodels for signal delay, chip area and dynamic power dissipa-tion, IEEE Trans. Computer-Aided Design, vol. 9, pp. 236247, Mar.1990.

    Low-Power CMOS with Subvolt Supply Voltages

    Mircea R. Stan

    AbstractWe first present a circuit taxonomy along the space and timedimensions, which is useful for classifying generic low-power techniques,

    followed by an analysis of optimal power supply and threshold voltagesand transistor sizing for minimizing the energy-delay product of a class of

    complementary metaloxidesemiconductor (CMOS) digital circuits.

    Index TermsDigital-complementary metaloxidesemiconductor(CMOS) VLSI, low-power design, low voltage, power consumption model.

    I. INTRODUCTION

    Power consumption in complementary metaloxidesemiconductor

    (CMOS) has two components: ac (dynamic) power that varies with

    operating frequency and dc (static) power that is independent of fre-

    quency [1][3]. The two major sources of dynamic power are the ca-

    pacitive current for charging and discharging load capacitances and the

    short circuit (or overlap) current [4]. When the supply voltage is ag-

    gressively scaled down the percentage of short circuit power becomes

    smaller and tends to zero as gets close to [5]. The two major

    sources of static power are the subthreshold current [6], [7] and the

    junction leakage current. In deep submicron technologies the junction

    leakage becomes negligible compared to the subthreshold current, but

    other leakage phenomena like gate oxide tunneling and gate induced

    drain leakage (GIDL) are likely to become important [8], [9].

    Although recognized as an important method to reduce power [1],

    scaling the power supply voltage has been historically driven by relia-

    bility concerns (gate oxide breakdown voltage and leakage) and not by

    power reduction strategies. The SIA Technology Roadmap [10], [11]

    predicts a V V inthe year2009for a 70 nmtechnology,

    and a V V in the year 2012 for a 50 nm technology.

    In what follows we show that a as low as 0.8 V should be used for

    low-power circuits even with current 0.25 and 0.18 processes as it

    provides the optimum energy-delay product for the design.

    A. Figures of Merit for Low-Power Design

    The classic two-dimensional VLSI design space tries to minimize

    the circuit area and delay in order to reduce cost and im-

    prove performance, by using optimizations with objective functions

    such as , , and [12]. The new emphasis on low power adds

    a third dimension (power) to the previously two-dimensional design

    space [13], but, except for a few cases [14], most of the research in low-

    power design is still two-dimensional with objective functions such as

    (power), (energy), and (energy-delayproduct).1 The power

    itself is a poor candidate for optimization as it canalways be lowered

    trivially by reducing the clock frequency. The energy is an appro-

    priate figure of merit for applications without stringent performance re-

    quirements, but, when performance is critical, the energy-delay productis a good compromise between the need to reduce power while

    still operating at reasonable speed [15].

    Manuscript received February 20, 1999; revised September 23, 1999. Thiswork was supported in part by NSF CAREER Award MIP-9703440.

    The author is with the Electrical Engineering Department, University of Vir-ginia, Charlottesville, VA 22903 USA (e-mail: [email protected]).

    Publisher Item Identifier S 1063-8210(01)00699-0.

    1The notation for energy and for energy-delay product is used tounderscore thereplacementof area by powerin theclassic and figuresof merit.

    10638210/01$10.00 2001 IEEE

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 2001 Low-Power CMOS With Subvolt Supply Voltages

    2/7

  • 7/31/2019 2001 Low-Power CMOS With Subvolt Supply Voltages

    3/7

    396 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001

    Fig. 1. for and and .

    Replacing (5) in (4) we obtain the desired result

    (6)

    This provides a simple voltage scaling rule-of-thumb for optimal en-

    ergy-delay product simply by looking at the ratio of ac to dc power.

    A large ratio suggests that more parallelism should be used to reduce

    dynamic power, while a small ratio implies that a more serial imple-

    mentation will be optimal by reducing leakage.

    C. Variation of with

    Untilnow wasassumedbutatverylow the short-channel

    effectsbecomelesscriticaland approachestheidealvalueof .Inorder

    to observe thevariation of with supply voltage for a given technology

    we extracted an approximate value numerically from HSPICE simula-

    tions using BSIM3v3 [21] models. This extraction was done by curve

    fitting analytically the simulated characteristics for different

    values to the -power law equation [20]. Fig. 2 shows that indeed, as

    the voltage gets into the sub-Volt range , and for the

    optimal as was previously determined in [3]. It should be

    understood though that, as thetechnology scales to thenanometer range,

    the devices become velocity saturated at even smaller supply voltages

    which means that becomes again , even for low values.

    When both and are aggressively scaled we cannot ignore

    any of the two terms in (2) and . This forces usto revisit the first order results presented in the previous sections. The

    results concerning do not change drastically with as was seen in

    Fig. 1, butthe optimal supplyvoltagebecomes V V

    which is larger than all previously reported first order analytical

    results [3], [16], [22].

    When theoptimum ratio ofac and dc power is also larger than

    the value computed in the previous section. Numerically, for

    and , the optimal ratio is which again is

    much larger than previously published results.

    These analytical results have been verified through simulation. A

    custom designed carry-lookahead adder [23], [24] was simulated with

    HSPICE using BSIM3v3 models [21] for a current 0.25 CMOS tech-

    nology. The simulation results are a good confirmation of the analyt-

    ical values, especially concerning the to ratio. The optimum

    values obtained from the simulation are mV, and

    V for a low activity case, and mV and

    V for a high activity case. Theoptimal values from sim-

    ulation are indeed very close to , but the accuracy of

    is not as good. This is due to the low logic depth of the carry-lookahead

    adder and to the fact that it is not truly a homogeneous and stationary

    circuit. This results in a small value for which places the optimiza-

    tion on the steep part of the curve in Fig. 1.

    D. Sizing for Minimum Energy-Delay Product

    Scaling the voltages has assumed an effective capacitive load

    as a function of technology, switching activity, ratio of gate to para-

    sitic and interconnect capacitance, fanout, etc. Here we show analyti-

    cally (the result was also shown graphically in [ 15]) that the optimal

    transistor sizing for minimum energy-delay product is obtained when

    the transistor capacitance equals the interconnect capacitance. Further-

    more, this optimal sizing is independent of voltage scaling, hence it can

    be done in parallel with optimizing and .

    The circuit model for this section includes the fanout and an ex-

    plicit interconnect capacitance as in Fig. 3. From Section III-A we

    use the fact that the optimal ratio of ac to dc power is a fixed fraction

    . Equation (1) can thus be approximated as

    (7)

    and can be rewritten such that transistor sizes become ex-

    plicit (see the Appendix)

    with

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 2001 Low-Power CMOS With Subvolt Supply Voltages

    4/7

    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001 397

    Fig. 2. as a function of supply voltage.

    where

    the switching activity;

    the fanout;

    the ratio of the width of the PMOS to the width of the

    NMOS;

    the logical effort [25];

    the ratio of parasitic (e.g., diffusion, short-circuit

    equivalent, etc.) capacitance to gate capacitance;

    and the transistor and interconnect capacitances, respec-tively.

    can bereplaced in (7) by denoting as the equivalent

    transistor width that has the same capacitance as the average wire

    (8)

    with

    By taking the derivative of (8) with respect to and setting to zero

    we obtain the optimum sizing as . This means that the

    optimal sizing is such that the transistor capacitance

    is equal to the interconnect capacitance . This

    result is totally independent of voltage scaling and partially contradictsthe common wisdom to use minimum size transistors in low-power

    design. Assuming an average interconnect capacitance of 40 fF [17]

    it results that the optimal transistor width should be m

    which is many ( 30) times larger than the minimum size in a 0.25

    technology.

    The result above is generic in nature and needs to be used only as

    a starting point for more detailed optimizations. One obvious missing

    item is the variation of the average interconnect capacitance (which is

    considered fixed here) as a function of increasing transistor sizes. A

    more detailed analysis could also consider the effect of the transistor

    to transistor ratio, the effect of rise time and fall time on short-circuit

    power [26], or even the simultaneous sizing of transistors and wires

    [27], [28].

    E. Buffered Design for Optimal Energy-Delay Product

    As the optimal sizing in the presence of interconnect parasitics is

    much larger than minimum size, it is natural to examine the effect of a

    buffered circuit style which can drive the interconnect more efficiently

    as in Fig. 4. A buffered circuit style is widely used in dynamic Domino

    circuits and was proposed for static logic as QuadRail [29], [30].

    First we rewrite the equivalent of (8) by assuming that the effect of

    the large interconnect capacitance is only seen at the output buffer

    (9)

    is the sizing for the logic gate and is the sizing for the buffer,

    the fanout only affects the buffer, and the logical effort for the buffer

    is 1.

    By taking the derivative with respect to and and setting to

    zero to obtain the optimal values

    where was the optimal size for the unbuffered case.The first observation is that for (no fanout) and (no

    logical effort) the optimal sizes are

    and . The energy-delay product in

    this case remains the same as in the unbuffered case.

    When and are the savings can be quite large. For example

    for and , and

    which leads to

    a saving of 87% in energy-delay product

    .

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 2001 Low-Power CMOS With Subvolt Supply Voltages

    5/7

    398 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001

    Fig. 3. Circuit with average fanout , average interconnect load , and logic depth .

    Fig. 4. Circuit with logic and buffer stages.

    Fig. 5. Constant energy-delay curves for .

    F. Parameter Variations

    Until now the analysis assumes that all parameters ( , sizes,

    etc.) can be precisely controlled; in reality there are always going to

    be parameter variations due to process, temperature, etc. The inter-die

    variations can be compensated by back-biasing [31], but intra-die

    random variations [32] represent a bigger problem which may limit

    the effectiveness of aggressive threshold scaling in the future. Energy-

    delay curves as inFig.5 showa nearly flatregion near the minimum for

    voltages slightly larger than the optimum values and a steep increase

    for smaller values [16]. This suggests a conservative approach: opti-

    mize for worst case such that the variations due to process and tem-

    perature will only make the actual values larger than optimum. This

    makes the typical design suboptimal but for small parameter variations

    the increase will not be significant. A variation of 25% in for ex-

    ample leads to an increase of 20% in energy-delay for the values in the

    Appendix. A better approach could use statistical methods [33] and let

    the voltages also be smaller than optimum but with a small probability

    and thus bring the typical case closer to the optimum.

    IV. CONCLUSION

    We have presented several new results related to optimal voltages

    andsizing of CMOS circuits for minimum energy-delayproduct. These

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 2001 Low-Power CMOS With Subvolt Supply Voltages

    6/7

    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001 399

    TABLE ITYPICAL PARAMETERS FOR A 0.25- m CMOS TECHNOLOGY

    results were presented in the context of a generic circuit classification

    along space and time which helps understand the limitations and appli-

    cability of various low-power techniques.

    APPENDIX

    All the computations in Section III are based on a large number of

    parameters depending on technology, circuit style, etc. Here some of

    the values are explained analytically although some of the choices aremore or less arbitrary in order to track published data. Table I summa-

    rizes the results.

    and arethe capacitanceand current forthe average gate while

    and are the values per micron of transistor width [16]. The value

    of fF/ m can be justified by using the following formulas and

    a choice of parameters as in Table I

    with

    Assuming (PMOS transistor twice larger than NMOS) and

    (a fudge factor to account for parasitic capacitances be-sides interconnect and for the short-circuit current), . From

    Section III-D the transistor capacitance equals the

    interconnect capacitance which means that for an average

    fF [17], average fanout and average switching activity

    fF fF. The equivalent inter-

    connect transistor width m and the optimal

    transistor width m, where

    fF/ m . Finally we obtain

    fF/ m as in [16].

    The zero-threshold current A/ m can be computed with the

    following formulas [7]:

    wherethe carrier mobility cm /V s and m which

    results in the desired A/ m [16].

    and can be computed using the following:

    with as another fudge factor to account for the logical effort[25] and the reduced current drive in deep-submicron, compared to an

    ideal inverter. This results in ps and .

    ACKNOWLEDGMENT

    The author would liketo thank A. Forestier for help with some of the

    simulations and T. Callaway for providing the carry-lookahead adder

    circuit used for simulation in Section III-C.

    REFERENCES

    [1] A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, Low-powerCMOS digital design, Proc. IEEE, vol. 81, 1992.

    [2] A. P. Chandrakasan and R. W. Brodersen, Minimizing power consump-tion in digital CMOS circuits, Proc. IEEE, vol. 83, pp. 498523, Apr.

    1995.[3] J. B. Burr and A. M. Peterson, Ultra low power CMOS technology, in

    Proc. NASA VLSI Design Symp., 1991.[4] S. Turgis, N. Azemard, and D. Auvergne, Explicit evaluation of short

    circuit power dissipationfor CMOS logicstructures, in Proc. Int.Symp.Low Power Design, Dana Point, CA, Apr. 1995, pp. 129134.

    [5] A. Alvandpour, P. Ededefors, and C. Svensson, Separation and extrac-tionof short-circuit power consumption in digital CMOSVLSI circuits,in Proc. Int. Symp. Low Power Electronics Design, Monterey, CA, Aug.1998, pp. 245249.

    [6] T. A. Fjeldly and M. Shur, Threshold voltage modeling and the sub-threshold regime of operation of short-channel MOSFET, IEEE Trans.

    Electron Devices, vol. 40, pp. 137145, Jan. 1993.[7] T. Grotjohn and B. Hoefflinger, A parametric short-channel MOS tran-

    sistor model for subthreshold and strong inversion current,IEEE Trans.Electron Devices, pp. 234246, 1984.

    [8] K. Roy, Low power design and leakage control techniques for deep

    submicron ICs, in Tutorial at VLSI Design Conf., Jan. 1999.[9] A. Keshavarzi, K. Roy, and C. Hawkins, Intrinsic IDDQ: Origins, re-

    duction, and applications in deep sub-low-power CMOS ICs, in Proc.Int. Test Conf., 1997, pp. 146155.

    [10] SIA, The national technology roadmap for semiconductors, , 1997.[11] SIA, The international technology roadmap for semiconductors,,

    1998.[12] J. Ullman, Ed., Computational Aspects of VLSI. Rockville, MD:

    Comput. Sci. Press, 1984.[13] D. Singh, J. M. Rabaey, M. Pedram, F. Catthoor, S. Rajgopal, N. Sehgal,

    and T. J. Mozdzen, Power conscious CAD tools and methodologies: Aperspective, Proc. IEEE, vol. 83, pp. 570594, Apr. 1995.

    [14] C. Chen and C. Tsui, Toward the capability of providing power-area-delay trade-off at the register transfer level, in Proc. Int. Symp. LowPower Electronics Design, Monterey, CA, Aug. 1998, pp. 2429.

    [15] M. Horowitz, T. Indermaur, and R. Gonzalez, Low-power digital de-sign, in Proc. Symp. Low Power Electronics, Oct. 1994, pp. 811.

    [16] R. Gonzales, B. M. Gordon, and M. A. Horowitz, Supply and thresholdvoltage scaling for low-power CMOS,IEEE J. Solid-State Circuits, vol.32, pp. 12101216, Aug. 1997.

    [17] A. J. Bhavnagarwala, B. Austin, and J. D. Meindl, Minimum supplyvoltage for bulk Si CMOS GSI, in Proc. Int. Symp. Low Power Elec-tronics Design, Monterey, CA, Aug. 1998.

    [18] D. Liu and C. Svensson, Trading speed for low-power by choice ofsupply and threshold voltages, IEEE J. Solid-State Circuits, vol. 28,pp. 1017, Jan. 1993.

    [19] C. Svensson and A. Alvandpour, Low power and low voltage CMOSdigital circuit techniques, in Proc. Int. Symp. Low Power Electronics

    Design, Monterey, CA, Aug. 1998, pp. 710.[20] T. Sakurai and R. Newton, Alpha-power law MOSFET model and its

    applications to CMOS inverter delayand otherformulas,IEEE J. Solid-State Circuits, vol. 25, pp. 584594, Apr. 1990.

    [21] U. Berkeley. (1997) BSIM3v3.1 SPICE MOS device model. [Online].Available: http://www-device.EECS.Berkeley.EDU/bsim3/.

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 7/31/2019 2001 Low-Power CMOS With Subvolt Supply Voltages

    7/7

    400 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 2, APRIL 2001

    [22] M. R. Stan, Optimal voltages and sizing for low power, in Proc. VLSIDesign Conf., Goa, India, Jan. 1999.

    [23] T. K. Callaway andE. E. Swartzlander, Estimating thepower consump-tion of CMOS adders, in Proc. Symp. Computer Arithmetic, Windsor,ON, Canada, 1993, pp. 210216.

    [24] T. K. Callaway, Area, delay, and power modeling of CMOS adders andmultipliers, Ph.D. dissertation, Dept. Elect./Comput. Eng., Univ. TexasAustin, TX, Dec. 1996.

    [25] I. E. Sutherland and R. F. Sproull, Logical effort: Designing for speed

    on the back of an envelope, in Proc. Conf. Advanced Research in VLSI,Nov. 1991.[26] M. Borah, R. M. Owens, and M. J. Irwin, Transistor sizing for mini-

    mizing power consumption of CMOS circuits under delay constraint,in Proc. Int. Symp. Low Power Design, Dana Point, CA, Apr. 1995, pp.167172.

    [27] N. Menezes, R. Baldick, and L. T. Pileggi, A sequential quadratic pro-gramming approach to concurrent gate and wire sizing, in Proc. Int.Conf. Computer-Aided Design, Nov. 1995, pp. 144151.

    [28] J. Cong, C.-K. Koh, and K.-S. Leung, Simultaneous driver and wiresizing for performance and power optimization, IEEE Trans. VLSISyst., vol. 2, pp. 408425, Dec. 1994.

    [29] R. K. Krishnamurthy and L. R. Carley, Exploring the design spaceof mixed-swing QuadRail for low-power digital circuits, IEEE Trans.VLSI Syst., vol. 5, pp. 388400, Dec. 1997.

    [30] L. R. Carley and I. Lys, QuadRail: A design methodology for low-power ICs, IEEE Trans. VLSI Syst., vol. 2, pp. 383390, Dec. 1994.

    [31] M. Miyazaki, H. Mizuno, and K. Ishibashi, A delay distributionsqueezing scheme with speed-adaptive threshold-voltage CMOS(SA-Vt CMOS) for low voltage LSIs, in Proc. Int. Symp. Low Power

    Electronics Design, Monterey, CA, Aug. 1998, pp. 4853.[32] X. Tang, V. De, and J. Meindl, Effects of random MOSFET parameter

    fluctuations on totalpower consumption, in Proc. Int.Symp. LowPowerElectronics Design, 1996, pp. 233236.

    [33] M. Orshansky, J. Chen, C. Hu, C.-P. Wan, and P. Bendix, Direct sam-pling methodology for statistical analysis of scaled CMOS technolo-gies, IEEE Trans. Semiconduct.r Manufact., vol. 12, 1999.

    Power Estimation for Large Sequential Circuits

    Joseph N. Kozhaya and Farid N. Najm

    AbstractA power estimation approach is presented in which blocks of

    consecutive vectors are selected at random from a user-supplied realisticinput vector set and the circuit is simulated for each block starting from an

    unknown state. This leads to two (upper and lower) bounds on the desiredpower value which can be quite tight (under 10% difference between thetwo in many cases). As a result, the power dissipation is obtained by simu-lating only a fraction of the potentially very large vector set.

    Index TermsFinite-state machine (FSM), power estimation, sequentialcircuit.

    I. INTRODUCTION

    Maximizing circuit speed and minimizing chip area used to be the

    only major concerns of VLSI designers. In recent years, power con-

    sumption of integrated circuits (ICs) has proved to be just as impor-

    tant of a concern. Thus, VLSI designs nowadays emerge as a tradeoff

    Manuscript received April 10, 1999; revised August 30, 2000.J. N. Kozhaya is with the Electrical and Computer Engineering Department,

    University of Illinois, Urbana-Champaign, IL 61801 USA.F. N. Najm is with theElectricaland Computer EngineeringDepartment,Uni-

    versity of Toronto, Toronto, ON M5S 3G4, Canada.Publisher Item Identifier S 1063-8210(01)00692-8.

    among three goals: minimum area, maximum speed, and minimum

    power dissipation.

    Power dissipation is a major concern of the semiconductor industry.

    This is because excessive power dissipation causes overheating, which

    may lead to soft errors or permanent damage. It also limits battery life

    in portable equipment. Thus, there is a need to accurately estimate the

    power dissipation of an IC during the design phase. We should note

    that by power estimation we refer to the problem of average power

    estimation. This is different from the estimation of the worst case in-

    stantaneous power. Chip reliability and equipment lifetime are directly

    related to the average power.

    Several approaches have been proposed for power estimation [1],

    especially for estimation at the gate-level. However, even at the gate-

    level, the problem is not yet completely solved. At least two open prob-

    lems remain: 1) Accurate and fastestimation of the average power dis-

    sipated by individual gates, typically inside an optimization loop and

    2) Accurate and fast estimation of the total average power dissipation

    in large sequential circuits. The words accurate and fast are em-

    phasized in both cases to indicate that existing techniques are either

    inaccurate and fast or accurate and slow. The fact that the first problem

    is not yet solvedhas been clearly illustrated in [2]. In thispaper, we will

    argue and demonstrate that the second problem is also still open, and

    we offer a new method which provides accurate and fast estimation of

    the total average power of large sequential circuits.

    Since the power is pattern-dependent, the average power dissipation

    of a circuit is not well-defined until a specific vector set is chosen. For

    combinational circuits, this may not be very critical, because different

    vector sets may dissipate approximately the same power, provided they

    have approximately equal values of switching activity. Thus, using a

    set of randomly generated vectors (with the right statistics) may be ap-

    propriate for these circuits. However, this does not hold for sequen-

    tial circuits because a real vector set (as opposed to a randomly gen-

    erated, artificial vector set) may contain specific vector sequences that

    put the circuit in specific operational modes or subspaces of its large

    state space and, in different operational modes, the circuit may dissi-

    pate quite different values of power. All one has to do is think of all

    the many different operational modes of a large microprocessor. Thus,

    for sequential circuits, the power may be critically dependent on the

    specific vector sequences that occur during typical operation.

    Most existing techniques of power estimation consider simply the

    average switching activity and signal probability of the input signals

    and use either static probability propagation methods [3][6] or dy-

    namic Monte Carlo simulation using randomly generated vectors [ 7],

    [8]. In either case, one runs the risk of taking the circuit into parts of its

    state space where it does not belong, i.e., into modes of operation that

    are unrealistic and may never be exercised in practice. When this hap-

    pens, there is no guarantee that the estimated power has any relation to

    what the circuit will actually dissipate under typical operation.

    To illustrate this problem, we have considered a number of sequen-

    tial circuits and constructed two sets of input vectors for each. Both

    sets of vectors have the same switching activity and signal probability

    for each input node. However, in one vector set, the input signals were

    generated at random, without any correlation between them, and in the

    other nonzero correlations were considered, both in space (between

    pairs of bits in the same vector) and in time (between pairs of con-

    secutive vectors). The intention is that these correlations would mimic

    to some degree the relationships that typically exist between signals,

    such as signals resulting from decoded instructions or general control

    signals. Note that these correlations are only the simplest kinds of cor-

    relation relations because they do not model the temporal correlations

    that canexist in vector streams over several clock cycles. We emphasize

    10638210/01$10.00 2001 IEEE

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-