Optimizing Power @ Standby – Circuits and Systems ... · PDF fileFloating-point unit and...

48
Jan M. Rabaey Optimizing Power @ Standby Circuits and Systems

Transcript of Optimizing Power @ Standby – Circuits and Systems ... · PDF fileFloating-point unit and...

Chapter 8

Optimizing Power @ Standby – Circuits and Systems

Jan M. Rabaey

Optimizing Power @ Standby

Circuits and Systems

Slide 8.1

Chapter Outline

Why Sleep Mode Management?Dynamic power in standby– Clock gating

Static power in standby– Transistor sizing– Power gating– Body biasing– Supply voltage ramping

Slide 8.2

Arguments for Sleep Mode Management

Many computational applications operate in burst modes, interchanging active and non-active modes– General-purpose computers, cell phones, interfaces, embedded

processors, consumer applications, etc.

Prime concept: Power dissipation in standby should beabsolutely minimum, if not zeroSleep mode management has gained importance with increasing leakage

Clock gating

Leakageelimination

Slide 8.3

Standby Power Was Not a Concern in Earlier Days

Pentium-1: 15 W (5 V - 66 MHz)Pentium-2: 8 W (3.3 V - 133 MHz)

Floating-point unit and cache powered down when not in use

Processor in idle mode!

[Source: Intel]

Slide 8.4

Dynamic Power – Clock Gating

Turn off clocks to idle modules– Ensure that spurious activity is set to zero

Must ensure that data inputs to the module are in stable mode – Primary inputs are from gated latches or

registers– Or, disconnected from interconnect network

Can be done at different levels of system hierarchy

Slide 8.5

Clock Gating

Turning off the clock to non-active components

Register File

Logic Module

Clk

Enable

Logic Module

Enable

Bus

Disconnecting the inputs

Slide 8.6

DSP/HIF

DEU

MIF

VDE

896Kb SRAM

10

8.5 mW

0 155

30.6 mW

20 25

Without clock gating

With clock gating

Power [mW]

Clock gating Efficiently Reduces Power

90% of FFs clock-gated.

70% power reduction by clockgating alone.

MPEG-4 decoder

© IEEE 2002[Ref: M. Ohashi, ISSCC’02]

Slide 8.7

Clock Gating

Challenges to skew management and clock distribution (load on clock network varies dynamically)Fortunately state-of-the-art design tools are starting to do a better job– For example, physically aware clock gating inserts gaters in clock tree

based on timing constraints and physical layout

CG

CG

CG

CG

CG

Simpler skew management, less areaPower savings

Slide 8.8

Clock Hierarchy and Clock Gating

Example: Clock distribution of dual-core Intel Montecito processor

“Gaters” provided at lower clock-tree levelsAutomatic skew compensation

[Ref: T. Fischer, ISSCC’05]

© IEEE 2005

Slide 8.9

Trade-Off Between Sleep Modes and Sleep Time

Active modenormal processing

Standby modefast resume

high passive power

Typical operation modes

Sleep modeslower resume

low passive power

Resume-time from clock gating determined by the time it takes to turn on the clock distribution network Standby Options:

Just gate the clock to the module in questionTurn off phased-locked loop(s)Turn off clock completely

Slide 8.10

Sleep Modes in μProcessors and μControllers

[Ref: S. Gary, Springer’95]

[Ref: TI’06]

• 0.1-μA power down• 0.8-μA standby• 250-μA/MIPS @ 3 V

TI MSP430™From standby to active in 1 μs using dual clock system

Slide 8.11

The Standby Design Exploration Space

Standby Power

Wak

e-up

Del

ay

Standby

Sleep

Nap

Doze

Trade-off between different operational modesShould blend smoothly with runtime optimizations

Slide 8.12

[Ref: T. Simunic, Kluwer’02]

Also the Case for Peripheral Devices

Wireless LAN Card

Hard diskPsleepW sec sec

IBM 0.75 3.48 0.51 6.97

Fujitsu 0.13 0.95 0.67 1.61

PactiveW

Tsleep Tactive

Slide 8.13

The Leakage Challenge – Power in Standby

With clock gating employed in most designs, leakage power has become the dominant standby power sourceWith no activity in module, leakage power should be minimized as well– Remember constant ratio between dynamic

and static power …

Challenge – how to disable unit most effectively given that no ideal switches are available

Slide 8.14

Standby Static Power Reduction Approaches

Transistor stackingPower gatingBody biasingSupply voltage ramping

Slide 8.15

Transistor Stacking

Off-current reduced in complex gates (see leakage power reduction @ design time)Some input patterns more effective than others in reducing leakageEffective standby power reduction strategy:– Select input pattern that minimizes leakage current of

combinational logic module– Force inputs of module to correspond to that pattern

during standby

Pros: Little overhead, fast transitionCon: Limited effectiveness

Slide 8.16

Transistor Stacking

CombinationalModule

Lat

ches

Lat

ches … …

Clk Standby

[Ref: S. Narendra, ISLPED’01]

Slide 8.17

Forced Transistor Stacking

Useful for reducing leakage in non-critical shallow gates(in addition to high VTH)

[Ref: S. Narendra, ISLPED’01]

Slide 8.18

Power Gating

Disconnect module from supply rail(s) during standby

Footer or header transistor, or bothMost effective when high-VTH transistors are availableEasily introduced in standard design flowsBut … Impact on performance

Very often called “MTCMOS” (when using high- and low-threshold devices)

Logic

sleep

sleep

[Ref: T. Sakata, VLSI’93; S. Mutoh, ASIC’93]

Slide 8.19

Power Gating – Concept

Leakage current reduces becauseIncreased resistance in leakage pathStacking effect introduces source biasing

(similar effect at PMOS side)

VDD

OUT

VS = IleakRS

RSSleep

IN = 0

M1

VS

Ileak

RS

M1

VTH shift

Extra resistance

Slide 8.20

Power Gating Options

Low VTH

sleep

sleep

Low VTH

sleep

Low VTH

sleep

footer + header footer only header only

NMOS sleeper transistor more area-efficient than PMOSLeakage reduction more effective (under all input patterns) when both footer and header transistors are present

Slide 8.21

Other option: Boosted-Gate MOS (BGMOS)

Leak cut-off Switch (LS)- high VTH- thick TOX

(eliminates tunneling)

VDD

Virtual GND

CMOS logic- low VTH- thin TOX

0 VVDD

VBOOST

<Standby><Active>

[T. Inukai, CICC’00]

Slide 8.22

Other Option: Boosted-Sleep MOS

Leak cut-off Switch (LS)- normal (or high) VTH- normal TOX

Area-efficient

VDD

Virtual GND

CMOS logic- low VTH- thin TOX

-Vboost

0 VVDD

<Standby><Active>

(also called Super-Cutoff CMOS or SCCMOS)

[Ref: T. Inukai, CICC’00]

Slide 8.23

Virtual Supplies

ON

...

VDD

Virtual VDD

GND

©IEEE 2003

Virtual GND

ON

...

VDD

Virtual VDD

Virtual GND

OFF

OFFGND

Virtual supply collapse

Active Mode Standby Mode

Noise on virtual supplies

[Ref: J. Tschanz, JSSC’03]

Slide 8.24

Decoupling Capacitor Placement

PerformanceConvergence time

Oxide leakage savings

Decap on supply rails Decap on virtual rails

[Ref: J. Tschanz, JSSC’03]

© IEEE 2003

Slide 8.25

Leakage Power Savings versus Decap

Idle time10 ns 1 µs 100 µs 10 ms10 µs

No

rmal

ized

leak

age

po

wer

in id

lem

od

e

90%

40%

Low-leakage 133 nF decap on

virtual VCC

No decap on virtual VCC

[Ref: J. Tschanz, JSSC’03]

0

0.2

0.4

0.6

0.8

1

1.32 V75°C

© IEEE 2003

Slide 8.26

How to Size the Sleep Transistor?

Sleep transistor is not free – it will degrade the performance in active modeCircuits in active mode see the sleep transistor as extra power-line resistance– The wider the sleep transistor, the better

Wide sleep transistors cost area– Minimize the size of the sleep transistor for given

ripple (e.g., 5%)– Need to find the worst-case vector

Slide 8.27

Sleep Transistor Sizing

High-VTH transistor must be very large for low resistancein linear region Low-VTH transistor needs less areafor same resistance.

[Ref: R. Krishnamurthy, ESSCIRC’02]

Slide 8.28

Preserving State

Virtual supply collapse in sleep mode causes the loss of state in registersKeeping the registers at nominal VDD preserves the state– These registers leak …

Can lower the VDD in sleep– Some impact on robustness, noise, and soft-

error immunity

Slide 8.29

Latch-Retaining State During Sleep

Clk

sleep sleep

sleep sleep

QD

Black-shaded devices use low-VTH tranistorsAll others are high- VTH.

Transmission gate

[Ref: S. Mutoh, JSSC’95]

Slide 8.30

MTCMOS Derivatives Preventing State Loss

low-VTHlogic

sleep

VDD

virtual-VDD

High-VTH

(small W )

HVT

Vretain

RetentionClamping

low-VTHlogic

sleep

virtual GND

High-VTH

VDD

Reduce voltage and retain state

Slide 8.31

Sleep Transistor Placement

No sleep transistors

Standard cell row“strapper”

cells

VDD

GND

GNDVDD

VDD

GND ′ GND ′

GNDVDD

With headers and footers

M4

M3

M3

M4

′ VDD ′

Slide 8.32

Sleep Transistor Layout

Sleep transistor

cells

AALLUU

[Ref: J. Tschanz, JSSC’03]

Slide 8.33

Dynamic Body Biasing

Increase thresholds of transistors during sleep using reverse body biasing – Can be combined with forward body biasing in active mode

No delay penaltyBut

Requires triple-well technologyLimited range of threshold adjustments (<100 mV)– Not improving with technology scaling

Limited leakage reduction (<10x)Energy cost of charging/discharging the substrate capacitance

Slide 8.34

Dynamic Body Biasing

... ...

FBB

FBB

VDD

GND

PMOS body

NMOS body

PMOS bias

NMOS bias

PMOS bias

... ...NMOSbias

RBB

RBB

VDD

GND

PMOS body

NMOS body

VHIGH

VLOW

Active mode: Forward Body Bias Standby mode: Reverse Body Bias

Low threshold, high performance High threshold, low leakage

Can also be used to compensate for threshold variations

© IEEE 2003

[Ref’s: T Kuroda ISSCC’96; J. Tschanz, JSSC’03]

Slide 8.35

-Needs level-shifting and voltage switch circuitry

[Ref: K. Seta, ISSCC’95]

The Dynamics of Dynamic Body Bias

VNBB (4 V)

V1

V2V3

V4

CEM2

M3M4

M5

M1

CE

CW

CWVDD (2 V)

VSS (0 V)

VPwell (0 or –2 V)

VPBB (–2 V)Voltage switch CE onLevel shifter

VNwell (2 or 4 V)

© IEEE 1995

V1

V3

V2

V4

VNBB

VDD

VSS

VPBB0–2

–1

0

1

Vol

tage

(V

) 2

3

4

100Time (ns)

200

VNwell

VPwell

CE offStand by –> Active mode Active –> Standby mode

Slide 8.36

Body Bias LayoutSleep transistor LBGs

ALU core LBGs

Sleep transistor LBGsALU core LBGs

ALU

LBG: Local bias generator[Ref: J. Tschanz, JSSC’03]

Slide 8.37

DBB for Standby Leakage Reduction - Example

Application-specific processor(SH-mobile)

250 nm technologycore at 1.8 VI/O at 3.3 V3.3M transistors

[Ref: M. Miyazaki, Springer’06]

© Springer 2006

VBC (0.13 mm2)

Slide 8.38

Effectiveness of Dynamic Body Biasing

0

0.1

0.2

0.3

0.4

0.5

0.6

-2 -1 0 1 2

VBS(V )

VT

H(V

)

Reverse VBS

Forward VBS

Practical VTH tuning range less than 150 mV in 90 nm technology

Slide 8.39

Supply Voltage Ramping (SVR)

Reduce supply voltage of modules in sleep mode – Can go to 0 V if no state-retention is necessary– Down to state retention voltage otherwise,

(see Memory in next chapter), or move state to persistent memory before power-down

Most effective leakage reduction technique– Reduces current and voltage

ButNeeds controllable voltage regulator– Becoming present more often in modern integrated system designs

Longer reactivation time

Simplified version switches between VDD and GND (or VDDL)

[Ref: M. Sheets, VLSI’06]

Slide 8.40

Supply Ramping

Standby power = VDD(standby) × I leak(standby)Modules must be isolated from neighborsCreating “voltage islands”

Module

0

VDD VDD

Module

DRV

Full power-down Power-down with data retention

Slide 8.41

Supply Ramping – Impact

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5

3

3.5

4× 10

–9

Leakage power as a function of the supply voltage (90 nm)

Factor 8.5

Inverter

NAND 4

Because of DIBL, dropping supply voltage causes dramatic reduction in leakage – Can go as low as 300 mV before data retention is lost

Slide 8.42

Integration in Standard-Cell Layout Methodology

Power switch cell easily incorporated into standard design flow

– Cell has same pitch as existing components– No changes required to cell library from foundry

Switch design can be independent of block size

GND

VDDL (RV)

Awake

Awake_buf

VvDD

VvDD

VvDD

VvDD GND

GND

GND

VD

DH

VD

DL

VD

DH

VD

DL

GN

D

Power switch cell Integration into power grid

GN

D

VD

DH

VD

DL

GN

DVDDH

Slide 8.43

Standby Leakage Management – A Comparison

Slide 8.44

Some Long-Term Musings

Ideal power-off switch should have zero leakage current (S = 0 mV/decade)Hard to accomplish with traditional electronic devicesMaybe possible using MEMS – mechanical switches have a long standing reputation for good isolation

[Ref: N. Abele, IEDM’05]

Slide 8.45

Summary and Perspectives

Today’s designs are not leaky enough to be truly power–performance optimal! Yet, when not switching, circuits should not leak!Clock gating effectively eliminates dynamic power in standbyEffective standby power management techniques are essential in sub-100 nm design– Power gating the most popular and effective technique– Can be supplemented with body biasing and transistor stacking– Voltage ramping probably the most effective technique in the

long range (if gate leakage becomes a bigger factor)

Emergence of “voltage or power” domains

Slide 8.46

References

Books and Book ChaptersV. De et al., “ Techniques for Leakage Power Reduction,” in A. Chandrakasan et al., Design of High-Performance Microprocessor Circuits, Ch. 3, IEEE Press, 2001.

K. Roy et al., “Circuit Techniques for Leakage Reduction,” in C. Piguet, Low-Power Electronics Design, Ch. 13, CRC Press, 2005.

S. Narendra and A. Chandrakasan, Leakage in Nanometer CMOS Technologies, Springer, 2006.

Articles

N. Abele, R. Fritschi, K. Boucart, F. Casset, P. Ancey, and A.M. Ionescu, “Suspended-gateMOSFET: bringing new MEMS functionality into solid-state MOS transistor,” Proc. Electron Devices Meeting, 2005. IEDM Technical Digest. IEEE International, pp. 479–481, Dec. 2005

T. Fischer, et al., “A 90-nm variable frequency clock system for a power-managed Itanium®architecture processor,” IEEE J. Solid-State Circuits, pp. 217–227, Feb. 2006.

,S. Gary, “Low-Power Microprocessor Design,” in Low Power Design Methodologies Ed. J. Rabaey and M. Pedram, Chapter 9, pp. 255–288, Kluwer Academic, 1995.

T. Inukai et al., “Boosted Gate MOS (BGMOS): Device/Circuit Cooperation Scheme to Achieve Leakage-Free Giga-Scale Integration,” CICC, pp. 409–412, May 2000.H. Kam et al., “A new nano-electro-mechanical field effect transistor (NEMFET) design for low-

-

power electronics, IEDM Tech. Digest, pp. 463–466, Dec. 2005.

R. Krishnamurthy et al., “High-performance and low-power challenges for sub-70 nm microprocessor circuits,” 2002 IEEE ESSCIRC Conf., pp. 315–321, Sep. 2002.

T. Kuroda et al., “A 0.9 V 150 MHz 10 mW 4 mm2 2-D discrete cosine transform core processor with variable-threshold-voltage scheme,” JSSC, 31(11), pp. 1770–1779, Nov. 1996.

M. Miyazaki et al., “Case study: Leakage reduction in hitachi/renesas microprocessors”, in A. Narendra, Leakage in Nanometer CMOS Technologies, Ch 10., Springer, 2006.

T. Simunic, ‘‘Dynamic Management of Power Consumption’’, in Power Aware Computing, edited by R. Graybill, R. Melhem, Kluwer Academic Publishers, 2002.

Slide 8.47

References (cont.)

S. Mutoh et al., 1V high-speed digital circuit technology with 0.5 mm multi-threshold CMOS, “Proc. Sixth Annual IEEE ASIC Conference and Exhibit, pp. 186–189, Sep. 1993.

S. Mutoh et al., “1-V power supply high-speed digital circuit technology with multithreshold -voltage CMOS”, IEEE Journal of Solid-State Circuits, 30, pp. 847–854, Aug. 1995.

S. Narendra, et al., “Scaling of stack effect and its application for leakage reduction,” ISLPED, pp. 195–200, Aug. 2001.

M. Ohashi et al., “A 27MHz 11.1mW MPEG-4 video decoder LSI for mobile application,” ISSCC, pp. 366–367, Feb. 2002.

T. Sakata, M. Horiguchi and K. Itoh, Subthreshold-current reduction circuits for multi-gigabit DRAM's, Symp. VLSI Circuits Dig., pp. 45–46, May 1993.

K. Seta, H. Hara, T. Kuroda, M. Kakumu and T. Sakurai, “50% active-power saving without speed degradation using standby power reduction (SPR) circuit,” IEEE International Solid-State Circuits Conference, XXXVIII, pp. 318–319, Feb. 1995.

M. Sheets et al., J, “A Power-Managed Protocol Processor for Wireless Sensor Networks,” Digest of Technical Papers 2006 Symposium on VLSI Circuits, pp. 212–213, June 15–17, 2006. TI MSP430 Microcontroller family, http://focus.ti.com/lit/Slab034n/slab034n.pdf

J. W. Tschanz, S. G. Narendra, Y. Ye, B. A. Bloechel, S. Borkar and V. De, ‘‘Dynamic sleep transistor and body bias for active leakage power control of microprocessors,’’ IEEE Journal of Solid-State Circuits, 38, pp. 1838–1845, Nov. 2003.

Slide 8.48