L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf ·...

24
L16: 6.111 Spring 2006 1 Introductory Digital Systems Laboratory L16: Power Dissipation in Digital Systems L16: Power Dissipation in Digital Systems

Transcript of L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf ·...

Page 1: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 1Introductory Digital Systems Laboratory

L16: Power Dissipation in Digital SystemsL16: Power Dissipation in Digital Systems

Page 2: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 2Introductory Digital Systems Laboratory

Problem #1: Power Dissipation/HeatProblem #1: Power Dissipation/Heat

5KW 18KW

1.5KW 500W

40048008

80808085

8086286

386486

Pentium® proc

0.1

1

10

100

1000

10000

100000

1971 1974 1978 1985 1992 2000 2004 2008Year

Pow

er (W

atts

)

400480088080

8085

8086

286 386486

Pentium® procP6

1

10

100

1000

10000

1970 1980 1990 2000 2010Year

Pow

er D

ensi

ty (W

/cm

2)

Hot Plate

NuclearReactor

RocketNozzle

Sun’sSurface

Courtesy Intel (S. Borkar)

How do you cool these chips??How do you cool these chips??

chip

heat sink

Page 3: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 3Introductory Digital Systems Laboratory

Problem #2: Energy ConsumptionProblem #2: Energy Consumption

(40+ lbs)Battery

Year

Nom

ina l

Cap a

city(

Watt

-ho u

r s/lb

)

Nickel-Cadmium

Ni-Metal Hydride

65 70 75 80 85 90 95 0

10

20

30

40

50 Rechargable Lithium

(from Jon Eager, Gates Inc. , S. Watanabe, Sony Inc.)

No Moore’s law for batteries…Today: Understand where power goes

and ways to manage it

What can One Jouleof energy do?

Send a 1 Megabyte file over 802.11b

Operate a processor

for ~ 7s

The Energy Problem

7.5 cm3

AA battery

Alkaline: ~10,000J

Mow your lawn for

1 ms

Page 4: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 4Introductory Digital Systems Laboratory

Dynamic Energy DissipationDynamic Energy Dissipation

VDD

CL

E0→1 = CLVDD2

Ecap = 1/2CLVDD2iDD

Ediss, RP = 1/2CLVDD2

VDD

CL

IN =1Ediss,RN =1/2CLVDD

2

Charging Discharging

IN =0

P = CL VDD2 fclk

RN

RP

RN

RP

Page 5: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 5Introductory Digital Systems Laboratory

The Transition Activity Factor The Transition Activity Factor αα00−−>>11

Current Input

Next Input

Output Transition

00 00 1 −> 100 01 1 −> 100 10 1 −> 100 11 1 −> 001 00 1 −> 101 01 1 −> 101 10 1 −> 101 11 1 −> 010 00 1 −> 110 01 1 −> 110 10 1 −> 110 11 1 −> 011 00 0 −> 111 01 0 −> 111 10 0 −> 111 11 0 −> 0

ZAB

Assume inputs (A,B) arrive at f and are uniformly distributedWhat is the average power dissipation?

α0−>1 = 3/16

P = α0−>1 CL VDD2 f

Page 6: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 6Introductory Digital Systems Laboratory

Junction (Silicon) TemperatureJunction (Silicon) Temperature

Simple Scenario Realistic Scenario

Tj-Ta= RθJA PD

SiliconSinkCase

SiliconTJ

TC

TS

TATJRθJA is the thermal resistance between silicon and Ambient

RθJCPDTJ

RθJA

TC

RθCSPDTS

Tj= Ta + RθJA PD

TA RθSA

TA

RθCA = RθCS + RθSA Make this as low as possibleis minimized by facilitating heat transfer

(bolt case to extended metal surface – heat sink)

Page 7: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 7Introductory Digital Systems Laboratory

Intel Pentium 4 Thermal GuidelinesIntel Pentium 4 Thermal Guidelines

Pentium 4 @ 3.06 GHz dissipates 81.8W!Maximum TC = 69 °CRCA < 0.23 °C/W for 50 C ambientTypical chips dissipate 0.5-1W (cheap packages without forced air cooling)

Execution core

120oC

Cache70°C

Integer & FP ALUs

Temp(oC)

Courtesy of Intel (Ram Krishnamurthy)

Page 8: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 8Introductory Digital Systems Laboratory

Power Reduction StrategiesPower Reduction Strategies

P = α0−>1 CL VDD2 f

Reduce Transition Activity or Switching EventsReduce Capacitance (e.g., keep wires short)Reduce Power Supply VoltageFrequency is typically fixed by the application, though this can be adjusted to control power

Optimize at all levels of design hierarchyOptimize at all levels of design hierarchy

Page 9: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 9Introductory Digital Systems Laboratory

Clock Gating is a Good Idea!Clock Gating is a Good Idea!

+

X

Global Clock Adder Clock

Multiplier Clock

Clock gating reduces activityand is the most common low-power

technique used today

Adder Off

Enable_Adder

Multiplier On

Enable_Multiplier

100’s of different clocks in a microprocessor

Clock Gating Reduces Energy, does it reduce Power?Clock Gating Reduces Energy, does it reduce Power?

Page 10: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 10Introductory Digital Systems Laboratory

Does your GHz Processor run at a GHz? Does your GHz Processor run at a GHz?

Processor

ThermalSensor

ChipActivity Control

Note that there is a difference between average and peak power

On-chip thermal sensor (diode based), measures the silicon temperature

If the silicon junction gets too hot (say 125 °C), then the activity is reduced (e.g., reduce clock rate or use clock gating)

Use of Thermal FeedbackUse of Thermal Feedback

Page 11: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 11Introductory Digital Systems Laboratory

Power Supply ResonancePower Supply Resonance

Lboard Lpackage Rgrid

Switchingcurrents

Board decap

On-diedecap

Courtesy of Motorola(David Blaauw)

Courtesy of MotorolaCourtesy of Motorola(David Blaauw)(David Blaauw)

Can write a Virus to Activate Can write a Virus to Activate

Power Supply Resonance!Power Supply Resonance!

200MhzDesign

Page 12: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 12Introductory Digital Systems Laboratory

Number Representation:Number Representation:TwoTwo’’s Complement vs. Sign Magnitudes Complement vs. Sign Magnitude

Two’s complement Sign-Magnitude

0000

0111

0011

1011

11111110

1101

1100

1010

1001

1000

0110

0101

0100

0010

0001

+0+1

+2

+3

+4

+5

+6

+7-0

-1

-2

-3

-4

-5

-6

-7

Consider a 16 bit bus where inputs togglesbetween +1 and –1 (i.e., a small noise input)Which representation is more energy efficient?

Page 13: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 13Introductory Digital Systems Laboratory

Time Sharing is a Bad IdeaTime Sharing is a Bad Idea

2

Time Sharing Increases Switching ActivityTime Sharing Increases Switching Activity

Page 14: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 14Introductory Digital Systems Laboratory

Not just a 6Not just a 6--1 Issue: 1 Issue: ““CoolCool”” Software ???Software ???

CPU

0111111100000000

0111111100000001

0111111100000010

0111111100000011

1000000000000000

1000000000000001

1000000000000010

1000000000000011

a[0]a[1]a[2]a[3]

b[0]b[1]b[2]b[3]

float a [256], b[256];float pi= 3.14;

for (i = 0; i < 255; i++) {a[i] = sin(pi * i /256);}for (i = 0; i < 255; i++) {b[i] = cos(pi * i /256);}

float a [256], b[256];float pi= 3.14;

for (i = 0; i < 255; i++) {a[i] = sin(pi * i /256);b[i] = cos(pi * i /256);

}

address

MEMORY address

16

2(8)+2(2+4+8+16+32+64+128+256)= 1030 transitions

512(8)+2+4+8+16+32+64+128+256= 4607 bit transitions

Page 15: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 15Introductory Digital Systems Laboratory

GlitchingGlitching TransitionsTransitions

Balancing paths reduces glitching transitionsStructures such as multipliers have lot of glitching transitionsKeeping logic depths short (e.g., pipelining) reduces glitching

++

+

A B C D

(A+B) + (C+D)+

+

+

A B

C

D

Chain Topology Tree Topology

(((A+B) + C)+D)

Page 16: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 16Introductory Digital Systems Laboratory

Reduce Supply Voltage : But is it Free?Reduce Supply Voltage : But is it Free?

t =0+

IN OUT

VDD

+

-

VS D

CL

GVDD

2)(2 T

VDD

VK

S

DDV

DDTDD

DD

VVVV

TV

DDV

k

DDV

LC

Di

VL

CDelay

1)( 2

2)(2

2 ≈−

=

∆⋅

=

VDD from 2V to 1V, energy ↓ by x4, delay ↑ x2

Page 17: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 17Introductory Digital Systems Laboratory

Transistors Are FreeTransistors Are Free……(What do you do with a Billion Transistors?)(What do you do with a Billion Transistors?)

OUT

IN

X

Pserial = Cmult 22 f P

f =1GHzVDD=2V

parallel = (2Cmult 12 f /2) = Pserial/4

X X

INf = 500MhzVDD=1V

f = 500MhzVDD=1V

IN

SELECT

Trade Area for Low PowerTrade Area for Low Power

OUT

Page 18: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 18Introductory Digital Systems Laboratory

Algorithmic WorkloadAlgorithmic Workload

Receiver just updatesCompare Current Image...

...to Previous Image

Fre

quen

cyof

Occ

urre

nce

Number of IDCTs per Frame0 500 1000 1500 2000

0.00

0.02

0.04

0.06

Exploit Time Varying Algorithmic WorkloadExploit Time Varying Algorithmic WorkloadTo Vary the Power Supply Voltage To Vary the Power Supply Voltage

Page 19: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 19Introductory Digital Systems Laboratory

Dynamic Voltage Scaling (DVS)Dynamic Voltage Scaling (DVS)

Variable Power SupplyACTIVE IDLE

Fixed Power SupplyACTIVE

EFIXED = ½ C VDD2 EVARIABLE = ½ C (VDD/2)2 = EFIXED / 4

0.2 0.4 0.8 1.0

0.2

0.4

0.6

0.8

1.0

Normalized Workload

Nor

mal

ized

Ene

rgy

Fixed Supply

VariableSupply

00 0.6

[Gutnik97]

Page 20: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 20Introductory Digital Systems Laboratory

DVS on a ProcessorDVS on a Processor

Digitally adjustable DC-DC converter powers SA-1110 core

µOS selects appropriate clock frequency based on workload and latency constraints

SA-1110

Control

µOS

VoutController

3.6V

5

Page 21: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 21Introductory Digital Systems Laboratory

Hardware vs. SoftwareHardware vs. Software

1nJ/Op

Flex

ibili

ty

0.25nJ/Op

Embedded Processor

DSP

Direct MappedHardware

FPGA0.1-1pJ/Op

Energy/OperationCourtesy of R. Brodersen, J. Rabaey, TI, ARM/StrongARM

Page 22: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 22Introductory Digital Systems Laboratory

Energy Efficiency of SoftwareEnergy Efficiency of Software

FPGA (Xilinx)

05

1015202530354045

Pow

er (%

)

Cache Control GCLK EBOX I/O,PLL

Processor (StrongARM-1100)

[Montanaro, JSSC ‘96]

[A. Sinha, DAC]

65%21%

9%5%

InterconnectClock

I/OCLB

CLB CLB

CLBCLB

[Kusse ‘98, UCB]

““SoftwareSoftware”” Energy Dissipation has Large OverheadEnergy Dissipation has Large Overhead

Page 23: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 23Introductory Digital Systems Laboratory

Trends: Leakage and Power GatingTrends: Leakage and Power Gating

Duty Cycle (%)

Tota

l Ene

rgy/

Switc

hing

Ene

rgy

VDD

C

VDD

C

EE = = VVDDDDII001010--VVTT//SSEE = = CVCVDDDD

22

SwitchingSwitching(computing)(computing)

LeakageLeakage(standby)(standby)

0 1

Low VTdevices are

leaky - Use a High VT

device is used to gate leakage current

Sleep

Page 24: L16: Power Dissipation in Digital Systemsweb.mit.edu/6.111/www/s2006/LECTURES/l16.pdf · [Montanaro, JSSC ‘96] [A. Sinha, DAC] 65% 21% 9% 5% Interconnect Clock I/O CLB CLB CLB CLB

L16: 6.111 Spring 2006 24Introductory Digital Systems Laboratory

Trends: Energy ScavengingTrends: Energy Scavenging

MEMS Generator Power Harvesting Shoes

Joe Paradiso(Media Lab)Jose Mur Miranda/

Jeff Lang

After 3-6 steps, it provides 3 mAfor 0.5 sec

~10mW

Vibration-to-Electric Conversion

~ 10µW