CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf ·...
Transcript of CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf ·...
![Page 1: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/1.jpg)
CMOS Power Consumption
Lecture 1318-322 Fall 2003
Textbook: [Sections 5.5 5.6 6.2 (p. 257-263) 11.7.1 ]
![Page 2: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/2.jpg)
2
Overview
Low-power designMotivationSources of power dissipation in CMOSPower modelingOptimization Techniques (a survey)
![Page 3: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/3.jpg)
Why worry about power?-- Heat Dissipation
Handhelds
Portables
Desktops Servers
![Page 4: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/4.jpg)
Power Density Trends
Courtesy of Fred Pollack, IntelCoolChips tutorial, MICRO-32
![Page 5: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/5.jpg)
High End Power Consumption
While you can probably afford to pay for 100-200W of power for your desktop…
Getting that heat off the chip and out of the box is expensive
![Page 6: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/6.jpg)
A Booming Market: Portable Devices
What we’d like…Video decompressionSpeech recognitionProtocols, ECC, ...Handwriting recognitionText/Graphics processingJava interpreter
Up to 1 month of uninterrupted operation!
What we would need…
Year
Nom
inal
Cap
acity
(Wat
t-hou
rs/l
b )
Nickel-Cadium
Ni-Metal Hydride
65 70 75 80 85 90 95 0
10
20
30
40
50 Rechargeable Lithium
Expected Battery Lifetime increaseover next 5 years: 30-40%
![Page 7: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/7.jpg)
Where Does Power Go in CMOS?
Switching power: due to charging and discharging of output capacitances:
Short-circuit power: due to non-zero rise/fall timesLeakage power (important with decreasing device sizes)
⌧ Typically between 0.1nA - 0.5nA at room temperature
![Page 8: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/8.jpg)
Short-Circuit Power
Inputs have finite rise and fall times
Depends on device sizes
Direct current path from VDDto GND while PMOS and NMOS are ON simultaneously for a short period
![Page 9: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/9.jpg)
Leakage Current
![Page 10: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/10.jpg)
New Problem: Gate Leakage
Now about 20-30% of all leakage, and growingGate oxide is so thin, electrons tunnel thru it…NMOS is much worse than PMOS
![Page 11: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/11.jpg)
Gate/Circuit-Level Power Estimation
It is a very difficult problemChallenges⌧VDD, fclk, CL are known
• Actually, the layout will determine the interconnect capacitances⌧Need node-by-node accuracy
• Power dissipation is highly data-dependent⌧Need to estimate switching activity accurately
• Simulation may take days to complete
![Page 12: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/12.jpg)
Dynamic Power Consumption - Revisited
Power = Energy/transition * transition rate= CL * Vdd
2 * f0→1
= CL * Vdd2 * P0→1* f
= CEFF * Vdd2 * f
P = CL(Vdd2/2) fclk sw
Switching activity (factor) on a signal line
C EFF = Effective Capacitance = C L * P0→1
Power Dissipation is Data DependentFunction of Switching Activity
![Page 13: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/13.jpg)
Example: Static 2 Input NOR
Assume:P(A=1) = 1/2P(B=1) = 1/2
P(Out=1) = 1/4 (this is the signal probability) Then:
P(0 →1)= 3/4 × 1/4 = 3/16 (this is the transition probability) = P(Out = 0) · P(Out = 1)
CEFF = 3/16 C L
A
B
Out
P(Out =1) = ?P(0->1) = ?
![Page 14: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/14.jpg)
Power Consumption is Data Dependent
P(0->1) = ?
0 0 0 0 1 10 0 0 1 1 0 0 0 1 0 0 10 0 1 1 0 0 0 1 0 0 1 0 0 1 0 1 1 0 0 1 1 0 0 0 0 1 1 1 0 0 1 0 0 0 0 11 0 0 1 0 0 1 0 1 0 0 11 0 1 1 0 0 1 1 0 0 0 0 1 1 0 1 0 0 1 1 1 0 0 0 1 1 1 1 0 0
A
B
Out
Suppose now that only patterns 00 and 11 can be applied (w/ equal probabilities). Then:
0 0 0 0 1 10 1 0 1 1 01 0 1 0 0 1 => P(0->1) = 1/41 1 1 1 0 0
Similarly, suppose that every 0 applied to the input A is immediately followed by a 1 while every 1 applied to B is immediately followed by a 0. P(0->1) = ?
![Page 15: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/15.jpg)
Transition Probabilities for Basic Gates
![Page 16: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/16.jpg)
(Big) Problem: Re-convergent Fanout
A
B
X
Z
Reconvergence
In this case, Z = B as it can be easily seen. The previous analysis simply fails because the signals are not independent!
P(Z=1) = P(B=1) · P(X=1 | B=1) = P(B=1)
Main issue: Becomes complex and intractable real fast!
![Page 17: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/17.jpg)
Another (Big) Problem: Glitching in Static CMOS
also called: dynamic hazardsX
ABC 101 000
X
Z
Unit Delay
wasted power
A
B ZC
![Page 18: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/18.jpg)
Example: A Chain of NAND Gates
out1 out2 out3 out4 out51
0 1 2 3t (nsec)0.0
2.0
4.0
6.0
V(V
olt)
out1out3 out5
out7
out2 out4 out6out8
...
![Page 19: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/19.jpg)
Glitch Reduction Using Balanced Paths
F1
F2
F3
F1
F3
F2
0
0
0
0
1
2
0
0
0
0 1
1
Equalize Lengths of Timing Paths Through Design
mismatch
![Page 20: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/20.jpg)
Delay is important: Delay vs. VDD and VT
Think about (Power Delay) product!
Delay for a 0->1 transition to propagate to the output:
Similar for a 1->0 transition( )2TnDDn
DDLpLH VVk
VCt−
=
![Page 21: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/21.jpg)
Delay vs. VDD
![Page 22: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/22.jpg)
Power-Performance Trade-offs
Prime choice: VDD reduction⌧ In recent years we have witnessed an increasing interest in supply voltage
reduction (e.g. Dynamic Voltage Scaling)• High VDD on critical path or for high performance• Low VDD where there is some available slack
⌧Design at very low voltages is still an open problem (0.6 – 0.9V by 2010!)• Ensures lower power• … but higher latency – loss in performance
Reduce switching activity⌧Logic synthesis⌧Clock gating
Reduce physical capacitance⌧Proper device sizing⌧Good layout
![Page 23: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/23.jpg)
How about POWER? Ways to reducing power consumption
Load capacitance (CL)⌧Roughly proportional to the chip
area
Switching activity (avg. number of transitions/cycle)
⌧Very data dependent⌧A big portion due to glitches
(real-delay)
Clock frequency (f)⌧Lowering only f decreases
average power, but total energy is the same and throughput is worse
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
5.50
6.00
6.50
7.00
7.50
2.00 4.00 6.00
V dd (volts)
NO
RM
AL
IZE
DD
EL
AY
adder (SPICE)
microcoded DSP chip
multiplier
adder
ring oscillator
clock generator2.0 µ m technology
Voltage supply (VDD)– Biggest impact
![Page 24: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/24.jpg)
Using parallelism (1)
Pref = CrefVDD2fref
Assume: tp = 25ns (worst-case, all modules) at VDD = 5V
![Page 25: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/25.jpg)
Using parallelism (2)
Cpar = 2.15C (extra-routing needed)fpar = f/2 (tp,new = (50)ns => VDD ~ 2.9V; VDD,par = 0.58 VDD)Ppar = CparVDD
2fpar = 0.36 Pref
Area increases about 3.4 times!
![Page 26: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/26.jpg)
Using pipelining
Cpipe = 1.15CDelay decreases 2 times (VDD,pipe = 0.58 VDD)Ppipe = 0.39 P
![Page 27: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/27.jpg)
Chain vs. balanced design
Question for you:Which of the two designs is more energy efficient?⌧Assume:
• Zero-delay model• All inputs have a signal probability of 0.5
⌧Hint: Calculate p0→1 for W, X and F
![Page 28: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/28.jpg)
Chain vs. balanced design
For the zero-delay modelChain design is betterBut ignores glitching⌧Depending on the gate delays, the chain design may be worse
![Page 29: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/29.jpg)
Low energy gates – transistor sizing
Use the smallest transistors that satisfy the delay constraints
Increasing transistor size improves the speed but it also increases power dissipation (since the load capacitances increases)⌧Slack time - difference between required time and arrival time of a
signal at a gate output• Positive slack - size down• Negative slack - size up
Make gates that toggle more frequently smaller
![Page 30: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/30.jpg)
Low energy gate netlists – pin ordering
Better to postpone the introduction of signals with a high transition rate (signals with signal probability close to 0.5)
![Page 31: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/31.jpg)
Control circuits
State encoding has a big impact on the power efficiencyEnergy driven -> try to minimize number of bit transitions in the state register
Fewer transitions in state registerFewer transitions propagated to combinational logic
![Page 32: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/32.jpg)
Bus encoding
Reduces number of bit toggles on the busDifferent flavors
Bus-invert coding⌧Uses an extra bus line invert:
• if the number of transitions is < K/2, invert = 0 and the symbol is transmitted as is
• if the number of transitions is > K/2, invert = 1 and the symbol is transmitted in a complemented form
Low-weight coding⌧Uses transition signaling instead of level signaling
DecoderEncoder Bus
![Page 33: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/33.jpg)
Bus invert coding
Source: M.Stan et al., 1994
![Page 34: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/34.jpg)
Summary
Power Dissipation is already a prime design constraint
Low-power design requires operation at lowest possible voltage and clock speed
Low-power design requires optimization at all levels of abstraction
![Page 35: CMOS Power Consumptioncourse.ece.cmu.edu/~ece322/LECTURES/Lecture13/Lecture13.03.pdf · Power-Performance Trade-offs Prime choice: V DD reduction ⌧In recent years we have witnessed](https://reader035.fdocuments.in/reader035/viewer/2022081606/5e8966cca684bf01b7475a16/html5/thumbnails/35.jpg)
Announcements
Project M1:Check off in lab sessionReport by Friday
Exam Review Session:Monday Oct 13, 4:30-6:30pmPH 125C