Keeping Hot Chips Cool
description
Transcript of Keeping Hot Chips Cool
Keeping Hot Chips Cool
Thermal Management for Green Computing
Yang Ge Professor Qinru Qiu
utline• Background– Need for green computing– Adverse effects of high temperature– Thermal management techniques
• Ongoing project– Power and thermal management for single chip
cloud computer (SCC)
The need for green computing• Computers consume 3%
of US energy use– Saving 1% of energy of
data center is more than saving a power plant
• Each computer generates 1 ton of CO2 every year– Equivalent to the CO2
emission of a car driving a round trip between New York and Los Angeles
Power and Cost for Cooling Systems• The energy dissipation for
cooling system is high– Cooling fan power can reach
up to 51% of the overall server power budget
• The cooling cost is expensive in large data centers– The total cooling costs for
large data centers can run into tens of millions of dollars
Fans
CPU
Mem
OtherFans 51%
Mem 20%
CPU 24%
Other 6%
IBM P670 Server power breakdown
Adverse effects of high temperature to VLSI Chips
• Affects the system reliability and causes permanent device failure
• Doubles leakage power consumption every 9oC increase
• Requires to increase fan speed which could reduce fan life time
Thermal Management Techniques
Offline Techniques
Online Techniques
Temperature aware scheduling
Dynamic voltage frequency scaling
Temperature aware task migration
Ongoing Project• Power and thermal management for
single chip cloud computer (SCC)
• 24 tiles arranged in 6X4 arrays
• 2 CPUs on each tile
• A router associated with each tile
• 4 memory controllers go to on board memory
Overview of SCC Architecture
• SCC and MCPC communicates over PCIe bus
• MCPC runs Ubuntu 10.04 x64 and SW from Intel
• Load Linux image on each core
• read and modify SCC registers
• Load programs on the SCC cores.
Management Console PC (MCPC)
• 6 voltage domains• 24 Frequency
domains, one for each tile
• 2 temperature sensors on each tile
• Voltage and frequency can be changed separately on each domain
Power and Thermal Management
hank y u